Bug #1306

sed(1) adds trailing newline

Added by hasso almost 6 years ago. Updated over 5 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Our gzip(1) has a trouble unpacking lokigames/idsoftware etc archives (a
shell script and tar.gz in one file) making these packages fail in
pkgsrc. The main trouble is that it's not gzip(1) itself and probably not
libz either that causes the problem, but something else. I have exactly
the same problem with GNU gzip from pkgsrc in DragonFly, but GNU gzip
works just fine with these archives on every other platform I have access
to. Our tar doesn't have problem either with these files (not using gzip,
but libz directly).

Finding out what exactly causes it is beyond my skills at the moment ...

The testcase:

$ fetch \
ftp://ftp.estpak.ee/pub/FreeBSD/ports/distfiles/linuxq3apoint-1.32b.x86.run
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | gzip -cd > /dev/null
gzip: input not gziped (MAGIC0)
$
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | /usr/pkg/bin/gzip -cd \
> /dev/null
gzip: stdin: unexpected end of file
$
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | /usr/bin/bsdtar zxfO - \
> /dev/null
$

History

#1 Updated by delphij almost 6 years ago

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi, Hasso,

NetBSD r1.87 of src/usr.bin/gzip/gzip.c would give a better chance (you
would probably also want 1.88) for gzip(1) to survive with such archive
(issue a warning, but not give a fail case). FWIW FreeBSD's gzip is
doing it this way as well.

Personally I would be inclined to issue such warning (thus the user
would know that there is something wrong with the archive, but still
allow obtaining data) rather than just to 100% match gzip behavior.
What do you think about this case?

Cheers,
- --
Xin LI <> http://www.delphij.net/
FreeBSD - The Power to Serve!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (FreeBSD)

iEYEARECAAYFAkmtdw8ACgkQi+vbBBjt66B1oACfXnSO/nPtQAhozaMu+5Buwn50
65MAoIYYTFQYbeQCsUX/cOUl3x6hlVOq
=xQZb
-----END PGP SIGNATURE-----

#2 Updated by hasso almost 6 years ago

Yes, I know. I updated gzip locally to the latest code from NetBSD. It
then gives warning, but unpacks. But it also returns nonzero value which
means that "gtar zxf" doesn't work - gtar errors out.

Maybe ...

I wouldn't mind to change the code to whatever behavior, but ... Why the
very same gzip works _without_ warning/error with the same archive on
NetBSD? Why GNU gzip works _without_ warning/error with the same archive
on every platform but DragonFly?

#3 Updated by qhwt+dfly almost 6 years ago

gzip is complaining about the new line that sed appended.

$ sed '1,265d' linuxq3apoint-1.32b.x86.run >a
$ perl -e 'open(A,"./linuxq3apoint-1.32b.x86.run");$_=join("",<A>);$_=~s/.*?\neval \$finish; exit \$res\n//s;print' >b
$ ls -l a b
-rw-r--r-- 1 qhwt wheel 31472010 Mar 4 09:19 a
-rw-r--r-- 1 qhwt wheel 31472009 Mar 4 09:17 b
$ hd -vs 31472000 a
01e03980 01 b1 e3 8c 0a 00 70 47 02 0a |......pG..|
01e0398a
$ hd -vs 31472000 b
01e03980 01 b1 e3 8c 0a 00 70 47 02 |......pG.|
01e03989
$ gzip -vt a b
gzip: input not gziped (MAGIC0)
a: NOT OK
gzip: a: uncompress failed
b: OK

#4 Updated by hasso almost 6 years ago

Bingo! Can someone verify that the sed(1) in FreeBSD has the same problem
and report it? It probably has, our sed(1) is almost in sync with the one
in FreeBSD.

#5 Updated by joerg almost 6 years ago

It's not a sed bug.

Joerg

#6 Updated by hasso almost 6 years ago

Can you explain? Simple test:

$ hexdump test.txt
0000000 6f66 0a6f 6162 7372
0000008
$ gsed '1,1d' test.txt | hexdump
0000000 6162 7372
0000004
$ sed '1,1d' test.txt | hexdump
0000000 6162 7372 000a
0000005
$

#7 Updated by joerg almost 6 years ago

sed is stream editor, I think it is perfectly fine to output the
trailing newline. Nevertheless, a version that behaves like gsed has
its advantages, so merging improvements isn't wrong either.

Joerg

#8 Updated by corecode over 5 years ago

can we close this?

#9 Updated by hasso over 5 years ago

No, I think that we should change that to be compatible with others.

#10 Updated by alexh over 5 years ago

Standard[1] dictates: "Whenever the pattern space is written to standard
output or a named file, sed shall immediately follow it with a <newline>."

In my personal opinion, though, I'd prefer to see a gnu-compatible sed, even
if that means breaking standards compliance.

So what is the decision on this? Stick to the standard or stick to gnu?

[1]: http://www.opengroup.org/onlinepubs/9699919799/utilities/sed.html

#11 Updated by corecode over 5 years ago

Alex Hornung (via DragonFly issue tracker) wrote:
> Alex Hornung <> added the comment:
>
> Standard[1] dictates: "Whenever the pattern space is written to standard
> output or a named file, sed shall immediately follow it with a <newline>."
>
> In my personal opinion, though, I'd prefer to see a gnu-compatible sed, even
> if that means breaking standards compliance.

The standard is exceptionally clear on this, and I don't quite see why we should introduce a regression for an insignificant script which assumes sed is operating in a non-conforming way. Their shell script is wrong.

cheers
simon

#12 Updated by alexh over 5 years ago

We are sticking to the standard, as are other BSDs, so I'm closing this.

Also available in: Atom PDF