Bug #1306
closed
sed(1) adds trailing newline
Added by hasso over 15 years ago.
Updated about 15 years ago.
Description
Our gzip(1) has a trouble unpacking lokigames/idsoftware etc archives (a
shell script and tar.gz in one file) making these packages fail in
pkgsrc. The main trouble is that it's not gzip(1) itself and probably not
libz either that causes the problem, but something else. I have exactly
the same problem with GNU gzip from pkgsrc in DragonFly, but GNU gzip
works just fine with these archives on every other platform I have access
to. Our tar doesn't have problem either with these files (not using gzip,
but libz directly).
Finding out what exactly causes it is beyond my skills at the moment ...
The testcase:
$ fetch \
ftp://ftp.estpak.ee/pub/FreeBSD/ports/distfiles/linuxq3apoint-1.32b.x86.run
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | gzip -cd > /dev/null
gzip: input not gziped (MAGIC0)
$
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | /usr/pkg/bin/gzip -cd \
/dev/null
gzip: stdin: unexpected end of file
$
$ sed '1,265d' linuxq3apoint-1.32b.x86.run | /usr/bin/bsdtar zxfO - \
/dev/null
$
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi, Hasso,
NetBSD r1.87 of src/usr.bin/gzip/gzip.c would give a better chance (you
would probably also want 1.88) for gzip(1) to survive with such archive
(issue a warning, but not give a fail case). FWIW FreeBSD's gzip is
doing it this way as well.
Personally I would be inclined to issue such warning (thus the user
would know that there is something wrong with the archive, but still
allow obtaining data) rather than just to 100% match gzip behavior.
What do you think about this case?
Cheers,
- --
Xin LI <delphij@delphij.net> http://www.delphij.net/
FreeBSD - The Power to Serve!
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (FreeBSD)
iEYEARECAAYFAkmtdw8ACgkQi+vbBBjt66B1oACfXnSO/nPtQAhozaMu+5Buwn50
65MAoIYYTFQYbeQCsUX/cOUl3x6hlVOq
=xQZb
-----END PGP SIGNATURE-----
Yes, I know. I updated gzip locally to the latest code from NetBSD. It
then gives warning, but unpacks. But it also returns nonzero value which
means that "gtar zxf" doesn't work - gtar errors out.
Maybe ...
I wouldn't mind to change the code to whatever behavior, but ... Why the
very same gzip works without warning/error with the same archive on
NetBSD? Why GNU gzip works without warning/error with the same archive
on every platform but DragonFly?
gzip is complaining about the new line that sed appended.
$ sed '1,265d' linuxq3apoint-1.32b.x86.run >a
$ perl e 'open(A,"./linuxq3apoint-1.32b.x86.run");$_=join("",<A>);$_=~s/.*?\neval \$finish; exit \$res\n//s;print' >b
$ ls -l a b
-rw-r--r- 1 qhwt wheel 31472010 Mar 4 09:19 a
rw-r--r- 1 qhwt wheel 31472009 Mar 4 09:17 b
$ hd -vs 31472000 a
01e03980 01 b1 e3 8c 0a 00 70 47 02 0a |......pG..|
01e0398a
$ hd -vs 31472000 b
01e03980 01 b1 e3 8c 0a 00 70 47 02 |......pG.|
01e03989
$ gzip -vt a b
gzip: input not gziped (MAGIC0)
a: NOT OK
gzip: a: uncompress failed
b: OK
Bingo! Can someone verify that the sed(1) in FreeBSD has the same problem
and report it? It probably has, our sed(1) is almost in sync with the one
in FreeBSD.
It's not a sed bug.
Joerg
Can you explain? Simple test:
$ hexdump test.txt
0000000 6f66 0a6f 6162 7372
0000008
$ gsed '1,1d' test.txt | hexdump
0000000 6162 7372
0000004
$ sed '1,1d' test.txt | hexdump
0000000 6162 7372 000a
0000005
$
sed is stream editor, I think it is perfectly fine to output the
trailing newline. Nevertheless, a version that behaves like gsed has
its advantages, so merging improvements isn't wrong either.
Joerg
No, I think that we should change that to be compatible with others.
Standard1 dictates: "Whenever the pattern space is written to standard
output or a named file, sed shall immediately follow it with a <newline>."
In my personal opinion, though, I'd prefer to see a gnu-compatible sed, even
if that means breaking standards compliance.
So what is the decision on this? Stick to the standard or stick to gnu?
[1]: http://www.opengroup.org/onlinepubs/9699919799/utilities/sed.html
Alex Hornung (via DragonFly issue tracker) wrote:
Alex Hornung <ahornung@gmail.com> added the comment:
Standard1 dictates: "Whenever the pattern space is written to standard
output or a named file, sed shall immediately follow it with a <newline>."
In my personal opinion, though, I'd prefer to see a gnu-compatible sed, even
if that means breaking standards compliance.
The standard is exceptionally clear on this, and I don't quite see why we should introduce a regression for an insignificant script which assumes sed is operating in a non-conforming way. Their shell script is wrong.
cheers
simon
We are sticking to the standard, as are other BSDs, so I'm closing this.
Also available in: Atom
PDF