Bug #242

Broken pipe error

Added by elekktretterr over 8 years ago. Updated about 8 years ago.

Status:ClosedStart date:
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-

Description

Ok, it seems that I found a bug.

elevator# man ppp
zcat: error writing to output: Broken pipe
zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed

The man page opens but when I close it, that is on the screen.

Also, when I run ./locate.updatedb also makes this error sometimes,
often when lots and lots of files on the fs are added or deleted.

Not a serious bug but might be an easy fix. So if anyone can have a look
at it, would be great.

Cheers,

Petr

showsig.c Magnifier (814 Bytes) corecode, 07/15/2006 12:47 PM

History

#1 Updated by corecode over 8 years ago

Petr Janda wrote:
> Ok, it seems that I found a bug.
>
> elevator# man ppp
> zcat: error writing to output: Broken pipe
> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
>
>
> The man page opens but when I close it, that is on the screen.
>
> Also, when I run ./locate.updatedb also makes this error sometimes,
> often when lots and lots of files on the fs are added or deleted.
>
> Not a serious bug but might be an easy fix. So if anyone can have a look
> at it, would be great.

is your /tmp full or your /usr mounted RO?

cheers
simon

#2 Updated by elekktretterr over 8 years ago

Hey,

No definitely not full or mount ro.

Petr

Simon 'corecode' Schubert wrote:
> Petr Janda wrote:
>> Ok, it seems that I found a bug.
>>
>> elevator# man ppp
>> zcat: error writing to output: Broken pipe
>> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
>>
>>
>> The man page opens but when I close it, that is on the screen.
>>
>> Also, when I run ./locate.updatedb also makes this error sometimes,
>> often when lots and lots of files on the fs are added or deleted.
>>
>> Not a serious bug but might be an easy fix. So if anyone can have a
>> look at it, would be great.
>
> is your /tmp full or your /usr mounted RO?
>
> cheers
> simon
>

#3 Updated by dillon over 8 years ago

:Ok, it seems that I found a bug.
:
:elevator# man ppp
:zcat: error writing to output: Broken pipe
:zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
:
:
:The man page opens but when I close it, that is on the screen.
:
:Also, when I run ./locate.updatedb also makes this error sometimes,
:often when lots and lots of files on the fs are added or deleted.
:
:Not a serious bug but might be an easy fix. So if anyone can have a look
:at it, would be great.
:
:Cheers,
:
:Petr

What happens if you bring up the manual page and run it all the way
to the end of the page before quiting out ? Do you still get the
Broken Pipe?

It could simply be the zcat program complaining about the pager exiting
before zcat has managed to write out the whole page.

-Matt
Matthew Dillon
<>

#4 Updated by elekktretterr over 8 years ago

Ok, brining the ppp man page down a lot makes the problem not occur. It
doesnt need to be brought all the way to the end. Interestingly enough
though, this problem doesnt occur for pages such as pppd at all.

The updatedb problem is also interesting as it seems to happen only when
there was a lot of changes in the filesystem(thousands of files
added/deleted)

Petr

Matthew Dillon wrote:
> :Ok, it seems that I found a bug.
> :
> :elevator# man ppp
> :zcat: error writing to output: Broken pipe
> :zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
> :
> :
> :The man page opens but when I close it, that is on the screen.
> :
> :Also, when I run ./locate.updatedb also makes this error sometimes,
> :often when lots and lots of files on the fs are added or deleted.
> :
> :Not a serious bug but might be an easy fix. So if anyone can have a look
> :at it, would be great.
> :
> :Cheers,
> :
> :Petr
>
> What happens if you bring up the manual page and run it all the way
> to the end of the page before quiting out ? Do you still get the
> Broken Pipe?
>
> It could simply be the zcat program complaining about the pager exiting
> before zcat has managed to write out the whole page.
>
> -Matt
> Matthew Dillon
> <>
>
>
>

#5 Updated by dillon over 8 years ago

:Ok, brining the ppp man page down a lot makes the problem not occur. It
:doesnt need to be brought all the way to the end. Interestingly enough
:though, this problem doesnt occur for pages such as pppd at all.
:
:The updatedb problem is also interesting as it seems to happen only when
:there was a lot of changes in the filesystem(thousands of files
:added/deleted)
:
:Petr

I'll need the actual output from updatedb. Looking at the source
code, it doesn't seem to call zcat directly.

-Matt
Matthew Dillon
<>

#6 Updated by swildner over 8 years ago

Petr Janda wrote:
> Ok, it seems that I found a bug.
>
> elevator# man ppp
> zcat: error writing to output: Broken pipe
> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed

Do you get that for all man pages or just ppp(8)?

Sascha

#7 Updated by dillon over 8 years ago

:Petr Janda wrote:
:> Ok, it seems that I found a bug.
:>
:> elevator# man ppp
:> zcat: error writing to output: Broken pipe
:> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
:
:Do you get that for all man pages or just ppp(8)?
:
:Sascha
:
:--
:http://yoyodyne.ath.cx

The error is quite simply due to the fact that the pager program
(more) exited without reading all the input from the pipe. It
only occurs sometimes because it depends on whether the pipe
direct-maps zcat's (gunzip's) write to the pager program's read.

But, of course, the pager program exited because the user quit out
of it before it read all the input from zcat.

I think the solution is to modify the 'man' program to pass the '-q'
option to zcat (gunzip), and to modify gunzip to do a
signal(SIGPIPE, SIG_IGN) when used with the -q option.

Petr, please try rebuilding the 'man' program and the gzip program
using this patch. Please tell me if it stops the errors. Here is
the build sequence after applying the patch:

cd /usr/src/gnu/usr.bin/man
make clean obj all install
cd /usr/src/usr.bin/gzip
make clean obj all install

Patch is included below.

-Matt
Matthew Dillon
<>

Index: gnu/usr.bin/man/Makefile.inc
===================================================================
RCS file: /cvs/src/gnu/usr.bin/man/Makefile.inc,v
retrieving revision 1.2
diff -u -r1.2 Makefile.inc
--- gnu/usr.bin/man/Makefile.inc 17 Jun 2003 04:25:46 -0000 1.2
+++ gnu/usr.bin/man/Makefile.inc 14 Jul 2006 18:14:49 -0000
@@ -22,7 +22,7 @@
refer= /usr/bin/refer
grap= # no grap
pic= /usr/bin/pic
-zcat= /usr/bin/zcat
+zcat= /usr/bin/zcat -q
compress= /usr/bin/gzip -c
compext= .gz

Index: usr.bin/gzip/gzip.c
===================================================================
RCS file: /cvs/src/usr.bin/gzip/gzip.c,v
retrieving revision 1.5
diff -u -r1.5 gzip.c
--- usr.bin/gzip/gzip.c 31 Jan 2005 19:28:57 -0000 1.5
+++ usr.bin/gzip/gzip.c 14 Jul 2006 18:17:11 -0000
@@ -314,6 +314,7 @@
break;
case 'q':
qflag = 1;
+ signal(SIGPIPE, SIG_IGN);
break;
case 'r':
rflag = 1;

#8 Updated by corecode over 8 years ago

Petr Janda wrote:
> Ok, it seems that I found a bug.
>
> elevator# man ppp
> zcat: error writing to output: Broken pipe

why is zcat not killed by SIGPIPE but instead receives EPIPE? this is the real bug.

cheers
simon

#9 Updated by swildner over 8 years ago

Matthew Dillon wrote:
> :Petr Janda wrote:
> :> Ok, it seems that I found a bug.
> :>
> :> elevator# man ppp
> :> zcat: error writing to output: Broken pipe
> :> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
> :
> :Do you get that for all man pages or just ppp(8)?
> :
> :Sascha
> :
> :--
> :http://yoyodyne.ath.cx
>
> The error is quite simply due to the fact that the pager program
> (more) exited without reading all the input from the pipe. It
> only occurs sometimes because it depends on whether the pipe
> direct-maps zcat's (gunzip's) write to the pager program's read.

How come only Petr is having this problem?

Sascha

#10 Updated by corecode over 8 years ago

Simon 'corecode' Schubert wrote:
> Petr Janda wrote:
>> Ok, it seems that I found a bug.
>>
>> elevator# man ppp
>> zcat: error writing to output: Broken pipe
>
> why is zcat not killed by SIGPIPE but instead receives EPIPE? this is
> the real bug.

which shell are you using?

cheers
simon

#11 Updated by elekktretterr over 8 years ago

Hi Sascha,

the ppp(8) is the only one I know about, but Matt's fix seemed to do it
for me.

Petr

Sascha Wildner wrote:
> Petr Janda wrote:
>> Ok, it seems that I found a bug.
>>
>> elevator# man ppp
>> zcat: error writing to output: Broken pipe
>> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
>
> Do you get that for all man pages or just ppp(8)?
>
> Sascha
>

#12 Updated by elekktretterr over 8 years ago

Matt,
Thanks, that "seems" to have fixed the problem. I havent tested
./locate.updatedb yet with this patch applied now, but I will give it a
shot as soon as i move lots of files. Might just move DF source and
pkgsrc directories around a bit and see if it makes a difference.

Cheers,

Petr

Matthew Dillon wrote:
> :Petr Janda wrote:
> :> Ok, it seems that I found a bug.
> :>
> :> elevator# man ppp
> :> zcat: error writing to output: Broken pipe
> :> zcat: /usr/share/man/cat8/ppp.8.gz: uncompress failed
> :
> :Do you get that for all man pages or just ppp(8)?
> :
> :Sascha
> :
> :--
> :http://yoyodyne.ath.cx
>
> The error is quite simply due to the fact that the pager program
> (more) exited without reading all the input from the pipe. It
> only occurs sometimes because it depends on whether the pipe
> direct-maps zcat's (gunzip's) write to the pager program's read.
>
> But, of course, the pager program exited because the user quit out
> of it before it read all the input from zcat.
>
> I think the solution is to modify the 'man' program to pass the '-q'
> option to zcat (gunzip), and to modify gunzip to do a
> signal(SIGPIPE, SIG_IGN) when used with the -q option.
>
> Petr, please try rebuilding the 'man' program and the gzip program
> using this patch. Please tell me if it stops the errors. Here is
> the build sequence after applying the patch:
>
> cd /usr/src/gnu/usr.bin/man
> make clean obj all install
> cd /usr/src/usr.bin/gzip
> make clean obj all install
>
> Patch is included below.
>
> -Matt
> Matthew Dillon
> <>
>
> Index: gnu/usr.bin/man/Makefile.inc
> ===================================================================
> RCS file: /cvs/src/gnu/usr.bin/man/Makefile.inc,v
> retrieving revision 1.2
> diff -u -r1.2 Makefile.inc
> --- gnu/usr.bin/man/Makefile.inc 17 Jun 2003 04:25:46 -0000 1.2
> +++ gnu/usr.bin/man/Makefile.inc 14 Jul 2006 18:14:49 -0000
> @@ -22,7 +22,7 @@
> refer= /usr/bin/refer
> grap= # no grap
> pic= /usr/bin/pic
> -zcat= /usr/bin/zcat
> +zcat= /usr/bin/zcat -q
> compress= /usr/bin/gzip -c
> compext= .gz
>
> Index: usr.bin/gzip/gzip.c
> ===================================================================
> RCS file: /cvs/src/usr.bin/gzip/gzip.c,v
> retrieving revision 1.5
> diff -u -r1.5 gzip.c
> --- usr.bin/gzip/gzip.c 31 Jan 2005 19:28:57 -0000 1.5
> +++ usr.bin/gzip/gzip.c 14 Jul 2006 18:17:11 -0000
> @@ -314,6 +314,7 @@
> break;
> case 'q':
> qflag = 1;
> + signal(SIGPIPE, SIG_IGN);
> break;
> case 'r':
> rflag = 1;
>
>
>

#13 Updated by elekktretterr over 8 years ago

Ok, guys. I moved the directories around as I said and ran
./locate.updatedb and have got an error. However in this case its not
zcat but sort. Here it is:

elevator# ./locate.updatedb
sort: -: write error: Broken pipe

#14 Updated by elekktretterr over 8 years ago

Keep forgetting to say this, to anyone who asked what shell im using.
For root I use the default csh, for my non-root I use tcsh. I run
fluxbox and Eterm.

#15 Updated by dillon over 8 years ago

:Matt,
:Thanks, that "seems" to have fixed the problem. I havent tested
:./locate.updatedb yet with this patch applied now, but I will give it a
:shot as soon as i move lots of files. Might just move DF source and
:pkgsrc directories around a bit and see if it makes a difference.
:
:Cheers,
:
:Petr

I'm thinking the '-q' part of the patch might be sufficient, so I
will commit that bit. If Simon is right, the signal() part of
the patch ought to have no effect.

-Matt

#16 Updated by dillon over 8 years ago

:
:Ok, guys. I moved the directories around as I said and ran
:./locate.updatedb and have got an error. However in this case its not
:zcat but sort. Here it is:
:
:elevator# ./locate.updatedb
:sort: -: write error: Broken pipe

Well, this is odd. There are a few places where a pipe can be broken
early, all in /usr/src/usr.bin/locate/locate/mklocatedb.sh:

locate -d $filelist / | $bigram | $sort -nr | head -128 | ...

and

$bigram < $filelist | $sort -nr |
awk '{if (/^[ ]*[0-9]+[ ]+..$/) {printf("%s",$2)} else {exit 1}}' > $bigrams || exit 1

But we have the same issue that Simon brought up for the 'man' problem...
the broken pipe should terminate the sort with a signal instead of
reporting an error.

Petr, do me a favor and try this sequence. I want to ktrace the
'man' program bug you reported earlier.

* Remove the patches I had you apply earlier. If you have updated
to the most recent HEAD then I committed one of them and you will
have to edit /usr/src/gnu/usr.bin/man/Makefile.inc to remove the
'-q' I added.

* Recompile as per before.

* su to root (otherwise ktrace won't be able to trace through the
suid man program).

* Make sure you can reproduce the problem with 'man ppp'. Hopefully
you can.

* Then run:

ktrace -t cns -i man ppp

* Then run 'kdump' and put the output file somewhere where we can
retrieve it.

-Matt
Matthew Dillon
<>

#17 Updated by elekktretterr over 8 years ago

Hi Matt,
Thanks. The ppp(8) error was certainly reproducible after i reverted the
patches applied. You can get the dump here.

http://220.233.111.100/system/manppp.dump.txt

Cheers,
Petr

Matthew Dillon wrote:
> :
> :Ok, guys. I moved the directories around as I said and ran
> :./locate.updatedb and have got an error. However in this case its not
> :zcat but sort. Here it is:
> :
> :elevator# ./locate.updatedb
> :sort: -: write error: Broken pipe
>
> Well, this is odd. There are a few places where a pipe can be broken
> early, all in /usr/src/usr.bin/locate/locate/mklocatedb.sh:
>
> locate -d $filelist / | $bigram | $sort -nr | head -128 | ...
>
> and
>
> $bigram < $filelist | $sort -nr |
> awk '{if (/^[ ]*[0-9]+[ ]+..$/) {printf("%s",$2)} else {exit 1}}' > $bigrams || exit 1
>
> But we have the same issue that Simon brought up for the 'man' problem...
> the broken pipe should terminate the sort with a signal instead of
> reporting an error.
>
> Petr, do me a favor and try this sequence. I want to ktrace the
> 'man' program bug you reported earlier.
>
> * Remove the patches I had you apply earlier. If you have updated
> to the most recent HEAD then I committed one of them and you will
> have to edit /usr/src/gnu/usr.bin/man/Makefile.inc to remove the
> '-q' I added.
>
> * Recompile as per before.
>
> * su to root (otherwise ktrace won't be able to trace through the
> suid man program).
>
> * Make sure you can reproduce the problem with 'man ppp'. Hopefully
> you can.
>
> * Then run:
>
> ktrace -t cns -i man ppp
>
> * Then run 'kdump' and put the output file somewhere where we can
> retrieve it.
>
> -Matt
> Matthew Dillon
> <>
>
>
>

#18 Updated by elekktretterr over 8 years ago

I just discovered that this zcat broken pipe error occur only within X.
If i use console it doesnt make this error(so it seems), but xterm,
eterm etc do.

#19 Updated by corecode over 8 years ago

Petr Janda wrote:
> Keep forgetting to say this, to anyone who asked what shell im using.
> For root I use the default csh, for my non-root I use tcsh. I run
> fluxbox and Eterm.

I am almost sure that there is the bug somewhere. Could you compile + run attached file and mail the output to the list?

How do you start xorg? via startx or some *dm?

cheers
simon

#20 Updated by corecode over 8 years ago

^^^^^

that's the bug:

Eterm/src/command.c:install_handlers():
signal(SIGPIPE, SIG_IGN);

(called from main)

eterm is simply ignoring SIGPIPE and not changing this for the running shell. zsh resets this to SIG_DFL, but sh/tcsh don't. I think that's a bug in Eterm.

cheers
simon

#21 Updated by dillon over 8 years ago

:that's the bug:
:
:Eterm/src/command.c:install_handlers():
: signal(SIGPIPE, SIG_IGN);
:
:(called from main)
:
:eterm is simply ignoring SIGPIPE and not changing this for the running sh=
:ell. zsh resets this to SIG_DFL, but sh/tcsh don't. I think that's a bu=
:g in Eterm.
:
:cheers
: simon

Yup. And the kdump shows no signal being generated.

-Matt

#22 Updated by elekktretterr over 8 years ago

I get the same error in xterm too though. Are you sure?

I start xorg via startx, but will run the compile/file once i wake up!
Im dead.

Petr

Simon 'corecode' Schubert wrote:
> Simon 'corecode' Schubert wrote:
>> Petr Janda wrote:
>>> Keep forgetting to say this, to anyone who asked what shell im
>>> using. For root I use the default csh, for my non-root I use tcsh. I
>>> run fluxbox and Eterm.
> ^^^^^
>
> that's the bug:
>
> Eterm/src/command.c:install_handlers():
> signal(SIGPIPE, SIG_IGN);
>
> (called from main)
>
> eterm is simply ignoring SIGPIPE and not changing this for the running
> shell. zsh resets this to SIG_DFL, but sh/tcsh don't. I think that's
> a bug in Eterm.
>
> cheers
> simon
>

#23 Updated by corecode over 8 years ago

Petr Janda wrote:
> I get the same error in xterm too though. Are you sure?

I suspect you started the xterm from an Eterm...

> I start xorg via startx, but will run the compile/file once i wake up!

I don't think it's neccessary, I already found the bug in eterm's code.

cheers
simon

#24 Updated by elekktretterr over 8 years ago

Yep, correct. Its an Eterm bug, I just tried running xterm not from
Eterm. Any ideas on why it happened only on the ppp(8) page as well?
What can I do about it now? Perhaps Joerg could patch it in the pkgsrc tree?

Petr

Simon 'corecode' Schubert wrote:
> Petr Janda wrote:
>> I get the same error in xterm too though. Are you sure?
>
> I suspect you started the xterm from an Eterm...
>
>> I start xorg via startx, but will run the compile/file once i wake up!
>
> I don't think it's neccessary, I already found the bug in eterm's code.
>
> cheers
> simon
>

#25 Updated by corecode over 8 years ago

Petr Janda wrote:
> Yep, correct. Its an Eterm bug, I just tried running xterm not from
> Eterm. Any ideas on why it happened only on the ppp(8) page as well?
> What can I do about it now? Perhaps Joerg could patch it in the pkgsrc
> tree?

because the ppp man page is so long that it doesn't fit in the default buffer.

workarounds:
1. ignore it
2. recompile man, matt changed it so zcat doesn't report warnings
3. use a different terminal (isn't eterm ugly as hell?)
4. talk with the eterm creators why they do this + tell them they should reset all signals to SIG_DFL when starting the shell
5. use zsh (anyways a good idea), it seems to reset the signals to default

cheers
simon

Also available in: Atom PDF