Haven't had a chance to really dig yet, but had ngIRCd sig 11 and core seemingly at random this afternoon.
Syslog: Feb 10 02:38:23 cyrix-dlc ngircd[27448]: User "Java873!~Java873@210.105.36.60" unregistered (connection 4): Got QUIT command. Feb 10 12:50:47 cyrix-dlc ngircd[27448]: Got ERROR from "delft.nl.eu.x386.net": Ping timeout! Feb 10 12:50:47 cyrix-dlc ngircd[27448]: Server "delft.nl.eu.x386.net" unregistered (connection 0): Socket closed! Feb 10 12:52:07 cyrix-dlc ngircd[27448]: Can't connect socket to "130.161.219.160:6667" (connection 0): Operation timed out! Feb 10 12:53:22 cyrix-dlc ngircd[27448]: Can't connect socket to "130.161.219.160:6667" (connection 0): Operation timed out! Feb 10 12:54:37 cyrix-dlc ngircd[27448]: Can't connect socket to "130.161.219.160:6667" (connection 0): Operation timed out! Feb 10 13:21:05 cyrix-dlc ngircd[27448]: Write error on connection 0 (socket 5): Broken pipe! Feb 10 13:46:10 cyrix-dlc /kernel: pid 27448 (ngircd), uid 0: exited on signal 11 (core dumped)
I have the 5.5MB gzip'd core available if there is any interest in looking at it, otherwise I'm open to suggestions on how to proceed with debugging.
Joshua Coombs
Hi Joshua
Sorry for answering that late ...
Haven't had a chance to really dig yet, but had ngIRCd sig 11 and core seemingly at random this afternoon.
Oops :-/
Feb 10 13:21:05 cyrix-dlc ngircd[27448]: Write error on connection 0 (socket 5): Broken pipe!
This message is generated in Handle_Write(), conn.c, line 761 whenever the daemon can't send data to a socket.
Feb 10 13:46:10 cyrix-dlc /kernel: pid 27448 (ngircd), uid 0: exited on signal 11 (core dumped)
Probably the server can't handle the following Conn_Close() correctly ... hm ...
I have the 5.5MB gzip'd core available if there is any interest in looking at it, otherwise I'm open to suggestions on how to proceed with debugging.
It would be very helpful if you could create a backtrace with gdb ("gdb <corefile>", then command "bt").
Regards Alex
Not a problem, here's the end of the backtrace: #629081 0x8050edc in Try_Write () #629082 0x804ffeb in Conn_Close () #629083 0x80512da in Handle_Write () #629084 0x8050edc in Try_Write () #629085 0x804ffeb in Conn_Close () #629086 0x80512da in Handle_Write () #629087 0x8050edc in Try_Write () #629088 0x804ffeb in Conn_Close () #629089 0x80512da in Handle_Write () #629090 0x8050edc in Try_Write () #629091 0x804ffeb in Conn_Close () #629092 0x80512da in Handle_Write () #629093 0x8050edc in Try_Write () #629094 0x804ffeb in Conn_Close () #629095 0x80512da in Handle_Write () #629096 0x8050edc in Try_Write () #629097 0x804ffeb in Conn_Close () #629098 0x80512da in Handle_Write () #629099 0x804fb88 in Conn_Handler () #629100 0x8049e05 in main () #629101 0x80496b9 in _start ()
The rest of the trace is the conn_close, handle_write, try_write loop. I've got the entire backtrace here if needed. Having never done this before, doesn't look like much usefull info there. Would this be a good time to switch to a debug build of ngircd and see if it tanks again?
Joshua Coombs
Alexander Barton writes:
Hi Joshua
Sorry for answering that late ...
Haven't had a chance to really dig yet, but had ngIRCd sig 11 and core seemingly at random this afternoon.
Oops :-/
Feb 10 13:21:05 cyrix-dlc ngircd[27448]: Write error on connection 0 (socket 5): Broken pipe!
This message is generated in Handle_Write(), conn.c, line 761 whenever the daemon can't send data to a socket.
Feb 10 13:46:10 cyrix-dlc /kernel: pid 27448 (ngircd), uid 0: exited on signal 11 (core dumped)
Probably the server can't handle the following Conn_Close() correctly ... hm ...
I have the 5.5MB gzip'd core available if there is any interest in looking at it, otherwise I'm open to suggestions on how to proceed with debugging.
It would be very helpful if you could create a backtrace with gdb ("gdb <corefile>", then command "bt").
Regards Alex -- Alexander Barton, Freiburg, Germany http://www.barton.de/, alex@barton.de
ngIRCd-ML mailing list ngIRCd-ML@arthur.ath.cx http://arthur.ath.cx/mailman/listinfo/ngircd-ml
Hi Joshua!
Not a problem, here's the end of the backtrace:
Thanks!
[ ... ]
#629093 0x8050edc in Try_Write () #629094 0x804ffeb in Conn_Close () #629095 0x80512da in Handle_Write () #629096 0x8050edc in Try_Write () #629097 0x804ffeb in Conn_Close () #629098 0x80512da in Handle_Write () #629099 0x804fb88 in Conn_Handler () #629100 0x8049e05 in main () #629101 0x80496b9 in _start ()
Hmpf. The daemon is in an endless loop here: Conn_Hander() wanted to empty some send buffer and called Handle_Write() which failed and tried to close the socket: Conn_Close(). This call again detected that there was some date in the write buffer and tried to empty it using Handle_Write() before closing it down ... -> Loop.
doesn't look like much usefull info there.
Oh, it is useful!
I think I can fix this quite good. Expect a patch soon.
Would this be a good time to switch to a debug build of ngircd and see if it tanks again?
I don't think that this is neccessary ...
Regards Alex
Alexander Barton writes:
doesn't look like much usefull info there.
Oh, it is useful!
I think I can fix this quite good. Expect a patch soon.
Guess I was expecting it to spit out:
Line 64225 - Ooops, did I do that? Comment this line to fix
Cool, I'll rebuild once the patch is ready and throw it into the fire to see what happens.
Joshua Coombs
Hi Everyone,
had the same problem here. We are using ngircd as backend for a web chat. I first ran into trouble with 0.6.0. Release 0.6.1 kept blowing up when there was some traffic. For now, we'll stick with 0.5.4 which does run fine. So, we're looking forward to the patch too!
Btw, to everyone developing ngircd, we really enjoy using it. We like the clean configuration and ease of use. Keep up the great work!
Best Regards, Martin Lorentz
Am Samstag, 15.02.03, um 08:31 Uhr (Europe/Berlin) schrieb Martin Lorentz:
I first ran into trouble with 0.6.0. Release 0.6.1 kept blowing up when there was some traffic. For now, we'll stick with 0.5.4 which does run fine. So, we're looking forward to the patch too!
Hm ... did you get some core dumps? If so, could you try to generate a backtrace using gdb? Are there some /tmp/ngircd-*.err's left after the daemon crashed with some information in them?
Btw, to everyone developing ngircd, we really enjoy using it. We like the clean configuration and ease of use. Keep up the great work!
Thank you :-)
Regards Alex
Hi folks!
I think I can fix this quite good. Expect a patch soon.
Okay, the patch had to wait for some days, I've been ill since lash tuesday :-(
The attached patch applies to CVS-HEAD, I committed it to HEAD too. Therefore it will be included in the next "nightly" tarball (20030222).
Let me know if it addresses your crashes!
Regards Alex
PS.: The patch _should_ applie to 0.6.1, too -- but you have to patch it "semi-manual": the patch utility succeedes with conn.h but fails on conn.c ...
I wrote:
PS.: The patch _should_ applie to 0.6.1, too -- but you have to patch it "semi-manual": the patch utility succeedes with conn.h but fails on conn.c ...
I applied the patch to CVS branch-0-6-x, now.
Has anybody had the chance to test it "in the real world"?
Regards Alex
So far I'm running clean.
Joshua Coombs www.x386.net
Alexander Barton writes:
I wrote:
PS.: The patch _should_ applie to 0.6.1, too -- but you have to patch it "semi-manual": the patch utility succeedes with conn.h but fails on conn.c ...
I applied the patch to CVS branch-0-6-x, now.
Has anybody had the chance to test it "in the real world"?
Regards Alex -- Alexander Barton, Freiburg, Germany alex@barton.de, http://www.barton.de/
ngIRCd-ML mailing list ngIRCd-ML@arthur.ath.cx http://arthur.ath.cx/mailman/listinfo/ngircd-ml