Forums: Depot:

 

Mysterious lighttpd crashes (emergency exit errors)

first last
 

StinkFist Mysterious lighttpd crashes (emergency exit errors)

Well there has been three strange lighttpd crashes over the last couple of weeks. The snippet from the logs this last time:


2005-10-15 13:51:42: (mod_fastcgi.c.2674) emergency exit: fastcgi: connection-fd: 81 fcgi-fd: 83
2005-10-15 13:51:42: (mod_fastcgi.c.1466) connect failed: 83 Connection refused 111 0 /tmp/lighttpd-rails-fcgi.socket-13
2005-10-15 13:51:42: (mod_fastcgi.c.2317) child signaled: 6
2005-10-15 13:51:49: (mod_fastcgi.c.2674) emergency exit: fastcgi: connection-fd: 84 fcgi-fd: 86
2005-10-15 13:52:15: (mod_fastcgi.c.1466) connect failed: 86 Connection refused 111 0 /tmp/lighttpd-rails-fcgi.socket-1
2005-10-15 13:52:15: (mod_fastcgi.c.2317) child signaled: 6


and later:

2005-10-15 13:58:06: (mod_fastcgi.c.3230) killed: socket: /tmp/lighttpd-rails-fcgi.socket-14 pid 12881
2005-10-15 13:59:48: (connections.c.1253) error-handler not found: /dispatch.fcgi
2005-10-15 14:00:13: (mod_fastcgi.c.3230) killed: socket: /tmp/lighttpd-rails-fcgi.socket-14 pid 5437
2005-10-15 14:00:22: (connections.c.1253) error-handler not found: /dispatch.fcgi
2005-10-15 14:00:44: (connections.c.1253) error-handler not found: /dispatch.fcgi
2005-10-15 14:01:45: (mod_fastcgi.c.3230) killed: socket: /tmp/lighttpd-rails-fcgi.socket-15 pid 13344


followed by a ton of:

2005-10-15 14:01:46: (mod_fastcgi.c.3254) pid 22688 4 not found: No child processes
2005-10-15 14:01:46: (mod_fastcgi.c.3254) pid 19789 4 not found: No child processes
2005-10-15 14:01:46: (mod_fastcgi.c.3254) pid 8086 4 not found: No child processes
...


These were the rest of the log until the server restarted.

I'm working on the wild guess that it has something to do with open file descriptors but I'm not holding my breath. If anyone has heard anything or is better at googling than I am I would love some input.

 

OvineWorrier

Which version of Lighttpd are you running? Are these errors occurring during high traffic or are they random?

Have you played with the following in your config or are they at their defaults?

server.max-keep-alive-requests
server.max-keep-alive-idle

try lower numbers like 16 or 4 for each setting or even setting max-keep-alive-requests to 0 (might be worth a try - last resort situation though)

How about Max connections?

server.max-fds <~~ what's that set to? default is 1024

Have you been tempted to geek out and try SCGI? smile

Bleat for me, baby...
quote
 

StinkFist

I've actually just changed the max-keep-alive-requests to 4 based on some of the lighttpd docco but I've done it pretty much blindly smile

I haven'd changed the max open file descriptors either but 1024 open file descripitors is a bit much.


I just did a search on SCGI and it looks pretty interesting. If it's more stable than FastCGI I'll be keen to give it a go.

 

OvineWorrier

1024 is the default unless you specify otherwise.

I've seen suggestions to double that for high traffic sites - then again, the issues you're getting may not be down to traffic. (just suggesting stuff off the top of my head) So, maybe leave that as it is for the moment.

Wanna post your Lighttpd config?

I've only recently started looking into SCGI based on some ruby blogs but haven't got as far as trying it out in a production environment - still playing with it in VMWare/FC4. Am liking what I've seen so far. smile



Bleat for me, baby...
quote
 

StinkFist

yeah, I guess it just seems weird to require 2048 open file descriptors smile

I really don't think we're getting that much traffic, but then again with oddball google links and such we could be getting hit.

 

OvineWorrier

How are the gnomes & unicorns holding up? (was the downtime earlier this week related to the Lighttpd issues you've been having?)

Btw, you had a play with this yet:

http://www.zedshaw.com/projects/scgi_rails/rdoc/index.html

Bleat for me, baby...
quote
 

StinkFist

well, yeehaw. another gnome fell asleep on the job.

Looks like it happened again. I finally have a little bit of time today so I'll be playing sysadmin and see what I can find out. Thanks for the scgi links I think that's a really good place to start smile

 

tenPlus

there's always my offer of a box of matches

 

StinkFist

*burns everything to the ground*


big grin

 

tenPlus

so how'd they treat you up there in the boondocks?
The ppl up there are generally very friendly and easy going, it's a diff lifestyle, a whole diff country. Once you get to know the locals and find out more about the area you'll really appreciate that therereally is many of pieces of utopia in oz.
No pix huh?
Anyway, I hope it was a good experience and that you were able to see more than just a 'puter monitor while you were there.

 

StinkFist

I didn't grab any pics because Dana's got both of the cameras in Hawaii big grin

I had an amazing time, I flew into weipa and spent a couple of days between there and napanum. Anyway it was awesome, the people were super nice and the weather wasn't too hot at all smile

 

tenPlus

who was the work for, cisco, cydn, or someone else?

 

StinkFist

It's actually a project that Abuzz has been involved with in conjunction with the University of Queensland. We have kiosks set up in aboriginal communities that provide information on health and act as community notice boards.

Because there was issues with the extremely limited bandwidth in the indiginous communities we wrote a lightweight interface for limited content creation that complements the more heavy-weight content management system that they use in Cairns where the project is based. That way people in the community can be more involved in the content that is on their local kiosks.

The place where we field tested it was a cydn centre though smile

I tell ya, I could hang with a bit more time up there. It was pretty cool.

 

tenPlus

it's a more laid back lifestyle, but they play hard when they play. It's great to hear that you enjoyed it though as it really is an amazing part of the country with some genuine characters and unforgettable scenery.

 

Arsis

conifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifersconifers

 

tenPlus

update: you sure these mysterious crashes aren't because you're going the microsoft way Stinky? .. I hear that Paul Allen was in Cairns as well, that yacht of his sticks out like dogs nuts in a place like that.

 

StinkFist

Originally posted by: OvineWorrier
How are the gnomes & unicorns holding up? (was the downtime earlier this week related to the Lighttpd issues you've been having?)

Btw, you had a play with this yet:

http://www.zedshaw.com/projects/scgi_rails/rdoc/index.html


I did the SCGI thing and it appears to be hanging in just fine at the mo smile

We'll see how it holds up, I had a bit of futzing with installation because I forgot about the default rails .htaccess file and things were being routed to dispatch.cgi rather than dispatch.sgci. Sneaky sneaky. Anyway I just removed my .htaccess file and everything worked great.

 

OvineWorrier

*notes the time and optimism of stinky's post*

big grin

From the other thread, I take it you're now running the latest versions of all the components on top of Apache 1.3.3? Feels a bit sluggish in serving pages but you'll no doubt be able to fine tune the gnomes once the stability's sorted.

Is it worth having curl or something like daedilus to check the status of the server and reboot if it dies or is that introducing more problems at the moment?

Btw, nice work keeping the site running. Nothing worse than having a day ruined with problem-solving.

smile

Bleat for me, baby...
quote
 

StinkFist

Everything seems to be working, it's made it through the night smile

The problem was that Apache was dropping the connection every now and then and rails assumes it's writing to stdout (which doesn't just drop a connection). So I was getting connection reset errors that were killing rails.

Anyway I just moved the begin block in the CgiResponse.out method up a little to swallow any exceptions coming from the socket write and it seems to have done the trick.

from cgi_process.rb


class CgiResponse < AbstractResponse #:nodoc:
def initialize(cgi)
@cgi = cgi
super()
end

def out(output = $stdout)
convert_content_type!(@headers)
output.binmode if output.respond_to?(:binmode)
output.sync = false if output.respond_to?(:sync=)

begin
output.write(@cgi.header(@headers))

if @cgi.send(:env_table)['REQUEST_METHOD'] == 'HEAD'
return
elsif @body.respond_to?(:call)
@body.call(self, output)
else
output.write(@body)
end

output.flush if output.respond_to?(:flush)
rescue Object => e
# lost connection to the SCGI process -- ignore the output, then
end
end

private
def convert_content_type!(headers)
%w( Content-Type Content-type content-type ).each do |ct|
if headers[ct]
headers["type"] = headers[ct]
headers.delete(ct)
end
end
end
end

 

OvineWorrier

Nice one.

So.... when are you installing trac?

*runs away fast*

Bleat for me, baby...
quote
 
first last
 

Forums: Depot: Mysterious lighttpd crashes (emergency exit errors)

 
New Post
 
You must be logged in to post