Version française
Home     About     Download     Resources     Contact us    
Browse thread
CamlGI question
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Eric Stokes <eric.stokes@c...>
Subject: Re: [Caml-list] CamlGI question [doh]
I have several multi threaded applications written with Netcgi_fastcgi, 
and I can say that I have seen problems like this, but not to this 
degree. However I've only tested on apache 1.3.x and 2.0.x with 
mod_fastcgi 2.4.2. That said, printing stuff to stdout should not cause 
a problem, because stdout is a listening socket, the data would 
essentially go into a buffer which will never be read. I think the 
problem may actually be single_write vs. write. When I originally wrote 
the fastcgi code for ocamlnet I was new to Ocaml, and I did not realize 
that write was different from the libc equivalent. Under load, or in 
some situations, using Unix.write will definitely cause a protocol 
error and could explain some strange intermittent behavior that I have 
seen. Therefore I will commit a patch the fastcgi library to use 
single_write instead of write and we'll see what happens. I'll also try 
to determine the effect of an error, and get back to you.

Mike, thanks for sticking with us and providing data, we appreciate it 
a lot.

On Apr 19, 2005, at 11:06 AM, Gerd Stolpmann wrote:

> Am Dienstag, den 19.04.2005, 11:28 -0400 schrieb Mike Hamburg:
>> I'll try to help work out this bug in either Apache/FastCGI or Netcgi,
>> and if that doesn't work, I'll go to AJP.  I compiled 
>> mod_fastcgi-2.4.0
>> on Apache 2.0.53.  It successfully serves FastCGI pages written with
>> CamlGI, except that it sometimes pauses after 8k of data, and 
>> sometimes
>> breaks the pipe to the CGI.
>
> Just to make sure Netcgi works also for multi-threaded programs, I made
> a test for myself. I added a second thread to the example 
> add_fastcgi.ml
> (of Ocamlnet), such that it reads now
>
> let a = ref ["Z"; "X"; "A"; "P"; "B" ];;
>
> Thread.create
>   (fun () ->
>      while true do
>        a := List.sort compare !a
>      done
>   )
>   ()
> ;;
>
> (* start the fastcgi server *)
> serv process buffered_transactional_optype;;
>
> at the end. So there is now a thread sorting this list over and over
> again.
>
> I tried it with Apache 2.0.50 and mod_fastcgi 2.4.2 (those versions
> coming with Ubuntu Warty). The first test failed, I had to comment out
> the FastCgiWrapper which is part of the default configuration. After
> that, the program just worked.
>
> I also looked at the sources. There might be problems because the 
> O'Caml
> thread implementation uses signals to wake up threads in some
> situations. This may cause that the Unix.read/write (one should better
> use Unix.single_write, btw.) fail with EINTR. This is not caught.
>
>> Under Netcgi 1.0, the CGI version works but the FastCGI version does 
>> not
>> work at all.  The difference between the CGI and FastCGI versions is 
>> only:
>>
>> let () = Netcgi_fcgi.serv main (`Direct "<br><hr><br>");
>> (* let cgi = new Netcgi.std_activation ()
>> let () =  main cgi *)
>>
>> The application reads files from the outside world, and may launch
>> processes with fork+execv to thumbnail images.  It is multithreaded,
>> with at least 4 threads at all times: the main thread, a thumbnail 
>> cache
>> manager, a thread which handles signals (this cannot be done in a 
>> signal
>> handler, as it may block, so I wake up the handler thread instead), 
>> and
>> a thread which rereads the configuration files periodically, so that 
>> the
>> user does not have to send HUP.
>>
>> The application usually raises Unix.Unix_error (EPIPE, "write", "") on
>> the line Netcgi_fcgi.serv main (`Direct "<br><hr><br>") or rather,
>> inside Netcgi_fcgi.serv at some point; I believe this occurs before 
>> main
>> is called.
>
> You can figure this out exactly when you byte-compile the program with
> -g, and set OCAMLRUNPARAM=b=1, e.g. using a wrapper script and
> redirecting stderr to a file.
>
> The EPIPE error is quite surprising, because Netcgi wraps all its
> exceptions into FCGI_Error.
>
>>  The webserver also receives EPIPE and returns error 500.
>> Sometimes, I get other errors: sometimes main gets called, and I get
>> Failure("send_output_header").
>
> This is very strange, too. Failure("send_output_header") can only 
> happen
> when the HTTP header is printed at the wrong moment (i.e. too late).
>
>>  If this happens, then rollback_output
>> successfully prints "<br><hr><br>" but no more (i.e. no further
>> diagnostics).  Sometimes, the webserver times out waiting for the 
>> first
>> read, in which case the application still raises EPIPE.
>>
>> Is there any more information that would help to debug the problem?
>
> I suspect a part of the program prints something to stdout at the wrong
> moment, maybe because of a race condition. This is only my intuition,
> but this is the first thing I would try to look for. This would explain
> why the fcgi protocol is violated, and EPIPE is the logical consequence
> (the web server sees wrong data and shuts down the socket).
>
> Maybe an strace of the process can show what is really going on.
>
> Gerd
>
>>
>> Mike
>>
>> Alex Baretta wrote:
>>
>>> If at all possible, rather than AJP I'd stick to FastCGI.  The code 
>>> in
>>> Ocamlnet reportedly works fine, but I don't have experience with it.
>>> Are you sure that the broken pipe is caused by a bug in Ocamlnet and
>>> not in the web server?
>>>
>>> Alex
>>
>>
>>
>> Gerd Stolpmann wrote:
>>
>>> Am Montag, den 18.04.2005, 23:26 -0400 schrieb Mike Hamburg:
>>>
>>>
>>>> I'm obviously too tired to be programming.  That should, of course, 
>>>> read
>>>> (1) fix a broken pipe error in Netcgi_fast or
>>>>
>>>>
>>>
>>> Well, we would need more details to help you.
>>>
>>>
>>>
>>>> (2) port the application to (and configure the webserver for) AJP
>>>>
>>>>
>>>
>>> Porting the application is quite simple, just follow the examples 
>>> coming
>>> with Ocamlnet (in examples/jserv). For the webserver, you need mod_jk
>>> (jk=Jakarta). The configuration reference is here:
>>>
>>> http://jakarta.apache.org/tomcat/connectors-doc/config/apache.html
>>>
>>> Note that Ocamlnet only supports AJP-1.2, not 1.3 which is the 
>>> current
>>> default for Tomcat.
>>>
>>> Gerd
>>>
>>>
>>
>>
> -- 
> ------------------------------------------------------------
> Gerd Stolpmann * Viktoriastr. 45 * 64293 Darmstadt * Germany
> gerd@gerd-stolpmann.de          http://www.gerd-stolpmann.de
> ------------------------------------------------------------
>
>