[Larceny-users] Larceny Runtime

Discussion:

[Larceny-users] Larceny Runtime - SIGPIPE Handled?

Ray Racine

2008-05-19 02:17:54 UTC

I have been doing some load testing, lots of threads/tasks with lots of
socket io on my Larceny WebServer.

Current test goes like this:

1. Client (browser, curl hammer scripts) makes an HTTP request to my
Larceny WebServer.

2. The Larceny server in handling the request opens additional out bound
HTTP requests to Amazon, Google, Wikipedia et al snarfing content,
mashing content, and returning mashed content to client.

3. I can hammer away. Stable, no mem leaks. Yea ha!.

4. Until I _interrupt_ a client in mid request at which point Larceny
just exits. No msg, no error, no muss, no fuss, just exits back to the
command line.

I have traced it to here.

;; write(2)
;; int write( int fd, void *buf, int n )
(define unix-write (foreign-procedure "write" '(int boxed int) 'int))

When called, it never returns.

Of course, I'm doing _tons_ of unix-reads as well. And I'm playing much
more games with reading such as non-blocking, EAGAIN, epolling. For
write I just do a simple write without much fanfare on the file
descriptor. Surprise surprise, no problems on read, but with the write.

Googling on socket writing I find...

Signals
When writing onto a connection-oriented socket that has been shut down
(by the local or the remote end) SIGPIPE is sent to the writing process
and EPIPE is returned. The signal is not sent when the write call
specified the MSG_NOSIGNAL flag.

My theory is, when the client is interrupted the socket is closed. The
Larceny server continues to try writing to the socket fd resulting in
the SIGPIPE signal and causing the exit. I looked briefly at signals.c
and don't see it being handled or masked. But I didn't look very
hard :).

I have 2 options that I know of.

1) Use send instead of write. Send allows for additional flags over
write. IN particular the following flag.

MSG_NOSIGNAL
Requests not to send SIGPIPE on errors on stream oriented sockets when
the other end breaks the connection. The EPIPE error is still
returned.

2) The Larceny runtime handles/ignores the SIGPIPE signal (if it does
not already).

Chris Brannon

2008-05-19 04:18:15 UTC

Permalink

Post by Ray Racine
My theory is, when the client is interrupted the socket is closed. The
Larceny server continues to try writing to the socket fd resulting in
the SIGPIPE signal and causing the exit. I looked briefly at signals.c
and don't see it being handled or masked. But I didn't look very
hard :).

Are you checking the return value of write(2), to make sure that you don't
do a partially successful write? Most people only check for a negative
return value, assuming that they check at all. If your write(2) call
doesn't write all the data that you supplied, then you had a partial write,
and it's probably safe to assume that the connection was closed at the
other end.
Here's an example, in C.

while(1) {
char *msg = get_next_datum(); /* Get something to write. */
int msg_length = strlen(msg);
int num_written = write(some_socket, msg, msg_length);
if(num_written != msg_length) {
printf("Error sending data.\n");
/* Do whatever is necessary to handle error. */
}
}

In other words, I think you can avoid sigpipe by simply checking the
return value of write(2). Someone correct me if I'm wrong.

-- Chris

Ray Racine

2008-05-20 00:19:51 UTC

Permalink

Post by Chris Brannon
Are you checking the return value of write(2), to make sure that you don't
do a partially successful write? Most people only check for a negative
return value, assuming that they check at all. If your write(2) call
doesn't write all the data that you supplied, then you had a partial write,
and it's probably safe to assume that the connection was closed at the
other end.

Hi Chris,

I do check for the return value +/-/0. My testing shows that the FFI
call to write() never returns at all even if the previous attempted
write() call succeeded in writing the full value. In Larceny's iosys
this is typically 1024 chunks.

Best guess at this time is the unhandled SIGPIPE signal that is raised.

I'm going to follow Will's suggestion and attach a null handler in the
Larceny runtime modeled after the SIGFPE code in signals.c

At the same time I'll put in additional code to carefully look at the
write() return value.

I'll update the list after my test on what I've found.

Ray

William D Clinger

2008-05-19 13:34:54 UTC

Permalink

1. Does my theory make sense?

Sounds good to me.

2. What would Larceny do if a SIGPIPE was thrown? Exit?

Yes, I think that is what the current version of Larceny
will do.

3. Assuming SIGPIPE indeed needs to be dealt with, out of the two
options 1) change to send with NOSIGNAL flag 2) Larceny Runtime
handle/ignore/mask SIGPIPE which approach would you suggest.

I think the second option is the better way to go for the
long term. See

src/Rts/Sys/signals.h
src/Rts/Sys/signals.c

It should be fairly easy to modify signals.c to treat
SIGPIPE the same as SIGFPE, which would allow you to
test your theory and to see whether the second approach
would solve the problem. The hard part is testing the
signal-handling code under all of

BSD_SIGNALS
XOPEN_SIGNALS
POSIX_SIGNALS
STDC_SIGNALS
WIN32_SIGNALS

If you can send us a prototype of the signal-handling
code you want for one of those, with a fairly simple
test case, then we should be able to generalize it to
all five.

Will

Ray Racine

2008-05-20 02:13:02 UTC

Permalink

Post by William D Clinger
If you can send us a prototype of the signal-handling
code you want for one of those, with a fairly simple
test case, then we should be able to generalize it to
all five.
Will

It was the SIGPIPE signal causing Larceny to exit.

On Linux "kill -s 13 <pid>" sends the given process a SIGPIPE.

Given an running instance of Larceny the above command does cause
Larceny to silently exit.

OTOH, "kill -s 8 <pid>" sends a SIGFPE which puts the Larceny process
into the debugger.

My design goal is to have Larceny ignore the SIGPIPE signal and have the
scheme custom io port socket code deal with the returned EPIPE (32)
error code on a broken connection from the write() call.

Turns out on Linux there exists a pre-canned ignore handler SIG_IGN.
So the following does the trick.

act.sa_handler = SIG_IGN;
sigaction( SIGPIPE, &act, (struct sigaction*)0 );

In signals.c

==============================================

diff --git a/trunk/larceny_src/src/Rts/Sys/signals.c
b/trunk/larceny_src/src/Rts/Sys/signals.c
index bb4a752..dc81e52 100644
--- a/trunk/larceny_src/src/Rts/Sys/signals.c
+++ b/trunk/larceny_src/src/Rts/Sys/signals.c
@@ -152,6 +152,9 @@ void setup_signal_handlers( void )

act.sa_handler = fpehandler;
sigaction( SIGFPE, &act, (struct sigaction*)0 );
+
+ act.sa_handler = SIG_IGN;
+ sigaction( SIGPIPE, &act, (struct sigaction*)0 );
#elif defined(WIN32_SIGNALS)
SetConsoleCtrlHandler(win32_inthandler,TRUE);
signal(SIGFPE, fpehandler);

==============================================

After recompiling Larceny
kill -s 12 <pid> is, as expected, ignored by Larceny.

My test case is pretty easy on my side, but I'm struggling for a simple
test case for the Larceny developers to test cross O/S.

On my side, I start my Larceny webserver, go to my test home page, and
execute 2 rapid refresh clicks. The browser terminates the first request
on the second click refresh click, causing the SIGPIPE. This happens
pretty much every time.

With the above change Larceny does not exit, and the socket scheme code
sees the expected EPIPE error code returned from the FFI write() call.

To simulate the problem with standard Larceny and its accompanying
libraries the following test outline should reproduce the problem.

In thread or process #1 open up a server socket. Upon accepting a
client connection start writing a 1K buffer of data in a loop for say
100 times. e.g. send 100k of data

In thread or process #2 create a client socket which connects to #1's
server socket. Read the first a 1K or two of data, then #2 closes the
socket without reading the rest of the data.

#1 should silently exit Larceny to the command line on the SIGPIPE
signal.

Ray

William D Clinger

2008-05-20 13:52:38 UTC

Permalink

Post by Ray Racine
My test case is pretty easy on my side, but I'm struggling for a simple
test case for the Larceny developers to test cross O/S.

I think we can proceed on the basis of what you've
already given us. OTOH, we may have a SIGPIPE put
Larceny in the debugger by default, just as a SIGFPE
already does, but programmers can write an exception
handler to ignore it instead.

Thanks!

Will