ITEM: K8725L

Need help with the poll command.


Question:

I am using the poll command with sockets and message queues. I am at
"\<3.2.5".  I am having problems, and need some help.

Response:

Customer is using the poll command with a timer on an internet
stream socket.  Customer is using poll because it
is far more efficient than simply waiting on a non blocking
socket function.  Everything works fine until the remote end
of the socket is shutdown.  At this time the program hangs
on the poll, and the socket connection is never reported
broken, although it should.  What is going wrong?  Customer
is setting the POLLIN and POLLPRIority bits.  


Response:

Customer called back; further debugging revealed that the 
process is not really stuck in the poll command.  The poll
command returns after the socket breaks with a value that
indicates there is something to be read on the socket 
descriptor.  When the socket descripter is read, the process
gets an input length of 0 and EWOULDBLOCK in errno.  
The process then gets stuck in an infinite loop with poll
returning on a zero byte input.  Should I treat this
0 return value as end of file?


Response:

A zero return value from a read does indicate an
EOF condition the the file descriptor.  (  See 
man pages on read, readv, readv, or readvx )

Once the remote process has died, there would seem
to be only two ways to be certain that the connection
has died.  

        First you can attempt to write to the socket 
when a 0 is returned on the nonblocking socket read.  
If the socket is truely dead, then the writing process 
will receive a SIGPIPE.  Be advised however, that unless
the program has been coded to handle a SIGPIPE gracefully
it will terminate.  This will provide instant feedback
on the status of the socket.

        The process might also enable the socket using
the socket option SO_KEEPALIVE.  With this option set,
the socket will use the check connections that have
been idle for the length of time specified by tcp_keepidle.
( set with the no command; check with 'no -a | pg' ) 
This will correctly find a socket whose peer has dropped
connection, though it will wait for the tcp_keepidle
to expire.  In AIX this value is set to the minimum 
allowable value specfied in the Host Requirements RFC,
which is 2 hours.  ( 14400 half seconds )  Changing
the tcp_keepidle to make this return faster will affect
all processes on the system, however.

        A third possiblity is a combination of the
previous two methods.  The Customer could code in
a timer that manually keeps track of how long the 
connection has been idle and then forces a write
to the socket to confirm its status after a 
certain timeout period.  This requires more effort on 
the part of the programmer, but has two distinct 
advantages:

  1)    The timeout for the socket could be set to
        a value much less than tcp_keepidle without
        affecting the other processes on the system
        by actually modifying tcp_keepidle.

  2)    Because the socket is non blocking, this method
        will eliminate the need for a write statement
        everytime a read returns with a 0 (EOF) value.

        TCP/IP is working as designed.  The keep alives
provided as part of AIX and several other unix implementations
are an elective portion of the protocol, and therefore not
activated by default on the socket creation.

References:

_TCP/IP Illustrated, Volume 1_, W. Richard Stevens, 1994,
Addison-Wesley Publishing Co.

_TCP/IP Tutorial and Technical Overview_, IBM, 1992,
GC24-3376-03.

HR
CENTER
FONT size=2Support Line: Need help with the poll command. ITEM: K8725L
BRDated: July 1994 Category: N/A
BRThis HTML file was generated 99/06/24~13:30:41
BRComments or suggestions?
A href="../../../../feedback.html"BContact us/B/ABR
/FONT
/CENTER
/BODY
/HTML