-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SR-3360] on Linux, using DispatchIO with reading and writing simultaneously to a channel can cause the program to hang #707
Comments
Comment by Pierre Habouzit (JIRA) THis is a libkqueue bug, not a libdispatch once. |
phabouzit (JIRA User), I agree but there is no |
Comment by Pierre Habouzit (JIRA) mheily/libkqueue#29 was the right thing to do, this is a dependency and not part of Swift directly. |
phabouzit (JIRA User), cool but the thing is that I and other people will try to use |
Restoring the libdispatch component since our choice of libkqueue in swift-corelibs-libdispatch is owned by the same folks. |
Comment by Pierre Habouzit (JIRA) Will be fixed by https://github.com/apple/swift-corelibs-libdispatch/compare/das-darwin-libdispatch-806-merge-master |
Comment by Pierre Habouzit (JIRA) Moving to "Invalid" as in "no longer applies" because with https://github.com/apple/swift-corelibs-libdispatch/compare/das-darwin-libdispatch-806-merge-master we have native epoll support and it doesn't suffer from that libkqueue bug. |
Attachment: Download
Additional Detail from JIRA
md5: b40f78e80442b68d649a1548a10b4b87
relates to:
Issue Description:
Description
On Linux, when using a
DispatchIO
channel for reading and writing simultaneously it's possible that just gets stuck. The problem is the following:when doing
channel.read(...)
, the underlyingDispatchSource
callsepoll_ctl(..., EPOLL_CTL_ADD, ...)
(withEPOLLIN
set) to register the DispatchChannel's fd with epoll. In fact this is done by the libkqueue kqueue to epoll conversion layerwhen (before the read has finished) you also start writing enough using
channel.write(...)
thenlibdispatch
tries to add the channel's file descriptor to the epoll group, again usingEPOLL_CTL_ADD
. That results in anEEXIST
error as documented in the epoll man page.In other words the problem is that the file descriptor is already in the epoll set (for reading) and later on, there's an attempt to add it again (for writing). With
epoll
you however need to modifyEPOLL_CTL_MOD
the existing file descriptor's entry to beEPOLLIN | EPOLLOUT
.The bug is quite hard to reproduce as
libdispatch
only tries to register a writer in the epoll set if thewrite(2)
syscall ever getsEAGAIN
...The attached Swift program reliably reproduces the problem.
How the program works:
opens a socket pair
registers a
DispatchIO
reader on the one end (which I'll call the "server end" from now on)uses the
write
system call directly to write some bytes to the other end (which I'll call the "client end" from now on)we now wait that the
DispatchIO.read
on the server side sees that writethen I write lots of data (to force the
EAGAIN
to happen) into the server end but do not read it off the client end yetwhen all the writes have been triggered, not all of them can be completed because the pipe which implements the socket pair under the hood is currently full
lastly, I start a
read(2)
loop on the client end which should read all the bytes that have been written before to the server sideExpected
Program runs to completion, all the bytes that have been written can be read out again. This behaviour actually happens on macOS
Actual (on Linux)
The program gets stuck, not all of the writes (and therefore also the client's read attempts) succeed.
Notes
A run on Linux looks like:
the
EEXIST
fromepoll_ctl
is best seen withstrace
:A successful run should look like this:
in the code
The problem is that
EPOLL_CTL_ADD
is unconditionally used here https://github.com/mheily/libkqueue/blob/794155d5e8a6c7389741f448e1f4e8836fe450cd/src/linux/read.c#L183-L202 and https://github.com/mheily/libkqueue/blob/794155d5e8a6c7389741f448e1f4e8836fe450cd/src/linux/write.c#L83-L94 .The text was updated successfully, but these errors were encountered: