Linux 3.9 introduced new way of writing socket servers
Linux kernel 3.9 came with an interesting feature: ability to bind multiple listening sockets to the same port on the same host, by unrelated processes or threads. The socket option is called SO_REUSEPORT and here is an article by Michael Kerrisk describing the feature.

The article makes SO_REUSEPORT look like it's a performance hack relevant only to multithreaded Google servers, but the option works also for processes and gives a new way of thinking about multiprocess socket servers. There are two standard UNIX designs of such servers:

  • fork - a main process creates a listening socket and calls accept. After a client connects, a child process is created, which inherits the newly created socket and communicates with the client, when the communication ends the process dies,
  • prefork - a main process creates a listening socket and creates N child processes, which inherit the listening socket. The child processes call accept on the socket. Operating system is responsible for distributing incoming connections among all the child processes.

The prefork model works better (when it can be used1) - it uses a preallocated pool of processes instead of calling fork() for every connection. It also gives more control over used system resources by ability to choose a size of the process pool, whereas in the fork model a number of processes can grow uncontrollably.

SO_REUSEPORT option can be viewed as a vastly simplified prefork model. No need to run the parent process, manage children, setup signals etc. Just spawn any number of single-process, single-threaded servers, kill them and add new as necessary. The kernel does all the work of the parent process from the classic prefork model.

Let's see an example. Here is a Python program that sets SO_REUSEPORT option on the listening socket. Besides that it's a normal echo server. We are printing PID after a client connect to see which process accepted a connection. Note also that I defined the SO_REUSEPORT constant by hand, because even the newest Python (3.3.2) lacks it's definition in the socket module.

import socket
import os

SO_REUSEPORT = 15

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.setsockopt(socket.SOL_SOCKET, SO_REUSEPORT, 1)
s.bind(('', 10000))
s.listen(1)
while True:
    conn, addr = s.accept()
    print('Connected to {}'.format(os.getpid()))
    data = conn.recv(1024)
    conn.send(data)
    conn.close()

Let's now spawn two servers and talk to them using nc as a client:

$ python server.py&
[1] 12649
$ python server.py&
[2] 12650
$ echo data | nc localhost 10000
Connected to 12649
data
$ echo data | nc localhost 10000
Connected to 12650
data
$ echo data | nc localhost 10000
Connected to 12649
data
$ echo data | nc localhost 10000
Connected to 12650
data

It works - we were connected to one of the two spawned processes2. Of course adding "dynamically" a new server also works:

$ python server.py&
[3] 14021
$ echo data | nc localhost 10000
Connected to 12650
data
$ echo data | nc localhost 10000
Connected to 14021
data

Now the question is why to bother with multiprocess socket servers at all - aren't threads and events better? There's at least one good niche for them - dynamic languages like Python or Ruby, which need multiple OS processes to achieve real concurrency.

Actually, I was interested in using the SO_REUSEPORT option after using gunicorn as a Python HTTP server, which is based on a Ruby server Unicorn, which uses a prefork model. One must be very careful about which code is executed in which process. For example, creating a database connection in a parent process and using it (implicitly cloned by fork()) in a child processes is a recipe for disaster, with a possibility of interleaved data coming from multiple processes visible to a database server as a single connection3. The other bag of inconveniences results from the fact that a pool of child processes is controlled by a parent process, so one must rely on functionality implemented in a server to control children: add or kill a new process, query statuses etc.

SO_REUSEPORT simplifies the situation. There is no parent, so there are no worries about implicitly inherited data. If a setup code executed before accepting connections is needed, it can be placed in a separate script and run before servers. And management of processes can be done using standard Unix tools, or process managers like Supervisor.

I'm not sure how much needs to be implemented to start using SO_REUSEPORT in Python WSGI servers. The only needed thing seems to be setting the socket option, the rest should be a minimal single-threaded, single-process server, and all the processes should be wired together using a process manager.

1 Slow clients or a stateful network protocol can quickly make all the processes occupied, spending most of the time waiting for network I/O.

2 Further tries showed that a process isn't selected using a round-robin algorithm, with the same process receiving connections multiple times in a row. I'm using kernel 3.10.3.

3 In case of PostgreSQL libpq documentation forbids using database connections after fork().

23 comment(s)
Philipp, 2013-08-24 19:17:43
Interesting. Reusing ports. That's wild. Never thought that was possible. Aren't there attack scenarios where one could steal a connection?
Anonymous, 2013-08-24 19:22:26
From the linked article: "To prevent unwanted processes from hijacking a port that has already been bound by a server using SO_REUSEPORT, all of the servers that later bind to that port must have an effective user ID that matches the effective user ID used to perform the first bind on the socket. "
Sean, 2013-08-24 20:03:21
Wouldn't a better way to secure the port be to set a flag indicating its exclusive to the parent process id?
Keith, 2013-08-24 20:24:31
'New' is something of an exaggeration. SO_REUSEPORT is a feature that's existed in BSD derivatives for decades.
Jacky Alciné, 2013-08-24 21:05:29
Keith, it's new to Linux.
Anonymous, 2013-08-24 21:07:52
DECADES.
Anonymous, 2013-08-24 21:11:03
Wow! As maintainer of an app that uses php and xinetd this could have some, really nice benefits to performance!
Anonymous, 2013-08-24 22:00:52
not good overall
Anonymous, 2013-08-24 22:25:12
Sean, the point is to allow processes which are not necessarily siblings to listen/accept on the same socket. (No forking, no common parent.)
Daniel, 2013-08-24 23:03:52
This is also a great way to hotswap network services with zero downtime: just bring up the new version of the daemon online, which will start serving requests right away, then send a signal to the old daemon telling it to not accept() requests anymore, just finish handling the current request and then die gracefully.
bert hubert, 2013-08-25 07:33:26
As the tiniest nit to this wonderful example code, I make it a point to use AF_INET6 for all my samples. This means a copy paster will at least have to do work to turn it into an IPv4-only solution!
Boris, 2013-08-25 09:50:09
SO_REUSEADDR also described in Kerrisk' book "The Linux Programming Interface" by No Starch Press.
And yes, it is cool.
Lothiraldan, 2013-08-25 10:39:56
It's weird, on Mac OS X, I've got a SO_REUSEPORT in socket module and the value is different:
Python 2.7.5 (default, May 19 2013, 00:47:00)
In [1]: import socket
In [2]: socket.SO_REUSEPORT
Out[2]: 512
Anonymous, 2013-08-25 12:12:53
Lothiraldan,

This is part of why you should use constants from your libraries by name when there is an option. Doing so makes it more portable where the value is different under different environments. The other part is so that the magic number isn't an unclear blight on your code.
Pavel, 2013-08-25 15:57:55
1. man 7 unix
2. Look for SCM_RIGHTS
3. ???
4. PROFIT
Seriously though. No new way has been introduced, it's just the old one was facilitated.
Anonymous, 2013-08-25 17:41:07
Is this web sockets?
Alex, 2013-08-25 18:11:27
Lothiraldan: Mac OS X is not Linux. In fact it was based on the BSD kernel.
Lothar Schwab, 2013-08-25 21:12:27
@Pavel,
good point. I has been possible all along. And passing file descriptors through a domain socket appears to me as the better solution because it allows the solicitor of the file descriptor to have more control over who will get the file descriptor. Using SO_REUSEPORT appears to open a barn door of security concerns.
Thomas Bratt, 2013-08-26 08:54:39
A quick comment on footnote 2: The kernel may reuse the same process to improve performance and memory usage - due to paging and caching.
Yann Droneaud, 2013-08-26 10:11:40
@Pavel,
Using SCM_RIGHTS, there's one process doing the accept() call ... this doesn't scale well on a multiprocessor, multi-NIC system.
  
o11c, 2013-08-26 17:17:39
@Yann Droneaud

No, you pass the listening socket fd, on which you do the accept().
Anonymous, 2013-08-26 19:00:20
what happens when both processes are blocked? prefork could detect blocked child processes and kill them off.
ixti, 2013-09-02 15:03:22
Ruby version of python example:
https://gist.github.com/ixti/6413826
Add a comment
Author (optional)