Non-blocking I/O for Python

Motivation: Ever wondered “What if NODEJS non blocking I/O could be somehow ported to python?”

Handling millions of long running tcp connections would require threads or processes pool in python. Below is the snippets for handling concurrent tcp connections using Thread Pool.

Concurrent TCP echo server using Thread Pool

The above threaded tcp server depicts following insane behavior:

  1. Python threads are real POSIX thread and are managed by OS and not the language runtime.
  2. Brings horrifying universe of deadlocks, mutex, conditional variables, futex, data races, threads synchronization, thread safe queue.
  3. Don’t use it.
  4. Finally, please don’t use it. (GIL gods are happy that way)

So, do we have any other alternatives to write asynchronous non-blocking IO in python without threads or process forks magic? And the answer is “YES”. Enter asyncio. This is fairly new package in standard library of python that tries to overlap computation and I/O and hence, not blocking the user space during i/o system calls. However there are major downsides to it. Few of them are:

  1. Suffering and pain(complicated API)
  2. It’s library and not a runtime.
  3. All the major blocking I/O drivers and libraries are useless. Example: psycopg2, pymongo, socket, requests, Sqlalchemy, etc. Forget about it.
  4. Weird syntax wrapping callback inside coroutine and exposing Future as a result primitive.
  5. Yet another queue, yet another future.
# OLD
from concurrent.futures import Future
import queue
# NEW
from asyncio import Future
from asyncio import Queue

Yeah, quickly burns my eyes with complexity it brings in regards to Future(Promise in JS), Task, Event Loop, Coroutine, Generators, async-await, Task wrapped in coroutine, thread-executor, wrapped-future, callbacks, etc.

Asyncio surely do brings API similar to what Node JS has to offer, however it lacks the luxury of event-loop within its runtime instead of in python which unfortunately exposes lower level implementation details as well as friction between threads and principle of single threaded event based I/O within python.

So what is my goal here?

  1. Don’t use Node JS. Its single threaded and non-blocking but switching language? Really?
  2. Don’t use threads
  3. Don’t use process pools.
  4. Don’t use asyncio either
  5. Don’t use async await. Nope!!
  6. Be concurrent and be able to handle millions of tcp connections while being single threaded.

Enter combination of and So what is select?

system call is used to determine when there’s any activity for an I/O descriptor. What makes the call interesting is that it can be used to provide notification for not just one descriptor, but many. For each descriptor, you can request notification of the descriptor’s ability to write data, availability of read data, and also whether an error has occurred. Below is the typical communication between user space and kernel space using select system call.

Communication between user space and kernel

If you are more interested about the API surrounding select, please follow this link https://developer.ibm.com/articles/l-async/

In essence, it has mainly three properties:

  1. It selects only the sockets that are ready to read, write or both from the pool of sockets provided from the user space.
  2. It blocks until any socket is available for read or write.
  3. However, it allows for overlapping CPU tasks via timeouts.

And straight out of wiki:

In computer science, a queue is a collection of entities that are maintained in a sequence and can be modified by the addition of entities at one end of the sequence and removal from the other end of the sequence.

Below is the pure python implementation just using sys call and custom event loop

TCP echo server using select and queue

P.S, For those who compared the above event loop vs libuv(nodejs).

  1. No task scheduler using sleep and timeouts.
  2. No task cancellation api.
  3. No interoperability between Future/Promise and callbacks.
  4. No waiting and sleeping queue.
  5. No select timeout.
  6. No prioritization between I/O and CPU bound tasks.

The implementation is way infant but is solely provide proof of concept around asynchronous programming with python and how we could achieve similar I/O performance compared to NODE JS. Also, I was curious about how python, nodejs or any other asynchronous language achieved non-blocking I/O capability.

The code is fairly simple and you might want to clone the gist and run it for yourself if you are interested in implementation.

Thank you for the read.