Performance tips

CPython is not the fastest interpreter out there, and inter-process communication suffers of both serialization and data transfer overhead, but these considerations will help you avoid common performance pitfalls.

Simplify serialized data

Using simpler data types (like python primitives) will dramatically reduce the time spent on serialization, while reducing the chance of transferring unnecessary data.

Custom serialization

When defining your own classes aimed to be sent to and from actors, consider implementing some pickle serialization interfaces in order to customize how they will be serialized, so unnecessary state data will be ignored.

Class optimization

By defining the __slots__ magic property on your classes (and by not adding __dict__ to it), their property mapping will become immutable, dramatically reducing their initialization cost.

Tip: if you plan to weakref those instances, you’ll need to add __weakref__ to __slots__.

External storage for big data-streams

In some cases, actors might need to transfer huge data blobs of between them.

In general, message-passing protocols are usually not the best at this, it might be better to persistently store that data somewhere else while only sending, as the message, what’s necessary to externally fetch that data.

You can see how to achieve this in our Intermediate result storage section.

Pickle5 (hack)

Traditionally, multiprocessing, and more specifically pickle, were not particularly optimized for binary data buffer transmission.

Python 3.8 introduced a new pickle protocol (PEP 574), greatly optimizing the serialization of buffer objects (like bytearray, memoryview, numpy.ndarray).

For compatibility reasons, multiprocessing does not use the latest pickle protocol available, and it does not expose any way of doing so other than patching it globally.

Workaround (tested on CPython 3.8 and 3.9, to use the latest protocol):

import multiprocessing.connection as mpc

class ForkingPickler5(mpc._ForkingPickler):
    @classmethod
    def dumps(cls, obj, protocol=-1):
        return super().dumps(obj, protocol)

mpc._ForkingPickler = ForkingPickler5

For previous CPython versions, a pickle5 backport is available, but the patch turns out a bit messier because of implementation details.

Workaround (tested on CPython 3.6 and 3.7, to use the pickle5 backport):

import io
import multiprocessing.connection as mpc
import pickle5

class ForkingPickler5(pickle5.Pickler):
    wrapped = mpc._ForkingPickler
    loads = staticmethod(pickle5.loads)

    @classmethod
    def dumps(cls, obj, protocol=-1):
        buf = io.BytesIO()
        cls(buf, protocol).dump(obj)
        return buf.getbuffer()

    def __init__(self, file, protocol=-1, **kwargs):
        super().__init__(file, protocol, **kwargs)
        self.dispatch_table = \
          self.wrapped(file, protocol, **kwargs).dispatch_table

mpc._ForkingPickler = ForkingPickler5

Keep in mind these snippets are no more than dirty workarounds to one of many multiprocessing implementation issues, so use this code with caution.