# Performance tips

CPython is not the fastest interpreter out there, and
[inter-process communication][ipc] suffers of both serialization
and data transfer overhead, but these considerations will help you avoid
common performance pitfalls.

## Simplify serialized data

Using simpler data types (like python primitives) will dramatically reduce
the time spent on serialization, while reducing the chance of transferring
unnecessary data.

## Custom serialization

When defining your own classes aimed to be sent to and from actors,
consider implementing some [pickle] serialization [interfaces][pickle-reduce]
in order to customize how they will be serialized, so unnecessary state
data will be ignored.

## Class optimization

By defining the `__slots__` magic property on your classes (and by not
adding `__dict__` to it), their property mapping will become immutable,
dramatically reducing their initialization cost.

**Tip:** if you plan to [weakref] those instances, you'll need to add
`__weakref__` to `__slots__`.

## External storage for big data-streams

In some cases, actors might need to transfer huge data blobs of between them.

In general, message-passing protocols are usually not the best at this,
it might be better to persistently store that data somewhere else while only
sending, as the message, what's necessary to externally fetch that data.

You can see how to achieve this in our
[Intermediate result storage](./result_storage.md) section.

## Pickle5 (hack)

Traditionally, [multiprocessing], and more specifically [pickle],
were not particularly optimized for binary data buffer transmission.

Python 3.8 introduced a new pickle protocol ([PEP 574][pep574]),
greatly optimizing the serialization of [buffer] objects
(like [bytearray], [memoryview], [numpy.ndarray]).

For compatibility reasons, [multiprocessing] does not use the latest
pickle protocol available, and it does not expose any way of doing so
other than patching it globally.

Workaround (tested on CPython 3.8 and 3.9, to use the latest protocol):
```python
import multiprocessing.connection as mpc

class ForkingPickler5(mpc._ForkingPickler):
    @classmethod
    def dumps(cls, obj, protocol=-1):
        return super().dumps(obj, protocol)

mpc._ForkingPickler = ForkingPickler5
```

For previous CPython versions, a [pickle5 backport][pickle5] is available, but
the patch turns out a bit messier because of implementation details.

Workaround (tested on CPython 3.6 and 3.7, to use the pickle5 backport):
```python
import io
import multiprocessing.connection as mpc
import pickle5

class ForkingPickler5(pickle5.Pickler):
    wrapped = mpc._ForkingPickler
    loads = staticmethod(pickle5.loads)

    @classmethod
    def dumps(cls, obj, protocol=-1):
        buf = io.BytesIO()
        cls(buf, protocol).dump(obj)
        return buf.getbuffer()

    def __init__(self, file, protocol=-1, **kwargs):
        super().__init__(file, protocol, **kwargs)
        self.dispatch_table = \
          self.wrapped(file, protocol, **kwargs).dispatch_table

mpc._ForkingPickler = ForkingPickler5
```

Keep in mind these snippets are no more than dirty workarounds to one of many
[multiprocessing][multiprocessing] implementation issues, so use this code
with caution.

[ipc]: https://en.wikipedia.org/wiki/Inter-process_communication
[weakref]: https://docs.python.org/3/library/weakref.html
[pep574]: https://www.python.org/dev/peps/pep-0574/
[buffer]: https://docs.python.org/3/c-api/buffer.html
[bytearray]: https://docs.python.org/3/library/functions.html#func-bytearray
[pickle]: https://docs.python.org/3/library/pickle.html
[pickle-reduce]: https://docs.python.org/3/library/pickle.html#pickling-class-instances
[memoryview]: https://docs.python.org/3/library/stdtypes.html#memoryview
[numpy.ndarray]: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
[multiprocessing]: https://docs.python.org/3/library/multiprocessing.html
[pickle5]: https://pypi.org/project/pickle5/