# Performance tips CPython is not the fastest interpreter out there, and [inter-process communication][ipc] suffers of both serialization and data transfer overhead, but these considerations will help you avoid common performance pitfalls. ## Simplify serialized data Using simpler data types (like python primitives) will dramatically reduce the time spent on serialization, while reducing the chance of transferring unnecessary data. ## Custom serialization When defining your own classes aimed to be sent to and from actors, consider implementing some [pickle] serialization [interfaces][pickle-reduce] in order to customize how they will be serialized, so unnecessary state data will be ignored. ## Class optimization By defining the `__slots__` magic property on your classes (and by not adding `__dict__` to it), their property mapping will become immutable, dramatically reducing their initialization cost. **Tip:** if you plan to [weakref] those instances, you'll need to add `__weakref__` to `__slots__`. ## External storage for big data-streams In some cases, actors might need to transfer huge data blobs of between them. In general, message-passing protocols are usually not the best at this, it might be better to persistently store that data somewhere else while only sending, as the message, what's necessary to externally fetch that data. You can see how to achieve this in our [Intermediate result storage](./result_storage.md) section. ## Pickle5 (hack) Traditionally, [multiprocessing], and more specifically [pickle], were not particularly optimized for binary data buffer transmission. Python 3.8 introduced a new pickle protocol ([PEP 574][pep574]), greatly optimizing the serialization of [buffer] objects (like [bytearray], [memoryview], [numpy.ndarray]). For compatibility reasons, [multiprocessing] does not use the latest pickle protocol available, and it does not expose any way of doing so other than patching it globally. Workaround (tested on CPython 3.8 and 3.9, to use the latest protocol): ```python import multiprocessing.connection as mpc class ForkingPickler5(mpc._ForkingPickler): @classmethod def dumps(cls, obj, protocol=-1): return super().dumps(obj, protocol) mpc._ForkingPickler = ForkingPickler5 ``` For previous CPython versions, a [pickle5 backport][pickle5] is available, but the patch turns out a bit messier because of implementation details. Workaround (tested on CPython 3.6 and 3.7, to use the pickle5 backport): ```python import io import multiprocessing.connection as mpc import pickle5 class ForkingPickler5(pickle5.Pickler): wrapped = mpc._ForkingPickler loads = staticmethod(pickle5.loads) @classmethod def dumps(cls, obj, protocol=-1): buf = io.BytesIO() cls(buf, protocol).dump(obj) return buf.getbuffer() def __init__(self, file, protocol=-1, **kwargs): super().__init__(file, protocol, **kwargs) self.dispatch_table = \ self.wrapped(file, protocol, **kwargs).dispatch_table mpc._ForkingPickler = ForkingPickler5 ``` Keep in mind these snippets are no more than dirty workarounds to one of many [multiprocessing][multiprocessing] implementation issues, so use this code with caution. [ipc]: https://en.wikipedia.org/wiki/Inter-process_communication [weakref]: https://docs.python.org/3/library/weakref.html [pep574]: https://www.python.org/dev/peps/pep-0574/ [buffer]: https://docs.python.org/3/c-api/buffer.html [bytearray]: https://docs.python.org/3/library/functions.html#func-bytearray [pickle]: https://docs.python.org/3/library/pickle.html [pickle-reduce]: https://docs.python.org/3/library/pickle.html#pickling-class-instances [memoryview]: https://docs.python.org/3/library/stdtypes.html#memoryview [numpy.ndarray]: https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html [multiprocessing]: https://docs.python.org/3/library/multiprocessing.html [pickle5]: https://pypi.org/project/pickle5/