How-to — Serializer Combinations


The Basics

In order to extend the use of serializers, it is possible to use serializers that wrap other serializers. The goal is to customize the behavior of the serialized data without even knowing what is in it.

For example, you could:

  • compress the data

  • add a checksum to verify data integrity

  • encode/encrypt the data

  • and much more.

Such a class is called a serializer wrapper. The easynetwork.serializers.wrapper package provides some useful classes to handle these cases.

Use Cases

To illustrate why this concept exists and can be useful to you, let’s take a concrete example of an existing serializer.

Let’s say we have a serializer named… PickleSerializer.

Enable The Stream Context

There is no efficient way to determine the end of a pickle-serialized object without calling pickle.loads() and praying that it works. And since we’re running Python code to recursively deserialize objects, it’s even less safe to potentially deserialize by half.

Therefore, this class does not implement the AbstractIncrementalPacketSerializer class. But if we want to use stream pipes (at random, a TCP stream), how can we do that?

Fortunately, it is possible to encode pickle data in another format: base64. Using an alphabet, we can divide the received packets into lines. And this is exactly what the Base64EncoderSerializer already does:

 1def main() -> None:
 2    from easynetwork.serializers.wrapper import Base64EncoderSerializer
 3
 4    # Use of Base64EncoderSerializer as an incremental serializer wrapper
 5    # Each line (by default delimited with \r\n) is a base64 encoded pickle object.
 6    serializer = Base64EncoderSerializer(PickleSerializer())
 7    protocol = StreamProtocol(serializer)
 8
 9    with TCPNetworkClient(("remote_address", 12345), protocol) as endpoint:
10        # Sends b'gASVDQAAAAAAAAB9lIwEZGF0YZRLKnMu\r\n' over the TCP socket.
11        endpoint.send_packet({"data": 42})
12
13        ...

Ensure Data Integrity

The pickle protocol is incredibly handy, but not very secure. Aside from the fact that malicious data can be processed, it can also be corrupted in transit, which is even more problematic since pickle relies on data to execute code, not the other way around.

To overcome this, we can tell the serializer to add an extra layer of verification:

 1def main() -> None:
 2    from easynetwork.serializers.wrapper import Base64EncoderSerializer
 3
 4    # Add checksum parameter to append a unique hash to verify data integrity
 5    serializer = Base64EncoderSerializer(PickleSerializer(), checksum=True)
 6    protocol = StreamProtocol(serializer)
 7
 8    with TCPNetworkClient(("remote_address", 12345), protocol) as endpoint:
 9        endpoint.send_packet({"data": 42})
10
11        ...

Use Less Bandwidth

Let’s say {"data": 42} produces a BIG packet (e.g., 1Mb size).

This is often the case with large projects that handle a lot of data. It’s common practice to reduce the size of packets sent as much as possible to reduce bandwidth usage and network traffic.

The first thing to do would be to reduce the pickle data itself:

 1def main() -> None:
 2    from easynetwork.serializers.wrapper import Base64EncoderSerializer
 3
 4    # Reduce pickle size by adding pickler_optimize=True
 5    serializer = Base64EncoderSerializer(PickleSerializer(pickler_optimize=True), checksum=True)
 6    protocol = StreamProtocol(serializer)
 7
 8    with TCPNetworkClient(("remote_address", 12345), protocol) as endpoint:
 9        endpoint.send_packet({"data": 42})
10
11        ...

This will do the trick, but it’s clearly not enough. So, last option: compress the data using zlib.

 1def main() -> None:
 2    from zlib import Z_BEST_COMPRESSION
 3
 4    from easynetwork.serializers.wrapper import Base64EncoderSerializer, ZlibCompressorSerializer
 5
 6    # Add ZlibCompressorSerializer to compress pickle output
 7    serializer = Base64EncoderSerializer(
 8        ZlibCompressorSerializer(PickleSerializer(pickler_optimize=True), compress_level=Z_BEST_COMPRESSION),
 9        checksum=True,
10    )
11    protocol = StreamProtocol(serializer)
12
13    with TCPNetworkClient(("remote_address", 12345), protocol) as endpoint:
14        endpoint.send_packet({"data": 42})
15
16        ...

Deserialize Trusted Data

Adding a checksum to verify the data is all well and good, but it’s not enough in terms of security if there’s no authentication protocol to ensure the provenance of the data, and even less so when it comes to pickle.

The last thing to do would be to add a signature with a key known only to the two communicators.

 1def main() -> None:
 2    from zlib import Z_BEST_COMPRESSION
 3
 4    from easynetwork.serializers.wrapper import Base64EncoderSerializer, ZlibCompressorSerializer
 5
 6    # Sign the checksum hash with a key known only to both sides of the pipe to verify the origin of the data.
 7    verification_key = b"very secret key"
 8    serializer = Base64EncoderSerializer(
 9        ZlibCompressorSerializer(PickleSerializer(pickler_optimize=True), compress_level=Z_BEST_COMPRESSION),
10        checksum=verification_key,
11    )
12    protocol = StreamProtocol(serializer)
13
14    with TCPNetworkClient(("remote_address", 12345), protocol) as endpoint:
15        endpoint.send_packet({"data": 42})
16
17        ...

Now we use this StreamProtocol for both server and client, and we can have secure data transfer.