Normally I do something related to programming in my spare time every month. I read something that I find interesting and want to share. Or I have some thought related to programming that I want to share. This is the first month since I started these monthly updates in June 2019 that I’ve got nothing of that. I’ve been occupied with other things, and I’ve also done quite a bit of programming at work. Perhaps that has satisfied my interest for programming.

One thing that I’ve done a lot at work this month is wrapping simple data structures in classes. Instead of passing around lists and dictionaries, I’ve created classes holding that data and only serialized it at the edges of the application. Every time I have done this, I wish I had done it sooner. It’s so good. And in nine times out of ten, those classes have attracted some functionality that fits perfectly. They have provided one place to put functionality instead of scattering it throughout the codebase.

What am I talking about? Let me give an example.

Imagine that we talk to a user service that has an API something like this:

service.get_all_users() -> ["user1", "user2"]
service.set_users_in_group("group1", ["user1", "user2"]) -> OK

The API works with users represented as list of strings. Now we want to write a function that applies some domain logic to assign users to groups. It is so easy and tempting to write something like this:

def update_r_users(service)
    r_users = []
    for user in service.get_all_users():
        if "r" in user:
            r_users.append(user)
    service.set_users_in_group("users_with_r_in_name", r_users)

One problem with this code is that it mixes calls to the user service with domain logic, so is is difficult to test domain logic without invoking the service. It’s also more difficult to reason about.

What if we instead did this:

def update_r_users(service)
    service.set_users_in_group(
        "users_with_r_in_name",
        Users.from_service(service.get_all_users()).filter_name("r").serialize()
    )

class Users:

    @classmethod
    def from_service(cls, users):
        return cls(users)

    def __init__(self, users):
        self.users = users

    def filter_name(self, text):
        return Users([
            user
            for user in self.users
            if text in user
        ])

    def serialize(self):
        return self.users

This is what I mean by wrapping simple data structures in classes. Why is this better?

First of all, I think update_r_users now reads a lot better. The filtering logic is no longer mixed with the calls to the service.

This comes at the cost of writing the Users class which is quite long for the relative functionality that it provides. In the beginning, I often find it hard to motivate myself to write these classes. It feels like a lots of boilerplate code for not much benefit. However, I often find that these sort of classes attract functionality, at which point they start to become more useful.

Another thing that they do is encapsulate the data format from the user service. Say that the API of the service changes. There is now more information about users:

service.get_all_users() -> [{"name": "user1", "age": 21}, {"name": "user2", "age": 43}]
service.set_users_in_group("group1", ["user1", "user2"]) -> OK

We can update the User class accordingly:

class Users:

    @classmethod
    def from_service(cls, users):
        return cls(users)

    def __init__(self, users):
        self.users = users

    def filter_name(self, text):
        return Users([
            user
            for user in self.users
            if text in user["name"]
        ])

    def serialize(self):
        return [user["name"] for user in self.users]

The update_r_users stays the same. We control the API of Users. Our application can safely depend on it. And we can change the internals.

I couldn’t find a name for this sort of pattern. My first though was that it was a way to avoid primitive obsession. And it is. But it feels like more than that. Bill suggested that it might be just “good old encapsulation”. Dan said that it depends on the context and that it could be an anti-corruption layer in DDD terms. What would you call it?

Anyway, I’ve been doing this sort of thing a lot this month, and I thought it was worth sharing.