Merging Dictionaries In Python

A massive merging of industrial trains in a train yard with blue and brown colors apparently in the morning.

Photo by Zach Brown on Unsplash. Original post from Jason's blog.

At Tincre I use a dictionary merging (and class construction) strategy that is highly flexible and allows us to reason well with our JavaScript/TypeScript frontends that serve client users. You can read about that on our two part Slightly Sharpe blog series for the deets.

But herein, I'm simply going to show you how we actually merge Python dictionaries that have lots of Nonetype values and sparsely, str or other value types, like int or float values.

Merging strategies

So let's start with an example Pydantic PaymentFeatures class, which we'll use as our dictionary, per se.

You can use the .dict() method to get back a dictionary on Pydantic objects.

# pydantic

class PaymentFeatures(BaseModel):
    """A class to model data features (columns) for
    the Payment table stored in our PostgreSQL database"""
    myParam: Union[str, None] = None
    myOtherParam: Union[str, None] = None

Oftentimes I need to merge several dictionaries with the same keys but don't want None values to overwrite keys with str or to be specific, non-Nonetype, values.

If you're a regular Pythonista you know that this behavior isn't totally built-in or rather, not available off-the-shelf.

Regular dictionary merging

If you want to simply merge two dictionaries, keeping the last seen value for shared keys, Python 3.9 adds the Dict union operator with PEP 584.

Check out how the CPython API was updated via the original PR #12088.

To combine two dictionaries simply:

my_dict0 = dict(awesomeKeyName = "awesome value 0")
my_dict1 = dict(awesomeKeyName = "awesome value 1")
# return a new dict
my_merged_dict_1_overwrites_0 = my_dict0 | my_dict1
print(my_merged_dict_1_overwrites_0)

Yeah, you're right, that's super cool. And the community was thrilled with the addition to the Python dict built-in. Life was made easier, thanks to open-source, yet again. I digress.

And yes, for those language lawyers out there, you can also do the same thing in-place:

my_dict0 = dict(awesomeKeyName = "awesome value 0",)
my_dict1 = dict(
    awesomeKeyName = "awesome value 1",
    anotherAwesomeKeyName = "awesome value 1 for anotherAwesomeKeyName",
)
# in-place modification of my_dict0
my_dict0 |= my_dict1
print(my_dict0)

But that `None` behavior...

What happens when one of these key values is of type None? In particular, will Nonetype values overwrite strings and other values?

Let's examine the situation by replacing the string value for awesomeKeyName in my_dict1 with None.

Look at this result:

my_dict0 = dict(awesomeKeyName = "awesome value 0",)
my_dict1 = dict(
    awesomeKeyName = None,
    anotherAwesomeKeyName = "awesome value 1 for anotherAwesomeKeyName",
)
# in-place modification of my_dict0
my_dict0 |= my_dict1
print(my_dict0)

Oh no! That's not what we want (but what we should expect).

`Nonetype` avoidance merging

So let's use dictionary comprehensions to get the job done and then throw that into a simple function you can reuse.

my_dict0 = dict(awesomeKeyName = "awesome value 0",)
my_dict1 = dict(
    awesomeKeyName = None,
    anotherAwesomeKeyName = "awesome value 1 for anotherAwesomeKeyName",
)
# merge the two dictionaries avoiding None values
my_merged_dict_1_overwrites_0_but_avoiding_nonetype = \
    {key: val for key, val in my_dict0.items() if val is not None} | \
    {key: val for key, val in my_dict1.items() if val is not None}
print(my_merged_dict_1_overwrites_0_but_avoiding_nonetype)

Perfect. Empty strings, boolean values, and everything else that isn't Nonetype is left alone. But those pesky None values are tossed (along with their keys, if nothing other than Nonetype values are present).

Make it reusable

def merge_without_none(lhs: dict, rhs: dict) -> dict:
    """Merge lhs and rhs dictionaries, excluding from both and
    avoiding overwriting lhs with Nonetype values."""
    return {key: val for key, val in lhs.items() if val is not None} | \
    {key: val for key, val in rhs.items() if val is not None}

assert "test val" in merge_without_none(
    {"testKey": "test val"},
    {"testKey": None},
).get("testKey")
assert not merge_without_none(
    {"testKey": None},
    {"testKey": None},
).get("testKey")

I'll leave it as an exercise for the reader to properly type the lhs and rhs.

Putting it all together

Let's back up to our PaymentFeatures class from before and use it to merge two of them that we instantiate.

Instantiate our `PaymentFeatures` objects

We'll rely on the .dict() method from Pydantic, as mentioned above.

# instantiate the objects
pf0 = PaymentFeatures(
    myParam=42,
    myOtherParam="Is the meaning of life",
)
pf1 = PaymentFeatures(
    myParam=42,
    myOtherParam=None,
)

Merge the dictionaries

Let's use our sparklin' new function and combine the two, keeping "Is the meaning of life"

pf_merged = merge_without_none(
    pf0.dict(),
    pf1.dict(),
)

Remember that the pf0 and pf1 objects we used are Pydantic models so convert them to dictionaries prior to merging!

Gotchas

Recall that the ordinary Python "merge" methods always rewrite keys present in both the left-hand-side and right-hand-side with the right-hand-side values.

That means that if we add something besides a None value to pf1's myOtherParam attribute, we should expect the merge_without_none function to overwrite the myOtherParam value from pf0.

Summary

To summarize, we

reviewed the standard way to merge two dictionaries in Python 3.9 via the newfangled union operator,
showed a simple dictionary comprehension method to merge and avoid None values ,
added the method to a reusable function, and
used that function with Pydantic models, as a toy, but real-world example.

If you liked this article, sign up for the newsletter so you get notified of additional rockin' posts from me.

Subscribe to the newsletter

✨ Lastly, check out my baby, Tincre Promo, to add a marketing agency to your app, site, or platform. ✨