Python pattern: Using defaultdicts to initialise dictionaries


When you are working with dictionaries in Python, you may be familiar with the following process:

  1. You initialize a dictionary.
  2. You iterate over data.
  3. You check if a key is in a dictionary. If it isn’t, you add a default value.
  4. You add to the value associated with the key (i.e. to increment a counter, or add an item to a list or set).

With the collections.defaultdict method, you can skip step #3.

The collections.defaultdict Python dictionary automatically adds an empty value of the default type supplied whenever a new item is added to the dictionary. defaultdict dictionaries behave like a Python dictionary, except with the property that if a key doesn’t exist, a default value is added so that you can manipulate the value without checking for the presence of the key.

For numbers, the default value set for a new key is is 0; for lists, the default value is an empty list; for dictionaries, the default value is an empty dictionary. This default value is overwritten if you manually assign a value.

Before defaultdict

Consider a program that counts words:


sentence = “the quick brown fox jumps over the lazy dog”

words = {}

for word in sentence.split():
    if word not in words:
        words[word] = 0
        
    words[word] += 1

This code returns:


{’the’: 2, ‘quick’: 1, ‘brown’: 1, ‘fox’: 1, ‘jumps’: 1, ‘over’: 1, ‘lazy’: 1, ‘dog’: 1}

In the above program, a words dictionary is initialized. The key entries are set to each unique word in sentence and the values are the number of times the word appears in the sentence.

When the program encounters a word for the first time, the script populates the words dictionary with a new entry whose value is 0. This is necessary because you cannot increment the value corresponding with a key in a dictionary if the key hasn’t been added yet.

Using defaultdict

We can reduce the code in our script by using a defaultdict instead of a regular dictionary:


from collections import defaultdict

sentence = “the quick brown fox jumps over the lazy dog”

words = defaultdict(int)

for word in sentence.split():
    words[word] += 1

In this code, we initialize a defaultdict with the int argument. This says that we want to create a dictionary whose value, by default, is an integer. The default value will be 0.

This code returns the same result as our first script:


{’the’: 2, ‘quick’: 1, ‘brown’: 1, ‘fox’: 1, ‘jumps’: 1, ‘over’: 1, ‘lazy’: 1, ‘dog’: 1}

Reference



Source link

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top