Introduction
Python’s built-in collections
module offers several powerful data structures that extend the capabilities of the standard types. In this tutorial, we’ll explore some of these advanced collections—such as defaultdict, Counter, deque, and namedtuple—and demonstrate how they can simplify your code and improve performance when handling complex data processing tasks.
defaultdict
The defaultdict
is a subclass of the built-in dict
that returns a default value when a key is missing. This eliminates the need for key existence checks.
#|label: defaultdict-example
from collections import defaultdict
# Create a defaultdict with list as the default factory.
= defaultdict(list)
dd
# Append values to keys.
"fruits"].append("apple")
dd["fruits"].append("banana")
dd["vegetables"].append("carrot")
dd[
print("defaultdict:", dict(dd))
outputs:
defaultdict: {'fruits': ['apple', 'banana'], 'vegetables': ['carrot']}
Counter
The Counter
class is useful for counting hashable objects. It makes it easy to count occurrences and perform operations such as finding the most common elements.
#|label: counter-example
from collections import Counter
= ["apple", "banana", "apple", "orange", "banana", "apple"]
data = Counter(data)
counter
print("Counter:", counter)
print("Most common:", counter.most_common(2))
outputs:
Counter: Counter({'apple': 3, 'banana': 2, 'orange': 1})
Most common: [('apple', 3), ('banana', 2)]
deque
deque
(double-ended queue) supports fast appends and pops from both ends, making it ideal for implementing queues or stacks.
#|label: deque-example
from collections import deque
# Create a deque and add elements to both ends.
= deque([1, 2, 3])
d 4) # Append to the right
d.append(0) # Append to the left
d.appendleft(
print("Deque:", d)
# Pop elements from both ends.
= d.pop()
right = d.popleft()
left
print("After popping, deque:", d)
print("Popped from right:", right, "and left:", left)
outputs:
Deque: deque([0, 1, 2, 3, 4])
After popping, deque: deque([1, 2, 3])
Popped from right: 4 and left: 0
namedtuple
namedtuple
provides a way to create lightweight object types with named fields. They are immutable and can be used like regular tuples while improving code readability.
#|label: namedtuple-example
from collections import namedtuple
# Define a namedtuple called 'Point'.
= namedtuple("Point", ["x", "y"])
Point
# Create a Point instance.
= Point(x=10, y=20)
p
print("Namedtuple Point:", p)
print("X coordinate:", p.x)
print("Y coordinate:", p.y)
outputs:
Namedtuple Point: Point(x=10, y=20)
X coordinate: 10
Y coordinate: 20
Additional Collections
While the above examples cover the most common advanced collections, Python’s collections
module also includes:
- OrderedDict: Though Python 3.7+ dicts maintain insertion order, OrderedDict can be useful for specific ordering operations.
- ChainMap: For managing multiple dictionaries as one.
- UserDict, UserList, and UserString: For easier subclassing of built-in types.
Conclusion
Advanced data structures in the collections
module can simplify many tasks by providing built-in functionality that goes beyond standard data types. Using tools like defaultdict, Counter, deque, and namedtuple can lead to cleaner, more efficient, and more readable code. Experiment with these collections to see how they can improve your data processing and application logic.
Further Reading
- Comprehensive Guide to Python Data Structures
- Handling Nested Data Structures in Python
- Advanced Operations on Data Structures in Python
- Performance Comparisons and Best Practices for Python Data Structures
Happy coding, and enjoy mastering Python’s advanced collections!
Reuse
Citation
@online{kassambara2024,
author = {Kassambara, Alboukadel},
title = {Advanced {Python} {Collections:} Defaultdict, {Counter,}
Deque, Namedtuple, Etc.},
date = {2024-02-09},
url = {https://www.datanovia.com/learn/programming/python/additional-tutorials/collections-module.html},
langid = {en}
}