Reducing Pydantic's memory footprint using bitsets
Blog post from Pydantic
In July 2025, a user raised a concern about Pydantic models potentially consuming excessive memory, initially suspecting PEP 412 as the cause. However, the investigation revealed that the memory issue was not due to PEP 412, which optimizes Python dictionaries by sharing keys across instances. Instead, the problem was traced to Pydantic's model validation logic, where a significant amount of memory was consumed by tracking explicitly set fields in a mutable Python set. The solution involved using bitsets to track these fields efficiently, drastically reducing memory usage by approximately 55%. This approach leverages Python's ordered fields to represent set fields as binary numbers, significantly improving memory efficiency, especially in models with numerous fields. The post also highlights the challenges of implementing such changes in open-source projects like Pydantic, emphasizing the need for caution to avoid breaking existing functionality and ensuring compatibility with user expectations.