Company
Date Published
Author
Vadim Markovtsev
Word count
2726
Language
English
Hacker News points
None

Summary

The engineers at Athenian optimized their Python API server code by 100x, achieving a significant performance boost. They focused on optimizing coroutines, groupby operations, and data serialization. By ordering arguments in `asyncio.gather()` according to expected IO wait time, they reduced execution times. Using shared filters in SQLAlchemy Core improved query performance. Custom construction of Pandas DataFrames from `asyncpg.Record`s eliminated unnecessary memory copies. Iterating lists without the Global Interpreter Lock (GIL) in Cython increased speed by 10x. Zero-copy serialization using a structured numpy array array wrapper provided at least 10-50x performance improvement compared to pickle and storing fields in individual SQL table columns. Replacing pandas' groupby with pure numpy operations reduced execution times by 20-50x, depending on the dataframe size and column values. These optimizations enabled Athenian's analytics backend to process hundreds of thousands of items in milliseconds, achieving a remarkable 1000x performance improvement over their initial MVP codebase.