MLOps London October: Testing and Quality in Data Science
Blog post from Seldon
In the October MLOps London meetup, Philip Henry and Chris Monit delivered a talk on integrating automated testing and quality assurance into data science projects, aiming to bridge academia and industry for enhanced efficiency and impact. They emphasized the benefits of automated regression testing, such as exercising production code paths and asserting expected results to prevent regressions. The speakers highlighted the use of synthetic data in testing to reduce bugs and enable quick fixes, noting its advantages like modeling control, privacy safety, and local testing capability, despite the initial investment needed for creating synthetic data generators. By adopting software engineering testing strategies, data scientists can develop reliable analytical code and robust production systems. The event, organized by Ed Shee and Seldon and hosted by Rise (Barclays), provided valuable insights into improving data science quality through test-driven development.