Build an End-to-End Encrypted 23andMe-like Genetic Testing Application using Concrete ML
Blog post from Zama
Season 5 of the Zama Bounty Program challenged participants to develop a machine learning system using Fully Homomorphic Encryption (FHE) to determine ancestry from encrypted DNA data, highlighting the need to protect sensitive genetic information. Two developers, Alephzerox and Soptq, shared first prize with distinct yet effective solutions. Soptq employed a logistic regression-based method, segmenting chromosome data into windows to predict ancestry, achieving 96% accuracy with a latency of around 300 seconds for encrypted genomes. Alephzerox, on the other hand, implemented a similarity search approach using a reference panel of pure-blooded genomes to determine ancestry, also reaching 96% accuracy with an accuracy increase proportional to the number of reference genomes. While both methods achieved similar accuracy, Soptq's machine learning approach led to lower inference latency in processing encrypted data, showcasing the potential of FHE in protecting DNA information.