Titanic Competition with Privacy Preserving Machine Learning

Post Details

Company

Zama

Date Published

Aug. 31, 2022

Author

Roman Bredehoft

Word Count

1,874

Language

English

Hacker News Points

-

Source URL

www.zama.org/post/titanic-competition-with-privacy-preserving-machine-learning-using-concrete-ml

Summary

The blog post explores a Privacy-Preserving Machine Learning (PPML) solution to the Titanic challenge on Kaggle using the Concrete-ML toolkit, demonstrating the application of Fully Homomorphic Encryption (FHE) to protect data during machine learning predictions without compromising performance. It employs an XGBoost classifier model, which is trained using scikit-learn's GridSearchCV for parameter optimization, and then compares it to a Concrete-ML version that uses FHE. The preparation involves data cleaning, feature engineering, and dummification of categorical variables to transform them into numerical data. Both models perform similarly, with the FHE model achieving a slightly higher accuracy of 78% compared to the XGBoost model's 77%, showcasing the effectiveness of FHE in secure data predictions. The blog emphasizes the ease of integrating Concrete-ML into existing data science workflows without requiring expertise in cryptography, highlighting the potential of FHE for secure machine learning applications.