Titanic Competition with Privacy Preserving Machine Learning
Blog post from Zama
The blog post explores a Privacy-Preserving Machine Learning (PPML) solution to the Titanic challenge on Kaggle using the Concrete-ML toolkit, demonstrating the application of Fully Homomorphic Encryption (FHE) to protect data during machine learning predictions without compromising performance. It employs an XGBoost classifier model, which is trained using scikit-learn's GridSearchCV for parameter optimization, and then compares it to a Concrete-ML version that uses FHE. The preparation involves data cleaning, feature engineering, and dummification of categorical variables to transform them into numerical data. Both models perform similarly, with the FHE model achieving a slightly higher accuracy of 78% compared to the XGBoost model's 77%, showcasing the effectiveness of FHE in secure data predictions. The blog emphasizes the ease of integrating Concrete-ML into existing data science workflows without requiring expertise in cryptography, highlighting the potential of FHE for secure machine learning applications.