Home / Companies / Starburst / Blog / Post Details
Content Deep Dive

The difference between SEMMA and CRISP-DM

Blog post from Starburst

Post Details
Company
Date Published
Author
Cindy Ng
Word Count
519
Language
English
Hacker News Points
-
Summary

SEMMA and CRISP-DM are two prominent process models used in data mining and machine learning to develop predictive models and extract insights from data, each with unique characteristics and applications. CRISP-DM, developed in the late 1990s, is a comprehensive and widely recognized framework encompassing six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment, making it flexible and suitable for various projects. In contrast, SEMMA, developed by SAS, focuses on five phases—sample, explore, modify, model, and assess—primarily targeting the modeling phase and is closely linked to SAS software, often used alongside broader methodologies like CRISP-DM. While CRISP-DM is widely adopted due to its extensive documentation and community support, SEMMA is more specialized and specific to the SAS environment, making CRISP-DM generally the more practical choice for diverse data mining projects.