The difference between SEMMA and CRISP-DM
Blog post from Starburst
SEMMA and CRISP-DM are two prominent process models used in data mining and machine learning to develop predictive models and extract insights from data, each with unique characteristics and applications. CRISP-DM, developed in the late 1990s, is a comprehensive and widely recognized framework encompassing six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment, making it flexible and suitable for various projects. In contrast, SEMMA, developed by SAS, focuses on five phases—sample, explore, modify, model, and assess—primarily targeting the modeling phase and is closely linked to SAS software, often used alongside broader methodologies like CRISP-DM. While CRISP-DM is widely adopted due to its extensive documentation and community support, SEMMA is more specialized and specific to the SAS environment, making CRISP-DM generally the more practical choice for diverse data mining projects.