Properly evaluating the performance of machine learning models is crucial to identify strengths and weaknesses, allowing continuous fine-tuning to improve model quality. Different evaluation metrics for classification, object detection, and segmentation models are explored, including accuracy, precision, recall, F1-score, confusion matrix, IoU (Intersection of Union), mAP (Mean Average Precision), pixel accuracy, mean IoU, Dice coefficient, and pixel-wise cross entropy. Each metric has its benefits and limitations, and choosing the right one is essential for making informed decisions about evaluating and improving AI models. Understanding these metrics can help developers select the most suitable performance evaluation method for their specific use case.