Structured-Then-Unstructured Pruning (STUN) presents an innovative two-phase approach to enhance the scalability of Mixture-of-Experts (MoE) models by first implementing structured pruning to remove redundant experts and then applying unstructured pruning within individual experts. This technique addresses the inefficiencies and high computational demands associated with traditional methods of pruning large MoE models, such as Snowflake's Arctic, which consists of 128 experts. STUN effectively reduces the model size while maintaining performance, achieving high sparsity without loss in accuracy, particularly on complex tasks like GSM8K. This approach significantly outperforms both structured-only and unstructured-only pruning methods, presenting a scalable solution for MoE models by leveraging the behavioral similarity between experts to streamline pruning decisions. The paper suggests that STUN's generalizability to other MoE families and potential hardware acceleration for unstructuredly pruned models are promising directions for future research, aiming to optimize memory access and processing efficiency further.