Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

Run Gemma 4 on Intel® Arc™ GPUs Out-Of-the-Box

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Matrix Yao, Chendi Xue, FanZhao, Xinyu Chen, Alex Gu, Wuxun Zhang, Xinyi Li, jianan, Yi Wang, and Yintong Lu
Word Count
1,495
Language
-
Hacker News Points
-
Summary

Intel's Arc GPUs, including the Intel Arc Pro B70/B65, are optimized for modern AI inference, providing a comprehensive platform with enhanced memory capacity to simplify adoption. Intel's strategy of prioritizing open-source AI frameworks like PyTorch and Hugging Face transformers ensures a seamless day-zero experience on Intel Xe GPUs. The Gemma 4 model utilizes different attention mechanisms and a highly optimized FusedMoE backend, supported on Intel hardware for efficient performance. Intel has collaborated with the open-source community to enhance kernel optimizations, allowing for out-of-the-box functionality for AI models like Gemma 4 on Xe GPUs. The article also outlines environment setup and execution for models using vLLM and Hugging Face Transformers, demonstrating capabilities like text generation, image captioning, and audio captioning with various configurations on Intel GPUs.