Company
Date Published
Author
Pankaj Gupta, Philip Kiely
Word count
1021
Language
English
Hacker News points
2

Summary

FP8 is an 8-bit floating point data format that enables more efficient model inference with larger dynamic range compared to INT8, making it suitable for quantizing LLMs' activations and offering better performance improvements without significant degradation of output quality.