Company
Date Published
Author
Daniel Timbrell
Word count
494
Language
-
Hacker News points
None

Summary

Lakera has developed an implementation of OpenAI's CLIP model that eliminates the need for PyTorch, facilitating easier deployment on production and edge devices. CLIP, known for its image-to-text capabilities, typically relies on PyTorch for its three main components: the text tokenizer, the image preprocessor, and the model itself, which outputs cosine similarities of text and image embeddings. Lakera has rewritten the text tokenizer in NumPy, created a custom image preprocessor, and exported the CLIP model to an .onnx format to replace PyTorch with the more lightweight onnxruntime. This development allows for a more streamlined and accessible application of CLIP in various environments.