| Juggernaut FLUX is live on DeepInfra! |
Oguz Vuruskaner |
Mar 25, 2025 |
349 |
- |
| Enhancing Open-Source LLMs with Function Calling Feature |
Pernekhan Utemuratov |
Jan 26, 2024 |
1025 |
- |
| Guaranteed JSON output on Open-Source LLMs. |
Patrick Reiter Horn |
Mar 08, 2024 |
624 |
- |
| How to use CivitAI LoRAs: 5-Minute AI Guide to Stunning Double Exposure Art |
Oguz Vuruskaner |
Jan 23, 2025 |
391 |
- |
| Introducing Tool Calling with LangChain, Search the Web with Tavily and Tool Calling Agents |
Oguz Vuruskaner |
Jul 05, 2024 |
583 |
- |
| FLUX.1-dev Guide: Mastering Text-to-Image AI Prompts for Stunning and Consistent Visuals |
Oguz Vuruskaner |
Sep 04, 2024 |
1276 |
- |
| How to deploy Databricks Dolly v2 12b, instruction tuned casual language model. |
Yessen Kanapin |
Apr 12, 2023 |
349 |
- |
| A Milestone on Our Journey Building Deep Infra and Scaling Open Source AI Infrastructure |
Yessen Kanapin |
Apr 22, 2025 |
589 |
- |
| Model Distillation Making AI Models Efficient |
Deep |
Apr 10, 2025 |
1426 |
- |
| Fork of Text Generation Inference. |
Nikola Borisov |
Aug 09, 2023 |
417 |
- |
| Getting Started |
Nikola Borisov |
Mar 02, 2023 |
278 |
- |
| Long Context models incoming |
Iskren Chernev |
Nov 21, 2023 |
628 |
- |
| The easiest way to build AI applications with Llama 2 LLMs. |
Nikola Borisov |
Aug 02, 2023 |
603 |
- |
| A short intro on running Stable Diffusion on DeepInfra |
Iskren |
Mar 08, 2023 |
218 |
- |
| Use OpenAI API clients with LLaMas |
Iskren Chernev |
Aug 28, 2023 |
343 |
- |
| Inference LoRA adapter model |
Askar Aitzhan |
Dec 06, 2024 |
459 |
- |
| Unleashing the Potential of AI for Exceptional Gaming Experiences |
Tsveta Gavanozova |
Nov 10, 2023 |
500 |
- |
| Chat with books using DeepInfra and LlamaIndex |
Oguz Vuruskaner |
Jun 07, 2024 |
565 |
- |
| Seed Anchoring and Parameter Tweaking with SDXL Turbo: Create Stunning Cubist Art |
Oguz Vuruskaner |
Sep 12, 2024 |
1233 |
- |
| Deploy Custom LLMs on DeepInfra |
Iskren Chernev |
Mar 01, 2024 |
276 |
- |
| Introducing GPU Instances: On-Demand GPU Compute for AI Workloads |
Deep |
Jun 09, 2025 |
792 |
- |
| How to OpenAI Whisper with per-sentence and per-word timestamp segmentation using DeepInfra |
Yessen Kanapin |
Apr 05, 2023 |
323 |
- |
| Building a Voice Assistant with Whisper, LLM, and TTS |
Askar Aitzhan |
Sep 20, 2024 |
748 |
- |
| Search That Actually Works: A Guide to LLM Rerankers |
Deep |
Sep 10, 2025 |
2122 |
- |
| Lzlv model for roleplaying and creative work |
Nikola Borisov |
Nov 02, 2023 |
532 |
- |
| Compare Llama2 vs OpenAI models for FREE. |
Nikola Borisov |
Sep 28, 2023 |
406 |
- |
| Langchain improvements: async and streaming |
Iskren Chernev |
Oct 25, 2023 |
292 |
- |
| How to deploy google/flan-ul2 - simple. (open source ChatGPT alternative) |
Nikola Borisov |
Mar 17, 2023 |
495 |
- |
| Art That Talks Back: A Hands-On Tutorial on Talking Images |
Oguz Vuruskaner |
Mar 07, 2025 |
591 |
- |
| Deep Infra Launches Access to NVIDIA Nemotron Models for Vision, Retrieval, and AI Safety |
Yessen Kanapin |
Oct 28, 2025 |
814 |
- |
| How to deploy Databricks Dolly v2 12b, instruction tuned casual language model. |
Yessen Kanapin |
Apr 12, 2023 |
541 |
- |