Home / Companies / HuggingFace / Hacker News

HuggingFace on HN

1107 posts with 1+ points since 2022

Filters
Since:
Posts by Month (1107 total)
Hacker News Posts
Title Points Comments Date
DeepSeek-v3.2: Pushing the frontier of open large language models [pdf] 978 -- 2025-12-01
Uncensor any LLM with abliteration 586 -- 2024-06-13
Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete 530 -- 2026-01-21
Deepseek R1-0528 451 -- 2025-05-28
Llama-3.3-70B-Instruct 425 -- 2024-12-06
Try Stable Diffusion's Img2Img Mode 415 -- 2022-08-29
Open-R1: an open reproduction of DeepSeek-R1 394 -- 2025-01-28
Smollm3: Smol, multilingual, long-context reasoner LLM 388 -- 2025-07-08
GLM-4.7-Flash 371 -- 2026-01-19
Nanonets-OCR-s – OCR model that transforms documents into structured markdown 361 -- 2025-06-16
A Replacement for BERT 348 -- 2024-12-19
MonadGPT – What would have happened if ChatGPT was invented in the … 323 -- 2023-11-24
Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS 319 -- 2025-09-02
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning 263 -- 2025-12-01
The Smol Training Playbook: The Secrets to Building World-Class LLMs 262 -- 2025-10-30
LLM in a Flash: Efficient LLM Inference with Limited Memory 252 -- 2023-12-20
Microsoft Phi-2 model changes licence to MIT 240 -- 2024-01-06
Falcon 180B 238 -- 2023-09-06
OpenLLaMA 13B Released 229 -- 2023-06-18
Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser 227 -- 2025-02-07
Hugging Face Releases Agents 214 -- 2023-05-10
Space secrets leak disclosure 197 -- 2024-06-01
BigCode Project Releases StarCoder: A 15B Code LLM 185 -- 2023-05-04
Best 7B LLM on leaderboards made by an amateur following a medium … 181 -- 2024-01-05
Stability.ai sent a take down request to Runway ML's SD v1.5 citing … 179 -- 2022-10-20
We raised $100M for open and collaborative machine learning 175 -- 2022-05-09
Llama 3 8B is almost as good as Wizard 2 8x22B 168 -- 2024-04-19
SantaCoder: A new 1.1B code model for generation and infilling 168 -- 2022-12-22
Nvidia releases NVLM 1.0 72B open weight model 167 -- 2024-10-02
Qwen3-4B-Thinking-2507 166 -- 2025-08-06
StackLlama: A hands-on guide to train LlaMa with RLHF 165 -- 2023-04-06
Explaining the SDXL Latent Space 163 -- 2024-02-05
BLOOM: The largest open multilingual language model 160 -- 2022-07-12
Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params) 156 -- 2026-01-22
Hugging Face and Google partner for AI collaboration 152 -- 2024-01-25
Qwen3-235B-A22B-Thinking-2507 152 -- 2025-07-25
Show HN: Penny-1.7B Irish Penny Journal style transfer 149 -- 2025-06-02
Wordalle – Guess the prompt used to generate a set of images … 137 -- 2022-07-01
Mistral-8x7B-Chat 131 -- 2023-12-10
A CC-By Open-Source TTS Model with Voice Cloning 131 -- 2024-11-04
Qwen-Image-Layered: transparency and layer aware open diffusion model 130 -- 2025-12-19
FineWeb: Decanting the web for the finest text data at scale 127 -- 2024-06-02
Yi-34B-Chat 115 -- 2023-11-24
GPT-3.5 and Wolfram Alpha via LangChain 107 -- 2023-01-18
The Falcon has landed in the Hugging Face ecosystem 105 -- 2023-06-05
HuggingChat: Chat with Open Source Models 103 -- 2024-02-21
Hugging Face and AWS partner to make AI more accessible 102 -- 2023-02-21
HuggingFace Training Cluster as a Service 101 -- 2023-09-05
More than 80 AI models from Qualcomm 95 -- 2024-02-28
Segmind Stable Diffusion – A smaller version of Stable Diffusion XL 95 -- 2023-10-25
LLaMA-Pro-8B 94 -- 2024-01-06
HuggingChat 93 -- 2023-04-25
Yarn-Mistral-7B-128k 88 -- 2023-11-11
Qwen3 30B-A3B 87 -- 2025-07-30
Apple/OpenELM: Efficient Open-Source Family Language Models 82 -- 2024-04-24
Sparse LLM Inference on CPU: 75% fewer parameters 78 -- 2023-10-19
Pokemon GAN 77 -- 2022-02-14
YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license 75 -- 2024-04-18
Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022) 73 -- 2023-11-20
Multimodal Neurons in Pretrained Text-Only Transformers 66 -- 2023-08-04
Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't 66 -- 2024-01-22
Voxtral-Mini-3B-2507 – Open source speech understanding model 64 -- 2025-07-15
Open-sourcing 5,000hrs of self-driving dataset 63 -- 2025-03-11
HuggingChat – ChatGPT alternative with open source models 61 -- 2023-12-15
MSFT's WizardLM2 models have been taken down 58 -- 2024-04-16
OpenLLaMA 7B Training Completed to 1T Tokens 58 -- 2023-06-07
Phi-2 57 -- 2023-12-13
Dolphin-2_6-Phi-2 56 -- 2023-12-24
Alibaba releases 72B LLM with 32k context length 55 -- 2023-11-30
LiteLlama-460M-1T has 460M parameters trained with 1T tokens 54 -- 2024-01-07
Qwen Image 54 -- 2025-08-04
Fine-Tuning LLMs to 1.58bit 52 -- 2024-09-18
Train faster static embedding models with sentence transformers 52 -- 2025-01-15
Show HN: ChatToSTL – AI text-to-CAD for 3D printing 52 -- 2025-06-12
LLaMA 3 70B Llamafiles 51 -- 2024-04-19
Janus-Pro: Autoregressive framework unifying multimodal understanding&generation 49 -- 2025-01-27
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper 48 -- 2024-12-26
Improving Parquet Dedupe on Hugging Face Hub 47 -- 2024-10-08
Open LLAMA 13B released, trained on 1T tokens 47 -- 2023-06-19
DALL·E Mini 46 -- 2022-04-11
Open-LLM performances are plateauing 46 -- 2024-06-29
The AI Research Residency Program 46 -- 2022-03-23
4-Bit Quantization and QLoRA 41 -- 2023-05-25
BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat 40 -- 2023-05-19
What's Going on with the Open LLM Leaderboard? 40 -- 2023-06-23
Kai-Fu Li's Yi-34B uses exactly Llama's architecture except for 2 tensor renamed 39 -- 2023-11-14
DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks 39 -- 2025-01-20
Fully autonomous AI agents should not be developed 38 -- 2025-02-07
Zephyr 7B – Mistral Finetune that responds like ChatGPT 37 -- 2023-10-15
Whisper Jax: Transcribe a 1 hour of audio in under 15 seconds 36 -- 2023-04-22
Qwen3-235B-A22B-Instruct-2507 36 -- 2025-07-21
MistralLite by Amazon Web Services 34 -- 2023-11-01
Mixtral-8x22B on HuggingFace 33 -- 2024-04-10
The Ultra-Scale Playbook: Training LLMs on GPU Clusters 33 -- 2025-02-19
Qwen3-Coder-30B-A3B-Instruct 32 -- 2025-07-31
General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model 31 -- 2024-09-11
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat 30 -- 2024-04-12
OpenFLUX.1 30 -- 2024-10-04
Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders 30 -- 2025-07-09
Mistral 7B v0.2 29 -- 2024-03-31
Mixture of Experts Explained 29 -- 2023-12-11
TinyLlama at 2T of 3T 29 -- 2023-11-19
Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video 28 -- 2024-04-16
Real-Time Latent Consistency Model 27 -- 2023-10-30
Language Modeling Is Compression 27 -- 2023-09-21
grok-2 on Hugging Face 27 -- 2025-08-23
Llama-3.2-3B-Instruct-uncensored 26 -- 2024-09-27
Pixel Art XL: Stable Diffusion XL for Pixel Art 26 -- 2023-08-03
UC Berkeley's open-source Vicuna LLM chatbot released new improved model weights 26 -- 2023-04-14
Llama can now see and run on your device – welcome Llama … 26 -- 2024-09-25
DeepSeek-v3.1 26 -- 2025-08-21
Llama 1.3B Trained on 200B Tokens for Commercial Use 25 -- 2023-04-28
New Phi-3.5 Models from Microsoft, including new MoE 25 -- 2024-08-20
LLM: Transformer Is Linear 25 -- 2024-05-24
DeepSeek-v3.1-Base 25 -- 2025-08-19
NousResearch/Nous-Hermes-2-Yi-34B 24 -- 2023-12-26
Accelerating Stable Diffusion XL Inference with Jax on Cloud TPU v5e 23 -- 2023-10-03
HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B 23 -- 2024-11-05
Mistral Small 3.2 (24B-Instruct-2506) 23 -- 2025-06-20
DeepSeek-v3.1 23 -- 2025-08-19
Lineage Explorer for open source models – Hugging Face Space 22 -- 2024-01-18
Llama 22B: 13B V2 with 33B attention heads frankensteined on 22 -- 2023-08-18
Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included 22 -- 2024-08-14
Mistral-7B-OpenOrca. First 7B model to beat all other models <30B 21 -- 2023-10-02
Würstchen: Fast Diffusion for Image Generation 21 -- 2023-09-13
Llama 3.2 21 -- 2024-09-25
Kyutai 1.6B Streaming TTS 21 -- 2025-07-03
Qwen3 235B beats Claude on some code benchmarks 21 -- 2025-07-21
Code Generation with HuggingFace 20 -- 2022-06-07
Selene Mini: Open-sourced SOTA small language-model-as-a-judge 20 -- 2025-01-29
Ernie-ViLG better anime quality than Stable Diffusion 19 -- 2022-09-01
Fine-tune and deploy open LLMs as containers using AIKit - Part 1 19 -- 2024-06-06
makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch 19 -- 2024-01-23
AMD and: Large Language Models Out-of-the-Box Acceleration with AMD GPU 19 -- 2023-12-13
The smallest VLM ever: 250M parameters 19 -- 2025-01-23
This Pokémon Does Not Exist: Using AI models to create fake cards … 18 -- 2022-03-22
HuggingFace to Replace Git LFS with Xet 18 -- 2024-08-23
GPT-NeoX 18 -- 2022-12-14
Fake Insects: a game where you have to identify AI-generated insects 18 -- 2024-08-17
Mixtral-8x22B-Instruct-v0.1 18 -- 2024-04-17
Stable Diffusion Multiplayer 18 -- 2022-10-30
Encrypted Large Language Models with Homomorphic Encryption 18 -- 2023-08-03
Hermes-2-Pro-Llama-3-8B 18 -- 2024-05-01
Orca 2: Teaching Small Language Models How to Reason 18 -- 2023-11-21
Deepseek V3-0324 18 -- 2025-03-24
Show HN: MiniSearch, a minimalist search engine with integrated browser-based AI 17 -- 2023-10-15
StableLM-2-12B 17 -- 2024-04-08
Gemini vs. GPT-4V: A Preliminary Comparison Through Qualitative Cases 17 -- 2023-12-28
Una-Cybertron-7B 17 -- 2023-12-08
GPT Baker lets you build your own open-source GPTs 17 -- 2023-11-23
Deploy Livebook (Elixir) Notebooks as Apps to Hugging Face Spaces 17 -- 2023-06-15
ChatRWKV 17 -- 2023-03-23
DeepSeek R1 17 -- 2025-01-20
Vector Search with DuckDB 17 -- 2025-02-26
DiffuCoder-7B-CpGRPO: A code generation LLM developed by Apple 17 -- 2025-07-04
NuExtract: A LLM for Structured Extraction 16 -- 2024-06-29
An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct 16 -- 2024-06-09
Phi-3 Weights Released 16 -- 2024-04-23
New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks 16 -- 2024-07-31
Miqu 70B – possible leak of the mistral-medium LLM 16 -- 2024-01-29
New Stable Diffusion model trained on high quality Art 16 -- 2022-12-11
Qwen3 0.6B now on HuggingFace (quantized) 16 -- 2025-04-28
Ollama can run any GGUF Model on Hugging Face Hub now 15 -- 2024-10-16
Llama-3-70B-Instruct-Gradient-1048k 14 -- 2024-05-04
New finance LLM passed the CFA Level III exam 14 -- 2024-07-31
Airoboros-13B: 98% against GPT-3.5 14 -- 2023-05-22
Run Mistral 7B model using less than 4GB of memory on your … 14 -- 2024-07-23
Stable Diffusion 3 Medium Released 14 -- 2024-06-12
Pre-computed vector embeddings available on HuggingFace 14 -- 2024-01-22
TeapotLLM- an open-source <1B model for hallucination-resistant Q&A on a CPU 14 -- 2025-04-16
DeepSeek-Prover-V2-671B 14 -- 2025-04-30
DeepSeek-R1-0528 performance improvements 14 -- 2025-05-29
Create a GPT3 powered Q&A Chatbot for *any* GitHub repo by posting … 13 -- 2023-02-05
Yi-9B-200K 13 -- 2024-03-17
An Introduction to Vision-Language Modeling 13 -- 2024-05-28
Co-Doodle with Gemini 13 -- 2025-03-19
Attention Sinks in LLMs for endless fluency 12 -- 2023-10-09
FineWeb: 15T tokens of the finest data the web has to offer 12 -- 2024-04-21
Idefics: Open Access 60B multimodal model 12 -- 2023-08-22
Google AI just released Flan-T5 models 12 -- 2022-10-24
Language model can listen while speaking 12 -- 2024-08-07
ML for 3D Course on Hugging Face 12 -- 2024-05-16
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs 12 -- 2024-04-09
Command-R: open weights 35B params / 128k tokens context length model by … 12 -- 2024-03-11
StarCoder2 and The Stack v2: new code LLMs and dataset 12 -- 2024-02-28
Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model 12 -- 2024-03-28
Stable difusion on multiplayer: Internet at it best 12 -- 2022-10-30
Open-source DeepResearch – Freeing our search agents 12 -- 2025-02-04
FUTO open-sources 1M row keyboard swipe dataset 12 -- 2025-04-04
HuggingFace Is Down 11 -- 2024-02-28
30B uncensored OSS model with no guardrails 11 -- 2023-11-07
The Stack: 3 TB of permissively licensed source code in 30 programming … 11 -- 2022-10-31
Experiments with Bitnet 1.5 (Ngmi) 11 -- 2024-03-23
Hierarchical Masked 3D Diffusion Model for Video Outpainting 11 -- 2023-09-06
FalconMamba 7B: The first attention-free and general-purpose pure Mamba model 11 -- 2024-08-13
NPC-Playground, a 3D playground to interact with LLM-powered NPCs 11 -- 2024-06-05
Open LLM Leaderboard 11 -- 2024-01-02
Shallow Feed-Forward Neural Networks as Alternative to Attention in Transformers 11 -- 2023-11-21
smolagents: A simple library to build AI agents 11 -- 2025-01-02
DeepSeek-TNG-R1T2-Chimera 11 -- 2025-07-02
CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher 10 -- 2024-06-15
Whisperfile 10 -- 2024-08-19
Llava Model for Video 10 -- 2024-05-16
Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption 10 -- 2024-01-31
Vector embeddings model for medical literature 10 -- 2024-01-08
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting 10 -- 2023-09-11
Origin of LLMs: An Evolutionary Tree and Graph for 15K Large Language … 10 -- 2023-07-20
Show HN: Image Filtering App Using Homomorphic Encryption 10 -- 2023-02-23
CMFNet: AI Image Deblurring 10 -- 2022-02-27
Show HN: Downloadable AI Musical Instruments 10 -- 2024-12-10
Phi-4 weights have been released under MIT license 10 -- 2025-01-08
Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 10 -- 2025-04-23
Open Source 1.7tb Dataset of What AI Crawlers Are Doing 10 -- 2025-07-03
Parquet Content-Defined Chunking 10 -- 2025-09-09
Wan2.2-S2V-14B – audio-driven cinematic video generation model 10 -- 2025-08-26
Not All Language Model Features Are Linear 9 -- 2024-05-25
Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct 9 -- 2024-10-16
Stable Diffusion XL Inpainting model released 9 -- 2023-09-01
Opentensor and Cerebras announce BTLM-3B-8K, a leading 3B param. language model 9 -- 2023-07-24
Perspectives for first principles prompt engineering 9 -- 2024-08-20
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models 9 -- 2024-05-28
Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B 9 -- 2024-01-04
LLM Arena. Mistral-small best open model. Gemini Pro beaten by 2 open … 9 -- 2023-12-17
Meta-llama (Meta Llama 2) 9 -- 2023-07-18
Summary of the Tokenizers 9 -- 2023-02-07
Show HN: Sentiment Analysis on Encrypted Data with Homomorphic Encryption 9 -- 2022-11-21
RunwayML fine tuned Stable Diffusion 1.5 model 9 -- 2022-10-20
Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B 9 -- 2024-11-18
MIT Researchers Unveil New Method to Improve LLM Inference Performance 9 -- 2024-10-04
Aryn/deformable-detr-DocLayNet – open-source Layout Model 9 -- 2024-07-31
AIMO (AI Math Olympiad) progress prize winning solution 9 -- 2024-07-10
Mistral-7B-v0.3 released on HuggingFace 9 -- 2024-05-22
Microsoft Phi-3 3.8B model with 128k Context 9 -- 2024-04-23
The Stack v2: a 3B files in 600 programming languages dataset 9 -- 2024-03-07
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces 9 -- 2024-12-15
Show HN: A Transformer model that preserves logical equivalence 9 -- 2025-03-02
NousResearch/Nous-Hermes-2-Llama-2-70B 8 -- 2024-02-12
Gradio-Lite: Serverless Gradio Running in the Browser 8 -- 2023-10-25
Show HN: Parley: The RPG where you Negotiate with Bandits 8 -- 2023-04-26
Show HN: We made an encrypted DNA testing app using Homomorphic Encryption 8 -- 2024-10-02
NexusRaven-V2-13B 8 -- 2024-01-25
Generate 1 page comic by text 8 -- 2023-09-03
Drag Your GAN: Interactive Point-Based Manipulation on Generative Image Manifold 8 -- 2023-05-23
Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard 8 -- 2024-10-15
Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released 8 -- 2024-09-17
Mistral AI Pixtral 8 -- 2024-09-11
Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces 8 -- 2024-02-14
Show HN: Open-source model to chat with your documents/data 8 -- 2023-08-14
Yes, Transformers Are Effective for Time Series Forecasting (+ Autoformer) 8 -- 2023-06-25
Hugging Face OpenAssistant 8 -- 2023-06-24
Dataset of 35,316,999 HackerNews Posts and Comments (2006 – 2023) 8 -- 2023-04-24
Show HN: Athelas – Automagically Repair Broken Code 8 -- 2023-01-03
Scaling Test Time Compute with Open Models 8 -- 2024-12-16
Sesame CSM-1B: Open-Source Conversational Speech Model 8 -- 2025-03-14
DeepSeek-Prover-V2-671B 8 -- 2025-04-30
Model Context Protocol (MCP) Course 8 -- 2025-05-21
Tencent's Hunyuan Instruct 7B/4B/1.8B/0.5B new models have been released 8 -- 2025-08-04
MistralAI released a new Magistral Small 2509 8 -- 2025-09-17
Phi-3 Technical a Highly Capable Language Model Locally on Your Phone 7 -- 2024-04-23
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? 7 -- 2023-05-16
Am I in the Stack? 7 -- 2024-03-20
Common Corpus: the largest public domain dataset for training LLMs 7 -- 2024-03-20
Introducing “Clerkie“: A LangChain Q&A bot for AI developers 7 -- 2023-01-18
Show HN: Step up your Midjourney AI images with this prompt autocomplete 7 -- 2022-09-10
Hugging Face launches Agents 2.0 7 -- 2024-05-13
OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 7 -- 2024-02-26
Microsoft's Orca 7B may violate OpenAI's Terms of Use 7 -- 2023-12-05
Stable Beluga 2 – Llama2 70B finetuned on an Orca style Dataset … 7 -- 2023-07-28
Databricks’ dolly-v2-12B, an instruction-following large language model 7 -- 2023-04-12
Cerebras releases its own open source GPT models (Apache 2.0 License) 7 -- 2023-03-28
Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space 7 -- 2024-05-16
1B+ words corpus of original texts and experimental post-OCR correction output 7 -- 2024-04-26
Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out 7 -- 2024-03-14
Grandmaster-Level Chess Without Search 7 -- 2024-02-08
Create a Web Interface for Your LLM in Python 7 -- 2024-01-23
Show HN: Interactively explore your Hugging Face dataset with one line of … 7 -- 2023-10-25
Show HN: DocQuery, an OSS tool for analyzing documents with LLMs 7 -- 2022-09-01
Hugging Face datasets and models for cybersecurity/sofwtare vulnerabilities 7 -- 2025-03-09
ByteDance/Dolphin on HuggingFace 7 -- 2025-05-19
Holo1.5: Foundational Models for Computer Use Agents 7 -- 2025-09-15
LFM2 WebGPU 7 -- 2025-08-06
OpenAI/GPT-OSS-120B · Hugging Face 7 -- 2025-08-05
CodeFusion: A Pre-Trained Diffusion Model for Code Generation 6 -- 2023-10-30
OpenChat 3.5: 7B model with comparable perf to ChatGPT 6 -- 2023-11-02
New leaderboard drop: Judge Arena 6 -- 2024-11-19
Phased Consistency Model 6 -- 2024-05-29
Generate Illusions with Stable Diffusion 6 -- 2023-09-16
Mann-E, an open source Equivalent of Midjourney reached its version 4.1.3 6 -- 2023-03-04
A Llama 70B finetune that has reflection baked into it's weights 6 -- 2024-09-05
Show HN: Understand politics by visualising manifesto embeddings 6 -- 2024-07-07
Mistral releases the v0.3 of its 7B LLM 6 -- 2024-05-22
Idefics2: A Powerful 8B Vision-Language Model for the Community 6 -- 2024-05-14
Show HN: Open-source LLM for data labeling 6 -- 2024-05-08
Dolphin-2.9-Llama3-8B 6 -- 2024-04-21
Introduction to 3D Gaussian Splatting 6 -- 2024-04-02
Qwen is a large language model series by Alibaba Cloud 6 -- 2023-09-27
Show HN: TCO Calculator to compare on-prem LLM deployment vs. OpenAI and … 6 -- 2023-08-21
Llama-2-70B-instruct-v2 6 -- 2023-08-03
Falcon 40B-Instruct GGML 6 -- 2023-06-15
RWKV – An RNN with the Advantages of a Transformer 6 -- 2023-05-15
Assisted Generation: a new direction toward low-latency text generation 6 -- 2023-05-11
Databricks Publishes a Version of Dolly LLM to Hugging Face 6 -- 2023-03-30
Hugging Face introduces Pull Requests and Discussions 6 -- 2022-05-25
Kokoro-TTS 6 -- 2025-01-13
Microsoft Phi 4 with R1 Reasoning 6 -- 2025-02-04
DeepSeek-R1 without CCP censorship 6 -- 2025-02-20
More Efficient Chain-of-Thought Reasoning Through Certainty Probing 6 -- 2025-02-18
SigLIP 2: A better multilingual vision language encoder 6 -- 2025-02-22
Qwen2.5-Omni Technical Report 6 -- 2025-03-30
Better than DeepSeek R1? MiniMax-M1:open-weight hybrid-attention reasoning model 6 -- 2025-06-16
Show HN: Agent Leaderboard 2.0 – Domain Specific edition 6 -- 2025-07-17
Apple releases FastVLM and MobileCLIP2 on HF, real-time video captioning 6 -- 2025-08-30
Show HN: We built a better reranker and open sourced it 6 -- 2025-08-27
Nvidia STT Parakeet v3 6 -- 2025-08-15
First 70B model released with all training epochs and data 6 -- 2025-09-12
Qwen3-Next series represents our next-generation foundation models 6 -- 2025-09-12
Qwen Image Edit - SOTA Open Weight Image Editing Model 6 -- 2025-08-18
Cybersecurity Instruction Tuned Model 6 -- 2025-08-05
TinyLlama a 1.1B Llama model trained on 3T tokens reaches 1.0 release 5 -- 2023-12-31
Gemma-2 2B beats GPT3.5 on Chatbot Arena 5 -- 2024-07-31
FineWeb-Edu: new 1.3T tokens web dataset 5 -- 2024-06-02
Wall Street Journal Hedcut Stable Diffusion Model 5 -- 2024-01-23
New Mixtral HQQ Quantzied 4-bit/2-bit configuration 5 -- 2023-12-18
Personal co-pilot with a fine-tuning and a VSCode extension 5 -- 2023-10-31
Segment Anything Model (Sam) in the Browser with Rust and WASM 5 -- 2023-09-16
SD-XL 1.0 Model Card 5 -- 2023-07-26
AI Policy: Open ML Considerations in the EU AI Act 5 -- 2023-07-26
Modified Version of Apache 2.0 License with Royalty Payments 5 -- 2023-05-26
Creating a Coding Assistant with StarCoder 5 -- 2023-05-10
CLIP Interrogator 5 -- 2022-10-22
Blip: Image Captioning and Visual Question Answering AI 5 -- 2022-02-26
Hertz-dev is an open-source model for full-duplex conversational audio 5 -- 2024-11-16
New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies 5 -- 2024-11-11
Hugging Face launches HUGS: managed containers for on-premise model deployment 5 -- 2024-10-23
Janus-1.3B: Unifying Multimodal Understanding and Generation 5 -- 2024-10-18
Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling 5 -- 2024-10-16
Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller 5 -- 2024-10-16
Whisper-Large-v3-Turbo 5 -- 2024-10-03
Show HN: Automatic chaptering – From raw transcripts to structured documents 5 -- 2024-09-09
TabReD: A Benchmark of Tabular Machine Learning In-the-Wild 5 -- 2024-07-04
Microsoft releases weights for Florence-2 vision model 5 -- 2024-06-19
Phi-3-medium-128k-instruct 5 -- 2024-05-22
Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs 5 -- 2024-04-13
Gretel: Synthetic Text to SQL Dataset 5 -- 2024-04-04
Detecting performance and ethical vulnerabilities in popular Hugging Face models 5 -- 2024-03-21
Design2Code: How Far Are We from Automating Front-End Engineering? 5 -- 2024-03-10
Genie: Generative Interactive Environments 5 -- 2024-02-26
TTS Arena: Benchmarking TTS Models in the Wild 5 -- 2024-02-25
Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral 5 -- 2024-02-20
DeciLM-7B 5 -- 2023-12-12
Nash Learning from Human Feedback 5 -- 2023-12-05
Real-time image generation demo on Gradio 5 -- 2023-11-12
Convert a transformers model to Core ML 5 -- 2023-04-06
Wikipedia Txtai Embeddings Index 5 -- 2023-03-21
Show HN: Get the gist of anyone's Twitter feed 5 -- 2023-02-24
Illustrating RLHF that's critical for ChatGPT 5 -- 2022-12-09
Stable Diffusion Webapp 5 -- 2022-09-28
The World’s Largest Open Multilingual Language Model: Bloom 5 -- 2022-08-15
Wikipedia assistant directly answers your questions 5 -- 2022-02-15
Moonshine – open-source, real-time speech-to-text in the browser 5 -- 2024-12-19
Open R1: Update #2 5 -- 2025-02-11
Deepseek VL2 Small 5 -- 2025-02-08
Gemma 3 QAT (Quantized Aware Training) 3x less memory 5 -- 2025-04-03
DocumentAI with 256M Parameters 5 -- 2025-03-20
An open source common knowledge and context based Hallucination Detection Model 5 -- 2025-04-29
Mixture of Tunable Experts-DeepSeek R1 Behavior Modification at Inference Time 5 -- 2025-05-01
CircleGuardBench Leaderboard 5 -- 2025-05-07
Show HN: Raman-01 – A Pocket Physics Solver LLM 5 -- 2025-05-05
An MCP-powered agent in 50 lines of code 5 -- 2025-05-15
SWE-rebench: Over 21,000 Open Tasks for SWE LLMs 5 -- 2025-05-29
The Common Pile v0.1 5 -- 2025-06-06
You could have designed state of the art positional encoding 5 -- 2025-05-20
LLM Embeddings Explained: A Visual and Intuitive Guide 5 -- 2025-05-14
Show HN: KaniTTS – Open-source high-fidelity TTS with just 450M params 5 -- 2025-09-19
GLM 4.5 5 -- 2025-07-28
Gaia2 and Are: Empowering the Community to Evaluate Agents 5 -- 2025-09-22
VibeVoice: A Frontier Open-Source Text-to-Speech Model 5 -- 2025-08-26
Qwen2.5-Coder-3B Fine-Tuned for Triton Kernel Gen 5 -- 2025-08-03
Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard 4 -- 2024-01-26
Octopus V4: a graph of language models 4 -- 2024-05-02
Llama-3 8B Instruct 262k 4 -- 2024-04-26
CodeGemma – an official Google release for code LLMs 4 -- 2024-04-09
Solar 10.7B: Elevating AI, Effortlessly 4 -- 2023-12-27
WhiteRabbitNeo model series can be used for offensive/defensive cybersecurity 4 -- 2023-12-20
Eric Hartford releases uncensored dolphin-2.5-mixtral-8x7B 4 -- 2023-12-14
XTTS: New Generative model for Voice (weights released on HF) 4 -- 2023-09-15
Prompt Injection Detection Model 4 -- 2023-06-14
GPT-2 Output Detector 4 -- 2022-12-05
Apple Open-Sources LLM DCLM-7B 4 -- 2024-07-19
Open LLM Leaderboard v2 4 -- 2024-06-29
Florence 2, Microsoft OCR Modell 4 -- 2024-06-20
Apple OpenELM Instruct Models 4 -- 2024-04-24
Phi-3 Released 4 -- 2024-04-23
GemMoE: An 8x8 Mixture Of Experts based on Gemma 4 -- 2024-03-13
Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science 4 -- 2024-02-07
Introduction to State Space Models (SSM) 4 -- 2024-01-24
Distributed Inference and Fine-Tuning of Large Language Models over the Internet 4 -- 2023-12-17
Distil-Whisper: Distil-Small.en 4 -- 2023-12-14
2-bit and 4-bit versions of Mixtral 4 -- 2023-12-11
Nous-Capybara-34B-200k 4 -- 2023-11-14
An open-source and privacy-by-design Conversational AI in-browser 4 -- 2023-09-22
Large Language Models for Compiler Optimization 4 -- 2023-09-14
Gaussian viewer streaming splats in web browser 4 -- 2023-09-12
Puma: Secure Inference of LLaMA-7B in Five Minutes 4 -- 2023-07-25
FreeWilly2: New LLM from Stability AI 4 -- 2023-07-24
40B LLM wants to charge 10% royalty on revenue? 4 -- 2023-05-26
Falcon-40B 4 -- 2023-05-26
Fully Open Source LLM Chat App – Chat about the Transformers Docs 4 -- 2023-03-14
Karlo, the first open source DALL-E 2 replication is here 4 -- 2022-12-21
Show HN: Thought Leadership as a Service 4 -- 2022-06-09
HtmlRAG: HTML Is Better Than Plain Text for RAG Systems 4 -- 2024-11-06
Structured generation with Outlines, now in Rust 4 -- 2024-10-22
Llama 3.2 in the Browser with WebGPU 4 -- 2024-09-30
Multimodal TextImage Augmentation for Document Images 4 -- 2024-09-14
'Reflection 70B' AI model could be the answer to pesky LLM hallucinations 4 -- 2024-09-06
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers 4 -- 2024-08-14
FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving … 4 -- 2024-08-13
Introduction to Ggml 4 -- 2024-08-13
Google releases Gemma 2 2B, ShieldGemma and Gemma Scope 4 -- 2024-08-01
Gemma 2 2B Release 4 -- 2024-08-01
Extracting Concepts from LLMs: Anthropic's recent discoveries 4 -- 2024-06-08
EasyAnimate: End-to-end solution for high-resolution and long video generation 4 -- 2024-06-04
Grokked Transformers Are Implicit Reasoners 4 -- 2024-05-27
Paligemma: A versatile and lightweight vision-language model (VLM) 4 -- 2024-05-14
4M Context – Llama-3-8B-Instruct 4 -- 2024-05-09
ReFT: Representation Finetuning for Language Models 4 -- 2024-04-05
Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage 4 -- 2024-03-22
Show HN: Chatbot Guardrails Arena 4 -- 2024-03-21
Quanto: A PyTorch Quantization Toolkit 4 -- 2024-03-18
On-device background removal with Transformers.js 4 -- 2024-02-07
SegMoE: Segmind Mixture of Diffusion Experts 4 -- 2024-02-05
NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs 4 -- 2024-02-03
HuggingChat Assistants: Open source models with custom instructions 4 -- 2024-02-02
TinyLlama Reaches 3T Checkpoint 4 -- 2023-12-28
Obsidian-3B 4 -- 2023-11-25
Yarn-Llama-2-70B-32k 4 -- 2023-11-20
SDXL in 4 steps with Latent Consistency LoRAs 4 -- 2023-11-09
Zephyr 7B 4 -- 2023-10-27
Apple/coreml-stable-diffusion-XL-base-iOS 4 -- 2023-09-30
DeepSpeed-Chat: Easy RLHF Training of ChatGPT-Like Models at All Scales 4 -- 2023-08-04
Deploy LLMs with Hugging Face Inference Endpoints 4 -- 2023-07-04
Instruct-Codegen: open-source instruction following codegen model 4 -- 2023-05-27
MPT-7B-StoryWriter-65k+: LLM for super long contexts (Apache 2.0) 4 -- 2023-05-05
BioGPT for Biomedical Scientific Discovery 4 -- 2023-02-07
Using LoRA for Efficient Stable Diffusion Fine-Tuning 4 -- 2023-01-26
From GPT2 to Stable Diffusion: Hugging Face Arrives to the Elixir Community 4 -- 2022-12-09
Stable Diffusion pre-loaded with 250 community textual inversion concepts 4 -- 2022-09-14
Overview of how Stable Diffusion works 4 -- 2022-08-27
Editing Videos by Editing Text 4 -- 2022-05-23
Latent Diffusion, open source alternative to DALL·E 2 4 -- 2022-04-13
From Files to Chunks: Improving HF Storage Efficiency 4 -- 2024-11-20
Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg 4 -- 2024-11-24
Show HN: LatComp – Compress your image into a small and reversible … 4 -- 2024-11-30
DeepSeek-V3-Base 4 -- 2024-12-25
Qwen 2.5 Max 4 -- 2025-01-28
Hugging Face open sources a web-browsing agent that uses VLMs 4 -- 2025-01-24
Deepseek R1 Zero 4 -- 2025-01-20
LLaSE-G1 A FOSS speech enhancement model 4 -- 2025-03-08
Qwen/QwQ-32B released on Hugging Face 4 -- 2025-03-06
Wan2.1-T2V-14B 4 -- 2025-02-25
The Curse of Depth in Large Language Models 4 -- 2025-02-13
Migrating Hugging Face off Git LFS and to a new storage system … 4 -- 2025-03-18
MoCha: Towards Movie-Grade Talking Character Synthesis 4 -- 2025-04-01
Qwen2.5-Omni-7B 4 -- 2025-03-26
Open R1's OlympicCoder beats Deepseek R1, models and underlying dataset released 4 -- 2025-03-25
Devin's First Open Source Model Beats O3 4 -- 2025-05-06
Ltxv-13B – high-quality videos in real-time 4 -- 2025-05-07
Show HN: HalluMix – A Benchmark for Real-World LLM Hallucination Detection 4 -- 2025-05-06
Higgs – Rapidly Compress LLMs Without Significant Loss of Quality 4 -- 2025-04-12
New virtual try on model family that seems to be SOTA 4 -- 2025-06-28
Gemma 3n available in the open-source ecosystem 4 -- 2025-06-26
Automated Discovery of High-Performance GPU Kernels with OpenEvolve 4 -- 2025-06-28
Jan-Nano-128k: Empowering deeper research through extended context understanding 4 -- 2025-06-25
Kimi-Dev-72B 4 -- 2025-07-13
Kimi K2: 1T total parameter open-source LLM by Moonshot AI 4 -- 2025-07-11
Mistral AI releases Devstral-Small-2507 4 -- 2025-07-10
A 337M RSS feed dataset 4 -- 2025-08-26
Trackio: A new experiment tracking library from Hugging Face 4 -- 2025-07-29
Show HN: Single-agent long-horizon reasoning within one LLM run 4 -- 2025-07-23
Tricks from OpenAI GPT-OSS you can use with transformers 4 -- 2025-09-11
Kimi-K2-Instruct-0905 4 -- 2025-09-05
OmniNeural – First NPU-Aware Multimodal Model 4 -- 2025-08-24
Gemma 3-270M 4 -- 2025-08-14
Pruned expert GPT-OSS 6.6B 4 -- 2025-08-13
UIGEN-X-32B-0727 Reasoning Only UI Generation Model 4 -- 2025-07-28
MiniLM-L6-v2 maps paragraphs to 384 dimension vector for clustering or search 3 -- 2023-03-21
Show HN: Turn Any Article into a Conversation-Like Podcast 3 -- 2024-05-22
Phi-1.5 (1.3B Outperforms Llama 2 7B) 3 -- 2023-09-12
GPT-2B-001 3 -- 2023-04-20
Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI 3 -- 2024-10-15
AI has a problem with objectifying women 3 -- 2024-05-28
Linus Torvalds Chat Bot 3 -- 2024-02-02
ChatQA: Building GPT-4 Level Conversational QA Models 3 -- 2024-01-19
10.7B Solar: Elevating Performance with Upstage Depth Up Scaling 3 -- 2023-12-18
Voice Chat with Mistral 7B 3 -- 2023-10-16
Hugging Face partner with AMD to accelerate state-of-the-art models 3 -- 2023-06-14
Frames: Factuality, Retrieval, and Reasoning MEasurement Set 3 -- 2024-10-01
Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and … 3 -- 2024-09-20
Chronos-T5 (Tiny) – pretrained time series forecasting models 3 -- 2024-08-14
HF for Legal, an open-source community on Hugging Face 3 -- 2024-07-01
LegalKit, French labeled datasets built for legal ML training 3 -- 2024-06-27
Nvidia releases ChatQA-1.5 in violation of Llama 3 license 3 -- 2024-05-02
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding 3 -- 2024-04-26
Everyone seems to have forgotten about Gemma 3 -- 2024-04-25
Introducing the Open Chain of Thought Leaderboard 3 -- 2024-04-23
Google Gemma 1.1 2B and 7B instruct 3 -- 2024-04-06
Starcoder-2 3 -- 2024-02-28
DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development 3 -- 2024-02-09
Nous-Hermes-2-SOLAR-10.7B 3 -- 2024-01-02
Solar 10.7B 3 -- 2023-12-27
Transformer.js: Machine Learning for the Web 3 -- 2023-12-09
PixArt-α: Fast Training of Diffusion Transformer for Text-to-Image Synthetis 3 -- 2023-12-04
Laiyer AI Released Its Open Source Prompt Injection Model 3 -- 2023-11-29
LZMD: Lempel-Ziv Montecarlo Diffusion file format 3 -- 2023-11-29
Faster MusicGen Generation with Streaming 3 -- 2023-10-06
Llama 2 on Amazon SageMaker a Benchmark 3 -- 2023-09-26
LoRA Roulette 3 -- 2023-09-22
Open-source AI Discord bots with HuggingFace 3 -- 2023-08-17
StableBeluga-7B 3 -- 2023-07-29
MPT-30B – Apache 2.0 licensed LLM 3 -- 2023-07-22
Show HN: I created a first-of-its-kind open corpus of Australian law 3 -- 2023-06-26
Show HN: DocsGPT-7B – purpose optimised and finetuned model for documentation QA 3 -- 2023-06-16
Alpaca Dataset Translated into Polish 3 -- 2023-04-12
Bert 101 State of the Art NLP Model Explained 3 -- 2022-03-02
SemScore: Evaluating LLMs with Semantic Similarity 3 -- 2024-11-06
Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints 3 -- 2024-10-31
Hugging Face Now Automatically Detects Leaked Secrets 3 -- 2024-09-05
Selective fine-tuning of Language Models with Spectrum 3 -- 2024-09-03
Idefics3: Open multimodal model based on Llama-3.1-8B 3 -- 2024-08-09
New Google Gemma 2 2B model 3 -- 2024-07-31
Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth 3 -- 2024-07-29
DiLoCo: Distributed Low-Communication Training of Language Models 3 -- 2024-07-26
The largest math dataset of Olympiad problems for training LLMs 3 -- 2024-07-21
SmolLM – Fast and Remarkably Powerful 3 -- 2024-07-16
Whisper WebGPU: Real-time in-browser speech recognition 3 -- 2024-06-08
UGI Leaderboard – Uncensored General Intelligence 3 -- 2024-06-07
Transformers Are SSMs: Generalized Models and Efficient Algorithms Through 3 -- 2024-06-04
Recovering 4D World from Monocular Video 3 -- 2024-05-29
LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models 3 -- 2024-05-26
Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data 3 -- 2024-05-26
Phi-3 in-browser inference using WebGPU 3 -- 2024-05-08
Show HN: GPT Fine-Tune Formatter 3 -- 2024-05-07
InstantMesh: Efficient 3D Mesh Generation from a Single Image 3 -- 2024-04-15
Mixture of Finetuned and GPT4 Model 3 -- 2024-04-07
H2O-Danube2-1.8B-Chat 3 -- 2024-04-07
Yi-9B 3 -- 2024-04-05
Dolphin-2.8-mistral-7B-v02 3 -- 2024-04-03
Common Corpus – Start of the largest public domain dataset for training … 3 -- 2024-03-20
MoAI: Mixture of All Intelligence for Large Language and Vision Models 3 -- 2024-03-14
OpenChat-3.5-0106-Gemma 3 -- 2024-03-10
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping 3 -- 2024-02-23
Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens 3 -- 2024-02-22
Stable Diffusion XL Lightning 3 -- 2024-02-21
Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases 3 -- 2024-02-03
Show HN: A lineage explorer for open source models and datasets 3 -- 2024-01-23
Aim – An Apple Collection 3 -- 2024-01-19
LLaVA-3B 3 -- 2024-01-01
Dolphin-2.6-Mistral-7B 3 -- 2023-12-29
MonadGPT 3 -- 2023-12-28
MiniMA-2-3B 3 -- 2023-12-27
WaveCoder: Widespread Versatile Enhanced Instruction Tuning with Refine Data Gen 3 -- 2023-12-26
StarVector: Generating Scalable Vector Graphics Code from Images 3 -- 2023-12-20
AITube - Youtube but everything is AI generated 3 -- 2023-12-15
Refact-1.6B 3 -- 2023-12-08
Llama-2-7B-chat-mlx for Apple’s new MLX framework 3 -- 2023-12-06
NeuralHermes-2.5-Mistral-7B 3 -- 2023-11-29
Tulu-2-Dpo-70B 3 -- 2023-11-21
Show HN: New Launch OrionStar-Yi-34B-Chat beats Llama2-70B and GPT-3.5-turbo 3 -- 2023-11-20
Nvidia nemotron-3-8B-base-4k 3 -- 2023-11-16
Optimizing LLMs in Production 3 -- 2023-11-15
HuggingFace Daily Papers 3 -- 2023-11-14
Make your llama generation time fly with AWS Inferentia2 3 -- 2023-11-11
Show HN: Face-Stylization – Create face styling with just 8 images 3 -- 2023-11-09
Document Question Answering 3 -- 2023-10-30
Apple's LLMs and other GenAI models on HuggingFace 3 -- 2023-10-19
Using HuggingFace to Train a GPT-2 Model for Music Generation 3 -- 2023-10-09
MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators 3 -- 2023-09-19
Generative Image Dynamics 3 -- 2023-09-15
OpenHermes-13B based on Llama-2 3 -- 2023-09-07
Llama2.c LLM: ported to Rust and running in the browser 3 -- 2023-09-07
Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2 3 -- 2023-09-01
Fine-tuned CodeLlama beats GPT-4 on HumanEval 3 -- 2023-08-27
LoRA the Explorer 3 -- 2023-08-17
Fine-tune Llama 2 with DPO 3 -- 2023-08-08
Show HN: Goat-7B LLM, a new SOTA among the open-source 7B models 3 -- 2023-07-25
How is ChatGPT's behavior changing over time? 3 -- 2023-07-19
Show HN: New control net model for AI art QRcode 3 -- 2023-06-27
Show HN: Bert-Based Classification Model for Google Local Listings 3 -- 2023-06-26
Mosaic ML: MPT-30B-Chat 3 -- 2023-06-25
Video Composer: Create videos using GPT-4 and FFmpeg 3 -- 2023-06-15
MusicGen from Meta on Hugging Face 3 -- 2023-06-09
OpenLLaMA 7B Released 3 -- 2023-06-07
WizardLM-30B 3 -- 2023-06-06
Can AI Code? 3 -- 2023-06-05
Constrained Text Generation with Transformers 3 -- 2023-05-22
StarCoder: A State-of-the-Art LLM for Code 3 -- 2023-05-05
Swift Diffusers: Fast Stable Diffusion for Mac 3 -- 2023-04-02
Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU 3 -- 2023-03-12
Parameter-Efficient Fine-Tuning Billion-Scale Models on Low-Resource Hardware 3 -- 2023-02-10
Finetuned Stable Diffusion: open, free, beautiful results near to Midjouney 3 -- 2022-12-28
Hugging Face Machine Learning Demos Are Now on ArXiv 3 -- 2022-11-17
Pony Diffusion 3 -- 2022-10-01
Show HN: Audio Intelligence Dashboard 3 -- 2022-09-26
Fast Bloom Inference with DeepSpeed and Accelerate 3 -- 2022-09-15
YOLOv6: Real-Time Object Detection Demo 3 -- 2022-07-15
An Introduction to Deep Reinforcement Learning 3 -- 2022-05-13
Transform natural language queries to vector search SQL 3 -- 2022-04-19
Single Image to 3D in the Browser 3 -- 2022-04-15
JPEG Artifacts Removal 3 -- 2022-04-12
Multimodal Augmentation of Generative Models Through Adapter-Based Finetuning 3 -- 2022-03-20
AI Line Drawing Generation 3 -- 2022-03-11
OCR Model Beats Captcha 3 -- 2022-02-23
Fairseq S2: Scalable Speech Synthesis 3 -- 2022-01-21
Dataset Card for 1M Bluesky Posts 3 -- 2024-11-27
New 2B vision language model that consumes the least memory 3 -- 2024-11-26
New synthetic dataset beating MSFT and mistral's SFT recipe 3 -- 2024-11-22
Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer 3 -- 2024-12-06
Quantum+AI Qiskit Code Assistant Open Source model 3 -- 2024-11-27
informatiker/20-million-bluesky-posts 3 -- 2024-11-29
Automated GitHub Issue Creation Using Structured Generation 3 -- 2024-11-29
QwQ-32B-Preview 3 -- 2024-11-27
Welcome to the Falcon 3 Family of Open Models 3 -- 2024-12-17
Meta releases family of multimodal models that comprehend hour-long video 3 -- 2024-12-16
Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset 3 -- 2024-12-09
Timeline of AI model releases in 2024 3 -- 2025-01-01
Fine-Tune Deepseek-R1 with a Synthetic Reasoning Dataset 3 -- 2025-02-11
Hugging Face AI Agents Course 3 -- 2025-02-10
HuggingFace open reproduction of R1 data and training pipeline 3 -- 2025-01-27
DeepSeek-R1 on iPhone? (DeepSeek-R1-Distill-Qwen-1.5B-GGUF) 3 -- 2025-01-21
GEN3C: 3D-Informed World-Consistent Video 3 -- 2025-03-06
Microsoft Releases Phi-4-multimodal [pdf] 3 -- 2025-02-26
WanX open weight sota 14B video model release 3 -- 2025-02-25
Step-Audio-Chat: a 132B end-to-end speech-to-speech model 3 -- 2025-02-17
Show HN: First large scale evaluation of 4o Image Generation from OpenAI 3 -- 2025-03-27
EuroBERT: A High-Performance Multilingual Encoder Model 3 -- 2025-03-10
Training LLMs with GRPO and Interpreter Feedback Using WebAssembly 3 -- 2025-04-06
AgentRxiv: Towards Collaborative Autonomous Research 3 -- 2025-03-25
DeepSeek V3-0324 Posted to HuggingFace 3 -- 2025-03-24
Nvidia Isaac GR00T N1 is the first open foundation model for humanoid 3 -- 2025-03-21
VACE: All-in-One Video Creation and Editing from Alibaba 3 -- 2025-03-12
Drape1: Open-Source Scalable adapter for clothing generation 3 -- 2025-05-01
GLM-4-32B-0414: New MIT-licensed SOTA LLM from Zhipu AI 3 -- 2025-04-15
Xiaomi MiMo 3 -- 2025-04-30
Qwen3 235B (MoE with 128 experts) 3 -- 2025-04-28
Dia 1.6B – Nari Text-to-Speech Synthesis 3 -- 2025-04-24
Microsoft/MAI-DS-R1, DeepSeek R1 Post-Trained by Microsoft 3 -- 2025-04-18
Yambda-5B – Industrial-scale music recommendation dataset 3 -- 2025-06-04
Show HN: we released an open source, best-in-class medical reasoning model 3 -- 2025-05-13
Understanding MCP Evals: Why Evals Matter for MCP 3 -- 2025-06-06
Show HN: Ego-Dex Gradio App 3 -- 2025-06-03
Hugging Face Courses 3 -- 2025-05-27
Show HN: Tinker with Meta's "tokenizer-free" patcher 3 -- 2025-05-21
Radiology explainer demo 3 -- 2025-05-20
Memelang – a hybrid relational-graph query language 3 -- 2025-05-17
Hugging Face Collaborates with Proxima Fusion on ML for Stellarator Optimization 3 -- 2025-07-02
Largest in-person AV conversational dataset ever released 3 -- 2025-06-27
Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models 3 -- 2025-07-10
Show HN: 1.5B LLM routing model that aligns to preferences, not leaderboards 3 -- 2025-07-17
Mistral Releases Voxtral: Open Source Speech Understanding Models (3B and 24B) 3 -- 2025-07-15
CommaCarSegments: 3148 hours of raw CAN bus data from 230 different car … 3 -- 2025-07-10
AnyCoder creates a demo for Qwen Image Edit Plus in 10mins 3 -- 2025-09-22
I made WEBGEN-OSS-20B, a model that generates clean websites from your prompts 3 -- 2025-09-13
Reasoning Traces from QA Pairs 3 -- 2025-09-09
Welcome EmbeddingGemma, Google's new efficient embedding model 3 -- 2025-09-04
Output Schema for CodeAct AI Agents: From Trial-and-Error to Predictive Planning 3 -- 2025-08-31
WildChat-4.8M: 4.8M Real User–ChatGPT Conversations (Open Dataset) 3 -- 2025-08-11
Break the quadratic wall of Transformer attention: WERSA, paper+code open source 3 -- 2025-08-02
Qwen-Image-Edit-2509 3 -- 2025-09-22
AI Spreadsheet Benchmark [pdf] 3 -- 2025-09-22
FinePDFs Dataset 3 -- 2025-09-15
TildeOpen-30B: European LLM Focused on Underrepresented Languages 3 -- 2025-09-04
First vision language model built off Open AI GPT-OSS 3 -- 2025-08-26
Seed-OSS: open-source LLM models by ByteDance 3 -- 2025-08-22
From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA … 3 -- 2025-08-20
Jan-v1: Advanced Agentic Language Model 3 -- 2025-08-12
NextCoder by Microsoft — LLM performing on par with GPT-4o on complex … 3 -- 2025-08-08
OpenReasoning-Nemotron by Nvidia: state-of-the-art distilled reasoning models 3 -- 2025-08-08
Accelerate ND-Parallel: A Guide to Efficient Multi-GPU Training 3 -- 2025-08-08
Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM 2 -- 2024-04-19
Try Qwen2.5-Coder-32B on HuggingChat 2 -- 2024-11-12
An orthogonalized AI to introduce an unengaged melancholic style 2 -- 2024-06-13
Pearl-7B-slerp, an xtraordinary 7B model for maths 2 -- 2024-02-05
Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station 2 -- 2024-01-28
7B model from Snorkel tops Alpaca Eval 2.0 leaderboard 2 -- 2024-01-24
Run Deepseek Coder LLM locally 2 -- 2023-12-03
Releasing Swift Transformers: Run On-Device LLMs in Apple Devices 2 -- 2023-08-08
Stable Diffusion Bias Explorer 2 -- 2022-11-09
LongVU – New Video LLM from Meta 2 -- 2024-10-24
Hacker News Comments Dataset 2 -- 2024-10-11
HuggingFace Accelerate 1.0.0 2 -- 2024-10-07
Mistral-Small-Instruct-2409 2 -- 2024-09-17
HuggingChat: Chat with Llama 3.1 (70B and 405B) 2 -- 2024-07-23
Ocean Biodiversity Information System on Hugging Face 2 -- 2024-07-21
CommonCanvas image generation from CC-licensed images – models, dataset released 2 -- 2024-06-07
Show HN: PodGen generate podcasts on any topic 2 -- 2024-06-01
Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models 2 -- 2024-05-28
The Waifu Research Department 2 -- 2024-05-16
Yi-1.5 LLM Models Released 2 -- 2024-05-12
Fietje: An open and efficient LLM for Dutch 2 -- 2024-05-02
Simple Multimodal LLM from Scratch 2 -- 2024-04-23
Stability Releases Code Instruct 3B 2 -- 2024-04-02
Mistral 7B v0.2 2 -- 2024-04-01
PolarsBot, a New HuggingChat Assistant 2 -- 2024-03-25
Easy and low cost model training on HF "DGX cloud" 2 -- 2024-03-19
Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's … 2 -- 2024-02-19
LLMs can learn useful guidelines from their own mistakes 2 -- 2024-02-12
Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard 2 -- 2024-02-11
Aanaphi-2 3B 2 -- 2024-02-09
Playground for Hugging Face Models 2 -- 2024-02-05
Hallucinations Leaderboard 2 -- 2024-01-29
Fine-tune Wav2Vec2-BERT for low resource speech recognition 2 -- 2024-01-23
InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds 2 -- 2024-01-22
Yayi2-30B-Llama 2 -- 2024-01-01
Mixtral_7Bx2_MoE 2 -- 2023-12-24
Universal AnglE Sentence Embedding: New SOTA on MTEB Leaderboard 2 -- 2023-12-05
Non-engineers guide: Train a LLaMA 2 chatbot 2 -- 2023-12-02
AutoTrain: (not just)LLM finetuning without code and infra 2 -- 2023-11-23
How do you think LLM inference on CPUs? 2 -- 2023-11-03
State-of-the-Art Ember embedding model for retrieval augmented generation 2 -- 2023-10-20
Large Language Models as Analogical Reasoners 2 -- 2023-10-05
QR Code Monster 2 -- 2023-10-02
CausalLM is not optimal for in-context learning 2 -- 2023-08-15
Count tokens used by GPT-4 and Llama for large texts (> 50k … 2 -- 2023-08-05
Apply ControlNet to a Video 2 -- 2023-08-01
Making real-time ML-powered web games with Transformers.js 2 -- 2023-07-05
LLaMA: Large Language Model Meta AI 2 -- 2023-03-17
Small Stable Diffusion 2 -- 2023-01-19
Dreambooth training UI for training a model for less than US$0.80 2 -- 2022-12-01
Stable Diffusion: Generating One Image a Second 2 -- 2022-10-15
VToonify Web Demo for Portrait Video Style Transfer 2 -- 2022-10-04
Pixtral-Large-Instruct-2411 2 -- 2024-11-18
FLUX.1-Dev LoRA Outfit Generator by TryOn Labs 2 -- 2024-11-06
Contextual Document Embeddings 2 -- 2024-11-01
Code a Simple RAG from Scratch – Hugging Face Community Article 2 -- 2024-10-30
OmniParser for Pure Vision Based GUI Agent 2 -- 2024-10-25
Hugs – Scale Your AI with Open Models 2 -- 2024-10-23
Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins 2 -- 2024-10-23
Pickle Scanning 2 -- 2024-10-23
New Video Generation Model:Allegro 2 -- 2024-10-22
TxT360 2 -- 2024-10-18
Dataset About Where 30k+ Startups Trend 2 -- 2024-10-18
Nvidia Nemotron 2 -- 2024-10-17
Fixing Gradient Accumulation 2 -- 2024-10-16
Animate-X: Universal Character Image Animation with Enhanced Motion 2 -- 2024-10-15
SOTA Open Source Text to Video Model 2 -- 2024-10-14
Exploring the Daily Papers Page on Hugging Face 2 -- 2024-09-24
Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu) 2 -- 2024-09-23
Recreating o1 at Home with Role-Play LLMs 2 -- 2024-09-21
FineVideo: Annotated YouTube Dataset by HuggingFace 2 -- 2024-09-12
Remove Background by Text 2 -- 2024-09-12
Labeled Image generation using Meta Llama 3.5 2 -- 2024-08-31
Scaling robotics datasets with video encoding 2 -- 2024-08-30
New FashionCLIP and SigLIP Classification Demo 2 -- 2024-08-28
Mozilla/TriLM-Llamafile · Hugging Face 2 -- 2024-08-26
Play: How random can a human brain truly be? 2 -- 2024-08-24
FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs 2 -- 2024-08-21
Flux Dev 1 model that creates half_illustration images 2 -- 2024-08-21
LLMs as Image Generators with Canonical Codec Representations 2 -- 2024-08-19
Instant in-browser demo of SmolLM 2 -- 2024-08-18
Marqo-FashionCLIP: New Embedding Model for Fashion 2 -- 2024-08-14
A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine 2 -- 2024-08-07
Generate and Export Segmentation Masks Using Meta's SAMv2 2 -- 2024-07-31
HuggingChat: Chat with Llama 3.1 405B 2 -- 2024-07-25
Meta-Llama-3.1-405B 2 -- 2024-07-23
Apple's DCLM model shares data&training code with weights 2 -- 2024-07-20
Predicting Multiplication with GPT-2 2 -- 2024-07-20
Qwen2 Technical Report 2 -- 2024-07-16
Gemma-2-27B-it llamafile 2 -- 2024-07-03
OpenRAIL: Towards open and responsible AI licensing frameworks (2022) 2 -- 2024-07-03
New LLM Agent writing actions in Python code tops the GAIA agent … 2 -- 2024-07-01
Stable Diffusion 3 Medium Online Demo, Free 2 -- 2024-06-12
To Believe or Not to Believe Your LLM 2 -- 2024-06-11
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs 2 -- 2024-06-04
Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series 2 -- 2024-05-31
Training and Finetuning Embedding Models with Sentence Transformers v3 2 -- 2024-05-30
ChatTTS – open-source TTS model designed specifically for dialogue scenario 2 -- 2024-05-29
Matryoshka Multimodal Models 2 -- 2024-05-28
Aya 23: Open Weight Releases to Further Multilingual Progress 2 -- 2024-05-28
HuggingFace Hub Incident Post Mortem 2 -- 2024-05-24
Cohere Updates Weights for Aya 2 -- 2024-05-23
Hugging Face on AMD Instinct MI300 GPU 2 -- 2024-05-23
Show HN: Generate a Quiz from Any Url 2 -- 2024-05-17
Show HN: EmuBert – the first open encoder model for Australian law 2 -- 2024-05-14
New Yi 1.5 models under Apache 2.0 2 -- 2024-05-12
Building Cost-Efficient Enterprise RAG Applications 2 -- 2024-05-10
Google codegemma-1.1-7B-it 2 -- 2024-05-03
Introduction to Matryoshka Embedding Models 2 -- 2024-05-03
Iterative Reasoning Preference Optimization 2 -- 2024-05-02
GPT-2 2 -- 2024-05-01
Fine-tune Llama 3 with ORPO 2 -- 2024-04-23
In-browser text-to-music generation using musicgen-small 2 -- 2024-04-20
Compression Represents Intelligence Linearly 2 -- 2024-04-16
Bringing serverless GPU inference to Hugging Face users 2 -- 2024-04-16
From Words to Numbers: Your LLM Is a Capable Regressor 2 -- 2024-04-12
Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace 2 -- 2024-04-11
TinyTimeMixer: Open-source time series LLM by IBM 2 -- 2024-04-09
Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction 2 -- 2024-04-05
Command R+ 2 -- 2024-04-04
Demo of Moondream2 vision language model running in browser 2 -- 2024-04-03
Mini-Jamba 2 -- 2024-04-01
Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs 2 -- 2024-04-01
The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 … 2 -- 2024-03-25
Cosmopedia: How to create large-scale synthetic data for pre-training 2 -- 2024-03-21
Playground-v2.5-1024px-Aesthetic 2 -- 2024-03-16
Gemini 1.5: Unlocking multimodal understanding across tokens of context 2 -- 2024-03-15
Better RAG 1: Advanced Basics 2 -- 2024-03-15
Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks 2 -- 2024-03-13
LLM Red-Teaming Resistance Leaderboard 2 -- 2024-03-01
Show HN: Visualize how you split your document into chunks for RAG … 2 -- 2024-02-27
From OpenAI to Open LLMs with Messages API on Hugging Face 2 -- 2024-02-23
C4: colossal cleaned version of Common Crawl's web crawl corpus 2 -- 2024-02-21
Constitutional AI with Open LLMs 2 -- 2024-02-01
Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna … 2 -- 2024-01-31
AMUSEd: Efficient Text-to-Image Generation 2 -- 2024-01-29
Minillama – 4.1 MB LLM for testing 2 -- 2024-01-20
StableLM 2 Zephyr 1.6B 2 -- 2024-01-20
Local vector embeddings index for analyzing ArXiv papers 2 -- 2024-01-17
Stable Zero123 Model Weights get Released. Text to 3D and image to … 2 -- 2024-01-15
Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL 2 -- 2024-01-10
OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally 2 -- 2024-01-10
Revolutionizing AI with Audio Classification via Computer Vision 2 -- 2024-01-02
Chatglm3-6B-32k 2 -- 2023-12-29
DreaMoving: A Human Video Generation Framework Based on Diffusion Models 2 -- 2023-12-28
Dream-Talk: Realistic Audio-Driven Single Image Talking Face Generation 2 -- 2023-12-24
Time Is Encoded in the Weights of Finetuned Language Models 2 -- 2023-12-22
2023, Year of Open LLMs 2 -- 2023-12-19
Hugging Face releases Optimum-Nvidia to accelerate LLM inference 2 -- 2023-12-07
Open LLM Leaderboard: DROP deep dive 2 -- 2023-12-02
Starling-RM-7B-Alpha 2 -- 2023-11-27
Intel: neural-chat-7B-v3-1 2 -- 2023-11-16
Whisper Large v3 2 -- 2023-11-09
MonadGPT – OS ChatGPT-like for the 17th century 2 -- 2023-11-09
OpenHermes-2.5-Mistral-7B 2 -- 2023-11-08
Yi-34B, 76.3 on MMLU, Apache 2.0 2 -- 2023-11-04
Templates for Chat Models 2 -- 2023-10-17
HF Shopify Image Background Replacement 2 -- 2023-10-12
OpenWebMath, a dataset containing every math docs found on the internet 2 -- 2023-10-11
Paper Page – NExT-GPT: Any-to-Any Multimodal LLM 2 -- 2023-09-12
Using Machine Learning to Improve Language Metadata on the Hugging Face Hub 2 -- 2023-09-12
Open ASR Leaderboard 2 -- 2023-09-07
Show HN: A LLM pull reqeust review tool [feedback wanted] 2 -- 2023-09-07
Technology Innovation Institute Releases Falcon 180B LLM 2 -- 2023-09-06
Hugging Face Tutorial for Unity RL Agents 2 -- 2023-08-31
Dolma: The Largest Open Dataset For Training Language Models 2 -- 2023-08-24
WizardMath: Empowering Math Reasoning for LLM via Reinforced Evol-Instruct 2 -- 2023-08-15
Hugging Face Launches Tools for Running LLMs on Apple Devices 2 -- 2023-08-09
Open sourcing OpenAI’s function calling 2 -- 2023-08-08
Autotrain – Create powerful AI models without code 2 -- 2023-07-30
Understanding Embeddings 2 -- 2023-07-28
Scaling TransNormer to 175B Parameters 2 -- 2023-07-28
Llama 2 is here – get it on Hugging Face 2 -- 2023-07-19
Building an AI WebTV 2 -- 2023-07-18
Open-Source Text Generation and LLM Ecosystem at Hugging Face 2 -- 2023-07-17
OpenOrca-Preview1 2 -- 2023-07-12
Large Language Models can complete complex non linguistic patterns in context 2 -- 2023-07-11
Whisper Web: Speech recognition in the web browser 2 -- 2023-07-10
Chat with Falcon-7B-instruct demo 2 -- 2023-07-08
OpenChat: Less is More for Open-source Models 2 -- 2023-07-06
Can foundation models label data like humans? 2 -- 2023-07-05
Are Text-to-image models biased? 2 -- 2023-07-03
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 2 -- 2023-07-01
Can foundation models label data like humans? 2 -- 2023-06-30
A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion 2 -- 2023-06-30
Hugging Face – Transformers Agents 4.30 with local agents 2 -- 2023-06-28
DragGan – Interactive Point-Based Manipulation on the Generative Image Manifold 2 -- 2023-06-26
QR Code Conditioned ControlNet Models for Stable Diffusion 1.5 and 2.1 2 -- 2023-06-16
Cluster and Visualise 100K Wines by Tasting Notes with T-SNE 2 -- 2023-06-11
Hugging Face and IBM partner on watsonx.ai, next-gen enterprise studio for AI 2 -- 2023-05-28
HuggingFace Demo: DragGAN 2 -- 2023-05-26
Audit shows that safetensors is safe and ready to become the default 2 -- 2023-05-23
A Dive into Text-to-Video Models 2 -- 2023-05-15
HuberChat, a Chatbot trained on HubermanLab podcast (OpenAI key required) 2 -- 2023-05-10
Demo: Code Completion with replit-code-v1-3B 2 -- 2023-05-03
RLHF – Hugging Face Course 2 -- 2023-04-27
Ekimetrics launches a “ChatGPT” dedicated to climate 2 -- 2023-04-07
Alpaca GarbageCollector – Curating high-quality data for open-source LLMs 2 -- 2023-04-04
Text2Video-Zero 2 -- 2023-03-26
Train your own ControlNet models with diffusers 2 -- 2023-03-24
Open source models for various Machine Learning tasks 2 -- 2023-03-08
Ultra Fast ControlNet with Hugging Face Diffusers 2 -- 2023-03-03
Using Stable Diffusion with Core ML on Apple Silicon 2 -- 2023-02-22
HuggingFace/Transformers-Stats 2 -- 2023-02-20
Playable Demo for MarioGPT: Open-Ended Text2Level Generation Through LLMs 2 -- 2023-02-18
Faster Training and Inference: Habana Gaudi -2 vs. Nvidia A100 80GB 2 -- 2023-02-16
Speech Synthesis, Recognition, and More with SpeechT5 2 -- 2023-02-09
Threat actors using HuggingFace to deliver malware 2 -- 2023-02-07
Generating Human Motion from Textual Descriptions (T2M-GPT) 2 -- 2023-01-31
AI for Game Development: 3D Asset Generation 2 -- 2023-01-20
Show HN: ML Q&A – Get answers to questions about ML frameworks 2 -- 2023-01-05
Probabilistic Time Series Forecasting with Transformers 2 -- 2022-12-02
Fine-Tune Whisper for Multilingual ASR with Transformers 2 -- 2022-11-23
Ask a question, YouTube and OpenAI Whisper will try to answer 2 -- 2022-10-28
Show HN: Ask YouTube – search for specific answers in videos 2 -- 2022-10-28
New Google big language model Flan-T5 available on HuggingFace 2 -- 2022-10-22
The Annotated Diffusion Model 2 -- 2022-09-13
Text2Human: Text-Driven Controllable Human Image Generation 2 -- 2022-08-04
Highly Accurate Dichotomous Image Segmentation 2 -- 2022-07-31
The Technology Behind BLOOM Training 2 -- 2022-07-23
BLOOM Language Model 2 -- 2022-07-04
GPT4-Chan – Conditions for Availability 2 -- 2022-06-24
Hugging Face Hub: discover and share ML models, datasets, and demos 2 -- 2022-06-01
Decision Transformers on Hugging Face 2 -- 2022-06-01
Mask Transfiner for High-Quality Instance Segmentation 2 -- 2022-04-17
MultiMAE: Multi-modal Multi-task Masked Autoencoders 2 -- 2022-04-16
Self-Distilled StyleGAN: Towards Generation from Internet Photos Gradio Demo 2 -- 2022-04-05
CVPR2022 Pastiche Master: Exemplar-Based High-Resolution Portrait Style Transfer 2 -- 2022-03-24
Show HN: HF-BERTopic – Transformer based topic modeling in the browser 2 -- 2022-02-02
Turn a Photo into an Animation 2 -- 2022-01-29
DeepPrivacy: GANs for Face Anonymization 2 -- 2022-01-24
Show HN: HN-KeyBERT: AI KeyPhrase extraction in the browser 2 -- 2022-01-24
Similarity search for current Hacker News front page titles 2 -- 2022-01-23
HuggingFace on Sheets 2 -- 2025-03-24
OpenGPT-X 2 -- 2024-11-26
Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI' 2 -- 2024-11-28
The Lichess database is now on Hugging Face 2 -- 2024-12-06
LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs 2 -- 2024-12-05
Releasing: A dataset of two million Bluesky posts 2 -- 2024-11-27
Just launched MilkDropLM model using 32B parameters 2 -- 2024-12-20
FineMath: the best public math pre-training dataset 2 -- 2024-12-19
I-JEPA Hugginface 2 -- 2024-12-09
FineWeb2 dataset: A sparkling update with 1000s of languages 2 -- 2024-12-08
Vdr-2B-multi-v1 a multilingual embedding model for visual document retrieval 2 -- 2025-01-10
Show HN: We collected detailed annotations for text-to-image generation 2 -- 2025-01-10
Hugging Face Smolagents 2 -- 2025-01-05
Hugging Face advocates for Code Agents: agents that write tool calls as … 2 -- 2025-01-02
ModernBERT: Encoder-only Transformer Model Strictly Improving on past work 2 -- 2025-01-01
Polish linguistic and cultural competency benchmark for LLMs 2 -- 2024-12-31
Flex.1-Alpha – A new modded Flux model that can properly handle being … 2 -- 2025-01-19
OpenAI o3 just scored 99.8% on CodeForces using brute-force 2 -- 2025-02-12
FinePersonas 2 -- 2025-02-10
#9: Does AI Remember? The Role of Memory in Agentic Workflows 2 -- 2025-02-03
Mistral-Small-24B-Base-2501 2 -- 2025-01-30
Generate Images, Chat with PDF in WebGPU via DeepSeek Janus Pro 1B 2 -- 2025-01-28
The state of open video generation models 2 -- 2025-01-28
Bespoke-Stratos-17k: Open Reasoning Dataset by Distilling DeepSeek-R1 2 -- 2025-01-27
DeepSeek-R1 WebGPU 2 -- 2025-01-22
FastRTC: The Real-Time Communication Library for Python 2 -- 2025-02-25
Show HN: Roast Any Website with AI 2 -- 2025-02-25
SWE-Lancer: Can LLMs Earn $1M from Real-World Freelance Software Engineering? 2 -- 2025-02-18
Desklib AI Detector Ranks No 1 on Raid Benchmark for AI Detection 2 -- 2025-02-17
Forget What You Know about LLMs Evaluations – LLMs Are Like a … 2 -- 2025-02-13
JFK Assassination Records Dataset on Hugging Face 2 -- 2025-04-09
Show HN: My progress towards building a robotics training dataset 2 -- 2025-03-18
HOGWILD! Inference – parallel LLM chain-of-thought with shared attention 2 -- 2025-04-09
Llama-4 Model-Based Agentic AI System HuggingFace Released 2 -- 2025-04-06
Llama 3.2 from-scratch implementation focused on code readability 2 -- 2025-04-01
deepsite 2 -- 2025-03-31
SuperBPE: Space Travel for Language Models 2 -- 2025-03-29
Gemma3 on Hugging Face 2 -- 2025-03-26
Open-source LLM beats OpenAI o1 and DeepSeek-R1 for PyTorch-to-Triton codegen 2 -- 2025-03-19
Cohere: Command A (111B Open Weights Model) 2 -- 2025-03-14
Open Dataset: Vehicle Accidents 2 -- 2025-03-13
Show HN: TTS Arena V2 2 -- 2025-05-02
WebThinker: Empowering Large Reasoning Models with Deep Research Capability 2 -- 2025-05-01
MamayLM: An Efficient Ukrainian LLM 2 -- 2025-04-23
Show HN: AEE – An Open-Source Engine That Evaluates Truth and Bias … 2 -- 2025-04-13
Magi-1: Autoregressive Video Generation at Scale 2 -- 2025-05-06
The 4 Things the Qwen-3's Chat Template Teaches Us 2 -- 2025-05-02
Show HN: A synthetic text dataset to train tiny language models on 2 -- 2025-05-01
Phi-4-Reasoning 2 -- 2025-05-01
FantasyTalking: Realistic Talking Portrait Generation 2 -- 2025-04-30
Neural Network Visualizer 2 -- 2025-04-29
The Bitter Lesson Learned from 2k Multilingual Benchmarks 2 -- 2025-04-23
ThinkFlow: The Revolutionary Platform That Gives LLMs the Power to Think 2 -- 2025-04-19
Microsoft BitNet 1.58bit LLM 2B4T released 2 -- 2025-04-16
SOTA Model in 8B Size? 2 -- 2025-05-29
TiRex Leads Gift Eval 2 -- 2025-06-02
How do AI political biases differ between English and French? 2 -- 2025-05-21
KernelLLM – Meta's new 8B SotA model 2 -- 2025-05-19
Wan: Open and Advanced Large-Scale Video Generative Models 2 -- 2025-05-14
Embedding Benchmark for Retrieval 2 -- 2025-06-11
MiniCPM4 – a series of open multimodal models for edge inference 2 -- 2025-06-10
The Qwen3 Embedding Model 2 -- 2025-06-06
Tiny Agents in Python: an MCP-powered agent in ~70 lines of code 2 -- 2025-05-23
Show HN: 2.4x faster baai/bge-M3 2 -- 2025-05-18
Vision Language Models (Better, Faster, Stronger) 2 -- 2025-05-13
Building and better understanding vision-language models (2024) 2 -- 2025-05-10
FLUX Kontext Dev Ultra Fast Live 2 -- 2025-06-26
Veena – open-source TTS for Indian Languages 2 -- 2025-06-25
Metalorian: Generate Heavy Metal-Binding Peptides with Diffusion Sampling 2 -- 2025-07-12
Kimi-K2-Base 2 -- 2025-07-11
Building the Hugging Face MCP Server 2 -- 2025-07-10
A Survey on Latent Reasoning 2 -- 2025-07-10
Skywork-R1V3-38B open-source multimodal reasoning model 2 -- 2025-07-08
HuggingChat is shutting down (for now) 2 -- 2025-07-04
Qwen3Guard: Real-Time Safety for Your Token Stream 2 -- 2025-09-24
K2-Think: A Parameter-Efficient Reasoning System 2 -- 2025-09-13
Environments Hub: Your Language Model needs better (open) environments to learn 2 -- 2025-09-05
Pivotal Token Search (PTS): Targeting Critical Decision Points in LLM Training 2 -- 2025-08-18
Voxtral WebGPU 2 -- 2025-07-25
Show HN: kulyk-uk-en and kulyk-en-uk 2 -- 2025-07-22
Show HN: KaniTTS – Ultra Fast and Expressive TTS Model 2 -- 2025-09-22
N-Atlas V1 2 -- 2025-09-21
Granite docling 258M: a small multimodal model for efficient document conversion 2 -- 2025-09-17
Statistical Methods in Generative AI 2 -- 2025-09-16
EmbeddingGemma is a 300M parameter, open embedding model from Google 2 -- 2025-09-05
Swiss AI Initiative 2 -- 2025-09-02
Apertus LLM 2 -- 2025-09-02
Hugging Face speadsheet tool: AI Sheets 2 -- 2025-09-01
A Novel Pretrained Tokenizer-Free LLM Architecture 2 -- 2025-08-29
MiniCPM-V 4.5: GPT-4o Level MLLM for Image and Video Understanding on Your … 2 -- 2025-08-26
NASA and IBM release open source model on Hugging Face to predict … 2 -- 2025-08-20
Tokenizers 2 -- 2025-08-17
FormulaOne: A reasoning benchmark that all models score 0% on 2 -- 2025-08-14
dots.ocr: Multilingual Document Layout Parsing in a Single Vision-Language Model 2 -- 2025-08-06
Qwen3-30B-A3B-Thinking-2507 has been released 2 -- 2025-07-31
Intern-S1: A 241B parameter open-source MoE multimodal model 2 -- 2025-07-28
Creating custom kernels for the AMD MI300 2 -- 2025-07-25
Fast LoRA Inference for Flux with Diffusers and PEFT 2 -- 2025-07-24
Nvidia parakeet-tdt-0.6B-v2 2 -- 2025-07-22
How to Run a Hugging Face Model in Jax (Part 1) 2 -- 2025-07-20
Show HN: Chimera-QxD-BMM-Qwen2-l22_28-alphaqd-1.5B-f16 2 -- 2025-07-19
Show HN: Embedding model for PDF page retrieval 1 -- 2024-08-08
Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune 1 -- 2024-05-02
Show HN: Elon Musk's Tweet Classifier 1 -- 2022-04-30
Get Insulted by AI 1 -- 2024-02-25
Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio 1 -- 2024-07-29
With LLMs we can create an open-source Library of Alexandria 1 -- 2023-09-28
Show HN: Find Your Celebrity Lookalike (With AI) 1 -- 2023-01-04
Stable difussion trained with “El Risitas” dataset 1 -- 2022-10-27
SmolLM2: The new, best, and open small language model 1 -- 2024-11-01
The Romulus model series has been released on Hugging Face 1 -- 2024-09-11
I added context data to the TruthfulQA dataset 1 -- 2024-08-10
Chinese AI Community: open-source Heatmap 1 -- 2024-07-31
Multi-token prediction models and baselines 1 -- 2024-07-04
Stupid Filter Corpus (2007) 1 -- 2024-05-24
MMLU-Pro: Advanced edition of MMLU & new Leaderboard 1 -- 2024-05-15
Ratchet and Phi 3 1 -- 2024-05-01
Snowflake Arctic Instruct Open LLM 1 -- 2024-04-24
LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes 1 -- 2024-04-08
MANATEE(lm): Market Analysis based on language model architectures 1 -- 2024-03-20
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a … 1 -- 2024-03-13
Serverless Image Similarity with Upstash Vector and HuggingFace Spaces 1 -- 2024-02-02
Dutch Drug-Related Text Classification Model by NOS 1 -- 2024-01-25
Implement Fractional GPUs in Kubernetes to save upto 50% cost 1 -- 2024-01-22
The next person that says textual modalities gets it 1 -- 2024-01-10
LLaMA Pro: Progressive LLaMA with Block Expansion 1 -- 2024-01-05
DiffMorpher – Using Diffusion Models for Image Morphing 1 -- 2023-12-24
Tencent Announces AppAgent 1 -- 2023-12-22
How Do Prompt Injection Scanners Perform? A Benchmark 1 -- 2023-12-07
Show HN: ChatData – an open-source ChatGPT-like chatbot 1 -- 2023-11-29
3D Gaussian Splat Viewer (top item) 1 -- 2023-10-23
Who loves you Hacker News? 1 -- 2023-10-12
Curious about Causality and Generative Models? Check Out This Demo 1 -- 2023-07-26
Have You Tried AWS Inferentia2 for ML Deployments? 1 -- 2023-07-16
Open Source LLM Inference DLC 1 -- 2023-06-29
WizardCoder: Empowering Code Large Language Models with Evol-Instruct 1 -- 2023-06-15
Text Embedding Benchmark (MTEB) Leaderboard 1 -- 2023-02-20
Diffusion Models Live Event with Hugging Face 1 -- 2022-11-25
Train a language model with Megatron-LM and convert it to Transformers 1 -- 2022-09-13
Multilingual GPT model with 1.3B parameters trained on 25 languages 1 -- 2022-05-01
Hugging Face Model Comparator Space Builder 1 -- 2022-03-28
Halo: Open-Source Health Tracking with Wearables 1 -- 2024-11-20
Releasing the largest multilingual open pretraining dataset 1 -- 2024-11-14
Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for … 1 -- 2024-11-12
Providing Open Investment Data – 25 years of data 1 -- 2024-11-11
New Sota Text to Image 1 -- 2024-10-31
Stable Diffusion 3.5 Medium 1 -- 2024-10-29
Kolors Virtual Try-On in the Wild 1 -- 2024-10-28
Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval 1 -- 2024-10-23
Stable Diffusion 3.5-large released 1 -- 2024-10-22
Transformers.js v3: WebGPU Support, New Models and Tasks, and More 1 -- 2024-10-22
Allegro – New Open Source Text to Video Generator from Rhymes AI 1 -- 2024-10-22
Distilabel Synthetic Data Generator on Hugging Face 1 -- 2024-10-17
HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance 1 -- 2024-10-17
Show HN: A dataset of all HN submission texts (2006-2024) in Markdown 1 -- 2024-10-13
Scaling AI-Based Data Processing with Hugging Face and Dask 1 -- 2024-10-10
LLMs Know More Than They Show 1 -- 2024-10-08
Document Similarity Search with ColPali 1 -- 2024-09-29
Prithvi WxC: Foundation Model for Weather and Climate 1 -- 2024-09-24
Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance 1 -- 2024-09-24
HN-Style HuggingFace Daily Papers 1 -- 2024-09-22
Qwen2.5-Coder Technical Report 1 -- 2024-09-21
Introducing Community Tools on HuggingChat 1 -- 2024-09-20
InkubaLM-0.4B: Small language model for low-resource African Languages 1 -- 2024-08-29
Diffusion models are real time game engines 1 -- 2024-08-29
Everchanging Quest: Rogue-like game powered by LLMs 1 -- 2024-08-21
xLSTM Model Trained on Music 1 -- 2024-08-16
Qwen2-VL 1 -- 2024-08-14
Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters 1 -- 2024-08-07
Depth Compare – A Hugging Face space to compare different depth models 1 -- 2024-07-29
Insilico Medicine on Hugging Face 1 -- 2024-07-27
LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs 1 -- 2024-07-26
Spreadsheetllm: Encoding Spreadsheets for Large Language Models 1 -- 2024-07-24
Followgraph for Hugging Face 1 -- 2024-07-23
Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text … 1 -- 2024-07-23
Scaling Diffusion Transformers to 16B Parameters 1 -- 2024-07-19
DeepSeek v2 Chat (0628) released 1 -- 2024-07-18
The Rise of Agentic Data Generation 1 -- 2024-07-15
Fast SD3 Medium 1 -- 2024-07-10
Agentic RAG: query reformulation and self-query 1 -- 2024-07-08
Meta LLM Compiler 1 -- 2024-06-29
Allegro-TI2V: an open source video generation model 1 -- 2024-11-27
PR Puppet Sora 1 -- 2024-11-27
Lightricks/LTX-Video – first real-time video generation model 1 -- 2024-11-23
PaliGemma 2 – New vision language models by Google 1 -- 2024-12-05
Open Source Developers Guide to the EU AI Act 1 -- 2024-12-03
LM Studio using models from Hugging Face 1 -- 2024-12-02
IC Light – Shade Generation Model 1 -- 2024-12-02
ModernBERT 1 -- 2024-12-20
Show HN: A ML powered text moderation model that outperforms Open AI 1 -- 2024-12-14
Help Us Rank the Best Background Removal Tools 1 -- 2024-12-11
I need your help to create brain-rot dataset 1 -- 2024-12-08
Phi-4 GGUF 1 -- 2024-12-14
HunyuanVideo and Diffusers Made Easy 1 -- 2024-12-11
Show HN: An Agentic AI dataset for deepfake detection 1 -- 2025-01-15
FP8 DeepSeek R1 Distilled LLMs for SGLang and VLLM 1 -- 2025-01-29