Hugging Face Hacker News

Filters

Min points: 1 10 25 50 100 250 500

Since:

Posts by Month (78 total)

Hacker News Posts

Search:

Title	Points	Comments	Date
DeepSeek-v3.2: Pushing the frontier of open large language models [pdf]	978	--	2025-12-01
Uncensor any LLM with abliteration	586	--	2024-06-13
Show HN: Sweep, Open-weights 1.5B model for next-edit autocomplete	530	--	2026-01-21
Kimi K2.7-Code: open-source coding model with better token efficiency	455	--	2026-06-12
Deepseek R1-0528	451	--	2025-05-28
Llama-3.3-70B-Instruct	425	--	2024-12-06
Try Stable Diffusion's Img2Img Mode	415	--	2022-08-29
Show HN: Hacker News archive (47M+ items, 11.6GB) as Parquet, updated every …	399	--	2026-03-14
Open-R1: an open reproduction of DeepSeek-R1	394	--	2025-01-28
Smollm3: Smol, multilingual, long-context reasoner LLM	388	--	2025-07-08
GLM-4.7-Flash	371	--	2026-01-19
Nanonets-OCR-s – OCR model that transforms documents into structured markdown	361	--	2025-06-16
A Replacement for BERT	348	--	2024-12-19
MonadGPT – What would have happened if ChatGPT was invented in the …	323	--	2023-11-24
Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS	319	--	2025-09-02
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning	263	--	2025-12-01
The Smol Training Playbook: The Secrets to Building World-Class LLMs	262	--	2025-10-30
LLM in a Flash: Efficient LLM Inference with Limited Memory	252	--	2023-12-20
Microsoft Phi-2 model changes licence to MIT	240	--	2024-01-06
Falcon 180B	238	--	2023-09-06
OpenLLaMA 13B Released	229	--	2023-06-18
Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser	227	--	2025-02-07
Hugging Face Releases Agents	214	--	2023-05-10
Space secrets leak disclosure	197	--	2024-06-01
BigCode Project Releases StarCoder: A 15B Code LLM	185	--	2023-05-04
Best 7B LLM on leaderboards made by an amateur following a medium …	181	--	2024-01-05
Stability.ai sent a take down request to Runway ML's SD v1.5 citing …	179	--	2022-10-20
We raised $100M for open and collaborative machine learning	175	--	2022-05-09
Llama 3 8B is almost as good as Wizard 2 8x22B	168	--	2024-04-19
SantaCoder: A new 1.1B code model for generation and infilling	168	--	2022-12-22
Nvidia releases NVLM 1.0 72B open weight model	167	--	2024-10-02
Qwen3-4B-Thinking-2507	166	--	2025-08-06
StackLlama: A hands-on guide to train LlaMa with RLHF	165	--	2023-04-06
Explaining the SDXL Latent Space	163	--	2024-02-05
BLOOM: The largest open multilingual language model	160	--	2022-07-12
DeepSeek-V4: Towards Highly Efficient Million-Token Context Intelligence	159	--	2026-04-24
Show HN: Text-to-video model from scratch (2 brothers, 2 years, 2B params)	156	--	2026-01-22
Hugging Face and Google partner for AI collaboration	152	--	2024-01-25
Qwen3-235B-A22B-Thinking-2507	152	--	2025-07-25
Show HN: Penny-1.7B Irish Penny Journal style transfer	149	--	2025-06-02
Wordalle – Guess the prompt used to generate a set of images …	137	--	2022-07-01
Mistral-8x7B-Chat	131	--	2023-12-10
A CC-By Open-Source TTS Model with Voice Cloning	131	--	2024-11-04
Qwen-Image-Layered: transparency and layer aware open diffusion model	130	--	2025-12-19
FineWeb: Decanting the web for the finest text data at scale	127	--	2024-06-02
Yi-34B-Chat	115	--	2023-11-24
GPT-3.5 and Wolfram Alpha via LangChain	107	--	2023-01-18
The Falcon has landed in the Hugging Face ecosystem	105	--	2023-06-05
HuggingChat: Chat with Open Source Models	103	--	2024-02-21
Hugging Face and AWS partner to make AI more accessible	102	--	2023-02-21
HuggingFace Training Cluster as a Service	101	--	2023-09-05
More than 80 AI models from Qualcomm	95	--	2024-02-28
Segmind Stable Diffusion – A smaller version of Stable Diffusion XL	95	--	2023-10-25
LLaMA-Pro-8B	94	--	2024-01-06
HuggingChat	93	--	2023-04-25
Yarn-Mistral-7B-128k	88	--	2023-11-11
Qwen3 30B-A3B	87	--	2025-07-30
Apple/OpenELM: Efficient Open-Source Family Language Models	82	--	2024-04-24
Sparse LLM Inference on CPU: 75% fewer parameters	78	--	2023-10-19
Pokemon GAN	77	--	2022-02-14
YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license	75	--	2024-04-18
Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022)	73	--	2023-11-20
Multimodal Neurons in Pretrained Text-Only Transformers	66	--	2023-08-04
Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't	66	--	2024-01-22
Voxtral-Mini-3B-2507 – Open source speech understanding model	64	--	2025-07-15
Open-sourcing 5,000hrs of self-driving dataset	63	--	2025-03-11
HuggingChat – ChatGPT alternative with open source models	61	--	2023-12-15
MSFT's WizardLM2 models have been taken down	58	--	2024-04-16
OpenLLaMA 7B Training Completed to 1T Tokens	58	--	2023-06-07
Phi-2	57	--	2023-12-13
Dolphin-2_6-Phi-2	56	--	2023-12-24
Alibaba releases 72B LLM with 32k context length	55	--	2023-11-30
LiteLlama-460M-1T has 460M parameters trained with 1T tokens	54	--	2024-01-07
Qwen Image	54	--	2025-08-04
Fine-Tuning LLMs to 1.58bit	52	--	2024-09-18
Train faster static embedding models with sentence transformers	52	--	2025-01-15
Show HN: ChatToSTL – AI text-to-CAD for 3D printing	52	--	2025-06-12
LLaMA 3 70B Llamafiles	51	--	2024-04-19

Plushcap, by Matt Makai. 2021-2026.

Hugging Face on HN