HuggingFace Hacker News

Filters

Min points: 1 10 25 50 100 250 500

Year:

Posts by Month (447 total)

Hacker News Posts

Search:

Title	Points	Comments	Date
Uncensor any LLM with abliteration	586	--	2024-06-13
Llama-3.3-70B-Instruct	425	--	2024-12-06
A Replacement for BERT	348	--	2024-12-19
Microsoft Phi-2 model changes licence to MIT	240	--	2024-01-06
Space secrets leak disclosure	197	--	2024-06-01
Best 7B LLM on leaderboards made by an amateur following a medium …	181	--	2024-01-05
Llama 3 8B is almost as good as Wizard 2 8x22B	168	--	2024-04-19
Nvidia releases NVLM 1.0 72B open weight model	167	--	2024-10-02
Explaining the SDXL Latent Space	163	--	2024-02-05
Hugging Face and Google partner for AI collaboration	152	--	2024-01-25
A CC-By Open-Source TTS Model with Voice Cloning	131	--	2024-11-04
FineWeb: Decanting the web for the finest text data at scale	127	--	2024-06-02
HuggingChat: Chat with Open Source Models	103	--	2024-02-21
More than 80 AI models from Qualcomm	95	--	2024-02-28
LLaMA-Pro-8B	94	--	2024-01-06
Apple/OpenELM: Efficient Open-Source Family Language Models	82	--	2024-04-24
YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license	75	--	2024-04-18
Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't	66	--	2024-01-22
MSFT's WizardLM2 models have been taken down	58	--	2024-04-16
LiteLlama-460M-1T has 460M parameters trained with 1T tokens	54	--	2024-01-07
Fine-Tuning LLMs to 1.58bit	52	--	2024-09-18
LLaMA 3 70B Llamafiles	51	--	2024-04-19
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper	48	--	2024-12-26
Improving Parquet Dedupe on Hugging Face Hub	47	--	2024-10-08
Open-LLM performances are plateauing	46	--	2024-06-29
Mixtral-8x22B on HuggingFace	33	--	2024-04-10
General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model	31	--	2024-09-11
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat	30	--	2024-04-12
OpenFLUX.1	30	--	2024-10-04
Mistral 7B v0.2	29	--	2024-03-31
Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video	28	--	2024-04-16
Llama-3.2-3B-Instruct-uncensored	26	--	2024-09-27
Llama can now see and run on your device – welcome Llama …	26	--	2024-09-25
New Phi-3.5 Models from Microsoft, including new MoE	25	--	2024-08-20
LLM: Transformer Is Linear	25	--	2024-05-24
HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B	23	--	2024-11-05
Lineage Explorer for open source models – Hugging Face Space	22	--	2024-01-18
Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included	22	--	2024-08-14
Llama 3.2	21	--	2024-09-25
Fine-tune and deploy open LLMs as containers using AIKit - Part 1	19	--	2024-06-06
makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch	19	--	2024-01-23
HuggingFace to Replace Git LFS with Xet	18	--	2024-08-23
Fake Insects: a game where you have to identify AI-generated insects	18	--	2024-08-17
Mixtral-8x22B-Instruct-v0.1	18	--	2024-04-17
Hermes-2-Pro-Llama-3-8B	18	--	2024-05-01
StableLM-2-12B	17	--	2024-04-08
NuExtract: A LLM for Structured Extraction	16	--	2024-06-29
An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct	16	--	2024-06-09
Phi-3 Weights Released	16	--	2024-04-23
New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks	16	--	2024-07-31
Miqu 70B – possible leak of the mistral-medium LLM	16	--	2024-01-29
Ollama can run any GGUF Model on Hugging Face Hub now	15	--	2024-10-16
Llama-3-70B-Instruct-Gradient-1048k	14	--	2024-05-04
New finance LLM passed the CFA Level III exam	14	--	2024-07-31
Run Mistral 7B model using less than 4GB of memory on your …	14	--	2024-07-23
Stable Diffusion 3 Medium Released	14	--	2024-06-12
Pre-computed vector embeddings available on HuggingFace	14	--	2024-01-22
Yi-9B-200K	13	--	2024-03-17
An Introduction to Vision-Language Modeling	13	--	2024-05-28
FineWeb: 15T tokens of the finest data the web has to offer	12	--	2024-04-21
Language model can listen while speaking	12	--	2024-08-07
ML for 3D Course on Hugging Face	12	--	2024-05-16
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs	12	--	2024-04-09
Command-R: open weights 35B params / 128k tokens context length model by …	12	--	2024-03-11
StarCoder2 and The Stack v2: new code LLMs and dataset	12	--	2024-02-28
Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model	12	--	2024-03-28
HuggingFace Is Down	11	--	2024-02-28
Experiments with Bitnet 1.5 (Ngmi)	11	--	2024-03-23
FalconMamba 7B: The first attention-free and general-purpose pure Mamba model	11	--	2024-08-13
NPC-Playground, a 3D playground to interact with LLM-powered NPCs	11	--	2024-06-05
Open LLM Leaderboard	11	--	2024-01-02
CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher	10	--	2024-06-15
Whisperfile	10	--	2024-08-19
Llava Model for Video	10	--	2024-05-16
Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption	10	--	2024-01-31
Vector embeddings model for medical literature	10	--	2024-01-08
Show HN: Downloadable AI Musical Instruments	10	--	2024-12-10
Not All Language Model Features Are Linear	9	--	2024-05-25
Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct	9	--	2024-10-16
Perspectives for first principles prompt engineering	9	--	2024-08-20
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models	9	--	2024-05-28
Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B	9	--	2024-01-04
Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B	9	--	2024-11-18
MIT Researchers Unveil New Method to Improve LLM Inference Performance	9	--	2024-10-04
Aryn/deformable-detr-DocLayNet – open-source Layout Model	9	--	2024-07-31
AIMO (AI Math Olympiad) progress prize winning solution	9	--	2024-07-10
Mistral-7B-v0.3 released on HuggingFace	9	--	2024-05-22
Microsoft Phi-3 3.8B model with 128k Context	9	--	2024-04-23
The Stack v2: a 3B files in 600 programming languages dataset	9	--	2024-03-07
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces	9	--	2024-12-15
NousResearch/Nous-Hermes-2-Llama-2-70B	8	--	2024-02-12
Show HN: We made an encrypted DNA testing app using Homomorphic Encryption	8	--	2024-10-02
NexusRaven-V2-13B	8	--	2024-01-25
Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard	8	--	2024-10-15
Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released	8	--	2024-09-17
Mistral AI Pixtral	8	--	2024-09-11
Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces	8	--	2024-02-14
Scaling Test Time Compute with Open Models	8	--	2024-12-16
Phi-3 Technical a Highly Capable Language Model Locally on Your Phone	7	--	2024-04-23
Am I in the Stack?	7	--	2024-03-20
Common Corpus: the largest public domain dataset for training LLMs	7	--	2024-03-20
Hugging Face launches Agents 2.0	7	--	2024-05-13
OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5	7	--	2024-02-26
Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space	7	--	2024-05-16
1B+ words corpus of original texts and experimental post-OCR correction output	7	--	2024-04-26
Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out	7	--	2024-03-14
Grandmaster-Level Chess Without Search	7	--	2024-02-08
Create a Web Interface for Your LLM in Python	7	--	2024-01-23
New leaderboard drop: Judge Arena	6	--	2024-11-19
Phased Consistency Model	6	--	2024-05-29
A Llama 70B finetune that has reflection baked into it's weights	6	--	2024-09-05
Show HN: Understand politics by visualising manifesto embeddings	6	--	2024-07-07
Mistral releases the v0.3 of its 7B LLM	6	--	2024-05-22
Idefics2: A Powerful 8B Vision-Language Model for the Community	6	--	2024-05-14
Show HN: Open-source LLM for data labeling	6	--	2024-05-08
Dolphin-2.9-Llama3-8B	6	--	2024-04-21
Introduction to 3D Gaussian Splatting	6	--	2024-04-02
Gemma-2 2B beats GPT3.5 on Chatbot Arena	5	--	2024-07-31
FineWeb-Edu: new 1.3T tokens web dataset	5	--	2024-06-02
Wall Street Journal Hedcut Stable Diffusion Model	5	--	2024-01-23
Hertz-dev is an open-source model for full-duplex conversational audio	5	--	2024-11-16
New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies	5	--	2024-11-11
Hugging Face launches HUGS: managed containers for on-premise model deployment	5	--	2024-10-23
Janus-1.3B: Unifying Multimodal Understanding and Generation	5	--	2024-10-18
Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling	5	--	2024-10-16
Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller	5	--	2024-10-16
Whisper-Large-v3-Turbo	5	--	2024-10-03
Show HN: Automatic chaptering – From raw transcripts to structured documents	5	--	2024-09-09
TabReD: A Benchmark of Tabular Machine Learning In-the-Wild	5	--	2024-07-04
Microsoft releases weights for Florence-2 vision model	5	--	2024-06-19
Phi-3-medium-128k-instruct	5	--	2024-05-22
Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs	5	--	2024-04-13
Gretel: Synthetic Text to SQL Dataset	5	--	2024-04-04
Detecting performance and ethical vulnerabilities in popular Hugging Face models	5	--	2024-03-21
Design2Code: How Far Are We from Automating Front-End Engineering?	5	--	2024-03-10
Genie: Generative Interactive Environments	5	--	2024-02-26
TTS Arena: Benchmarking TTS Models in the Wild	5	--	2024-02-25
Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral	5	--	2024-02-20
Moonshine – open-source, real-time speech-to-text in the browser	5	--	2024-12-19
Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard	4	--	2024-01-26
Octopus V4: a graph of language models	4	--	2024-05-02
Llama-3 8B Instruct 262k	4	--	2024-04-26
CodeGemma – an official Google release for code LLMs	4	--	2024-04-09
Apple Open-Sources LLM DCLM-7B	4	--	2024-07-19
Open LLM Leaderboard v2	4	--	2024-06-29
Florence 2, Microsoft OCR Modell	4	--	2024-06-20
Apple OpenELM Instruct Models	4	--	2024-04-24
Phi-3 Released	4	--	2024-04-23
GemMoE: An 8x8 Mixture Of Experts based on Gemma	4	--	2024-03-13
Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science	4	--	2024-02-07
Introduction to State Space Models (SSM)	4	--	2024-01-24
HtmlRAG: HTML Is Better Than Plain Text for RAG Systems	4	--	2024-11-06
Structured generation with Outlines, now in Rust	4	--	2024-10-22
Llama 3.2 in the Browser with WebGPU	4	--	2024-09-30
Multimodal TextImage Augmentation for Document Images	4	--	2024-09-14
'Reflection 70B' AI model could be the answer to pesky LLM hallucinations	4	--	2024-09-06
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers	4	--	2024-08-14
FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving …	4	--	2024-08-13
Introduction to Ggml	4	--	2024-08-13
Google releases Gemma 2 2B, ShieldGemma and Gemma Scope	4	--	2024-08-01
Gemma 2 2B Release	4	--	2024-08-01
Extracting Concepts from LLMs: Anthropic's recent discoveries	4	--	2024-06-08
EasyAnimate: End-to-end solution for high-resolution and long video generation	4	--	2024-06-04
Grokked Transformers Are Implicit Reasoners	4	--	2024-05-27
Paligemma: A versatile and lightweight vision-language model (VLM)	4	--	2024-05-14
4M Context – Llama-3-8B-Instruct	4	--	2024-05-09
ReFT: Representation Finetuning for Language Models	4	--	2024-04-05
Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage	4	--	2024-03-22
Show HN: Chatbot Guardrails Arena	4	--	2024-03-21
Quanto: A PyTorch Quantization Toolkit	4	--	2024-03-18
On-device background removal with Transformers.js	4	--	2024-02-07
SegMoE: Segmind Mixture of Diffusion Experts	4	--	2024-02-05
NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs	4	--	2024-02-03
HuggingChat Assistants: Open source models with custom instructions	4	--	2024-02-02
From Files to Chunks: Improving HF Storage Efficiency	4	--	2024-11-20
Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg	4	--	2024-11-24
Show HN: LatComp – Compress your image into a small and reversible …	4	--	2024-11-30
DeepSeek-V3-Base	4	--	2024-12-25
Show HN: Turn Any Article into a Conversation-Like Podcast	3	--	2024-05-22
Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI	3	--	2024-10-15
AI has a problem with objectifying women	3	--	2024-05-28
Linus Torvalds Chat Bot	3	--	2024-02-02
ChatQA: Building GPT-4 Level Conversational QA Models	3	--	2024-01-19
Frames: Factuality, Retrieval, and Reasoning MEasurement Set	3	--	2024-10-01
Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and …	3	--	2024-09-20
Chronos-T5 (Tiny) – pretrained time series forecasting models	3	--	2024-08-14
HF for Legal, an open-source community on Hugging Face	3	--	2024-07-01
LegalKit, French labeled datasets built for legal ML training	3	--	2024-06-27
Nvidia releases ChatQA-1.5 in violation of Llama 3 license	3	--	2024-05-02
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding	3	--	2024-04-26
Everyone seems to have forgotten about Gemma	3	--	2024-04-25
Introducing the Open Chain of Thought Leaderboard	3	--	2024-04-23
Google Gemma 1.1 2B and 7B instruct	3	--	2024-04-06
Starcoder-2	3	--	2024-02-28
DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development	3	--	2024-02-09
Nous-Hermes-2-SOLAR-10.7B	3	--	2024-01-02
SemScore: Evaluating LLMs with Semantic Similarity	3	--	2024-11-06
Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints	3	--	2024-10-31
Hugging Face Now Automatically Detects Leaked Secrets	3	--	2024-09-05
Selective fine-tuning of Language Models with Spectrum	3	--	2024-09-03
Idefics3: Open multimodal model based on Llama-3.1-8B	3	--	2024-08-09
New Google Gemma 2 2B model	3	--	2024-07-31
Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth	3	--	2024-07-29
DiLoCo: Distributed Low-Communication Training of Language Models	3	--	2024-07-26
The largest math dataset of Olympiad problems for training LLMs	3	--	2024-07-21
SmolLM – Fast and Remarkably Powerful	3	--	2024-07-16
Whisper WebGPU: Real-time in-browser speech recognition	3	--	2024-06-08
UGI Leaderboard – Uncensored General Intelligence	3	--	2024-06-07
Transformers Are SSMs: Generalized Models and Efficient Algorithms Through	3	--	2024-06-04
Recovering 4D World from Monocular Video	3	--	2024-05-29
LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models	3	--	2024-05-26
Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data	3	--	2024-05-26
Phi-3 in-browser inference using WebGPU	3	--	2024-05-08
Show HN: GPT Fine-Tune Formatter	3	--	2024-05-07
InstantMesh: Efficient 3D Mesh Generation from a Single Image	3	--	2024-04-15
Mixture of Finetuned and GPT4 Model	3	--	2024-04-07
H2O-Danube2-1.8B-Chat	3	--	2024-04-07
Yi-9B	3	--	2024-04-05
Dolphin-2.8-mistral-7B-v02	3	--	2024-04-03
Common Corpus – Start of the largest public domain dataset for training …	3	--	2024-03-20
MoAI: Mixture of All Intelligence for Large Language and Vision Models	3	--	2024-03-14
OpenChat-3.5-0106-Gemma	3	--	2024-03-10
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping	3	--	2024-02-23
Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens	3	--	2024-02-22
Stable Diffusion XL Lightning	3	--	2024-02-21
Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases	3	--	2024-02-03
Show HN: A lineage explorer for open source models and datasets	3	--	2024-01-23
Aim – An Apple Collection	3	--	2024-01-19
LLaVA-3B	3	--	2024-01-01
Dataset Card for 1M Bluesky Posts	3	--	2024-11-27
New 2B vision language model that consumes the least memory	3	--	2024-11-26
New synthetic dataset beating MSFT and mistral's SFT recipe	3	--	2024-11-22
Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer	3	--	2024-12-06
Quantum+AI Qiskit Code Assistant Open Source model	3	--	2024-11-27
informatiker/20-million-bluesky-posts	3	--	2024-11-29
Automated GitHub Issue Creation Using Structured Generation	3	--	2024-11-29
QwQ-32B-Preview	3	--	2024-11-27
Welcome to the Falcon 3 Family of Open Models	3	--	2024-12-17
Meta releases family of multimodal models that comprehend hour-long video	3	--	2024-12-16
Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset	3	--	2024-12-09
Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM	2	--	2024-04-19
Try Qwen2.5-Coder-32B on HuggingChat	2	--	2024-11-12
An orthogonalized AI to introduce an unengaged melancholic style	2	--	2024-06-13
Pearl-7B-slerp, an xtraordinary 7B model for maths	2	--	2024-02-05
Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station	2	--	2024-01-28
7B model from Snorkel tops Alpaca Eval 2.0 leaderboard	2	--	2024-01-24
LongVU – New Video LLM from Meta	2	--	2024-10-24
Hacker News Comments Dataset	2	--	2024-10-11
HuggingFace Accelerate 1.0.0	2	--	2024-10-07
Mistral-Small-Instruct-2409	2	--	2024-09-17
HuggingChat: Chat with Llama 3.1 (70B and 405B)	2	--	2024-07-23
Ocean Biodiversity Information System on Hugging Face	2	--	2024-07-21
CommonCanvas image generation from CC-licensed images – models, dataset released	2	--	2024-06-07
Show HN: PodGen generate podcasts on any topic	2	--	2024-06-01
Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models	2	--	2024-05-28
The Waifu Research Department	2	--	2024-05-16
Yi-1.5 LLM Models Released	2	--	2024-05-12
Fietje: An open and efficient LLM for Dutch	2	--	2024-05-02
Simple Multimodal LLM from Scratch	2	--	2024-04-23
Stability Releases Code Instruct 3B	2	--	2024-04-02
Mistral 7B v0.2	2	--	2024-04-01
PolarsBot, a New HuggingChat Assistant	2	--	2024-03-25
Easy and low cost model training on HF "DGX cloud"	2	--	2024-03-19
Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's …	2	--	2024-02-19
LLMs can learn useful guidelines from their own mistakes	2	--	2024-02-12
Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard	2	--	2024-02-11
Aanaphi-2 3B	2	--	2024-02-09
Playground for Hugging Face Models	2	--	2024-02-05
Hallucinations Leaderboard	2	--	2024-01-29
Fine-tune Wav2Vec2-BERT for low resource speech recognition	2	--	2024-01-23
InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds	2	--	2024-01-22
Yayi2-30B-Llama	2	--	2024-01-01
Pixtral-Large-Instruct-2411	2	--	2024-11-18
FLUX.1-Dev LoRA Outfit Generator by TryOn Labs	2	--	2024-11-06
Contextual Document Embeddings	2	--	2024-11-01
Code a Simple RAG from Scratch – Hugging Face Community Article	2	--	2024-10-30
OmniParser for Pure Vision Based GUI Agent	2	--	2024-10-25
Hugs – Scale Your AI with Open Models	2	--	2024-10-23
Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins	2	--	2024-10-23
Pickle Scanning	2	--	2024-10-23
New Video Generation Model：Allegro	2	--	2024-10-22
TxT360	2	--	2024-10-18
Dataset About Where 30k+ Startups Trend	2	--	2024-10-18
Nvidia Nemotron	2	--	2024-10-17
Fixing Gradient Accumulation	2	--	2024-10-16
Animate-X: Universal Character Image Animation with Enhanced Motion	2	--	2024-10-15
SOTA Open Source Text to Video Model	2	--	2024-10-14
Exploring the Daily Papers Page on Hugging Face	2	--	2024-09-24
Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu)	2	--	2024-09-23
Recreating o1 at Home with Role-Play LLMs	2	--	2024-09-21
FineVideo: Annotated YouTube Dataset by HuggingFace	2	--	2024-09-12
Remove Background by Text	2	--	2024-09-12
Labeled Image generation using Meta Llama 3.5	2	--	2024-08-31
Scaling robotics datasets with video encoding	2	--	2024-08-30
New FashionCLIP and SigLIP Classification Demo	2	--	2024-08-28
Mozilla/TriLM-Llamafile · Hugging Face	2	--	2024-08-26
Play: How random can a human brain truly be?	2	--	2024-08-24
FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs	2	--	2024-08-21
Flux Dev 1 model that creates half_illustration images	2	--	2024-08-21
LLMs as Image Generators with Canonical Codec Representations	2	--	2024-08-19
Instant in-browser demo of SmolLM	2	--	2024-08-18
Marqo-FashionCLIP: New Embedding Model for Fashion	2	--	2024-08-14
A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine	2	--	2024-08-07
Generate and Export Segmentation Masks Using Meta's SAMv2	2	--	2024-07-31
HuggingChat: Chat with Llama 3.1 405B	2	--	2024-07-25
Meta-Llama-3.1-405B	2	--	2024-07-23
Apple's DCLM model shares data&training code with weights	2	--	2024-07-20
Predicting Multiplication with GPT-2	2	--	2024-07-20
Qwen2 Technical Report	2	--	2024-07-16
Gemma-2-27B-it llamafile	2	--	2024-07-03
OpenRAIL: Towards open and responsible AI licensing frameworks (2022)	2	--	2024-07-03
New LLM Agent writing actions in Python code tops the GAIA agent …	2	--	2024-07-01
Stable Diffusion 3 Medium Online Demo, Free	2	--	2024-06-12
To Believe or Not to Believe Your LLM	2	--	2024-06-11
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs	2	--	2024-06-04
Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series	2	--	2024-05-31
Training and Finetuning Embedding Models with Sentence Transformers v3	2	--	2024-05-30
ChatTTS – open-source TTS model designed specifically for dialogue scenario	2	--	2024-05-29
Matryoshka Multimodal Models	2	--	2024-05-28
Aya 23: Open Weight Releases to Further Multilingual Progress	2	--	2024-05-28
HuggingFace Hub Incident Post Mortem	2	--	2024-05-24
Cohere Updates Weights for Aya	2	--	2024-05-23
Hugging Face on AMD Instinct MI300 GPU	2	--	2024-05-23
Show HN: Generate a Quiz from Any Url	2	--	2024-05-17
Show HN: EmuBert – the first open encoder model for Australian law	2	--	2024-05-14
New Yi 1.5 models under Apache 2.0	2	--	2024-05-12
Building Cost-Efficient Enterprise RAG Applications	2	--	2024-05-10
Google codegemma-1.1-7B-it	2	--	2024-05-03
Introduction to Matryoshka Embedding Models	2	--	2024-05-03
Iterative Reasoning Preference Optimization	2	--	2024-05-02
GPT-2	2	--	2024-05-01
Fine-tune Llama 3 with ORPO	2	--	2024-04-23
In-browser text-to-music generation using musicgen-small	2	--	2024-04-20
Compression Represents Intelligence Linearly	2	--	2024-04-16
Bringing serverless GPU inference to Hugging Face users	2	--	2024-04-16
From Words to Numbers: Your LLM Is a Capable Regressor	2	--	2024-04-12
Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace	2	--	2024-04-11
TinyTimeMixer: Open-source time series LLM by IBM	2	--	2024-04-09
Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction	2	--	2024-04-05
Command R+	2	--	2024-04-04
Demo of Moondream2 vision language model running in browser	2	--	2024-04-03
Mini-Jamba	2	--	2024-04-01
Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs	2	--	2024-04-01
The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 …	2	--	2024-03-25
Cosmopedia: How to create large-scale synthetic data for pre-training	2	--	2024-03-21
Playground-v2.5-1024px-Aesthetic	2	--	2024-03-16
Gemini 1.5: Unlocking multimodal understanding across tokens of context	2	--	2024-03-15
Better RAG 1: Advanced Basics	2	--	2024-03-15
Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks	2	--	2024-03-13
LLM Red-Teaming Resistance Leaderboard	2	--	2024-03-01
Show HN: Visualize how you split your document into chunks for RAG …	2	--	2024-02-27
From OpenAI to Open LLMs with Messages API on Hugging Face	2	--	2024-02-23
C4: colossal cleaned version of Common Crawl's web crawl corpus	2	--	2024-02-21
Constitutional AI with Open LLMs	2	--	2024-02-01
Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna …	2	--	2024-01-31
AMUSEd: Efficient Text-to-Image Generation	2	--	2024-01-29
Minillama – 4.1 MB LLM for testing	2	--	2024-01-20
StableLM 2 Zephyr 1.6B	2	--	2024-01-20
Local vector embeddings index for analyzing ArXiv papers	2	--	2024-01-17
Stable Zero123 Model Weights get Released. Text to 3D and image to …	2	--	2024-01-15
Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL	2	--	2024-01-10
OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally	2	--	2024-01-10
Revolutionizing AI with Audio Classification via Computer Vision	2	--	2024-01-02
OpenGPT-X	2	--	2024-11-26
Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI'	2	--	2024-11-28
The Lichess database is now on Hugging Face	2	--	2024-12-06
LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs	2	--	2024-12-05
Releasing: A dataset of two million Bluesky posts	2	--	2024-11-27
Just launched MilkDropLM model using 32B parameters	2	--	2024-12-20
FineMath: the best public math pre-training dataset	2	--	2024-12-19
I-JEPA Hugginface	2	--	2024-12-09
FineWeb2 dataset: A sparkling update with 1000s of languages	2	--	2024-12-08
Polish linguistic and cultural competency benchmark for LLMs	2	--	2024-12-31
Show HN: Embedding model for PDF page retrieval	1	--	2024-08-08
Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune	1	--	2024-05-02
Get Insulted by AI	1	--	2024-02-25
Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio	1	--	2024-07-29
SmolLM2: The new, best, and open small language model	1	--	2024-11-01
The Romulus model series has been released on Hugging Face	1	--	2024-09-11
I added context data to the TruthfulQA dataset	1	--	2024-08-10
Chinese AI Community: open-source Heatmap	1	--	2024-07-31
Multi-token prediction models and baselines	1	--	2024-07-04
Stupid Filter Corpus (2007)	1	--	2024-05-24
MMLU-Pro: Advanced edition of MMLU & new Leaderboard	1	--	2024-05-15
Ratchet and Phi 3	1	--	2024-05-01
Snowflake Arctic Instruct Open LLM	1	--	2024-04-24
LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes	1	--	2024-04-08
MANATEE(lm): Market Analysis based on language model architectures	1	--	2024-03-20
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a …	1	--	2024-03-13
Serverless Image Similarity with Upstash Vector and HuggingFace Spaces	1	--	2024-02-02
Dutch Drug-Related Text Classification Model by NOS	1	--	2024-01-25
Implement Fractional GPUs in Kubernetes to save upto 50% cost	1	--	2024-01-22
The next person that says textual modalities gets it	1	--	2024-01-10
LLaMA Pro: Progressive LLaMA with Block Expansion	1	--	2024-01-05
Halo: Open-Source Health Tracking with Wearables	1	--	2024-11-20
Releasing the largest multilingual open pretraining dataset	1	--	2024-11-14
Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for …	1	--	2024-11-12
Providing Open Investment Data – 25 years of data	1	--	2024-11-11
New Sota Text to Image	1	--	2024-10-31
Stable Diffusion 3.5 Medium	1	--	2024-10-29
Kolors Virtual Try-On in the Wild	1	--	2024-10-28
Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval	1	--	2024-10-23
Stable Diffusion 3.5-large released	1	--	2024-10-22
Transformers.js v3: WebGPU Support, New Models and Tasks, and More	1	--	2024-10-22
Allegro – New Open Source Text to Video Generator from Rhymes AI	1	--	2024-10-22
Distilabel Synthetic Data Generator on Hugging Face	1	--	2024-10-17
HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance	1	--	2024-10-17
Show HN: A dataset of all HN submission texts (2006-2024) in Markdown	1	--	2024-10-13
Scaling AI-Based Data Processing with Hugging Face and Dask	1	--	2024-10-10
LLMs Know More Than They Show	1	--	2024-10-08
Document Similarity Search with ColPali	1	--	2024-09-29
Prithvi WxC: Foundation Model for Weather and Climate	1	--	2024-09-24
Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance	1	--	2024-09-24
HN-Style HuggingFace Daily Papers	1	--	2024-09-22
Qwen2.5-Coder Technical Report	1	--	2024-09-21
Introducing Community Tools on HuggingChat	1	--	2024-09-20
InkubaLM-0.4B: Small language model for low-resource African Languages	1	--	2024-08-29
Diffusion models are real time game engines	1	--	2024-08-29
Everchanging Quest: Rogue-like game powered by LLMs	1	--	2024-08-21
xLSTM Model Trained on Music	1	--	2024-08-16
Qwen2-VL	1	--	2024-08-14
Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters	1	--	2024-08-07
Depth Compare – A Hugging Face space to compare different depth models	1	--	2024-07-29
Insilico Medicine on Hugging Face	1	--	2024-07-27
LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs	1	--	2024-07-26
Spreadsheetllm: Encoding Spreadsheets for Large Language Models	1	--	2024-07-24
Followgraph for Hugging Face	1	--	2024-07-23
Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text …	1	--	2024-07-23
Scaling Diffusion Transformers to 16B Parameters	1	--	2024-07-19
DeepSeek v2 Chat (0628) released	1	--	2024-07-18
The Rise of Agentic Data Generation	1	--	2024-07-15
Fast SD3 Medium	1	--	2024-07-10
Agentic RAG: query reformulation and self-query	1	--	2024-07-08
Meta LLM Compiler	1	--	2024-06-29
Allegro-TI2V: an open source video generation model	1	--	2024-11-27
PR Puppet Sora	1	--	2024-11-27
Lightricks/LTX-Video – first real-time video generation model	1	--	2024-11-23
PaliGemma 2 – New vision language models by Google	1	--	2024-12-05
Open Source Developers Guide to the EU AI Act	1	--	2024-12-03
LM Studio using models from Hugging Face	1	--	2024-12-02
IC Light – Shade Generation Model	1	--	2024-12-02
ModernBERT	1	--	2024-12-20
Show HN: A ML powered text moderation model that outperforms Open AI	1	--	2024-12-14
Help Us Rank the Best Background Removal Tools	1	--	2024-12-11
I need your help to create brain-rot dataset	1	--	2024-12-08
Phi-4 GGUF	1	--	2024-12-14
HunyuanVideo and Diffusers Made Easy	1	--	2024-12-11

Plushcap, by Matt Makai. 2021-2026.

HuggingFace on HN