Home / Companies / HuggingFace / Hacker News

HuggingFace on HN

447 posts with 1+ points in 2024

Filters
Year:
Posts by Month (447 total)
Hacker News Posts
Title Points Comments Date
Uncensor any LLM with abliteration 586 -- 2024-06-13
Llama-3.3-70B-Instruct 425 -- 2024-12-06
A Replacement for BERT 348 -- 2024-12-19
Microsoft Phi-2 model changes licence to MIT 240 -- 2024-01-06
Space secrets leak disclosure 197 -- 2024-06-01
Best 7B LLM on leaderboards made by an amateur following a medium … 181 -- 2024-01-05
Llama 3 8B is almost as good as Wizard 2 8x22B 168 -- 2024-04-19
Nvidia releases NVLM 1.0 72B open weight model 167 -- 2024-10-02
Explaining the SDXL Latent Space 163 -- 2024-02-05
Hugging Face and Google partner for AI collaboration 152 -- 2024-01-25
A CC-By Open-Source TTS Model with Voice Cloning 131 -- 2024-11-04
FineWeb: Decanting the web for the finest text data at scale 127 -- 2024-06-02
HuggingChat: Chat with Open Source Models 103 -- 2024-02-21
More than 80 AI models from Qualcomm 95 -- 2024-02-28
LLaMA-Pro-8B 94 -- 2024-01-06
Apple/OpenELM: Efficient Open-Source Family Language Models 82 -- 2024-04-24
YouTube-Commons: Audio transcripts of 2,063,066 YouTube videos, CC-By license 75 -- 2024-04-18
Show HN: Simply Reading Analog Gauges – GPT4, CogVLM Can't 66 -- 2024-01-22
MSFT's WizardLM2 models have been taken down 58 -- 2024-04-16
LiteLlama-460M-1T has 460M parameters trained with 1T tokens 54 -- 2024-01-07
Fine-Tuning LLMs to 1.58bit 52 -- 2024-09-18
LLaMA 3 70B Llamafiles 51 -- 2024-04-19
DeepSeek v3 beats Claude sonnet 3.5 and way cheaper 48 -- 2024-12-26
Improving Parquet Dedupe on Hugging Face Hub 47 -- 2024-10-08
Open-LLM performances are plateauing 46 -- 2024-06-29
Mixtral-8x22B on HuggingFace 33 -- 2024-04-10
General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model 31 -- 2024-09-11
Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat 30 -- 2024-04-12
OpenFLUX.1 30 -- 2024-10-04
Mistral 7B v0.2 29 -- 2024-03-31
Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video 28 -- 2024-04-16
Llama-3.2-3B-Instruct-uncensored 26 -- 2024-09-27
Llama can now see and run on your device – welcome Llama … 26 -- 2024-09-25
New Phi-3.5 Models from Microsoft, including new MoE 25 -- 2024-08-20
LLM: Transformer Is Linear 25 -- 2024-05-24
HuggingFace - Tencent launches Hunyuan Large which outperforms Llama 3.1 405B 23 -- 2024-11-05
Lineage Explorer for open source models – Hugging Face Space 22 -- 2024-01-18
Show HN: Fineweb-Edu-Fortified dataset: Fineweb-Edu deduped, embeddings included 22 -- 2024-08-14
Llama 3.2 21 -- 2024-09-25
Fine-tune and deploy open LLMs as containers using AIKit - Part 1 19 -- 2024-06-06
makeMoE: Implement a Sparse Mixture of Experts LLM from Scratch 19 -- 2024-01-23
HuggingFace to Replace Git LFS with Xet 18 -- 2024-08-23
Fake Insects: a game where you have to identify AI-generated insects 18 -- 2024-08-17
Mixtral-8x22B-Instruct-v0.1 18 -- 2024-04-17
Hermes-2-Pro-Llama-3-8B 18 -- 2024-05-01
StableLM-2-12B 17 -- 2024-04-08
NuExtract: A LLM for Structured Extraction 16 -- 2024-06-29
An Analysis of Chinese LLM Censorship and Bias with Qwen 2 Instruct 16 -- 2024-06-09
Phi-3 Weights Released 16 -- 2024-04-23
New medical LLM beats Med-PaLM-2, GPT-4 on MMLU benchmarks 16 -- 2024-07-31
Miqu 70B – possible leak of the mistral-medium LLM 16 -- 2024-01-29
Ollama can run any GGUF Model on Hugging Face Hub now 15 -- 2024-10-16
Llama-3-70B-Instruct-Gradient-1048k 14 -- 2024-05-04
New finance LLM passed the CFA Level III exam 14 -- 2024-07-31
Run Mistral 7B model using less than 4GB of memory on your … 14 -- 2024-07-23
Stable Diffusion 3 Medium Released 14 -- 2024-06-12
Pre-computed vector embeddings available on HuggingFace 14 -- 2024-01-22
Yi-9B-200K 13 -- 2024-03-17
An Introduction to Vision-Language Modeling 13 -- 2024-05-28
FineWeb: 15T tokens of the finest data the web has to offer 12 -- 2024-04-21
Language model can listen while speaking 12 -- 2024-08-07
ML for 3D Course on Hugging Face 12 -- 2024-05-16
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs 12 -- 2024-04-09
Command-R: open weights 35B params / 128k tokens context length model by … 12 -- 2024-03-11
StarCoder2 and The Stack v2: new code LLMs and dataset 12 -- 2024-02-28
Jamba-v0.1: An Apache 2.0 licensed 52B Mamba Transformer hybrid LLM base model 12 -- 2024-03-28
HuggingFace Is Down 11 -- 2024-02-28
Experiments with Bitnet 1.5 (Ngmi) 11 -- 2024-03-23
FalconMamba 7B: The first attention-free and general-purpose pure Mamba model 11 -- 2024-08-13
NPC-Playground, a 3D playground to interact with LLM-powered NPCs 11 -- 2024-06-05
Open LLM Leaderboard 11 -- 2024-01-02
CryptGPT: A Simple Approach to Privacy-Preserving LLMs Using Vigenere Cipher 10 -- 2024-06-15
Whisperfile 10 -- 2024-08-19
Llava Model for Video 10 -- 2024-05-16
Show HN: Encrypted Credit Card Approval Using Homomorphic Encryption 10 -- 2024-01-31
Vector embeddings model for medical literature 10 -- 2024-01-08
Show HN: Downloadable AI Musical Instruments 10 -- 2024-12-10
Not All Language Model Features Are Linear 9 -- 2024-05-25
Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct 9 -- 2024-10-16
Perspectives for first principles prompt engineering 9 -- 2024-08-20
ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models 9 -- 2024-05-28
Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B 9 -- 2024-01-04
Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B 9 -- 2024-11-18
MIT Researchers Unveil New Method to Improve LLM Inference Performance 9 -- 2024-10-04
Aryn/deformable-detr-DocLayNet – open-source Layout Model 9 -- 2024-07-31
AIMO (AI Math Olympiad) progress prize winning solution 9 -- 2024-07-10
Mistral-7B-v0.3 released on HuggingFace 9 -- 2024-05-22
Microsoft Phi-3 3.8B model with 128k Context 9 -- 2024-04-23
The Stack v2: a 3B files in 600 programming languages dataset 9 -- 2024-03-07
Spaces ZeroGPU: Dynamic GPU Allocation for Spaces 9 -- 2024-12-15
NousResearch/Nous-Hermes-2-Llama-2-70B 8 -- 2024-02-12
Show HN: We made an encrypted DNA testing app using Homomorphic Encryption 8 -- 2024-10-02
NexusRaven-V2-13B 8 -- 2024-01-25
Open-source 70B model surpass GPT-4o and Claude 3.5 on Arena Hard 8 -- 2024-10-15
Llama 3.1 70B compressed by 6.4x using AQLM-PV, now released 8 -- 2024-09-17
Mistral AI Pixtral 8 -- 2024-09-11
Gradio Notebook – Generative AI Notebook Interface for Hugging Face Spaces 8 -- 2024-02-14
Scaling Test Time Compute with Open Models 8 -- 2024-12-16
Phi-3 Technical a Highly Capable Language Model Locally on Your Phone 7 -- 2024-04-23
Am I in the Stack? 7 -- 2024-03-20
Common Corpus: the largest public domain dataset for training LLMs 7 -- 2024-03-20
Hugging Face launches Agents 2.0 7 -- 2024-05-13
OpenHermesPreferences: Dataset of ~1M AI preferences from teknium/OpenHermes-2.5 7 -- 2024-02-26
Mini- Dust3r: A miniature version of dust3r running in a HuggingFace Space 7 -- 2024-05-16
1B+ words corpus of original texts and experimental post-OCR correction output 7 -- 2024-04-26
Show HN: Chess-LLM, using constrained-generation to force LLMs to battle it out 7 -- 2024-03-14
Grandmaster-Level Chess Without Search 7 -- 2024-02-08
Create a Web Interface for Your LLM in Python 7 -- 2024-01-23
New leaderboard drop: Judge Arena 6 -- 2024-11-19
Phased Consistency Model 6 -- 2024-05-29
A Llama 70B finetune that has reflection baked into it's weights 6 -- 2024-09-05
Show HN: Understand politics by visualising manifesto embeddings 6 -- 2024-07-07
Mistral releases the v0.3 of its 7B LLM 6 -- 2024-05-22
Idefics2: A Powerful 8B Vision-Language Model for the Community 6 -- 2024-05-14
Show HN: Open-source LLM for data labeling 6 -- 2024-05-08
Dolphin-2.9-Llama3-8B 6 -- 2024-04-21
Introduction to 3D Gaussian Splatting 6 -- 2024-04-02
Gemma-2 2B beats GPT3.5 on Chatbot Arena 5 -- 2024-07-31
FineWeb-Edu: new 1.3T tokens web dataset 5 -- 2024-06-02
Wall Street Journal Hedcut Stable Diffusion Model 5 -- 2024-01-23
Hertz-dev is an open-source model for full-duplex conversational audio 5 -- 2024-11-16
New Dataset: RedPajama Dynamic Topic Modeling, 100K Docs W Topic Heirarchies 5 -- 2024-11-11
Hugging Face launches HUGS: managed containers for on-premise model deployment 5 -- 2024-10-23
Janus-1.3B: Unifying Multimodal Understanding and Generation 5 -- 2024-10-18
Show HN: Arch-Function: 3B parameter LLM that beats GPT-4o on function calling 5 -- 2024-10-16
Model2Vec: Make sentence transformers 500x faster on CPU, 15x smaller 5 -- 2024-10-16
Whisper-Large-v3-Turbo 5 -- 2024-10-03
Show HN: Automatic chaptering – From raw transcripts to structured documents 5 -- 2024-09-09
TabReD: A Benchmark of Tabular Machine Learning In-the-Wild 5 -- 2024-07-04
Microsoft releases weights for Florence-2 vision model 5 -- 2024-06-19
Phi-3-medium-128k-instruct 5 -- 2024-05-22
Ferret-v2: An Improved Baseline for Referring and Grounding with LLMs 5 -- 2024-04-13
Gretel: Synthetic Text to SQL Dataset 5 -- 2024-04-04
Detecting performance and ethical vulnerabilities in popular Hugging Face models 5 -- 2024-03-21
Design2Code: How Far Are We from Automating Front-End Engineering? 5 -- 2024-03-10
Genie: Generative Interactive Environments 5 -- 2024-02-26
TTS Arena: Benchmarking TTS Models in the Wild 5 -- 2024-02-25
Cosmopedia: the largest synthetic dataset of textbooks generated by Mixtral 5 -- 2024-02-20
Moonshine – open-source, real-time speech-to-text in the browser 5 -- 2024-12-19
Google's Bard surpassing GPT-4, SECOND SPOT on the leaderboard 4 -- 2024-01-26
Octopus V4: a graph of language models 4 -- 2024-05-02
Llama-3 8B Instruct 262k 4 -- 2024-04-26
CodeGemma – an official Google release for code LLMs 4 -- 2024-04-09
Apple Open-Sources LLM DCLM-7B 4 -- 2024-07-19
Open LLM Leaderboard v2 4 -- 2024-06-29
Florence 2, Microsoft OCR Modell 4 -- 2024-06-20
Apple OpenELM Instruct Models 4 -- 2024-04-24
Phi-3 Released 4 -- 2024-04-23
GemMoE: An 8x8 Mixture Of Experts based on Gemma 4 -- 2024-03-13
Pearl-3x7B, an xtraordinary Mixure of Experts (MoE) for data science 4 -- 2024-02-07
Introduction to State Space Models (SSM) 4 -- 2024-01-24
HtmlRAG: HTML Is Better Than Plain Text for RAG Systems 4 -- 2024-11-06
Structured generation with Outlines, now in Rust 4 -- 2024-10-22
Llama 3.2 in the Browser with WebGPU 4 -- 2024-09-30
Multimodal TextImage Augmentation for Document Images 4 -- 2024-09-14
'Reflection 70B' AI model could be the answer to pesky LLM hallucinations 4 -- 2024-09-06
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers 4 -- 2024-08-14
FHE can be leveraged for LLMs such as ChatGPT in a privacy-preserving … 4 -- 2024-08-13
Introduction to Ggml 4 -- 2024-08-13
Google releases Gemma 2 2B, ShieldGemma and Gemma Scope 4 -- 2024-08-01
Gemma 2 2B Release 4 -- 2024-08-01
Extracting Concepts from LLMs: Anthropic's recent discoveries 4 -- 2024-06-08
EasyAnimate: End-to-end solution for high-resolution and long video generation 4 -- 2024-06-04
Grokked Transformers Are Implicit Reasoners 4 -- 2024-05-27
Paligemma: A versatile and lightweight vision-language model (VLM) 4 -- 2024-05-14
4M Context – Llama-3-8B-Instruct 4 -- 2024-05-09
ReFT: Representation Finetuning for Language Models 4 -- 2024-04-05
Embedding Quantization: 25-45x retrieval speedup, 32x or 4x less memory usage 4 -- 2024-03-22
Show HN: Chatbot Guardrails Arena 4 -- 2024-03-21
Quanto: A PyTorch Quantization Toolkit 4 -- 2024-03-18
On-device background removal with Transformers.js 4 -- 2024-02-07
SegMoE: Segmind Mixture of Diffusion Experts 4 -- 2024-02-05
NPHardEval leaderboard a benchmark for assessing the reasoning abilities of LLMs 4 -- 2024-02-03
HuggingChat Assistants: Open source models with custom instructions 4 -- 2024-02-02
From Files to Chunks: Improving HF Storage Efficiency 4 -- 2024-11-20
Show HN: Video Composition Tool Powered by Qwen2.5-Coder and FFmpeg 4 -- 2024-11-24
Show HN: LatComp – Compress your image into a small and reversible … 4 -- 2024-11-30
DeepSeek-V3-Base 4 -- 2024-12-25
Show HN: Turn Any Article into a Conversation-Like Podcast 3 -- 2024-05-22
Open NotebookLM – Generate Podcasts from PDFs Using Open-Source AI 3 -- 2024-10-15
AI has a problem with objectifying women 3 -- 2024-05-28
Linus Torvalds Chat Bot 3 -- 2024-02-02
ChatQA: Building GPT-4 Level Conversational QA Models 3 -- 2024-01-19
Frames: Factuality, Retrieval, and Reasoning MEasurement Set 3 -- 2024-10-01
Show HN: We just dropped a 8B alternative of OpenAI GPT-o1 and … 3 -- 2024-09-20
Chronos-T5 (Tiny) – pretrained time series forecasting models 3 -- 2024-08-14
HF for Legal, an open-source community on Hugging Face 3 -- 2024-07-01
LegalKit, French labeled datasets built for legal ML training 3 -- 2024-06-27
Nvidia releases ChatQA-1.5 in violation of Llama 3 license 3 -- 2024-05-02
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding 3 -- 2024-04-26
Everyone seems to have forgotten about Gemma 3 -- 2024-04-25
Introducing the Open Chain of Thought Leaderboard 3 -- 2024-04-23
Google Gemma 1.1 2B and 7B instruct 3 -- 2024-04-06
Starcoder-2 3 -- 2024-02-28
DevPearl-2x7B, an xtraordinary Mixture of Experts (MoE) for development 3 -- 2024-02-09
Nous-Hermes-2-SOLAR-10.7B 3 -- 2024-01-02
SemScore: Evaluating LLMs with Semantic Similarity 3 -- 2024-11-06
Meta released MobileLLM – 125M, 350M, 600M, 1B model checkpoints 3 -- 2024-10-31
Hugging Face Now Automatically Detects Leaked Secrets 3 -- 2024-09-05
Selective fine-tuning of Language Models with Spectrum 3 -- 2024-09-03
Idefics3: Open multimodal model based on Llama-3.1-8B 3 -- 2024-08-09
New Google Gemma 2 2B model 3 -- 2024-07-31
Fine-Tune Llama 3.1 Ultra-Efficiently with Unsloth 3 -- 2024-07-29
DiLoCo: Distributed Low-Communication Training of Language Models 3 -- 2024-07-26
The largest math dataset of Olympiad problems for training LLMs 3 -- 2024-07-21
SmolLM – Fast and Remarkably Powerful 3 -- 2024-07-16
Whisper WebGPU: Real-time in-browser speech recognition 3 -- 2024-06-08
UGI Leaderboard – Uncensored General Intelligence 3 -- 2024-06-07
Transformers Are SSMs: Generalized Models and Efficient Algorithms Through 3 -- 2024-06-04
Recovering 4D World from Monocular Video 3 -- 2024-05-29
LiteVAE: Lightweight and Efficient Variational Autoencoders for Diffusion Models 3 -- 2024-05-26
Advancing Theorem Proving in LLMs Through Large-Scale Synthetic Data 3 -- 2024-05-26
Phi-3 in-browser inference using WebGPU 3 -- 2024-05-08
Show HN: GPT Fine-Tune Formatter 3 -- 2024-05-07
InstantMesh: Efficient 3D Mesh Generation from a Single Image 3 -- 2024-04-15
Mixture of Finetuned and GPT4 Model 3 -- 2024-04-07
H2O-Danube2-1.8B-Chat 3 -- 2024-04-07
Yi-9B 3 -- 2024-04-05
Dolphin-2.8-mistral-7B-v02 3 -- 2024-04-03
Common Corpus – Start of the largest public domain dataset for training … 3 -- 2024-03-20
MoAI: Mixture of All Intelligence for Large Language and Vision Models 3 -- 2024-03-14
OpenChat-3.5-0106-Gemma 3 -- 2024-03-10
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping 3 -- 2024-02-23
Microsoft's LongRoPE: Extending LLM Context Window Beyond 2M Tokens 3 -- 2024-02-22
Stable Diffusion XL Lightning 3 -- 2024-02-21
Enterprise Scenarios leaderboard evals the perf. of LLMs on enterprise use cases 3 -- 2024-02-03
Show HN: A lineage explorer for open source models and datasets 3 -- 2024-01-23
Aim – An Apple Collection 3 -- 2024-01-19
LLaVA-3B 3 -- 2024-01-01
Dataset Card for 1M Bluesky Posts 3 -- 2024-11-27
New 2B vision language model that consumes the least memory 3 -- 2024-11-26
New synthetic dataset beating MSFT and mistral's SFT recipe 3 -- 2024-11-22
Show HN: MilkDropLM – generate presets for the MilkDrop music visualizer 3 -- 2024-12-06
Quantum+AI Qiskit Code Assistant Open Source model 3 -- 2024-11-27
informatiker/20-million-bluesky-posts 3 -- 2024-11-29
Automated GitHub Issue Creation Using Structured Generation 3 -- 2024-11-29
QwQ-32B-Preview 3 -- 2024-11-27
Welcome to the Falcon 3 Family of Open Models 3 -- 2024-12-17
Meta releases family of multimodal models that comprehend hour-long video 3 -- 2024-12-16
Finding Moroccan Arabic (Darija) in the Fineweb 2 Dataset 3 -- 2024-12-09
Llama 3 8B Instruct quantized with GPTQ to fit in 10gb vRAM 2 -- 2024-04-19
Try Qwen2.5-Coder-32B on HuggingChat 2 -- 2024-11-12
An orthogonalized AI to introduce an unengaged melancholic style 2 -- 2024-06-13
Pearl-7B-slerp, an xtraordinary 7B model for maths 2 -- 2024-02-05
Duckdb-nsql: 7B parameter text-to-SQL model by MotherDuck and Numbers Station 2 -- 2024-01-28
7B model from Snorkel tops Alpaca Eval 2.0 leaderboard 2 -- 2024-01-24
LongVU – New Video LLM from Meta 2 -- 2024-10-24
Hacker News Comments Dataset 2 -- 2024-10-11
HuggingFace Accelerate 1.0.0 2 -- 2024-10-07
Mistral-Small-Instruct-2409 2 -- 2024-09-17
HuggingChat: Chat with Llama 3.1 (70B and 405B) 2 -- 2024-07-23
Ocean Biodiversity Information System on Hugging Face 2 -- 2024-07-21
CommonCanvas image generation from CC-licensed images – models, dataset released 2 -- 2024-06-07
Show HN: PodGen generate podcasts on any topic 2 -- 2024-06-01
Meteor: Mamba-Based Traversal of Rationale for Large Language and Vision Models 2 -- 2024-05-28
The Waifu Research Department 2 -- 2024-05-16
Yi-1.5 LLM Models Released 2 -- 2024-05-12
Fietje: An open and efficient LLM for Dutch 2 -- 2024-05-02
Simple Multimodal LLM from Scratch 2 -- 2024-04-23
Stability Releases Code Instruct 3B 2 -- 2024-04-02
Mistral 7B v0.2 2 -- 2024-04-01
PolarsBot, a New HuggingChat Assistant 2 -- 2024-03-25
Easy and low cost model training on HF "DGX cloud" 2 -- 2024-03-19
Pearl-7B-0211 LLM now exceeds 75 in the average score of the HF's … 2 -- 2024-02-19
LLMs can learn useful guidelines from their own mistakes 2 -- 2024-02-12
Pearl-7B-0210-dare now sits next to the best 7Bs on HF Leaderboard 2 -- 2024-02-11
Aanaphi-2 3B 2 -- 2024-02-09
Playground for Hugging Face Models 2 -- 2024-02-05
Hallucinations Leaderboard 2 -- 2024-01-29
Fine-tune Wav2Vec2-BERT for low resource speech recognition 2 -- 2024-01-23
InstantID Demo: Zero-Shot Identity-Preserving Generation in Seconds 2 -- 2024-01-22
Yayi2-30B-Llama 2 -- 2024-01-01
Pixtral-Large-Instruct-2411 2 -- 2024-11-18
FLUX.1-Dev LoRA Outfit Generator by TryOn Labs 2 -- 2024-11-06
Contextual Document Embeddings 2 -- 2024-11-01
Code a Simple RAG from Scratch – Hugging Face Community Article 2 -- 2024-10-30
OmniParser for Pure Vision Based GUI Agent 2 -- 2024-10-25
Hugs – Scale Your AI with Open Models 2 -- 2024-10-23
Wpaigpt-SQL-01: text-to-SQL model designed for WordPress and WordPress plugins 2 -- 2024-10-23
Pickle Scanning 2 -- 2024-10-23
New Video Generation Model:Allegro 2 -- 2024-10-22
TxT360 2 -- 2024-10-18
Dataset About Where 30k+ Startups Trend 2 -- 2024-10-18
Nvidia Nemotron 2 -- 2024-10-17
Fixing Gradient Accumulation 2 -- 2024-10-16
Animate-X: Universal Character Image Animation with Enhanced Motion 2 -- 2024-10-15
SOTA Open Source Text to Video Model 2 -- 2024-10-14
Exploring the Daily Papers Page on Hugging Face 2 -- 2024-09-24
Multilingual MMLU Dataset from OpenAI (OpenAI/Mmmlu) 2 -- 2024-09-23
Recreating o1 at Home with Role-Play LLMs 2 -- 2024-09-21
FineVideo: Annotated YouTube Dataset by HuggingFace 2 -- 2024-09-12
Remove Background by Text 2 -- 2024-09-12
Labeled Image generation using Meta Llama 3.5 2 -- 2024-08-31
Scaling robotics datasets with video encoding 2 -- 2024-08-30
New FashionCLIP and SigLIP Classification Demo 2 -- 2024-08-28
Mozilla/TriLM-Llamafile · Hugging Face 2 -- 2024-08-26
Play: How random can a human brain truly be? 2 -- 2024-08-24
FLUX.1 [Schnell] – a Hugging Face Space by black-forest-labs 2 -- 2024-08-21
Flux Dev 1 model that creates half_illustration images 2 -- 2024-08-21
LLMs as Image Generators with Canonical Codec Representations 2 -- 2024-08-19
Instant in-browser demo of SmolLM 2 -- 2024-08-18
Marqo-FashionCLIP: New Embedding Model for Fashion 2 -- 2024-08-14
A Large-Scale Multimodal Dataset with Multigranular Annotations for Medicine 2 -- 2024-08-07
Generate and Export Segmentation Masks Using Meta's SAMv2 2 -- 2024-07-31
HuggingChat: Chat with Llama 3.1 405B 2 -- 2024-07-25
Meta-Llama-3.1-405B 2 -- 2024-07-23
Apple's DCLM model shares data&training code with weights 2 -- 2024-07-20
Predicting Multiplication with GPT-2 2 -- 2024-07-20
Qwen2 Technical Report 2 -- 2024-07-16
Gemma-2-27B-it llamafile 2 -- 2024-07-03
OpenRAIL: Towards open and responsible AI licensing frameworks (2022) 2 -- 2024-07-03
New LLM Agent writing actions in Python code tops the GAIA agent … 2 -- 2024-07-01
Stable Diffusion 3 Medium Online Demo, Free 2 -- 2024-06-12
To Believe or Not to Believe Your LLM 2 -- 2024-06-11
Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-Modal LLMs 2 -- 2024-06-04
Map-Neo: Highly Capable and Transparent Bilingual Large Language Model Series 2 -- 2024-05-31
Training and Finetuning Embedding Models with Sentence Transformers v3 2 -- 2024-05-30
ChatTTS – open-source TTS model designed specifically for dialogue scenario 2 -- 2024-05-29
Matryoshka Multimodal Models 2 -- 2024-05-28
Aya 23: Open Weight Releases to Further Multilingual Progress 2 -- 2024-05-28
HuggingFace Hub Incident Post Mortem 2 -- 2024-05-24
Cohere Updates Weights for Aya 2 -- 2024-05-23
Hugging Face on AMD Instinct MI300 GPU 2 -- 2024-05-23
Show HN: Generate a Quiz from Any Url 2 -- 2024-05-17
Show HN: EmuBert – the first open encoder model for Australian law 2 -- 2024-05-14
New Yi 1.5 models under Apache 2.0 2 -- 2024-05-12
Building Cost-Efficient Enterprise RAG Applications 2 -- 2024-05-10
Google codegemma-1.1-7B-it 2 -- 2024-05-03
Introduction to Matryoshka Embedding Models 2 -- 2024-05-03
Iterative Reasoning Preference Optimization 2 -- 2024-05-02
GPT-2 2 -- 2024-05-01
Fine-tune Llama 3 with ORPO 2 -- 2024-04-23
In-browser text-to-music generation using musicgen-small 2 -- 2024-04-20
Compression Represents Intelligence Linearly 2 -- 2024-04-16
Bringing serverless GPU inference to Hugging Face users 2 -- 2024-04-16
From Words to Numbers: Your LLM Is a Capable Regressor 2 -- 2024-04-12
Zephyr-orpo-141B-A35B: Mixtral 8x22B fine-tune by HuggingFace 2 -- 2024-04-11
TinyTimeMixer: Open-source time series LLM by IBM 2 -- 2024-04-09
Visual Autoregressive Modeling: Scalable Image Generation W NextScale Prediction 2 -- 2024-04-05
Command R+ 2 -- 2024-04-04
Demo of Moondream2 vision language model running in browser 2 -- 2024-04-03
Mini-Jamba 2 -- 2024-04-01
Transformer-Lite: High-Efficiency Deployment of LLMs on Mobile Phone GPUs 2 -- 2024-04-01
The Era of 1-Bit LLMs: All Large Language Models Are in 1.58 … 2 -- 2024-03-25
Cosmopedia: How to create large-scale synthetic data for pre-training 2 -- 2024-03-21
Playground-v2.5-1024px-Aesthetic 2 -- 2024-03-16
Gemini 1.5: Unlocking multimodal understanding across tokens of context 2 -- 2024-03-15
Better RAG 1: Advanced Basics 2 -- 2024-03-15
Cerebrum 7B – Mistral fine-tune created specifically for reasoning tasks 2 -- 2024-03-13
LLM Red-Teaming Resistance Leaderboard 2 -- 2024-03-01
Show HN: Visualize how you split your document into chunks for RAG … 2 -- 2024-02-27
From OpenAI to Open LLMs with Messages API on Hugging Face 2 -- 2024-02-23
C4: colossal cleaned version of Common Crawl's web crawl corpus 2 -- 2024-02-21
Constitutional AI with Open LLMs 2 -- 2024-02-01
Show HN: 2x Faster Stable Diffusion Models on Hugging Face with Pruna … 2 -- 2024-01-31
AMUSEd: Efficient Text-to-Image Generation 2 -- 2024-01-29
Minillama – 4.1 MB LLM for testing 2 -- 2024-01-20
StableLM 2 Zephyr 1.6B 2 -- 2024-01-20
Local vector embeddings index for analyzing ArXiv papers 2 -- 2024-01-17
Stable Zero123 Model Weights get Released. Text to 3D and image to … 2 -- 2024-01-15
Make LLM Fine-Tuning 2x Faster with Unsloth and HuggingFace TRL 2 -- 2024-01-10
OpenChat-3.5 Update 0106: ChatGPT-level performances accessible locally 2 -- 2024-01-10
Revolutionizing AI with Audio Classification via Computer Vision 2 -- 2024-01-02
OpenGPT-X 2 -- 2024-11-26
Show HN: AI Hackathon_ Prize 20K USD '1-Min Creative Innovation with AI' 2 -- 2024-11-28
The Lichess database is now on Hugging Face 2 -- 2024-12-06
LLM Comparison/Test: 25 SOTA LLMs (Including QwQ) Through 59 MMLU-Pro CS Runs 2 -- 2024-12-05
Releasing: A dataset of two million Bluesky posts 2 -- 2024-11-27
Just launched MilkDropLM model using 32B parameters 2 -- 2024-12-20
FineMath: the best public math pre-training dataset 2 -- 2024-12-19
I-JEPA Hugginface 2 -- 2024-12-09
FineWeb2 dataset: A sparkling update with 1000s of languages 2 -- 2024-12-08
Polish linguistic and cultural competency benchmark for LLMs 2 -- 2024-12-31
Show HN: Embedding model for PDF page retrieval 1 -- 2024-08-08
Nvidia Just Published ChatQA 1.5, a Llama3 QA/RAG Finetune 1 -- 2024-05-02
Get Insulted by AI 1 -- 2024-02-25
Launch of F.ai Fuzer v0.1 on HuggingFace Space using Gradio 1 -- 2024-07-29
SmolLM2: The new, best, and open small language model 1 -- 2024-11-01
The Romulus model series has been released on Hugging Face 1 -- 2024-09-11
I added context data to the TruthfulQA dataset 1 -- 2024-08-10
Chinese AI Community: open-source Heatmap 1 -- 2024-07-31
Multi-token prediction models and baselines 1 -- 2024-07-04
Stupid Filter Corpus (2007) 1 -- 2024-05-24
MMLU-Pro: Advanced edition of MMLU & new Leaderboard 1 -- 2024-05-15
Ratchet and Phi 3 1 -- 2024-05-01
Snowflake Arctic Instruct Open LLM 1 -- 2024-04-24
LegalKit Retrieval, binary Search with int8 Rescoring through French legal codes 1 -- 2024-04-08
MANATEE(lm): Market Analysis based on language model architectures 1 -- 2024-03-20
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-Tuning on a … 1 -- 2024-03-13
Serverless Image Similarity with Upstash Vector and HuggingFace Spaces 1 -- 2024-02-02
Dutch Drug-Related Text Classification Model by NOS 1 -- 2024-01-25
Implement Fractional GPUs in Kubernetes to save upto 50% cost 1 -- 2024-01-22
The next person that says textual modalities gets it 1 -- 2024-01-10
LLaMA Pro: Progressive LLaMA with Block Expansion 1 -- 2024-01-05
Halo: Open-Source Health Tracking with Wearables 1 -- 2024-11-20
Releasing the largest multilingual open pretraining dataset 1 -- 2024-11-14
Qwen 2.5 Coder: LLM model based on Qwen 2.5 architecture optimised for … 1 -- 2024-11-12
Providing Open Investment Data – 25 years of data 1 -- 2024-11-11
New Sota Text to Image 1 -- 2024-10-31
Stable Diffusion 3.5 Medium 1 -- 2024-10-29
Kolors Virtual Try-On in the Wild 1 -- 2024-10-28
Google Shopping 10M Dataset: One of the Largest for Multimodal Product Retrieval 1 -- 2024-10-23
Stable Diffusion 3.5-large released 1 -- 2024-10-22
Transformers.js v3: WebGPU Support, New Models and Tasks, and More 1 -- 2024-10-22
Allegro – New Open Source Text to Video Generator from Rhymes AI 1 -- 2024-10-22
Distilabel Synthetic Data Generator on Hugging Face 1 -- 2024-10-17
HF's Open LLM Leaderboard releases Comparator to drill down in LLM performance 1 -- 2024-10-17
Show HN: A dataset of all HN submission texts (2006-2024) in Markdown 1 -- 2024-10-13
Scaling AI-Based Data Processing with Hugging Face and Dask 1 -- 2024-10-10
LLMs Know More Than They Show 1 -- 2024-10-08
Document Similarity Search with ColPali 1 -- 2024-09-29
Prithvi WxC: Foundation Model for Weather and Climate 1 -- 2024-09-24
Show HN: Fusion-Guide: A Model for Generating Cot Reasoning and Guidance 1 -- 2024-09-24
HN-Style HuggingFace Daily Papers 1 -- 2024-09-22
Qwen2.5-Coder Technical Report 1 -- 2024-09-21
Introducing Community Tools on HuggingChat 1 -- 2024-09-20
InkubaLM-0.4B: Small language model for low-resource African Languages 1 -- 2024-08-29
Diffusion models are real time game engines 1 -- 2024-08-29
Everchanging Quest: Rogue-like game powered by LLMs 1 -- 2024-08-21
xLSTM Model Trained on Music 1 -- 2024-08-16
Qwen2-VL 1 -- 2024-08-14
Scaling LLM Test-Time Compute More Effective Than Scaling Model Parameters 1 -- 2024-08-07
Depth Compare – A Hugging Face space to compare different depth models 1 -- 2024-07-29
Insilico Medicine on Hugging Face 1 -- 2024-07-27
LAVE: Zero-Shot VQA Evaluation on Docmatix with LLMs 1 -- 2024-07-26
Spreadsheetllm: Encoding Spreadsheets for Large Language Models 1 -- 2024-07-24
Followgraph for Hugging Face 1 -- 2024-07-23
Show HN: Variable-length (up to 47s) stereo audio at 44.1kHz from text … 1 -- 2024-07-23
Scaling Diffusion Transformers to 16B Parameters 1 -- 2024-07-19
DeepSeek v2 Chat (0628) released 1 -- 2024-07-18
The Rise of Agentic Data Generation 1 -- 2024-07-15
Fast SD3 Medium 1 -- 2024-07-10
Agentic RAG: query reformulation and self-query 1 -- 2024-07-08
Meta LLM Compiler 1 -- 2024-06-29
Allegro-TI2V: an open source video generation model 1 -- 2024-11-27
PR Puppet Sora 1 -- 2024-11-27
Lightricks/LTX-Video – first real-time video generation model 1 -- 2024-11-23
PaliGemma 2 – New vision language models by Google 1 -- 2024-12-05
Open Source Developers Guide to the EU AI Act 1 -- 2024-12-03
LM Studio using models from Hugging Face 1 -- 2024-12-02
IC Light – Shade Generation Model 1 -- 2024-12-02
ModernBERT 1 -- 2024-12-20
Show HN: A ML powered text moderation model that outperforms Open AI 1 -- 2024-12-14
Help Us Rank the Best Background Removal Tools 1 -- 2024-12-11
I need your help to create brain-rot dataset 1 -- 2024-12-08
Phi-4 GGUF 1 -- 2024-12-14
HunyuanVideo and Diffusers Made Easy 1 -- 2024-12-11