HuggingFace Hacker News

Filters

Min points: 1 10 25 50 100 250 500

Year:

Posts by Month (297 total)

Hacker News Posts

Search:

Title	Points	Comments	Date
MonadGPT – What would have happened if ChatGPT was invented in the …	323	--	2023-11-24
LLM in a Flash: Efficient LLM Inference with Limited Memory	252	--	2023-12-20
Falcon 180B	238	--	2023-09-06
OpenLLaMA 13B Released	229	--	2023-06-18
Hugging Face Releases Agents	214	--	2023-05-10
BigCode Project Releases StarCoder: A 15B Code LLM	185	--	2023-05-04
StackLlama: A hands-on guide to train LlaMa with RLHF	165	--	2023-04-06
Mistral-8x7B-Chat	131	--	2023-12-10
Yi-34B-Chat	115	--	2023-11-24
GPT-3.5 and Wolfram Alpha via LangChain	107	--	2023-01-18
The Falcon has landed in the Hugging Face ecosystem	105	--	2023-06-05
Hugging Face and AWS partner to make AI more accessible	102	--	2023-02-21
HuggingFace Training Cluster as a Service	101	--	2023-09-05
Segmind Stable Diffusion – A smaller version of Stable Diffusion XL	95	--	2023-10-25
HuggingChat	93	--	2023-04-25
Yarn-Mistral-7B-128k	88	--	2023-11-11
Sparse LLM Inference on CPU: 75% fewer parameters	78	--	2023-10-19
Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022)	73	--	2023-11-20
Multimodal Neurons in Pretrained Text-Only Transformers	66	--	2023-08-04
HuggingChat – ChatGPT alternative with open source models	61	--	2023-12-15
OpenLLaMA 7B Training Completed to 1T Tokens	58	--	2023-06-07
Phi-2	57	--	2023-12-13
Dolphin-2_6-Phi-2	56	--	2023-12-24
Alibaba releases 72B LLM with 32k context length	55	--	2023-11-30
Open LLAMA 13B released, trained on 1T tokens	47	--	2023-06-19
4-Bit Quantization and QLoRA	41	--	2023-05-25
BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat	40	--	2023-05-19
What's Going on with the Open LLM Leaderboard?	40	--	2023-06-23
Kai-Fu Li's Yi-34B uses exactly Llama's architecture except for 2 tensor renamed	39	--	2023-11-14
Zephyr 7B – Mistral Finetune that responds like ChatGPT	37	--	2023-10-15
Whisper Jax: Transcribe a 1 hour of audio in under 15 seconds	36	--	2023-04-22
MistralLite by Amazon Web Services	34	--	2023-11-01
Mixture of Experts Explained	29	--	2023-12-11
TinyLlama at 2T of 3T	29	--	2023-11-19
Real-Time Latent Consistency Model	27	--	2023-10-30
Language Modeling Is Compression	27	--	2023-09-21
Pixel Art XL: Stable Diffusion XL for Pixel Art	26	--	2023-08-03
UC Berkeley's open-source Vicuna LLM chatbot released new improved model weights	26	--	2023-04-14
Llama 1.3B Trained on 200B Tokens for Commercial Use	25	--	2023-04-28
NousResearch/Nous-Hermes-2-Yi-34B	24	--	2023-12-26
Accelerating Stable Diffusion XL Inference with Jax on Cloud TPU v5e	23	--	2023-10-03
Llama 22B: 13B V2 with 33B attention heads frankensteined on	22	--	2023-08-18
Mistral-7B-OpenOrca. First 7B model to beat all other models <30B	21	--	2023-10-02
Würstchen: Fast Diffusion for Image Generation	21	--	2023-09-13
AMD and: Large Language Models Out-of-the-Box Acceleration with AMD GPU	19	--	2023-12-13
Encrypted Large Language Models with Homomorphic Encryption	18	--	2023-08-03
Orca 2: Teaching Small Language Models How to Reason	18	--	2023-11-21
Show HN: MiniSearch, a minimalist search engine with integrated browser-based AI	17	--	2023-10-15
Gemini vs. GPT-4V: A Preliminary Comparison Through Qualitative Cases	17	--	2023-12-28
Una-Cybertron-7B	17	--	2023-12-08
GPT Baker lets you build your own open-source GPTs	17	--	2023-11-23
Deploy Livebook (Elixir) Notebooks as Apps to Hugging Face Spaces	17	--	2023-06-15
ChatRWKV	17	--	2023-03-23
Airoboros-13B: 98% against GPT-3.5	14	--	2023-05-22
Create a GPT3 powered Q&A Chatbot for any GitHub repo by posting …	13	--	2023-02-05
Attention Sinks in LLMs for endless fluency	12	--	2023-10-09
Idefics: Open Access 60B multimodal model	12	--	2023-08-22
30B uncensored OSS model with no guardrails	11	--	2023-11-07
Hierarchical Masked 3D Diffusion Model for Video Outpainting	11	--	2023-09-06
Shallow Feed-Forward Neural Networks as Alternative to Attention in Transformers	11	--	2023-11-21
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting	10	--	2023-09-11
Origin of LLMs: An Evolutionary Tree and Graph for 15K Large Language …	10	--	2023-07-20
Show HN: Image Filtering App Using Homomorphic Encryption	10	--	2023-02-23
Stable Diffusion XL Inpainting model released	9	--	2023-09-01
Opentensor and Cerebras announce BTLM-3B-8K, a leading 3B param. language model	9	--	2023-07-24
LLM Arena. Mistral-small best open model. Gemini Pro beaten by 2 open …	9	--	2023-12-17
Meta-llama (Meta Llama 2)	9	--	2023-07-18
Summary of the Tokenizers	9	--	2023-02-07
Gradio-Lite: Serverless Gradio Running in the Browser	8	--	2023-10-25
Show HN: Parley: The RPG where you Negotiate with Bandits	8	--	2023-04-26
Generate 1 page comic by text	8	--	2023-09-03
Drag Your GAN: Interactive Point-Based Manipulation on Generative Image Manifold	8	--	2023-05-23
Show HN: Open-source model to chat with your documents/data	8	--	2023-08-14
Yes, Transformers Are Effective for Time Series Forecasting (+ Autoformer)	8	--	2023-06-25
Hugging Face OpenAssistant	8	--	2023-06-24
Dataset of 35,316,999 HackerNews Posts and Comments (2006 – 2023)	8	--	2023-04-24
Show HN: Athelas – Automagically Repair Broken Code	8	--	2023-01-03
TinyStories: How Small Can Language Models Be and Still Speak Coherent English?	7	--	2023-05-16
Introducing “Clerkie“: A LangChain Q&A bot for AI developers	7	--	2023-01-18
Microsoft's Orca 7B may violate OpenAI's Terms of Use	7	--	2023-12-05
Stable Beluga 2 – Llama2 70B finetuned on an Orca style Dataset …	7	--	2023-07-28
Databricks’ dolly-v2-12B, an instruction-following large language model	7	--	2023-04-12
Cerebras releases its own open source GPT models (Apache 2.0 License)	7	--	2023-03-28
Show HN: Interactively explore your Hugging Face dataset with one line of …	7	--	2023-10-25
CodeFusion: A Pre-Trained Diffusion Model for Code Generation	6	--	2023-10-30
OpenChat 3.5: 7B model with comparable perf to ChatGPT	6	--	2023-11-02
Generate Illusions with Stable Diffusion	6	--	2023-09-16
Mann-E, an open source Equivalent of Midjourney reached its version 4.1.3	6	--	2023-03-04
Qwen is a large language model series by Alibaba Cloud	6	--	2023-09-27
Show HN: TCO Calculator to compare on-prem LLM deployment vs. OpenAI and …	6	--	2023-08-21
Llama-2-70B-instruct-v2	6	--	2023-08-03
Falcon 40B-Instruct GGML	6	--	2023-06-15
RWKV – An RNN with the Advantages of a Transformer	6	--	2023-05-15
Assisted Generation: a new direction toward low-latency text generation	6	--	2023-05-11
Databricks Publishes a Version of Dolly LLM to Hugging Face	6	--	2023-03-30
TinyLlama a 1.1B Llama model trained on 3T tokens reaches 1.0 release	5	--	2023-12-31
New Mixtral HQQ Quantzied 4-bit/2-bit configuration	5	--	2023-12-18
Personal co-pilot with a fine-tuning and a VSCode extension	5	--	2023-10-31
Segment Anything Model (Sam) in the Browser with Rust and WASM	5	--	2023-09-16
SD-XL 1.0 Model Card	5	--	2023-07-26
AI Policy: Open ML Considerations in the EU AI Act	5	--	2023-07-26
Modified Version of Apache 2.0 License with Royalty Payments	5	--	2023-05-26
Creating a Coding Assistant with StarCoder	5	--	2023-05-10
DeciLM-7B	5	--	2023-12-12
Nash Learning from Human Feedback	5	--	2023-12-05
Real-time image generation demo on Gradio	5	--	2023-11-12
Convert a transformers model to Core ML	5	--	2023-04-06
Wikipedia Txtai Embeddings Index	5	--	2023-03-21
Show HN: Get the gist of anyone's Twitter feed	5	--	2023-02-24
Solar 10.7B: Elevating AI, Effortlessly	4	--	2023-12-27
WhiteRabbitNeo model series can be used for offensive/defensive cybersecurity	4	--	2023-12-20
Eric Hartford releases uncensored dolphin-2.5-mixtral-8x7B	4	--	2023-12-14
XTTS: New Generative model for Voice (weights released on HF)	4	--	2023-09-15
Prompt Injection Detection Model	4	--	2023-06-14
Distributed Inference and Fine-Tuning of Large Language Models over the Internet	4	--	2023-12-17
Distil-Whisper: Distil-Small.en	4	--	2023-12-14
2-bit and 4-bit versions of Mixtral	4	--	2023-12-11
Nous-Capybara-34B-200k	4	--	2023-11-14
An open-source and privacy-by-design Conversational AI in-browser	4	--	2023-09-22
Large Language Models for Compiler Optimization	4	--	2023-09-14
Gaussian viewer streaming splats in web browser	4	--	2023-09-12
Puma: Secure Inference of LLaMA-7B in Five Minutes	4	--	2023-07-25
FreeWilly2: New LLM from Stability AI	4	--	2023-07-24
40B LLM wants to charge 10% royalty on revenue?	4	--	2023-05-26
Falcon-40B	4	--	2023-05-26
Fully Open Source LLM Chat App – Chat about the Transformers Docs	4	--	2023-03-14
TinyLlama Reaches 3T Checkpoint	4	--	2023-12-28
Obsidian-3B	4	--	2023-11-25
Yarn-Llama-2-70B-32k	4	--	2023-11-20
SDXL in 4 steps with Latent Consistency LoRAs	4	--	2023-11-09
Zephyr 7B	4	--	2023-10-27
Apple/coreml-stable-diffusion-XL-base-iOS	4	--	2023-09-30
DeepSpeed-Chat: Easy RLHF Training of ChatGPT-Like Models at All Scales	4	--	2023-08-04
Deploy LLMs with Hugging Face Inference Endpoints	4	--	2023-07-04
Instruct-Codegen: open-source instruction following codegen model	4	--	2023-05-27
MPT-7B-StoryWriter-65k+: LLM for super long contexts (Apache 2.0)	4	--	2023-05-05
BioGPT for Biomedical Scientific Discovery	4	--	2023-02-07
Using LoRA for Efficient Stable Diffusion Fine-Tuning	4	--	2023-01-26
MiniLM-L6-v2 maps paragraphs to 384 dimension vector for clustering or search	3	--	2023-03-21
Phi-1.5 (1.3B Outperforms Llama 2 7B)	3	--	2023-09-12
GPT-2B-001	3	--	2023-04-20
10.7B Solar: Elevating Performance with Upstage Depth Up Scaling	3	--	2023-12-18
Voice Chat with Mistral 7B	3	--	2023-10-16
Hugging Face partner with AMD to accelerate state-of-the-art models	3	--	2023-06-14
Solar 10.7B	3	--	2023-12-27
Transformer.js: Machine Learning for the Web	3	--	2023-12-09
PixArt-α: Fast Training of Diffusion Transformer for Text-to-Image Synthetis	3	--	2023-12-04
Laiyer AI Released Its Open Source Prompt Injection Model	3	--	2023-11-29
LZMD: Lempel-Ziv Montecarlo Diffusion file format	3	--	2023-11-29
Faster MusicGen Generation with Streaming	3	--	2023-10-06
Llama 2 on Amazon SageMaker a Benchmark	3	--	2023-09-26
LoRA Roulette	3	--	2023-09-22
Open-source AI Discord bots with HuggingFace	3	--	2023-08-17
StableBeluga-7B	3	--	2023-07-29
MPT-30B – Apache 2.0 licensed LLM	3	--	2023-07-22
Show HN: I created a first-of-its-kind open corpus of Australian law	3	--	2023-06-26
Show HN: DocsGPT-7B – purpose optimised and finetuned model for documentation QA	3	--	2023-06-16
Alpaca Dataset Translated into Polish	3	--	2023-04-12
Dolphin-2.6-Mistral-7B	3	--	2023-12-29
MonadGPT	3	--	2023-12-28
MiniMA-2-3B	3	--	2023-12-27
WaveCoder: Widespread Versatile Enhanced Instruction Tuning with Refine Data Gen	3	--	2023-12-26
StarVector: Generating Scalable Vector Graphics Code from Images	3	--	2023-12-20
AITube - Youtube but everything is AI generated	3	--	2023-12-15
Refact-1.6B	3	--	2023-12-08
Llama-2-7B-chat-mlx for Apple’s new MLX framework	3	--	2023-12-06
NeuralHermes-2.5-Mistral-7B	3	--	2023-11-29
Tulu-2-Dpo-70B	3	--	2023-11-21
Show HN: New Launch OrionStar-Yi-34B-Chat beats Llama2-70B and GPT-3.5-turbo	3	--	2023-11-20
Nvidia nemotron-3-8B-base-4k	3	--	2023-11-16
Optimizing LLMs in Production	3	--	2023-11-15
HuggingFace Daily Papers	3	--	2023-11-14
Make your llama generation time fly with AWS Inferentia2	3	--	2023-11-11
Show HN: Face-Stylization – Create face styling with just 8 images	3	--	2023-11-09
Document Question Answering	3	--	2023-10-30
Apple's LLMs and other GenAI models on HuggingFace	3	--	2023-10-19
Using HuggingFace to Train a GPT-2 Model for Music Generation	3	--	2023-10-09
MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators	3	--	2023-09-19
Generative Image Dynamics	3	--	2023-09-15
OpenHermes-13B based on Llama-2	3	--	2023-09-07
Llama2.c LLM: ported to Rust and running in the browser	3	--	2023-09-07
Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2	3	--	2023-09-01
Fine-tuned CodeLlama beats GPT-4 on HumanEval	3	--	2023-08-27
LoRA the Explorer	3	--	2023-08-17
Fine-tune Llama 2 with DPO	3	--	2023-08-08
Show HN: Goat-7B LLM, a new SOTA among the open-source 7B models	3	--	2023-07-25
How is ChatGPT's behavior changing over time?	3	--	2023-07-19
Show HN: New control net model for AI art QRcode	3	--	2023-06-27
Show HN: Bert-Based Classification Model for Google Local Listings	3	--	2023-06-26
Mosaic ML: MPT-30B-Chat	3	--	2023-06-25
Video Composer: Create videos using GPT-4 and FFmpeg	3	--	2023-06-15
MusicGen from Meta on Hugging Face	3	--	2023-06-09
OpenLLaMA 7B Released	3	--	2023-06-07
WizardLM-30B	3	--	2023-06-06
Can AI Code?	3	--	2023-06-05
Constrained Text Generation with Transformers	3	--	2023-05-22
StarCoder: A State-of-the-Art LLM for Code	3	--	2023-05-05
Swift Diffusers: Fast Stable Diffusion for Mac	3	--	2023-04-02
Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU	3	--	2023-03-12
Parameter-Efficient Fine-Tuning Billion-Scale Models on Low-Resource Hardware	3	--	2023-02-10
Run Deepseek Coder LLM locally	2	--	2023-12-03
Releasing Swift Transformers: Run On-Device LLMs in Apple Devices	2	--	2023-08-08
Mixtral_7Bx2_MoE	2	--	2023-12-24
Universal AnglE Sentence Embedding: New SOTA on MTEB Leaderboard	2	--	2023-12-05
Non-engineers guide: Train a LLaMA 2 chatbot	2	--	2023-12-02
AutoTrain: (not just)LLM finetuning without code and infra	2	--	2023-11-23
How do you think LLM inference on CPUs?	2	--	2023-11-03
State-of-the-Art Ember embedding model for retrieval augmented generation	2	--	2023-10-20
Large Language Models as Analogical Reasoners	2	--	2023-10-05
QR Code Monster	2	--	2023-10-02
CausalLM is not optimal for in-context learning	2	--	2023-08-15
Count tokens used by GPT-4 and Llama for large texts (> 50k …	2	--	2023-08-05
Apply ControlNet to a Video	2	--	2023-08-01
Making real-time ML-powered web games with Transformers.js	2	--	2023-07-05
LLaMA: Large Language Model Meta AI	2	--	2023-03-17
Small Stable Diffusion	2	--	2023-01-19
Chatglm3-6B-32k	2	--	2023-12-29
DreaMoving: A Human Video Generation Framework Based on Diffusion Models	2	--	2023-12-28
Dream-Talk: Realistic Audio-Driven Single Image Talking Face Generation	2	--	2023-12-24
Time Is Encoded in the Weights of Finetuned Language Models	2	--	2023-12-22
2023, Year of Open LLMs	2	--	2023-12-19
Hugging Face releases Optimum-Nvidia to accelerate LLM inference	2	--	2023-12-07
Open LLM Leaderboard: DROP deep dive	2	--	2023-12-02
Starling-RM-7B-Alpha	2	--	2023-11-27
Intel: neural-chat-7B-v3-1	2	--	2023-11-16
Whisper Large v3	2	--	2023-11-09
MonadGPT – OS ChatGPT-like for the 17th century	2	--	2023-11-09
OpenHermes-2.5-Mistral-7B	2	--	2023-11-08
Yi-34B, 76.3 on MMLU, Apache 2.0	2	--	2023-11-04
Templates for Chat Models	2	--	2023-10-17
HF Shopify Image Background Replacement	2	--	2023-10-12
OpenWebMath, a dataset containing every math docs found on the internet	2	--	2023-10-11
Paper Page – NExT-GPT: Any-to-Any Multimodal LLM	2	--	2023-09-12
Using Machine Learning to Improve Language Metadata on the Hugging Face Hub	2	--	2023-09-12
Open ASR Leaderboard	2	--	2023-09-07
Show HN: A LLM pull reqeust review tool [feedback wanted]	2	--	2023-09-07
Technology Innovation Institute Releases Falcon 180B LLM	2	--	2023-09-06
Hugging Face Tutorial for Unity RL Agents	2	--	2023-08-31
Dolma: The Largest Open Dataset For Training Language Models	2	--	2023-08-24
WizardMath: Empowering Math Reasoning for LLM via Reinforced Evol-Instruct	2	--	2023-08-15
Hugging Face Launches Tools for Running LLMs on Apple Devices	2	--	2023-08-09
Open sourcing OpenAI’s function calling	2	--	2023-08-08
Autotrain – Create powerful AI models without code	2	--	2023-07-30
Understanding Embeddings	2	--	2023-07-28
Scaling TransNormer to 175B Parameters	2	--	2023-07-28
Llama 2 is here – get it on Hugging Face	2	--	2023-07-19
Building an AI WebTV	2	--	2023-07-18
Open-Source Text Generation and LLM Ecosystem at Hugging Face	2	--	2023-07-17
OpenOrca-Preview1	2	--	2023-07-12
Large Language Models can complete complex non linguistic patterns in context	2	--	2023-07-11
Whisper Web: Speech recognition in the web browser	2	--	2023-07-10
Chat with Falcon-7B-instruct demo	2	--	2023-07-08
OpenChat: Less is More for Open-source Models	2	--	2023-07-06
Can foundation models label data like humans?	2	--	2023-07-05
Are Text-to-image models biased?	2	--	2023-07-03
Orca: Progressive Learning from Complex Explanation Traces of GPT-4	2	--	2023-07-01
Can foundation models label data like humans?	2	--	2023-06-30
A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion	2	--	2023-06-30
Hugging Face – Transformers Agents 4.30 with local agents	2	--	2023-06-28
DragGan – Interactive Point-Based Manipulation on the Generative Image Manifold	2	--	2023-06-26
QR Code Conditioned ControlNet Models for Stable Diffusion 1.5 and 2.1	2	--	2023-06-16
Cluster and Visualise 100K Wines by Tasting Notes with T-SNE	2	--	2023-06-11
Hugging Face and IBM partner on watsonx.ai, next-gen enterprise studio for AI	2	--	2023-05-28
HuggingFace Demo: DragGAN	2	--	2023-05-26
Audit shows that safetensors is safe and ready to become the default	2	--	2023-05-23
A Dive into Text-to-Video Models	2	--	2023-05-15
HuberChat, a Chatbot trained on HubermanLab podcast (OpenAI key required)	2	--	2023-05-10
Demo: Code Completion with replit-code-v1-3B	2	--	2023-05-03
RLHF – Hugging Face Course	2	--	2023-04-27
Ekimetrics launches a “ChatGPT” dedicated to climate	2	--	2023-04-07
Alpaca GarbageCollector – Curating high-quality data for open-source LLMs	2	--	2023-04-04
Text2Video-Zero	2	--	2023-03-26
Train your own ControlNet models with diffusers	2	--	2023-03-24
Open source models for various Machine Learning tasks	2	--	2023-03-08
Ultra Fast ControlNet with Hugging Face Diffusers	2	--	2023-03-03
Using Stable Diffusion with Core ML on Apple Silicon	2	--	2023-02-22
HuggingFace/Transformers-Stats	2	--	2023-02-20
Playable Demo for MarioGPT: Open-Ended Text2Level Generation Through LLMs	2	--	2023-02-18
Faster Training and Inference: Habana Gaudi -2 vs. Nvidia A100 80GB	2	--	2023-02-16
Speech Synthesis, Recognition, and More with SpeechT5	2	--	2023-02-09
Threat actors using HuggingFace to deliver malware	2	--	2023-02-07
Generating Human Motion from Textual Descriptions (T2M-GPT)	2	--	2023-01-31
AI for Game Development: 3D Asset Generation	2	--	2023-01-20
Show HN: ML Q&A – Get answers to questions about ML frameworks	2	--	2023-01-05
With LLMs we can create an open-source Library of Alexandria	1	--	2023-09-28
Show HN: Find Your Celebrity Lookalike (With AI)	1	--	2023-01-04
DiffMorpher – Using Diffusion Models for Image Morphing	1	--	2023-12-24
Tencent Announces AppAgent	1	--	2023-12-22
How Do Prompt Injection Scanners Perform? A Benchmark	1	--	2023-12-07
Show HN: ChatData – an open-source ChatGPT-like chatbot	1	--	2023-11-29
3D Gaussian Splat Viewer (top item)	1	--	2023-10-23
Who loves you Hacker News?	1	--	2023-10-12
Curious about Causality and Generative Models? Check Out This Demo	1	--	2023-07-26
Have You Tried AWS Inferentia2 for ML Deployments?	1	--	2023-07-16
Open Source LLM Inference DLC	1	--	2023-06-29
WizardCoder: Empowering Code Large Language Models with Evol-Instruct	1	--	2023-06-15
Text Embedding Benchmark (MTEB) Leaderboard	1	--	2023-02-20

Plushcap, by Matt Makai. 2021-2026.

HuggingFace on HN