Home / Companies / HuggingFace / Hacker News

HuggingFace on HN

297 posts with 1+ points in 2023

Filters
Year:
Posts by Month (297 total)
Hacker News Posts
Title Points Comments Date
MonadGPT – What would have happened if ChatGPT was invented in the … 323 -- 2023-11-24
LLM in a Flash: Efficient LLM Inference with Limited Memory 252 -- 2023-12-20
Falcon 180B 238 -- 2023-09-06
OpenLLaMA 13B Released 229 -- 2023-06-18
Hugging Face Releases Agents 214 -- 2023-05-10
BigCode Project Releases StarCoder: A 15B Code LLM 185 -- 2023-05-04
StackLlama: A hands-on guide to train LlaMa with RLHF 165 -- 2023-04-06
Mistral-8x7B-Chat 131 -- 2023-12-10
Yi-34B-Chat 115 -- 2023-11-24
GPT-3.5 and Wolfram Alpha via LangChain 107 -- 2023-01-18
The Falcon has landed in the Hugging Face ecosystem 105 -- 2023-06-05
Hugging Face and AWS partner to make AI more accessible 102 -- 2023-02-21
HuggingFace Training Cluster as a Service 101 -- 2023-09-05
Segmind Stable Diffusion – A smaller version of Stable Diffusion XL 95 -- 2023-10-25
HuggingChat 93 -- 2023-04-25
Yarn-Mistral-7B-128k 88 -- 2023-11-11
Sparse LLM Inference on CPU: 75% fewer parameters 78 -- 2023-10-19
Switch Transformers C – 2048 experts (1.6T params for 3.1 TB) (2022) 73 -- 2023-11-20
Multimodal Neurons in Pretrained Text-Only Transformers 66 -- 2023-08-04
HuggingChat – ChatGPT alternative with open source models 61 -- 2023-12-15
OpenLLaMA 7B Training Completed to 1T Tokens 58 -- 2023-06-07
Phi-2 57 -- 2023-12-13
Dolphin-2_6-Phi-2 56 -- 2023-12-24
Alibaba releases 72B LLM with 32k context length 55 -- 2023-11-30
Open LLAMA 13B released, trained on 1T tokens 47 -- 2023-06-19
4-Bit Quantization and QLoRA 41 -- 2023-05-25
BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat 40 -- 2023-05-19
What's Going on with the Open LLM Leaderboard? 40 -- 2023-06-23
Kai-Fu Li's Yi-34B uses exactly Llama's architecture except for 2 tensor renamed 39 -- 2023-11-14
Zephyr 7B – Mistral Finetune that responds like ChatGPT 37 -- 2023-10-15
Whisper Jax: Transcribe a 1 hour of audio in under 15 seconds 36 -- 2023-04-22
MistralLite by Amazon Web Services 34 -- 2023-11-01
Mixture of Experts Explained 29 -- 2023-12-11
TinyLlama at 2T of 3T 29 -- 2023-11-19
Real-Time Latent Consistency Model 27 -- 2023-10-30
Language Modeling Is Compression 27 -- 2023-09-21
Pixel Art XL: Stable Diffusion XL for Pixel Art 26 -- 2023-08-03
UC Berkeley's open-source Vicuna LLM chatbot released new improved model weights 26 -- 2023-04-14
Llama 1.3B Trained on 200B Tokens for Commercial Use 25 -- 2023-04-28
NousResearch/Nous-Hermes-2-Yi-34B 24 -- 2023-12-26
Accelerating Stable Diffusion XL Inference with Jax on Cloud TPU v5e 23 -- 2023-10-03
Llama 22B: 13B V2 with 33B attention heads frankensteined on 22 -- 2023-08-18
Mistral-7B-OpenOrca. First 7B model to beat all other models <30B 21 -- 2023-10-02
Würstchen: Fast Diffusion for Image Generation 21 -- 2023-09-13
AMD and: Large Language Models Out-of-the-Box Acceleration with AMD GPU 19 -- 2023-12-13
Encrypted Large Language Models with Homomorphic Encryption 18 -- 2023-08-03
Orca 2: Teaching Small Language Models How to Reason 18 -- 2023-11-21
Show HN: MiniSearch, a minimalist search engine with integrated browser-based AI 17 -- 2023-10-15
Gemini vs. GPT-4V: A Preliminary Comparison Through Qualitative Cases 17 -- 2023-12-28
Una-Cybertron-7B 17 -- 2023-12-08
GPT Baker lets you build your own open-source GPTs 17 -- 2023-11-23
Deploy Livebook (Elixir) Notebooks as Apps to Hugging Face Spaces 17 -- 2023-06-15
ChatRWKV 17 -- 2023-03-23
Airoboros-13B: 98% against GPT-3.5 14 -- 2023-05-22
Create a GPT3 powered Q&A Chatbot for *any* GitHub repo by posting … 13 -- 2023-02-05
Attention Sinks in LLMs for endless fluency 12 -- 2023-10-09
Idefics: Open Access 60B multimodal model 12 -- 2023-08-22
30B uncensored OSS model with no guardrails 11 -- 2023-11-07
Hierarchical Masked 3D Diffusion Model for Video Outpainting 11 -- 2023-09-06
Shallow Feed-Forward Neural Networks as Alternative to Attention in Transformers 11 -- 2023-11-21
From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting 10 -- 2023-09-11
Origin of LLMs: An Evolutionary Tree and Graph for 15K Large Language … 10 -- 2023-07-20
Show HN: Image Filtering App Using Homomorphic Encryption 10 -- 2023-02-23
Stable Diffusion XL Inpainting model released 9 -- 2023-09-01
Opentensor and Cerebras announce BTLM-3B-8K, a leading 3B param. language model 9 -- 2023-07-24
LLM Arena. Mistral-small best open model. Gemini Pro beaten by 2 open … 9 -- 2023-12-17
Meta-llama (Meta Llama 2) 9 -- 2023-07-18
Summary of the Tokenizers 9 -- 2023-02-07
Gradio-Lite: Serverless Gradio Running in the Browser 8 -- 2023-10-25
Show HN: Parley: The RPG where you Negotiate with Bandits 8 -- 2023-04-26
Generate 1 page comic by text 8 -- 2023-09-03
Drag Your GAN: Interactive Point-Based Manipulation on Generative Image Manifold 8 -- 2023-05-23
Show HN: Open-source model to chat with your documents/data 8 -- 2023-08-14
Yes, Transformers Are Effective for Time Series Forecasting (+ Autoformer) 8 -- 2023-06-25
Hugging Face OpenAssistant 8 -- 2023-06-24
Dataset of 35,316,999 HackerNews Posts and Comments (2006 – 2023) 8 -- 2023-04-24
Show HN: Athelas – Automagically Repair Broken Code 8 -- 2023-01-03
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? 7 -- 2023-05-16
Introducing “Clerkie“: A LangChain Q&A bot for AI developers 7 -- 2023-01-18
Microsoft's Orca 7B may violate OpenAI's Terms of Use 7 -- 2023-12-05
Stable Beluga 2 – Llama2 70B finetuned on an Orca style Dataset … 7 -- 2023-07-28
Databricks’ dolly-v2-12B, an instruction-following large language model 7 -- 2023-04-12
Cerebras releases its own open source GPT models (Apache 2.0 License) 7 -- 2023-03-28
Show HN: Interactively explore your Hugging Face dataset with one line of … 7 -- 2023-10-25
CodeFusion: A Pre-Trained Diffusion Model for Code Generation 6 -- 2023-10-30
OpenChat 3.5: 7B model with comparable perf to ChatGPT 6 -- 2023-11-02
Generate Illusions with Stable Diffusion 6 -- 2023-09-16
Mann-E, an open source Equivalent of Midjourney reached its version 4.1.3 6 -- 2023-03-04
Qwen is a large language model series by Alibaba Cloud 6 -- 2023-09-27
Show HN: TCO Calculator to compare on-prem LLM deployment vs. OpenAI and … 6 -- 2023-08-21
Llama-2-70B-instruct-v2 6 -- 2023-08-03
Falcon 40B-Instruct GGML 6 -- 2023-06-15
RWKV – An RNN with the Advantages of a Transformer 6 -- 2023-05-15
Assisted Generation: a new direction toward low-latency text generation 6 -- 2023-05-11
Databricks Publishes a Version of Dolly LLM to Hugging Face 6 -- 2023-03-30
TinyLlama a 1.1B Llama model trained on 3T tokens reaches 1.0 release 5 -- 2023-12-31
New Mixtral HQQ Quantzied 4-bit/2-bit configuration 5 -- 2023-12-18
Personal co-pilot with a fine-tuning and a VSCode extension 5 -- 2023-10-31
Segment Anything Model (Sam) in the Browser with Rust and WASM 5 -- 2023-09-16
SD-XL 1.0 Model Card 5 -- 2023-07-26
AI Policy: Open ML Considerations in the EU AI Act 5 -- 2023-07-26
Modified Version of Apache 2.0 License with Royalty Payments 5 -- 2023-05-26
Creating a Coding Assistant with StarCoder 5 -- 2023-05-10
DeciLM-7B 5 -- 2023-12-12
Nash Learning from Human Feedback 5 -- 2023-12-05
Real-time image generation demo on Gradio 5 -- 2023-11-12
Convert a transformers model to Core ML 5 -- 2023-04-06
Wikipedia Txtai Embeddings Index 5 -- 2023-03-21
Show HN: Get the gist of anyone's Twitter feed 5 -- 2023-02-24
Solar 10.7B: Elevating AI, Effortlessly 4 -- 2023-12-27
WhiteRabbitNeo model series can be used for offensive/defensive cybersecurity 4 -- 2023-12-20
Eric Hartford releases uncensored dolphin-2.5-mixtral-8x7B 4 -- 2023-12-14
XTTS: New Generative model for Voice (weights released on HF) 4 -- 2023-09-15
Prompt Injection Detection Model 4 -- 2023-06-14
Distributed Inference and Fine-Tuning of Large Language Models over the Internet 4 -- 2023-12-17
Distil-Whisper: Distil-Small.en 4 -- 2023-12-14
2-bit and 4-bit versions of Mixtral 4 -- 2023-12-11
Nous-Capybara-34B-200k 4 -- 2023-11-14
An open-source and privacy-by-design Conversational AI in-browser 4 -- 2023-09-22
Large Language Models for Compiler Optimization 4 -- 2023-09-14
Gaussian viewer streaming splats in web browser 4 -- 2023-09-12
Puma: Secure Inference of LLaMA-7B in Five Minutes 4 -- 2023-07-25
FreeWilly2: New LLM from Stability AI 4 -- 2023-07-24
40B LLM wants to charge 10% royalty on revenue? 4 -- 2023-05-26
Falcon-40B 4 -- 2023-05-26
Fully Open Source LLM Chat App – Chat about the Transformers Docs 4 -- 2023-03-14
TinyLlama Reaches 3T Checkpoint 4 -- 2023-12-28
Obsidian-3B 4 -- 2023-11-25
Yarn-Llama-2-70B-32k 4 -- 2023-11-20
SDXL in 4 steps with Latent Consistency LoRAs 4 -- 2023-11-09
Zephyr 7B 4 -- 2023-10-27
Apple/coreml-stable-diffusion-XL-base-iOS 4 -- 2023-09-30
DeepSpeed-Chat: Easy RLHF Training of ChatGPT-Like Models at All Scales 4 -- 2023-08-04
Deploy LLMs with Hugging Face Inference Endpoints 4 -- 2023-07-04
Instruct-Codegen: open-source instruction following codegen model 4 -- 2023-05-27
MPT-7B-StoryWriter-65k+: LLM for super long contexts (Apache 2.0) 4 -- 2023-05-05
BioGPT for Biomedical Scientific Discovery 4 -- 2023-02-07
Using LoRA for Efficient Stable Diffusion Fine-Tuning 4 -- 2023-01-26
MiniLM-L6-v2 maps paragraphs to 384 dimension vector for clustering or search 3 -- 2023-03-21
Phi-1.5 (1.3B Outperforms Llama 2 7B) 3 -- 2023-09-12
GPT-2B-001 3 -- 2023-04-20
10.7B Solar: Elevating Performance with Upstage Depth Up Scaling 3 -- 2023-12-18
Voice Chat with Mistral 7B 3 -- 2023-10-16
Hugging Face partner with AMD to accelerate state-of-the-art models 3 -- 2023-06-14
Solar 10.7B 3 -- 2023-12-27
Transformer.js: Machine Learning for the Web 3 -- 2023-12-09
PixArt-α: Fast Training of Diffusion Transformer for Text-to-Image Synthetis 3 -- 2023-12-04
Laiyer AI Released Its Open Source Prompt Injection Model 3 -- 2023-11-29
LZMD: Lempel-Ziv Montecarlo Diffusion file format 3 -- 2023-11-29
Faster MusicGen Generation with Streaming 3 -- 2023-10-06
Llama 2 on Amazon SageMaker a Benchmark 3 -- 2023-09-26
LoRA Roulette 3 -- 2023-09-22
Open-source AI Discord bots with HuggingFace 3 -- 2023-08-17
StableBeluga-7B 3 -- 2023-07-29
MPT-30B – Apache 2.0 licensed LLM 3 -- 2023-07-22
Show HN: I created a first-of-its-kind open corpus of Australian law 3 -- 2023-06-26
Show HN: DocsGPT-7B – purpose optimised and finetuned model for documentation QA 3 -- 2023-06-16
Alpaca Dataset Translated into Polish 3 -- 2023-04-12
Dolphin-2.6-Mistral-7B 3 -- 2023-12-29
MonadGPT 3 -- 2023-12-28
MiniMA-2-3B 3 -- 2023-12-27
WaveCoder: Widespread Versatile Enhanced Instruction Tuning with Refine Data Gen 3 -- 2023-12-26
StarVector: Generating Scalable Vector Graphics Code from Images 3 -- 2023-12-20
AITube - Youtube but everything is AI generated 3 -- 2023-12-15
Refact-1.6B 3 -- 2023-12-08
Llama-2-7B-chat-mlx for Apple’s new MLX framework 3 -- 2023-12-06
NeuralHermes-2.5-Mistral-7B 3 -- 2023-11-29
Tulu-2-Dpo-70B 3 -- 2023-11-21
Show HN: New Launch OrionStar-Yi-34B-Chat beats Llama2-70B and GPT-3.5-turbo 3 -- 2023-11-20
Nvidia nemotron-3-8B-base-4k 3 -- 2023-11-16
Optimizing LLMs in Production 3 -- 2023-11-15
HuggingFace Daily Papers 3 -- 2023-11-14
Make your llama generation time fly with AWS Inferentia2 3 -- 2023-11-11
Show HN: Face-Stylization – Create face styling with just 8 images 3 -- 2023-11-09
Document Question Answering 3 -- 2023-10-30
Apple's LLMs and other GenAI models on HuggingFace 3 -- 2023-10-19
Using HuggingFace to Train a GPT-2 Model for Music Generation 3 -- 2023-10-09
MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators 3 -- 2023-09-19
Generative Image Dynamics 3 -- 2023-09-15
OpenHermes-13B based on Llama-2 3 -- 2023-09-07
Llama2.c LLM: ported to Rust and running in the browser 3 -- 2023-09-07
Accelerating Vision-Language Models: BridgeTower on Habana Gaudi2 3 -- 2023-09-01
Fine-tuned CodeLlama beats GPT-4 on HumanEval 3 -- 2023-08-27
LoRA the Explorer 3 -- 2023-08-17
Fine-tune Llama 2 with DPO 3 -- 2023-08-08
Show HN: Goat-7B LLM, a new SOTA among the open-source 7B models 3 -- 2023-07-25
How is ChatGPT's behavior changing over time? 3 -- 2023-07-19
Show HN: New control net model for AI art QRcode 3 -- 2023-06-27
Show HN: Bert-Based Classification Model for Google Local Listings 3 -- 2023-06-26
Mosaic ML: MPT-30B-Chat 3 -- 2023-06-25
Video Composer: Create videos using GPT-4 and FFmpeg 3 -- 2023-06-15
MusicGen from Meta on Hugging Face 3 -- 2023-06-09
OpenLLaMA 7B Released 3 -- 2023-06-07
WizardLM-30B 3 -- 2023-06-06
Can AI Code? 3 -- 2023-06-05
Constrained Text Generation with Transformers 3 -- 2023-05-22
StarCoder: A State-of-the-Art LLM for Code 3 -- 2023-05-05
Swift Diffusers: Fast Stable Diffusion for Mac 3 -- 2023-04-02
Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU 3 -- 2023-03-12
Parameter-Efficient Fine-Tuning Billion-Scale Models on Low-Resource Hardware 3 -- 2023-02-10
Run Deepseek Coder LLM locally 2 -- 2023-12-03
Releasing Swift Transformers: Run On-Device LLMs in Apple Devices 2 -- 2023-08-08
Mixtral_7Bx2_MoE 2 -- 2023-12-24
Universal AnglE Sentence Embedding: New SOTA on MTEB Leaderboard 2 -- 2023-12-05
Non-engineers guide: Train a LLaMA 2 chatbot 2 -- 2023-12-02
AutoTrain: (not just)LLM finetuning without code and infra 2 -- 2023-11-23
How do you think LLM inference on CPUs? 2 -- 2023-11-03
State-of-the-Art Ember embedding model for retrieval augmented generation 2 -- 2023-10-20
Large Language Models as Analogical Reasoners 2 -- 2023-10-05
QR Code Monster 2 -- 2023-10-02
CausalLM is not optimal for in-context learning 2 -- 2023-08-15
Count tokens used by GPT-4 and Llama for large texts (> 50k … 2 -- 2023-08-05
Apply ControlNet to a Video 2 -- 2023-08-01
Making real-time ML-powered web games with Transformers.js 2 -- 2023-07-05
LLaMA: Large Language Model Meta AI 2 -- 2023-03-17
Small Stable Diffusion 2 -- 2023-01-19
Chatglm3-6B-32k 2 -- 2023-12-29
DreaMoving: A Human Video Generation Framework Based on Diffusion Models 2 -- 2023-12-28
Dream-Talk: Realistic Audio-Driven Single Image Talking Face Generation 2 -- 2023-12-24
Time Is Encoded in the Weights of Finetuned Language Models 2 -- 2023-12-22
2023, Year of Open LLMs 2 -- 2023-12-19
Hugging Face releases Optimum-Nvidia to accelerate LLM inference 2 -- 2023-12-07
Open LLM Leaderboard: DROP deep dive 2 -- 2023-12-02
Starling-RM-7B-Alpha 2 -- 2023-11-27
Intel: neural-chat-7B-v3-1 2 -- 2023-11-16
Whisper Large v3 2 -- 2023-11-09
MonadGPT – OS ChatGPT-like for the 17th century 2 -- 2023-11-09
OpenHermes-2.5-Mistral-7B 2 -- 2023-11-08
Yi-34B, 76.3 on MMLU, Apache 2.0 2 -- 2023-11-04
Templates for Chat Models 2 -- 2023-10-17
HF Shopify Image Background Replacement 2 -- 2023-10-12
OpenWebMath, a dataset containing every math docs found on the internet 2 -- 2023-10-11
Paper Page – NExT-GPT: Any-to-Any Multimodal LLM 2 -- 2023-09-12
Using Machine Learning to Improve Language Metadata on the Hugging Face Hub 2 -- 2023-09-12
Open ASR Leaderboard 2 -- 2023-09-07
Show HN: A LLM pull reqeust review tool [feedback wanted] 2 -- 2023-09-07
Technology Innovation Institute Releases Falcon 180B LLM 2 -- 2023-09-06
Hugging Face Tutorial for Unity RL Agents 2 -- 2023-08-31
Dolma: The Largest Open Dataset For Training Language Models 2 -- 2023-08-24
WizardMath: Empowering Math Reasoning for LLM via Reinforced Evol-Instruct 2 -- 2023-08-15
Hugging Face Launches Tools for Running LLMs on Apple Devices 2 -- 2023-08-09
Open sourcing OpenAI’s function calling 2 -- 2023-08-08
Autotrain – Create powerful AI models without code 2 -- 2023-07-30
Understanding Embeddings 2 -- 2023-07-28
Scaling TransNormer to 175B Parameters 2 -- 2023-07-28
Llama 2 is here – get it on Hugging Face 2 -- 2023-07-19
Building an AI WebTV 2 -- 2023-07-18
Open-Source Text Generation and LLM Ecosystem at Hugging Face 2 -- 2023-07-17
OpenOrca-Preview1 2 -- 2023-07-12
Large Language Models can complete complex non linguistic patterns in context 2 -- 2023-07-11
Whisper Web: Speech recognition in the web browser 2 -- 2023-07-10
Chat with Falcon-7B-instruct demo 2 -- 2023-07-08
OpenChat: Less is More for Open-source Models 2 -- 2023-07-06
Can foundation models label data like humans? 2 -- 2023-07-05
Are Text-to-image models biased? 2 -- 2023-07-03
Orca: Progressive Learning from Complex Explanation Traces of GPT-4 2 -- 2023-07-01
Can foundation models label data like humans? 2 -- 2023-06-30
A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion 2 -- 2023-06-30
Hugging Face – Transformers Agents 4.30 with local agents 2 -- 2023-06-28
DragGan – Interactive Point-Based Manipulation on the Generative Image Manifold 2 -- 2023-06-26
QR Code Conditioned ControlNet Models for Stable Diffusion 1.5 and 2.1 2 -- 2023-06-16
Cluster and Visualise 100K Wines by Tasting Notes with T-SNE 2 -- 2023-06-11
Hugging Face and IBM partner on watsonx.ai, next-gen enterprise studio for AI 2 -- 2023-05-28
HuggingFace Demo: DragGAN 2 -- 2023-05-26
Audit shows that safetensors is safe and ready to become the default 2 -- 2023-05-23
A Dive into Text-to-Video Models 2 -- 2023-05-15
HuberChat, a Chatbot trained on HubermanLab podcast (OpenAI key required) 2 -- 2023-05-10
Demo: Code Completion with replit-code-v1-3B 2 -- 2023-05-03
RLHF – Hugging Face Course 2 -- 2023-04-27
Ekimetrics launches a “ChatGPT” dedicated to climate 2 -- 2023-04-07
Alpaca GarbageCollector – Curating high-quality data for open-source LLMs 2 -- 2023-04-04
Text2Video-Zero 2 -- 2023-03-26
Train your own ControlNet models with diffusers 2 -- 2023-03-24
Open source models for various Machine Learning tasks 2 -- 2023-03-08
Ultra Fast ControlNet with Hugging Face Diffusers 2 -- 2023-03-03
Using Stable Diffusion with Core ML on Apple Silicon 2 -- 2023-02-22
HuggingFace/Transformers-Stats 2 -- 2023-02-20
Playable Demo for MarioGPT: Open-Ended Text2Level Generation Through LLMs 2 -- 2023-02-18
Faster Training and Inference: Habana Gaudi -2 vs. Nvidia A100 80GB 2 -- 2023-02-16
Speech Synthesis, Recognition, and More with SpeechT5 2 -- 2023-02-09
Threat actors using HuggingFace to deliver malware 2 -- 2023-02-07
Generating Human Motion from Textual Descriptions (T2M-GPT) 2 -- 2023-01-31
AI for Game Development: 3D Asset Generation 2 -- 2023-01-20
Show HN: ML Q&A – Get answers to questions about ML frameworks 2 -- 2023-01-05
With LLMs we can create an open-source Library of Alexandria 1 -- 2023-09-28
Show HN: Find Your Celebrity Lookalike (With AI) 1 -- 2023-01-04
DiffMorpher – Using Diffusion Models for Image Morphing 1 -- 2023-12-24
Tencent Announces AppAgent 1 -- 2023-12-22
How Do Prompt Injection Scanners Perform? A Benchmark 1 -- 2023-12-07
Show HN: ChatData – an open-source ChatGPT-like chatbot 1 -- 2023-11-29
3D Gaussian Splat Viewer (top item) 1 -- 2023-10-23
Who loves you Hacker News? 1 -- 2023-10-12
Curious about Causality and Generative Models? Check Out This Demo 1 -- 2023-07-26
Have You Tried AWS Inferentia2 for ML Deployments? 1 -- 2023-07-16
Open Source LLM Inference DLC 1 -- 2023-06-29
WizardCoder: Empowering Code Large Language Models with Evol-Instruct 1 -- 2023-06-15
Text Embedding Benchmark (MTEB) Leaderboard 1 -- 2023-02-20