| 52 |
Train faster static embedding models with sentence transformers |
2025-01-15 |
| 394 |
Open-R1: an open reproduction of DeepSeek-R1 |
2025-01-28 |
| 227 |
Kokoro WebGPU: Real-time text-to-speech 100% locally in the browser |
2025-02-07 |
| 49 |
Janus-Pro: Autoregressive framework unifying multimodal understanding&generation |
2025-01-27 |
| 39 |
DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks |
2025-01-20 |
| 38 |
Fully autonomous AI agents should not be developed |
2025-02-07 |
| 33 |
The Ultra-Scale Playbook: Training LLMs on GPU Clusters |
2025-02-19 |
| 63 |
Open-sourcing 5,000hrs of self-driving dataset |
2025-03-11 |
| 451 |
Deepseek R1-0528 |
2025-05-28 |
| 149 |
Show HN: Penny-1.7B Irish Penny Journal style transfer |
2025-06-02 |
| 52 |
Show HN: ChatToSTL – AI text-to-CAD for 3D printing |
2025-06-12 |
| 361 |
Nanonets-OCR-s – OCR model that transforms documents into structured markdown |
2025-06-16 |
| 388 |
Smollm3: Smol, multilingual, long-context reasoner LLM |
2025-07-08 |
| 64 |
Voxtral-Mini-3B-2507 – Open source speech understanding model |
2025-07-15 |
| 30 |
Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders |
2025-07-09 |
| 152 |
Qwen3-235B-A22B-Thinking-2507 |
2025-07-25 |
| 166 |
Qwen3-4B-Thinking-2507 |
2025-08-06 |
| 319 |
Apertus 70B: Truly Open - Swiss LLM by ETH, EPFL and CSCS |
2025-09-02 |
| 87 |
Qwen3 30B-A3B |
2025-07-30 |
| 54 |
Qwen Image |
2025-08-04 |
| 36 |
Qwen3-235B-A22B-Instruct-2507 |
2025-07-21 |
| 32 |
Qwen3-Coder-30B-A3B-Instruct |
2025-07-31 |
| 27 |
grok-2 on Hugging Face |
2025-08-23 |
| 26 |
DeepSeek-v3.1 |
2025-08-21 |
| 25 |
DeepSeek-v3.1-Base |
2025-08-19 |