| 287 |
FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-Precision |
2024-07-11 |
| 221 |
Paving the way to efficient architectures: StripedHyena-7B |
2023-12-08 |
| 165 |
Based: Simple linear attention language models |
2024-03-05 |
| 143 |
Dragonfly: A large vision-language model with multi-resolution zoom |
2024-06-06 |
| 80 |
A practitioner's guide to testing and running GPU clusters |
2024-08-13 |
| 70 |
Together AI raises a $102.5M Series A |
2023-11-29 |
| 236 |
RedPajama v2 Open Dataset with 30T Tokens for Training LLMs |
2023-10-30 |
| 84 |
Llama 32K Context Released by Together AI |
2023-07-29 |
| 54 |
Llama 2 on togetherAI is as bad of a privacy nightmare as OpenAI |
2023-09-08 |
| 198 |
AdapTive-LeArning Speculator System (ATLAS): Faster LLM inference |
2025-10-12 |