HN Points | HN Title (Links to original post) | Submitted Date |
---|---|---|
217 | Using reinforcement learning and $4.80 of GPU time to find the best HN post | 2024-10-28 |
199 | Using GRPO to Beat o1, o3-mini and R1 at “Temporal Clue” | 2025-03-06 |
81 | Show HN: RULER – Easily apply RL to any agent | 2025-07-11 |