Home / Companies / Surge AI / Hacker News

Surge AI on HN

37 posts with 1+ points since 2022

Filters
Since:
Posts by Month (37 total)
Hacker News Posts
Title Points Comments Date
Three areas where Google Search lags behind competitors: code, cooking, travel 527 -- 2022-04-13
Is Google Search Deteriorating? Measuring Google's Search Quality in 2022 470 -- 2022-01-11
30% of Google's Emotions Dataset Is Mislabeled 334 -- 2022-07-14
Evaluation of TikTok vs. Instagram Reels 222 -- 2022-09-02
Building a no-code toxicity classifier by talking to GitHub Copilot 212 -- 2022-03-25
Are popular toxicity models simply profanity detectors? 183 -- 2022-01-25
Generating Children’s Stories Using GPT-3 and DALL·E 138 -- 2022-06-29
We asked 100 humans to draw the DALL·E prompts 138 -- 2022-05-13
HellaSwag: 36% of this popular large language model benchmark contains errors 49 -- 2022-12-06
I wanted burritos. Facebook Search sent me to a dead restaurant 45m … 25 -- 2022-06-16
We Evaluated ChatGPT vs. Google on 500 Search Queries 25 -- 2022-12-26
SWE-Bench Failures: When Coding Agents Spiral into 693 Lines of Hallucinations 22 -- 2025-09-18
Twitter’s Egregious Content Moderation Failures 15 -- 2022-11-10
Move Over, Google: The TikTokification of Next-Gen Search 13 -- 2022-10-26
The average number of ads on a Google Search recipe? 8.7 13 -- 2022-04-29
DALL·E vs. Imagen, and Evaluating Astral Codex Ten's Bet on AI Progress 13 -- 2022-09-30
What if social media optimized for human values? A Facebook case study 12 -- 2022-02-11
Explaining Reinforcement Learning with Human Feedback (RLHF) 11 -- 2023-01-05
The $250K Inverse Scaling Prize and Human-AI Alignment 11 -- 2022-09-28
An Analysis of Omicron Tweets: 30% Are Skeptical of the Medical Establishment 10 -- 2022-01-21
How Good is Hugging Face's BLOOM? Human Evaluation of Large Language Models 10 -- 2022-07-21
Are the Spammers Winning? Failures in Gmail Spam Detection 10 -- 2022-05-24
We measured the percentage of Spammy Twitter users 10 -- 2022-05-18
AI Red Teams for Adversarial Training: Making ChatGPT and LLMs More Robust 9 -- 2022-12-13
Writing a Super Bowl Worthy Commercial with GPT-3 9 -- 2022-02-16
Inter-Annotator Agreement: An Introduction to Krippendorff’s Alpha 9 -- 2022-01-06
Optimizing Facebook's Algorithms for Human Values Instead of Clicks 7 -- 2022-07-29
Building Better Developer Search: How Neeva Measures Search Quality 5 -- 2022-07-07
How We Built It: OpenAI's GSM8K Dataset of 8,500 Math Problems 4 -- 2022-06-15
Unsexy AI Failures: The PDF That Broke ChatGPT 4 -- 2025-10-03
Humans vs. Gary Marcus: The Complexity of Measuring Machine Intelligence 3 -- 2022-06-23
How TikTok Is Evolving the Next Generation of Search 2 -- 2022-11-01
Sentiment Analysis Dataset of Social Media Stock Conversations 2 -- 2022-06-10
Unsexy AI Failures: Still Confidently Hallucinating Image Text 2 -- 2025-09-23
The Obscenity List 1 -- 2022-01-18
SurgeAI Blog: Human Evals vs. Academic Benchmarks 1 -- 2025-09-04
Unsexy AI Failures: The PDF That Broke ChatGPT 1 -- 2025-09-03