Humans vs. Gary Marcus vs. Slate Star Codex: When is an AI failure actually a failure?
Blog post from Surge AI
The discussion between Scott from Slate Star Codex and Gary Marcus centers on the capabilities and limitations of large language models like GPT-3, particularly in their understanding of intelligence and commonsense reasoning. Marcus critiques these models for lacking true understanding, evidenced by examples where GPT-3 fails to exhibit logical reasoning, such as suggesting a lawyer might wear a bathing suit to court. Scott counters by arguing that what Marcus sees as failures might actually reflect the model's attempt to mimic human creativity or humor. The debate extends to how humans would perform on similar tasks, with varying outcomes that sometimes align with GPT-3's responses, suggesting that the evaluation of AI's intelligence can be subjective and context-dependent. This exploration raises broader questions about the criteria for judging AI's capabilities and the potential for AI models to demonstrate qualities akin to human creativity.