The article compares the capabilities of two prominent AI models, OpenAI o4-mini and DeepSeek R1, in detecting difficult-to-identify bugs across multiple programming languages. The evaluation dataset consists of 210 programs with realistic bugs introduced in various domains and languages. DeepSeek R1 outperforms OpenAI o4-mini in detecting bugs, particularly in TypeScript, Go, and Rust, where it excels in handling concurrency issues and logical complexities. In contrast, OpenAI o4-mini performs better in Python, where its pattern recognition capabilities are more effective. The study highlights the strengths of DeepSeek R1's architecture and training methods, which enable it to identify subtle logic errors and concurrency issues that often elude simpler methods.