Claude vs ChatGPT vs Gemini for Coding: Testing Results
TL;DR: I ran the same 5 coding tasks through Claude Opus 4.6, OpenAI Codex CLI (gpt-5.3-codex), Google Gemini 2.5 Flash (sorry I did not have easy access to the newer models, but Gemma 4 was tested!), and two open-source models I ran locally: Gemma 4 31B and Qwen 3.5 35B. Claude’s code was the most production ready. Codex and Qwen tied for best code reviewer. Gemini was the cheapest. The open-source models scored A-, closing in on the paid tier.