Can LLMs Audit Smart Contracts? Benchmarking Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro
I gave 56 known-vulnerable Solidity smart contracts to three frontier LLMs — Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro — and asked each one to find the bugs. 168 API calls, ~$5, and a couple of surprises later, here is what the data says. Claude finds the most bugs (98.2%). GPT-5.5 localizes the
ORIGINAL SOURCE →via Dev.to
ADVERTISEMENT
⚡ STAY AHEAD
Events like this, convergence-verified across 689 sources, land in your inbox every Sunday. Free.
GET THE SUNDAY BRIEFING →RELATED · tech
- [TECH] Solana Foundation partners Google Cloud on stablecoin payments for AI agents
- [TECH] U.S. and China Pursue Guardrails to Stop AI Rivalry From Spiraling Into Crisis
- [TECH] Mother of 4 of Elon Musk’s kids takes the stand in OpenAI trial and reveals details about their personal relationship
- [TECH] Musk’s biggest loyalist became his biggest liability
- [TECH] Stop Writing Code. Start Managing Agents. (A VSCode vs. Antigravity Story)
- [TECH] Horacio Melgarejo tras la dura derrota de Cienciano en Copa Sudamericana: “Queda cambiar el chip y meternos en el torneo local”