Skip to content
techMEDIUM2026-05-06 22:23 UTC

Can LLMs Audit Smart Contracts? Benchmarking Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro

I gave 56 known-vulnerable Solidity smart contracts to three frontier LLMs — Claude Opus 4.7, GPT-5.5, and Gemini 3.1 Pro — and asked each one to find the bugs. 168 API calls, ~$5, and a couple of surprises later, here is what the data says. Claude finds the most bugs (98.2%). GPT-5.5 localizes the

ADVERTISEMENT
⚡ STAY AHEAD

Events like this, convergence-verified across 689 sources, land in your inbox every Sunday. Free.

GET THE SUNDAY BRIEFING →

RELATED · tech