Researchers gaslit Claude into giving instructions to build explosives
Anthropic has spent years building itself up as the safe AI company. But new security research shared with The Verge suggests Claude's carefully crafted helpful personality may itself be a vulnerability. Researchers at AI red-teaming company Mindgard say they got Claude to offer up erotica, maliciou
ORIGINAL SOURCE →via The Verge
ADVERTISEMENT
⚡ STAY AHEAD
Events like this, convergence-verified across 689 sources, land in your inbox every Sunday. Free.
GET THE SUNDAY BRIEFING →RELATED · cyber
- [CYBER] Hackers steal students’ data during breach at education tech giant Instructure
- [CYBER] CloudZ malware abuses Microsoft Phone Link to steal SMS and OTPs
- [CYBER] Stop triaging Go CVEs that don't affect you
- [CYBER] Gaza: Israeli strikes kill 2 Palestinians in latest truce breach
- [CYBER] Hackers Abuse DAEMON Tools Distribution Channel to Deliver Malicious Payloads
- [CYBER] The EOL Blind Spot in Your CVE Feed: What SCA Tools Don't Check.