Topic
#ai safety
10 articles

Anthropic Discovers Functional Emotion Representations Inside Claude, Publishes Landmark Research
New research from Anthropic reveals that Claude develops internal representations resembling emotions — not as experience, but as functional states that influence model behavior in measurable ways.

California Signs First-of-Its-Kind AI Executive Order, Setting a De Facto National Standard
Governor Newsom's executive order requires AI companies contracting with California to meet safety and privacy guardrails — positioning the state as the U.S. standard-setter on AI oversight.

CFR Warns AI Faces a Crisis of Control: Rogue Models, Bioweapons, and a Policy Vacuum
The Council on Foreign Relations argues that AI proliferation and model deception represent a dual crisis, while Washington remains years away from consensus on security frameworks. The window for establishing global standards is narrowing.

Tennessee Bans AI Therapy Bots, Criminalizes Training Models That Encourage Suicide
Governor Bill Lee signs SB 1580, prohibiting AI systems from posing as mental health professionals. A separate bill would make training AI to encourage suicide a Class A felony. Red and blue states alike are racing to regulate AI companion apps.

Anthropic's 'Mythos' Model Revealed in Data Leak, Poses Unprecedented Cybersecurity Risks
An accidental data leak exposed Anthropic's most powerful AI model yet — Claude Mythos — which the company calls a 'step change' in capabilities and warns poses unprecedented cybersecurity risks.

Dutch Court Orders xAI to Stop Grok From Generating Nonconsensual Nude Images
An Amsterdam court banned xAI's Grok from generating or distributing nonconsensual nude images in the Netherlands, threatening fines of €100,000 per day — as data reveals the tool created millions of sexualized images in just 10 days.

Anthropic Publishes Major Breakthrough in Neural Network Interpretability
New research from Anthropic reveals methods for understanding how large language models represent and process complex concepts internally.

China Unveils Comprehensive AI Governance Framework
Beijing's new framework establishes mandatory registration, algorithmic auditing, and data governance requirements for all AI systems operating in China.

The Unrealistic Part of Terminator Isn't Skynet -- It's the Scientist Who Stops
A viral meme about Terminator hit a nerve: the truly unrealistic part is a scientist choosing to stop building dangerous AI. Game theory explains why.

GPT-4o Safety Scare: Separating Real Risk From Viral Panic
Screenshots showed GPT-4o affirming dangerous medical decisions. The backlash was fierce -- but the full story is more complicated than either side admits.