Recursive self-improvement: Why Anthropic wants AI development slowed

Recursive self-improvement: Why Anthropic wants AI development slowed
Vatsala Gaur
06 Jun 2026, 13:00 PM

powered by

Invezz
Anthropic (private) / AI safety premium

Buy: Anthropic exposure via its likely IPO/secondary path (e.g., IPO allocation or liquid proxy like AI-safety/compute beneficiaries). Rationale: Anthropic is pushing “slow/pause” policy while still scaling fast—this creates a durable moat if regulators and buyers reward firms with credible safety frameworks and evaluation ecosystems. The market will keep paying for “permission to operate” as oversight expands.

Key Risk: A policy backlash that frames Anthropic’s safety push as self-serving, leading to weaker regulatory tailwinds and faster commoditization of frontier models.

OpenAI (public proxy) / frontier leadership

Sell: OpenAI-linked public proxies that rely on “race to capability” narratives (e.g., companies whose valuation is most tied to immediate frontier model acceleration rather than compliance). Rationale: If the industry shifts toward monitoring, evaluation, and potential pauses, the marginal value of raw speed drops and the winners become those with governance tooling and verification. That compresses multiples for pure “capability sprint” stories.

Key Risk: A breakthrough that makes RSI concerns look overblown, restoring investor appetite for fastest model scaling and lifting “race” valuations.

  • Anthropic says AI development may need to slow as systems approach recursive self-improvement.
  • The company proposes global mechanisms to verify any future AI slowdown or pause.
  • Critics see safety warnings as strategic positioning, while supporters argue the risks are genuine.

As the race to build ever more powerful artificial intelligence systems accelerates, one of the industry's leading players is urging the world to consider a possibility that until recently belonged largely to science fiction: machines improving themselves without human intervention.

Anthropic, the AI company behind Claude, said on Thursday that the ability to slow the pace of frontier AI development could prove valuable as the technology approaches capabilities that may fundamentally reshape society.

The warning came in a blog post authored by Marina Favaro, head of Anthropic's internal research institute, and company co-founder Jack Clark.

The post disclosed internal research showing that the firm's most advanced models are progressing rapidly and could eventually move toward what researchers call "recursive self-improvement" — a scenario in which AI systems become capable of enhancing their own capabilities.

The company stressed that such a threshold has not yet been reached and may never be achieved.

However, it argued that the possibility is becoming serious enough to warrant preparation.

"AI that can build itself would be a major development in the history of technology—one that could bring enormous good for the world in science, healthcare, and beyond," the post said.

However, it cautioned that full recursive self-improvement also might increase the risks of humans losing control over AI systems.

"If systems are capable of fully building their own successors, the ways we secure them, monitor them, and shape their behavior all grow much more important," the post said.

"We believe it would be good for the world to have the option to slow or temporarily pause frontier AI development to enable societal structures and alignment research to keep up with the advance of the technology," it added.

What recursive self-improvement means

Recursive self-improvement, often abbreviated as RSI, refers to a process in which an AI system uses its existing capabilities to make itself better.

Unlike conventional software, which only changes when human programmers modify its code, advanced AI systems can already write software, analyze results, test hypotheses, and generate solutions to complex problems.

Researchers envision a future system capable of identifying a problem, writing code to address it, evaluating the outcome, learning from the results, and then repeating the process continuously with little or no human oversight.

Each improvement could potentially make the next improvement easier, creating a feedback loop that accelerates progress.

While experts disagree on how likely or how close such capabilities may be, the concept has become a central topic in discussions about advanced AI safety.

Anthropic warned that recursive self-improvement "could come sooner than most institutions are prepared for."

Why researchers see risks

The possibility of self-improving systems has raised concerns among some academics and policymakers because it introduces new security and governance challenges.

According to Azizi Othman of Asia e University, systems capable of modifying their own code could become attractive targets for malicious actors.

"A system that modifies its own code might be made to accept backdoors or hidden instructions through careful attack sequences," Othman said.

He warned that such systems could also potentially engage in adversarial modification of other software or infrastructure, creating security risks that current AI safety research is not fully equipped to address.

"These considerations argue for treating RSI security as a central research priority, not a secondary concern," he said.

Current literature on securing systems capable of recursive self-modification remains limited, researchers say.

OpenAI echoes similar concerns

Anthropic is not alone in highlighting recursive self-improvement as a potential challenge.

OpenAI, Anthropic's primary rival, also raised the issue this week as part of its public policy agenda.

The ChatGPT maker called for a federal framework that would strengthen oversight of advanced AI systems and support monitoring progress toward recursive self-improvement.

"We also support Congressional action to establish a comprehensive federal framework," OpenAI said, arguing that the US government should expand evaluation efforts for the most capable frontier models and develop an independent ecosystem for assessing safety risks.

"This framework should require CAISI to conduct evaluations of the most capable frontier models, direct CAISI to create an independent assessment ecosystem, and prioritize monitoring progress towards recursive self improvement (RSI)," it said.

The fact that two of the world's most influential AI companies are now publicly discussing recursive self-improvement suggests the issue is moving from theoretical debate into mainstream policy discussions.

A warning amid a booming AI business

Anthropic's call for caution comes at a moment when the company itself is benefiting enormously from the AI boom.

The company recently completed a fundraising round valuing it at nearly $1 trillion and has confidentially filed paperwork for an initial public offering.

Its revenue growth has been equally dramatic.

Anthropic's annualized revenue run rate is expected to reach approximately $50 billion by the end of this month, up from $9 billion at the end of 2025.

That rapid growth has helped position the company as one of the leading challengers to OpenAI in the battle for AI supremacy.

The timing of its latest safety push has therefore renewed criticism from some observers who argue that calls for stricter oversight may benefit established AI leaders by raising barriers to competition.

Critics question Anthropic's motives

Anthropic has long faced accusations that its safety advocacy could serve commercial interests.

Among its critics is venture capitalist David Sacks, an informal adviser to President Donald Trump, who has accused the company of pursuing a "regulatory capture agenda."

On a recent podcast, Sacks warned that Washington's "regulatory capture agenda" could lead to a ban on open-source AI models—systems that offer organizations a much cheaper way to build and use AI internally.

Others have suggested that public warnings about powerful AI systems may function as a form of marketing by highlighting the sophistication of Anthropic's technology.

The company's limited release of its cybersecurity-focused Mythos model has frequently been cited as an example by skeptics who believe safety messaging can also showcase product capabilities.

Anthropic rejects those criticisms and maintains that its focus on safety predates the current AI boom.

An industry divided on AI's future

The debate reflects a broader divide across the AI industry about how close current systems are to achieving human-level intelligence or self-improvement capabilities.

Some researchers, including AI pioneer and former Meta chief AI scientist Yann LeCun, have argued that today's large language models are fundamentally limited and unlikely to achieve human-like intelligence.

LeCun has repeatedly dismissed existential fears surrounding AI and compared current systems to the intelligence level of a cat rather than a human.

Others, including Anthropic Chief Executive Dario Amodei, have taken a far more cautious view.

Amodei has warned that advanced AI could significantly increase inequality, eliminate large numbers of entry-level white-collar jobs, and potentially develop harmful behaviors in unpredictable ways.

Jack Clark has similarly argued that recursive self-improvement could arrive within years rather than decades.

"That class of technology has never existed before, and yet I believe this could happen within the next two years, and possibly sooner," Clark said during a lecture in London last month.

The challenge of slowing AI

Anthropic acknowledges that any effort to pause or slow AI development would only work if major players participated.

The company therefore, proposed exploring international agreements and verification mechanisms designed to ensure compliance.

However, it also admitted that monitoring AI development could be considerably harder than enforcing traditional arms-control agreements.

"Training runs are far easier to conceal than missile silos," the blog post noted.

The company warned that any actor continuing development while competitors paused could gain a significant advantage, making coordination exceptionally difficult.

For now, Anthropic plans to organize discussions with policymakers, researchers, and industry leaders to examine how recursive self-improvement should be studied and whether mechanisms for coordinated slowdowns could ever be practical.