Anthropic's Triple Moment: Code Leak, Government Standoff, and Weaponization

By: blockbeats|2026/04/03 18:00:08
0
Share
copy
Original Article Title: Anthropic: The Leak, The War, The Weapon
Original Article Author: BuBBliK
Translation: Peggy, BlockBeats

Editor's Note: Over the past six months, Anthropic has been repeatedly involved in a series of seemingly independent but actually interconnected events: a leap in model capability, automated attacks in the real world, drastic reactions from the capital market, public conflicts with the government, and multiple information leaks caused by basic configuration errors. Putting these clues together, they collectively outline a clearer direction of change.

This article takes these events as a starting point to review the continuous trajectory of an AI company in technical breakthroughs, risk exposure, and governance games, attempting to answer a deeper question: as the ability to "discover vulnerabilities" is greatly amplified and gradually disseminated, can the cybersecurity system itself still maintain its original operating logic?

In the past, security was built on the scarcity of capabilities and human constraints; under new conditions, however, offense and defense are unfolding around the same set of model capabilities, making the boundaries increasingly blurry. At the same time, the reactions of institutions, markets, and organizations remain within the old framework, struggling to timely adapt to this change.

This article focuses not only on Anthropic itself but on a larger reality it reflects: AI is not only changing tools but also the premise of "how security is established."

Below is the original text:

What does it look like when a $380 billion company engages in a power play with the Pentagon, survives the first-ever network attack launched by autonomous AI, leaks a model internally that even its developers are afraid of, and even "accidentally" exposes the entire source code? All stacked together, what does that look like?

The answer is what it looks like now. And what is even more unsettling is that the truly most dangerous part may not have happened yet.

Event Review

Anthropic Leaks Its Code Again

On March 31, 2026, security researcher Shou Chaofan from the blockchain company Fuzzland discovered a file named cli.js.map in the official Claude Code npm package, while inspecting it.

This 60MB file contained astonishing content. It almost included the entire product's complete TypeScript source code. With just this file, anyone could reconstruct up to 1906 internal source files: including internal API designs, telemetry systems, encryption tools, security logic, plugin systems—almost all core components laid bare. More crucially, this content could even be directly downloaded as a zip file from Anthropic's own R2 bucket.

This discovery quickly spread on social media: within hours, the related posts received 754,000 views and nearly 1000 retweets; at the same time, multiple GitHub repositories containing the leaked source code were created and made public.

Anthropic's Triple Moment: Code Leak, Government Standoff, and Weaponization

The so-called source map, fundamentally just an auxiliary file for JavaScript debugging, is designed to revert compressed and compiled code back to its original source code, facilitating issue troubleshooting for developers.

However, there is one basic principle: it should never be included in the production environment's release package.

This is not some advanced hacking technique, but a very basic engineering best practice issue, part of "Build Configuration 101", something developers learn even in their first week. If mistakenly packaged into the production environment, the source map often amounts to essentially "gifting" the source code to everyone.

You can also directly view the related code here: https://github.com/instructkr/claude-code

But what truly makes this situation absurd is: this has already happened once before.

In February 2025, just a year ago, almost the exact same leak: the same file, the same kind of mistake. Anthropic then removed the old version from npm, eliminated the source map, and re-released a new version, and the issue was resolved.

Yet, in version 2.1.88, this file was once again packaged and released.

A company with a market value of 380 billion dollars, currently building the world's most advanced vulnerability detection system, made the same fundamental mistake twice within a year. There were no hacker attacks, no convoluted exploit paths, just a basic build process that should have worked as expected going wrong.

This irony is almost poetic in a way.

The AI that could discover 500 zero-day vulnerabilities in a single run; the model used to launch automated attacks against 30 global organizations—meanwhile, Anthropic inadvertently "gift-wrapped" its own source code for anyone willing to glance at an npm package.

Two Leaks, Just Seven Days Apart.

Yet, the reasons were eerily similar: the most basic configuration error. No need for any technical sophistication, no convoluted exploitation path. Just knowing where to look, anyone could freely grab.

A Week Ago: Internal 'Risk Model' Accidentally Exposed

On March 26, 2026, LayerX Security's security researcher Roy Paz and University of Cambridge's Alexandre Pauwels discovered a misconfiguration in the Anthropic website's CMS, leading to the exposure of around 3000 internal files.

These files included draft blogs, PDFs, internal documents, presentation materials—all exposed in an unprotected, searchable datastore. No hack needed, no technicality.

Among these files were two nearly identical blog drafts, differing only in the model name: one labeled 'Mythos,' the other labeled 'Capybara.'

This indicated that Anthropic was then deciding between two names for the same secretive project. The company later confirmed: the model's training had been completed and testing had commenced with some early customers.

This was not a routine upgrade to Opus but a brand-new 'Tier Four' model, a tier positioned even above Opus.

In Anthropic's own draft, it was described as "larger, more intelligent than our Opus model—Opus being our strongest model to date." It had seen a significant leap in programming prowess, academic reasoning, and cybersecurity. A spokesperson called it "a qualitative leap" and also "the most powerful model we've built to date."

But what truly caught attention wasn't these performance descriptions.

In the leaked draft, Anthropic's assessment of this model was that it "introduces unprecedented cybersecurity risks," "far surpasses any other AI model in network capabilities," and "heralds an upcoming wave of models—a wave that leverages vulnerabilities at a speed far beyond defenders' responses."

In other words, in an unpublished official blog draft, Anthropic had already expressed a rare stance: they were uneasy about the product they were building.

The market's reaction was almost instantaneous. CrowdStrike's stock price fell by 7%, Palo Alto Networks dropped by 6%, Zscaler fell by 4.5%; both Okta and SentinelOne dropped by over 7%, while Tenable plummeted by 9%. The iShares Cybersecurity ETF experienced a 4.5% single-day decline. Only CrowdStrike saw its market capitalization evaporate by around $15 billion on that day alone. Meanwhile, Bitcoin fell back to $66,000.

The market evidently interpreted this event as a "judgment" on the entire cybersecurity industry.

The gist of the image: Under the influence of relevant news, the cybersecurity sector as a whole experienced a decline, with several leading companies (such as CrowdStrike, Palo Alto Networks, Zscaler, etc.) showing significant drops, reflecting the market's concerns about AI's impact on the cybersecurity industry. However, this reaction is not unprecedented. Previously, when Anthropic released a code scanning tool, related stocks also fell, indicating that the market has begun to view AI as a structural threat to traditional security vendors, with the entire software industry facing similar pressures.

Stifel analyst Adam Borg's evaluation was quite direct: The model "has the potential to become the ultimate hacking tool, capable of even turning a regular hacker into an adversary with nation-state-level attack capabilities."

So why hasn't it been publicly released yet? Anthropic's explanation is that the operating cost of Mythos is "very high" and does not yet meet the conditions for public release. The current plan is to first grant early access to a small number of cybersecurity partners to strengthen their defense systems; then gradually expand the API's open access. In the meantime, the company continues to optimize efficiency.

However, the key point is that this model already exists, is already being tested, and merely due to being "accidentally exposed," has already had an impact on the entire capital market.

Anthropic has built what they themselves call the "most cyber-risky AI model in history." Yet, the leakage of its information stems precisely from one of the most basic infrastructure configuration errors—exactly the kind of error that these models were originally designed to detect.

-- Price

--

March 2026: Anthropic's Showdown with the Pentagon and Victory

In July 2025, Anthropic signed a $200 million contract with the U.S. Department of Defense, initially appearing to be a routine collaboration. However, during subsequent deployment negotiations, the conflict escalated rapidly.

The Pentagon sought "full access" to Claude on its GenAI.mil platform for all "lawful purposes" — including even fully autonomous weapon systems and extensive domestic surveillance of American citizens.

Anthropic drew red lines on two critical issues and explicitly refused, leading to the breakdown of negotiations in September 2025.

Subsequently, the situation began to escalate rapidly. On February 27, 2026, Donald Trump posted on Truth Social, demanding all federal agencies to "immediately cease" using Anthropic's technology and labeling the company as "far-left."

On March 5, 2026, the U.S. Department of Defense formally designated Anthropic as a "supply chain risk."

This label was previously used almost exclusively for foreign adversaries — such as Chinese companies or Russian entities — and is now being applied for the first time to a U.S. company based in San Francisco. Meanwhile, companies like Amazon, Microsoft, and Palantir Technologies were also required to prove that Claude was not used in any of their military-related operations.

The Pentagon's CTO Emile Michael's explanation for this decision was that Claude could "taint" the supply chain because the model embeds different "policy preferences." In other words, in the official context, an AI with restrictions on use that would not unconditionally assist in lethal action is instead seen as a national security risk.

On March 26, 2026, Federal Judge Rita Lin issued a 43-page ruling that comprehensively blocked the Pentagon's related measures.

In her judgment, she wrote, "There is no legal basis in current law to support this 'Orwellian' logic — that merely because a U.S. company disagrees with the government's position, it can be labeled potential adversaries. Punishing Anthropic for putting the government's position under public scrutiny is fundamentally classic and illegal First Amendment retaliation." A court amicus opinion even described the Pentagon's actions as "an attempt to murder a corporation."

As a result, the government's attempt to suppress Anthropic actually brought it more attention. The Claude app surpassed ChatGPT for the first time on the app store, with registration numbers peaking at over 1 million per day.

An AI company said "no" to the world's most powerful military organization. And the court sided with them.

November 2025: The First AI-Led Cyberattack in History

On November 14, 2025, Anthropic released a report that sent shockwaves worldwide.

The report revealed that a Chinese state-sponsored hacking group used Claude Code to launch automated attacks against 30 global institutions, including tech giants, banks, and multiple government agencies.

This marked a pivotal moment: AI was no longer just an assisting tool but was being used to autonomously carry out cyberattacks.

The key was a shift in the "division of labor": humans were only responsible for target selection and approving key decisions. Throughout the entire operation, they intervened only about 4 to 6 times. Everything else was handled by AI: intelligence gathering, vulnerability identification, exploit code writing, data exfiltration, backdoor implantation... comprising 80%–90% of the entire attack process and running at a rate of thousands of requests per second—a scale and efficiency unmatched by any human team.

So how did they bypass Claude's security measures? The answer: they didn't "break" through but "tricked" through.

The attack was fragmented into seemingly harmless subtasks and packaged as an "authorized penetration test" by a "legitimate security company." Essentially, it was a form of social engineering attack, except this time, the deceived target was the AI itself.

Some parts of the attack were highly successful. Claude was able to autonomously map out the entire network topology, locate databases, and perform data extraction without human step-by-step instructions.

The only factor slowing down the attack pace was occasional "hallucinations" in the model—such as fabricated credentials or claims of obtaining documents that were actually already public. At least for now, these remain among the few "natural obstacles" preventing fully automated cyberattacks.

At the RSA Conference 2026, former NSA cybersecurity chief Rob Joyce referred to this event as a "Rorschach test": some choose to ignore, while others are deeply disturbed. And he, apparently, belonged to the latter—"It's very frightening."

September 2025: This is not some kind of prediction; it’s a reality that has already happened.

February 2026: One run uncovers 500 zero-day vulnerabilities

On February 5, 2026, Anthropic released Claude Opus 4.6, accompanied by a research paper that almost shook the entire cybersecurity industry.

The experimental setup was extremely simple: Claude was placed in an isolated virtual machine environment equipped with standard tools—Python, a debugger, and fuzzers. There were no additional instructions, no complex prompts, just one sentence: “Go find bugs.”

The result: the model discovered over 500 previously unknown high-risk zero-day vulnerabilities. Some of these vulnerabilities remained undiscovered even after decades of expert review and millions of hours of automated testing.

Subsequently, at the RSA Conference 2026, researcher Nicholas Carlini took the stage. He aimed Claude at Ghost, a CMS system on GitHub with 50,000 stars and no history of serious vulnerabilities.

After 90 minutes, the result came in: a blind SQL injection vulnerability was found, allowing unauthenticated users to achieve full admin takeover.

Next, he used Claude to analyze the Linux kernel. The results were similar.

15 days later, Anthropic introduced Claude Code Security, a security product that no longer relies on pattern matching but is based on “reasoning ability” to understand code.

However, an Anthropic spokesperson also stated a key but often overlooked fact: “The same reasoning ability that can help Claude discover and fix vulnerabilities can also be used by attackers to exploit these vulnerabilities.”

It’s the same ability, the same model, just wielded by different hands.

All of this coming together, what does it mean?

If viewed individually, each one of these would be enough to be the headline news of the month. However, they all happened within a mere six months, all at the same company.

Anthropic has built a model that can discover vulnerabilities faster than any human; Chinese hackers have turned the previous generation into an automated network weapon; the company is developing a next-generation, more powerful model, even admitting in internal documents - they are uneasy about it.

The U.S. government has tried to suppress it, not because the technology itself is dangerous, but because Anthropic refuses to hand over this capability without limits.

And amidst all this, this company has leaked its source code twice due to the same file in the same npm package. A company worth $380 billion; a company aiming for a $600 billion IPO by October 2026; a company publicly stating it is building "one of the most transformative and potentially dangerous technologies in human history" - yet they choose to continue moving forward.

Because they believe: It's better to do it themselves than to have others do it for them.

As for that source map in the npm package - it may just be the most unsettling detail in this era's most disturbing narrative, the most absurd, yet also the most real.

And Mythos hasn't even been officially released yet.

[Original Article Link]

You may also like

Popular coins

Latest Crypto News

Read more