‹ ARCHIVE NB-L066 · .log · 2026·07

Finding a hole in the software you rely on now costs an attacker 17 cents

ID: NB-L066
TYPE: .log
LOGGED: 2026-07-04 18:05 WEST
READING: 4 min read
STATUS: PUBLISHED
CLEARANCE: PUBLIC

NB-L066 .log

An independent test put a free Chinese AI model, one that anyone can download, up against Anthropic's flagship at finding vulnerabilities in code. The Chinese model won, at roughly 17 cents per flaw it found.

The benchmark is the least interesting part of this story. What matters is the contrast. The United States spent months building, and then partly walking back, export controls on its most capable AI models, meant to keep the most dangerous capability out of adversaries' hands. Meanwhile, a Chinese lab published a model for free that beat the American flagship at finding flaws in code, under a licence that lets anyone run it on their own machine. You cannot embargo a file that everyone can download. And whoever wants to attack you just got the tool for nothing.

A model no one can recall

The model is called GLM-5.2, from Zhipu AI, a Chinese company that operates under the brand Z.ai. It was released on June 13 under an MIT licence, which in practice means open-weight: anyone can download the entire model, run it offline, and modify it without asking permission. It has roughly 750 billion parameters and a context window that reaches one million tokens.

The proof came from Semgrep, a software security firm that set it loose on access-control flaws, the kind of bug that lets one user see another's data just by changing a number in the address. Given each model only the same prompt, with no help, GLM-5.2 scored 39% and came in ahead of the Claude configurations tested, at about 17 cents per flaw found. In Semgrep's own words, “the best open-weight option beat Claude Opus 4.8,” which landed at 28%. Zhipu goes further and claims parity with Claude Mythos, one of the American models the US government moved to restrict over cybersecurity concerns. This is where you hit the brakes, because the independent evidence supports a narrow claim, solid assisted vulnerability analysis, not the full parity for building attacks that the headline promises.

There is one more thing the test did not show, worth saying so no one sells panic. Semgrep's own defensive pipeline, a machine built on purpose around several models, still beats raw GLM-5.2, at 53% to 61%. The sky did not fall. What moved was the floor, because finding flaws at the level of the best models stopped being an expensive privilege and became a free download, and that does not undo.

Why this weighs more on the defender

The asymmetry is in who gains more from the same tool. The attacker gets the version with no brakes: they run the model on their own machine, strip out the safety guardrails that would block a malicious request, fine-tune it against their target, and work with no provider watching. When someone abuses a cloud-hosted AI, there is a trail, the provider detects it, cuts access, keeps logs. An open model running offline breaks that chain of custody, there is no provider, no log, no visibility. The defender, on the other hand, does not download a finished solution. They have to build the machine around the model, the way Semgrep did, and that costs time, engineering, and money. The attacker downloads capability, the defender has to manufacture it.

In Portugal this lands just as the bar has been raised. On June 22, the National Cybersecurity Centre published the regulation that implements the country's new Legal Framework for Cybersecurity, the transposition of the European NIS2 directive, with three levels of compliance and a set of minimum measures that become mandatory for organisations deemed essential, important, or relevant public bodies. The new law assumes what this test confirms, that the organisations holding the country up can no longer start from the belief that the adversary is poorly equipped.

How to prepare for a better-armed adversary

Against a tool like this, defence does not change in nature, it changes in urgency. What helps:

Assume parity. Do not count on the attacker not having the tool. It is free, so assume they have it.
Shrink the patch window. The time between a flaw going public and being exploited is compressing, and patching fast has gone from good practice to survival.
Invest in the machine, not just the model. It was a system built around models, not a single model, that won the test. Your defence needs to be a process too, not a purchase.
Close the obvious first. The flaw GLM-5.2 hunts best is one a good security test catches, badly built access control. Much of what these tools find is still hygiene left undone.

The ability to find the holes in your systems is now a free file that no one can call back. The only variable left is whether the defender moves as fast as the attacker who just downloaded it.

Sources: Semgrep, Dark Reading.

#StaySafe
🙏🖖