How is CVE-2026-4944 exploited?

Network reachability (required): The attacker must be able to reach the vllm service or influence its model-loading requests over the network, for example by serving a malicious HuggingFace model repository from a network-accessible location. Authentication (not-required): No credentials are required; the vulnerability is triggered purely through the hardcoded parameter during model loading, without any authentication gate. Victim interaction (required): A user or operator must initiate loading of a malicious NemotronVL or KimiK25 model repository, making this a social-engineering or supply-chain vector where the victim must take an action. Attack complexity (info): Attack complexity is low; once the victim loads the malicious model, the exploit is reliable and requires no race condition, memory-layout knowledge, or other environmental pre-condition.

What is the impact of CVE-2026-4944?

The attacker executes arbitrary code in the context of the vllm server process, with whatever OS-level privileges that process holds. All data accessible to the vllm process (model weights, inference inputs, API tokens, environment variables) is readable by the attacker. The attacker can write or delete files, modify persisted model state, or alter runtime configuration within the process environment. The attacker can terminate or crash the vllm service, interrupting inference availability for any workloads depending on it.

Back to search

HIGHCVE-2026-4944Published 2026-05-28Modified 2026-05-28CNA @huntr_ai

CVE-2026-4944: Hardcoded trust_remote_code=True in vllm-project/vllm Bypasses User Security Control

vllm-project/vllm version 0.14.1 contains a vulnerability where the `trust_remote_code=True` parameter is hardcoded in two model implementation files (`vllm/model_executor/models/nemotron_vl.py` and `vllm/model_executor/models/kimi_k25.py`). This bypasses the user's explicit `--trust-remote-code=False` setting, enabling remote code execution via malicious HuggingFace model repositories. This issue is an incomplete fix for CVE-2025-66448 and CVE-2026-22807, as it affects separate code paths in model implementation files. Deployments loading NemotronVL or KimiK25 models are particularly impacted.

HarborGuard Analysis

HarborGuard analysis

Synopsis

An authentication-bypass and remote code execution vulnerability exists in vllm-project/vllm version 0.14.1, where the `trust_remote_code=True` parameter is hardcoded in two model implementation files (`nemotron_vl.py` and `kimi_k25.py`). A network-reachable attacker who can influence which HuggingFace model repository is loaded can bypass the user's explicit `--trust-remote-code=False` setting without any credentials, provided the victim interacts with a malicious model. Successful exploitation gives the attacker full code execution in the context of the vllm process, with complete access to data, the ability to modify state, and the ability to crash the service. HarborGuard is tracking this advisory and will make a patched-image rebuild available as soon as an upstream fix is published.

HarborGuard Coverage

Detection

Detection is available across every HarborGuard environment: the CVE is ingested from upstream advisory feeds within minutes of publication and matched against customer images in connected registries and CI pipelines, including custom-built images that bundle vllm. Any image containing the affected vllm-project/vllm package at version 0.14.1 or earlier is flagged automatically.

Available

Triage

HarborGuard scores this CVE at CVSS 8.8 (HIGH) and weights it against each environment's compliance policy to determine escalation priority. Findings are routed to the appropriate team inbox within the customer org based on image ownership and policy configuration.

Available

Patch

No upstream fix version has been published for this CVE yet. HarborGuard re-evaluates the advisory on every ingest cycle and will make a patched-image rebuild available the moment the upstream project ships a corrective release. For customers with auto-remediation enabled, the rebuild, regression test run, and PR against affected workloads will be triggered automatically at that point.

Pending upstream

Exploit Conditions

Network reachabilityRequired
The attacker must be able to reach the vllm service or influence its model-loading requests over the network, for example by serving a malicious HuggingFace model repository from a network-accessible location.
AuthenticationNot required
No credentials are required; the vulnerability is triggered purely through the hardcoded parameter during model loading, without any authentication gate.
Victim interactionRequired
A user or operator must initiate loading of a malicious NemotronVL or KimiK25 model repository, making this a social-engineering or supply-chain vector where the victim must take an action.
Attack complexityDetail
Attack complexity is low; once the victim loads the malicious model, the exploit is reliable and requires no race condition, memory-layout knowledge, or other environmental pre-condition.

Blast Radius

The attacker executes arbitrary code in the context of the vllm server process, with whatever OS-level privileges that process holds.
All data accessible to the vllm process (model weights, inference inputs, API tokens, environment variables) is readable by the attacker.
The attacker can write or delete files, modify persisted model state, or alter runtime configuration within the process environment.
The attacker can terminate or crash the vllm service, interrupting inference availability for any workloads depending on it.

How HarborGuard Handles This

Available on HarborGuard: because no upstream fix has been published, automated image rebuilds are not yet available, but HarborGuard continuously monitors the advisory and will trigger a patched rebuild the moment a fix version appears. For customers with auto-remediation enabled, that rebuild will be followed by a regression test run and a PR opened against affected workloads. In the meantime, compensating controls worth considering include network-policy isolation that restricts outbound model-fetch requests to a vetted model registry allowlist, egress filtering to block connections to arbitrary HuggingFace endpoints at the network layer, and feature-flag gating to prevent operators from loading NemotronVL or KimiK25 model paths in production environments until the fix is available. HarborGuard will surface a rebuild prompt automatically once the upstream project publishes a corrective release.

See how HarborGuard automates this

CVSS v3.0: 8.8
Severity: HIGH
Fixed in: —
Affected Products: 1

Affected packages

vllm-project / vllm-project/vllm
≤ latest

CVSS Vector

CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

References

huntr.com