How is CVE-2026-5497 exploited?

Network reachability (required): The vulnerable endpoint is exposed over the network via the OpenAI-compatible HTTP API, so an attacker must be able to reach the service across the network. Authentication (not-required): The chat completions API endpoint accepts requests without any authentication credential, so no account or token is needed. Victim interaction (not-required): The attack is fully server-side; no user action or social engineering is needed beyond submitting the crafted API request. Attack complexity (info): Exploitation is reliable and condition-free: a single well-formed HTTP request carrying a comma-padded base64 data URL is sufficient to trigger the memory exhaustion.

What is the impact of CVE-2026-5497?

Crashes the vLLM inference server process due to total memory exhaustion, taking down all inference capacity on that host. Denies service to every client depending on the affected vLLM instance for the duration of the outage or until the process is restarted. No confidential data is read and no stored data is modified; impact is limited to availability.

Back to search

HIGHCVE-2026-5497Published 2026-06-11Modified 2026-06-11CNA @huntr_ai

CVE-2026-5497: Unbounded Frame Count in video/jpeg Base64 Data URL Processing Leads to OOM DoS in vllm-project/vllm

vLLM versions 0.8.0 and later are vulnerable to an Out-of-Memory (OOM) Denial of Service (DoS) attack due to unbounded frame count processing in the `VideoMediaIO.load_base64()` method. When processing `video/jpeg` data URLs, the method splits the base64 data string on commas to extract individual JPEG frames without enforcing a frame count limit. An attacker can exploit this by crafting a single API request containing thousands of comma-separated base64-encoded JPEG frames in a data URL, causing the server to decode all frames into memory and crash due to excessive memory consumption. This vulnerability is reachable via the OpenAI-compatible chat completions API and does not require authentication.

CVSS v3.0: 7.5
Severity: HIGH
Fixed in: 0.19.0
Affected Products: 1

HarborGuard Analysis

Synopsis

An out-of-memory denial-of-service vulnerability affects vLLM (vllm-project/vllm) versions 0.8.0 through 0.18.x. The flaw is in the VideoMediaIO.load_base64() method, which splits comma-separated base64 JPEG frames from a video/jpeg data URL without enforcing any upper limit on frame count. An attacker can send a single crafted HTTP request to the OpenAI-compatible chat completions API containing thousands of JPEG frames, causing the server to exhaust available memory and crash. A patched-image rebuild at version 0.19.0 is available on HarborGuard for affected environments.

HarborGuard Coverage

Detection

Detection is available across every HarborGuard environment: the CVE is ingested from upstream advisory feeds within minutes of publication and matched against customer images in connected registries and CI pipelines, including custom-built images derived from vLLM base layers.

Available

Triage

HarborGuard scores this CVE at 7.5 HIGH using the recorded CVSS v3.0 vector and weights it against each environment's active compliance policy, routing findings to the appropriate team inbox within the customer org.

Available

Patch

A patched-image rebuild pinned to vLLM 0.19.0 is available on HarborGuard for any environment running an affected version. For customers who opt into auto-remediation, HarborGuard triggers a rebuild, runs a regression test suite against the new image, and opens a pull request against affected workloads.

Available

Exploit Conditions

Network reachabilityRequired
The vulnerable endpoint is exposed over the network via the OpenAI-compatible HTTP API, so an attacker must be able to reach the service across the network.
AuthenticationNot required
The chat completions API endpoint accepts requests without any authentication credential, so no account or token is needed.
Victim interactionNot required
The attack is fully server-side; no user action or social engineering is needed beyond submitting the crafted API request.
Attack complexityDetail
Exploitation is reliable and condition-free: a single well-formed HTTP request carrying a comma-padded base64 data URL is sufficient to trigger the memory exhaustion.

Blast Radius

Crashes the vLLM inference server process due to total memory exhaustion, taking down all inference capacity on that host.
Denies service to every client depending on the affected vLLM instance for the duration of the outage or until the process is restarted.
No confidential data is read and no stored data is modified; impact is limited to availability.

How HarborGuard Handles This

Available on HarborGuard: images running vLLM 0.8.0 through 0.18.x are flagged automatically when the CVE is ingested, typically within minutes of publication. For customers who opt into auto-remediation, HarborGuard rebuilds the image at version 0.19.0, runs a regression test pass, and opens a pull request against affected workloads; median time from CVE publication to a merged patch PR for high-severity issues is around 90 minutes in environments with auto-remediation enabled. Where compliance policy requires manual approval, the rebuilt image and associated scan diff are staged and routed to the designated approver inbox. As a compensating control prior to patching, customers can apply a network policy that restricts access to the vLLM API to trusted internal CIDRs only, reducing the set of potential senders of malicious frame-count payloads.

See how HarborGuard automates this

Fix available

0.19.0

Affected packages

vllm-project / vllm-project/vllm
< 0.19.0 (from unspecified)

CVSS Vector

CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H

References

Metrics

Get notified

HarborGuard Analysis

Synopsis

HarborGuard Coverage

Exploit Conditions

Blast Radius

How HarborGuard Handles This

Fix available