HarborGuard / CVE
Back to search
CRITICALCVE-2026-45311Published Modified CNA GitHub_M

CVE-2026-45311: CodeWhale: run_tests Tool Enables RCE via Malicious Repository Without Approval

CodeWhale is a DeepSeek + MiMo coding agent in terminal. From 0.3.0 to 0.8.23, the run_tests tool executes cargo test in the workspace with ApprovalRequirement::Auto, meaning it runs without any user approval prompt. cargo test compiles and executes arbitrary code: test binaries, build.rs build scripts, and proc macros. While auto-approving test execution is a deliberate design choice, it creates an inconsistency in the security boundary. However, in a malicious repository, test code can execute arbitrary shell commands, exfiltrate credentials, or establish persistence with zero approval. The attack is amplified by AGENTS.md (auto-loaded into the system prompt), which can instruct the model to run tests proactively at session start. This vulnerability is fixed in 0.8.23.

HarborGuard Analysis

HarborGuard analysis

Synopsis

Remote code execution vulnerability in CodeWhale, a terminal-based AI coding agent, affects versions 0.3.0 through 0.8.23. The run_tests tool executes cargo test with automatic approval (no user prompt), meaning that a malicious repository can embed arbitrary code in test binaries, build scripts, or proc macros that runs immediately when the agent processes the repo. A victim must open the malicious repository with CodeWhale, but once they do, an attacker achieves full code execution on the host with no further barriers. No fix version has been published yet; HarborGuard is tracking the advisory for patch availability.

HarborGuard Coverage

Detection

Detection of CVE-2026-45311 is available across every HarborGuard environment: the CVE is ingested from upstream advisory feeds within minutes of publication and matched against all customer images, including custom-built images that bundle the CodeWhale binary. Any image containing a CodeWhale version in the affected range (0.3.0 to 0.8.23) is flagged immediately.

Available
Triage

Triage is available with the full CVSS v3.1 score of 9.6 (Critical), and each finding can be weighted against per-environment compliance policies to determine priority and routing. Alerts are directed to the appropriate team inbox within each customer organization based on configured ownership rules.

Available
Patch

Because no upstream fix has been published, HarborGuard re-checks the advisory on every ingest cycle and will make a patched-image rebuild available the moment a fix version is released. In the interim, customers can apply compensating controls through HarborGuard policy rules, such as flagging or blocking deployment of any image containing affected CodeWhale versions.

Pending upstream

Exploit Conditions

  • Network reachabilityRequired

    The attacker delivers the malicious repository over the network; the victim must fetch or open a remote repo, exposing the attack surface to any network-accessible source.

  • AuthenticationNot required

    No credentials or account privileges are needed; any unauthenticated party can publish or share a malicious repository.

  • Victim interactionRequired

    The victim must open the malicious repository with CodeWhale, making social engineering (e.g., sharing a repo link) the required delivery mechanism.

  • Attack complexityDetail

    Exploitation is reliable and condition-free once the repository is opened; no race conditions, memory layout dependencies, or special environmental factors apply.

Blast Radius

  • Arbitrary shell commands execute on the victim's host inside the terminal session with the victim's user privileges.
  • Test code or build scripts read and exfiltrate credentials, SSH keys, API tokens, or environment variables stored on the machine.
  • An attacker establishes persistence by writing files, installing cron jobs, or modifying shell profiles on the host.
  • The AGENTS.md mechanism allows the malicious repo to instruct the AI model to trigger test execution automatically at session start, removing even the implicit friction of a manual command.

How HarborGuard Handles This

Available on HarborGuard: affected images containing CodeWhale 0.3.0 through 0.8.23 are identified automatically as each customer's registry and pipeline images are scanned. Because no upstream fix exists at this time, HarborGuard monitors the advisory on every ingest cycle and will surface a patched-image rebuild opportunity the moment a fix version is published. For customers with auto-remediation enabled, the rebuild, regression run, and PR flow will trigger automatically against affected workloads once a fix is available. In the interim, recommended compensating controls include blocking deployment of images containing affected CodeWhale versions via HarborGuard policy rules, restricting the agent to trusted internal repositories through network egress filtering, and auditing any AGENTS.md files in repositories the agent is permitted to access.

See how HarborGuard automates this

Metrics

CVSS v3.1
9.6
Severity
CRITICAL
Fixed in
Affected Products
1
Affected packages
  • Hmbown / CodeWhale
    >= 0.3.0, < 0.8.23
CVSS Vector
CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H