{
    "document": {
        "category": "csaf_base",
        "csaf_version": "2.0",
        "distribution": {
            "tlp": {
                "label": "WHITE"
            }
        },
        "lang": "en",
        "notes": [
            {
                "category": "legal_disclaimer",
                "text": "The Netherlands Cyber Security Center (henceforth: NCSC-NL) maintains this portal to enhance access to its information and vulnerabilities. The use of this information is subject to the following terms and conditions:\n\nThe vulnerabilities disclosed in this portal are gathered by NCSC-NL from a variety of open sources, which the user can retrieve from other platforms. NCSC-NL makes every reasonable effort to ensure that the content of this portal is kept up to date, and that it is accurate and complete. Nevertheless, NCSC-NL cannot entirely rule out the possibility of errors, and therefore cannot give any warranty in respect of its completeness, accuracy or real-time keeping up-to-date. NCSC-NL does not control nor guarantee the accuracy, relevance, timeliness or completeness of information obtained from these external sources. The vulnerabilities disclosed in this portal are intended solely for the convenience of professional parties to take appropriate measures to manage the risks posed to the cybersecurity. No rights can be derived from the information provided therein.\n\nNCSC-NL and the Kingdom of the Netherlands assume no legal liability or responsibility for any damage resulting from either the use or inability of use of the vulnerabilities disclosed in this portal. This includes damage resulting from the inaccuracy of incompleteness of the information contained in it.\nThe information on this page is subject to Dutch law. All disputes related to or arising from the use of this portal regarding the disclosure of vulnerabilities will be submitted to the competent court in The Hague. This choice of means also applies to the court in summary proceedings."
            }
        ],
        "publisher": {
            "category": "coordinator",
            "contact_details": "cert@ncsc.nl",
            "name": "National Cyber Security Centre",
            "namespace": "https://www.ncsc.nl/"
        },
        "title": "CVE-2026-34756",
        "tracking": {
            "current_release_date": "2026-04-03T16:12:51.748581Z",
            "generator": {
                "date": "2026-02-17T15:00:00Z",
                "engine": {
                    "name": "V.E.L.M.A",
                    "version": "1.7"
                }
            },
            "id": "CVE-2026-34756",
            "initial_release_date": "2026-04-03T16:05:31.296909Z",
            "revision_history": [
                {
                    "date": "2026-04-03T16:05:31.296909Z",
                    "number": "1",
                    "summary": "CVE created.| Source created.| CVE status created. (valid)| Description created for source.| CVSS created.| References created (4).| CWES updated (1)."
                },
                {
                    "date": "2026-04-03T16:05:35.418337Z",
                    "number": "2",
                    "summary": "NCSC Score created."
                }
            ],
            "status": "interim",
            "version": "2"
        }
    },
    "vulnerabilities": [
        {
            "cve": "CVE-2026-34756",
            "cwe": {
                "id": "CWE-770",
                "name": "Allocation of Resources Without Limits or Throttling"
            },
            "notes": [
                {
                    "category": "description",
                    "text": "### Summary\nA Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the `n` parameter in the `ChatCompletionRequest` and `CompletionRequest` Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large `n` value. This completely blocks the Python `asyncio` event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue.\n\n### Details\nThe root cause of this vulnerability lies in the missing upper bound checks across the request parsing and asynchronous scheduling layers:\n\n1. **Protocol Layer:**\n   In `vllm/entrypoints/openai/chat_completion/protocol.py`, the `n` parameter is defined simply as an integer without any `pydantic.Field` constraints for an upper bound.\n```python\nclass ChatCompletionRequest(OpenAIBaseModel):\n    # Ordered by official OpenAI API documentation\n    # https://platform.openai.com/docs/api/reference/chat/create\n    messages: list[ChatCompletionMessageParam]\n    model: str | None = None\n    frequency_penalty: float | None = 0.0\n    logit_bias: dict[str, float] | None = None\n    logprobs: bool | None = False\n    top_logprobs: int | None = 0\n    max_tokens: int | None = Field(\n        default=None,\n        deprecated=\"max_tokens is deprecated in favor of \"\n        \"the max_completion_tokens field\",\n    )\n    max_completion_tokens: int | None = None\n    n: int | None = 1\n    presence_penalty: float | None = 0.0\n```\n\n1. **SamplingParams Layer (Incomplete Validation):**\n   When the API request is converted to internal `SamplingParams` in `vllm/sampling_params.py`, the `_verify_args` method only checks the lower bound (`self.n < 1`), entirely omitting an upper bounds check.\n```python\n    def _verify_args(self) -> None:\n        if not isinstance(self.n, int):\n            raise ValueError(f\"n must be an int, but is of type {type(self.n)}\")\n        if self.n < 1:\n            raise ValueError(f\"n must be at least 1, got {self.n}.\")\n```\n\n1. **Engine Layer (The OOM Trigger):**\n   When the malicious request reaches the core engine (`vllm/v1/engine/async_llm.py`), the engine attempts to fan out the request `n` times to generate identical independent sequences within a synchronous loop.\n```python\n        # Fan out child requests (for n>1).\n        parent_request = ParentRequest(request)\n        for idx in range(parent_params.n):\n            request_id, child_params = parent_request.get_child_info(idx)\n            child_request = request if idx == parent_params.n - 1 else copy(request)\n            child_request.request_id = request_id\n            child_request.sampling_params = child_params\n            await self._add_request(\n                child_request, prompt_text, parent_request, idx, queue\n            )\n        return queue\n```\n   Because Python's `asyncio` runs on a single thread and event loop, this monolithic `for`-loop monopolizes the CPU thread. The server stops responding to all other connections (including liveness probes). Simultaneously, the memory allocator is overwhelmed by cloning millions of request object instances via `copy(request)`, driving the host's Resident Set Size (RSS) up by gigabytes per second until the OS `OOM-killer` terminates the vLLM process.\n\n### Impact\n**Vulnerability Type:** Resource Exhaustion / Denial of Service\n\n**Impacted Parties:**\n- Any individual or organization hosting a public-facing vLLM API server (`vllm.entrypoints.openai.api_server`), which happens to be the primary entrypoint for OpenAI-compatible setups.\n- SaaS / AI-as-a-Service platforms acting as reverse proxies sitting in front of vLLM without strict HTTP body payload validation or rate limitations.\n\nBecause this vulnerability exploits the control plane rather than the data plane, an unauthenticated remote attacker can achieve a high success rate in taking down production inference hosts with a single HTTP request. This effectively circumvents any hardware-level capacity planning and conventional bandwidth stress limitations.",
                    "title": "github - https://api.github.com/advisories/GHSA-3mwp-wvh9-7528"
                },
                {
                    "category": "other",
                    "text": "3.6",
                    "title": "NCSC Score"
                }
            ],
            "references": [
                {
                    "category": "external",
                    "summary": "Source - github",
                    "url": "https://api.github.com/advisories/GHSA-3mwp-wvh9-7528"
                },
                {
                    "category": "external",
                    "summary": "Reference - github",
                    "url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-3mwp-wvh9-7528"
                },
                {
                    "category": "external",
                    "summary": "Reference - github",
                    "url": "https://github.com/vllm-project/vllm/pull/37952"
                },
                {
                    "category": "external",
                    "summary": "Reference - github",
                    "url": "https://github.com/vllm-project/vllm/commit/b111f8a61f100fdca08706f41f29ef3548de7380"
                },
                {
                    "category": "external",
                    "summary": "Reference - github",
                    "url": "https://github.com/advisories/GHSA-3mwp-wvh9-7528"
                }
            ],
            "title": "CVE-2026-34756"
        }
    ]
}