diff --git a/threat-model/Chicory_ThreatModel_20251124_182935.md b/threat-model/Chicory_ThreatModel_20251124_182935.md new file mode 100644 index 000000000..ad04e0999 --- /dev/null +++ b/threat-model/Chicory_ThreatModel_20251124_182935.md @@ -0,0 +1,240 @@ +--- +title: "Threat Model: Chicory" +date: 2025-11-24 18:29:35 +geometry: + - margin=1in +--- + + +**Model:** gemini-2.5-pro + +## Executive Summary + +The threat model for the Chicory WebAssembly runtime identifies several critical and high-risk threats primarily related to the processing of untrusted Wasm modules. The most severe vulnerabilities stem from potential sandbox escapes via compiler exploitation, cache poisoning of the build-time compiled code, and path traversal attacks through the WASI implementation. Denial-of-Service risks are also prominent, with both the parser and compiler being susceptible to resource exhaustion from maliciously crafted inputs (compiler bombs). Key mitigation strategies must focus on enforcing strict resource limits, implementing integrity verification for the AOT cache, hardening the WASI filesystem sandboxing to be secure by default, and adding a bytecode verification step after compilation. + +## Assumptions + +- The runtime lacks built-in path traversal protection; users must provide a virtual filesystem for sandboxing. +- WASI modules do not inherit host environment variables by default, requiring explicit configuration. +- The runtime compiler lacks internal resource limits, making it susceptible to denial-of-service attacks. +- Users are responsible for implementing resource limits on the compiler to prevent denial-of-service attacks. +- The Wasm sandbox is the sole defense against malicious code execution from a compromised code cache. +- Memory access bounds checking is consistently handled by the Java virtual machine (JVM) across all execution engines. +- The directory cache for compiled code does not perform integrity checks, allowing for cache poisoning attacks. +- Security events are raised as exceptions, requiring users to implement their own logging for compromise recording. +- The system trusts the compiler's output, as there is no separate verification of the generated JVM bytecode. +- The experimental command-line interface (CLI) is not designed for production use and may have insecure defaults. + +## Assets + +| Asset ID | Classification | Description | Owner | +| --------- | -------------- | ----------------------------------------------------------------------------- | ------------------- | +| asset-001 | Internal | Wasm Parser: Responsible for parsing .wasm binary files. | Chicory Maintainers | +| asset-002 | Confidential | Build-time Compiler: Translates Wasm bytecode to JVM bytecode. | Chicory Maintainers | +| asset-003 | Internal | Interpreter Runtime: Executes Wasm instructions on the JVM. | Chicory Maintainers | +| asset-004 | Confidential | WASI Preview 1 Implementation: Provides system call interface. | Chicory Maintainers | +| asset-005 | Restricted | Directory Cache: Filesystem-based cache for compiled modules. | User/Deployer | +| asset-006 | Public | Untrusted Wasm Module: External WebAssembly code provided by the user. | End User | +| asset-007 | Restricted | Host Filesystem: The underlying filesystem of the host machine. | Host Environment | +| asset-008 | Restricted | Compiled JVM Bytecode: Output of the compiler, stored in memory or cache. | Chicory Runtime | +| asset-009 | Internal | Experimental CLI: Command-line tool for running Chicory. | Chicory Maintainers | +| asset-010 | Confidential | Wasm Linear Memory: The memory space allocated for a Wasm instance. | Chicory Runtime | + +## Trust Boundaries + +| Boundary ID | Type | Description | +| ----------------- | ------- | --------------------------------------------------------------------------------- | +| trustboundary-001 | process | Host/Guest Boundary: Separates trusted Chicory runtime from untrusted Wasm guest. | +| trustboundary-002 | process | Filesystem Boundary: Between the running JVM and the host filesystem. | +| trustboundary-003 | process | Compilation Boundary: Between the Wasm module and the generated JVM bytecode. | + +## Data Flows + +| Source | Destination | Protocol | Description | Encryption | Trust Boundary | +| --------- | ----------- | -------- | ----------------------------------------------------------------------- | ----------- | ---------------- | +| End User | asset-001 | File I/O | User provides an untrusted Wasm file for execution. | Unencrypted | Crosses Boundary | +| asset-002 | asset-005 | File I/O | Compiler writes generated bytecode to the directory cache. | Unencrypted | Crosses Boundary | +| asset-005 | asset-003 | File I/O | Runtime loads compiled bytecode from the directory cache for execution. | Unencrypted | Crosses Boundary | +| asset-006 | asset-007 | WASI | Wasm guest code makes a WASI call to access the host filesystem. | Unencrypted | Crosses Boundary | + +## Identified Threats + +| # | Threat | Component | STRIDE | Severity | Risk Score | +| -- | ------------------------------------------------------ | ----------------------------- | ---------------------- | -------- | ---------- | +| 1 | Compiler Bomb Denial of Service | Compiler | Denial of Service | high | 64/100 | +| 2 | Compiler Cache Poisoning leading to Sandbox Escape | Directory Cache | Tampering | critical | 80/100 | +| 3 | Path Traversal Arbitrary File Access | WASI Preview 1 Implementation | Elevation of Privilege | high | 72/100 | +| 4 | Denial of Service via Infinite Loop | Interpreter | Denial of Service | high | 63/100 | +| 5 | Sandbox Escape via Compiler Bug | Compiler | Elevation of Privilege | high | 60/100 | +| 6 | Denial of Service via Parser Exhaustion | Wasm Parser | Denial of Service | medium | 36/100 | +| 7 | Vulnerable Host Function Implementation | Interpreter | Elevation of Privilege | high | 70/100 | +| 8 | Insecure Defaults in CLI leading to Host Compromise | Experimental CLI | Elevation of Privilege | high | 72/100 | +| 9 | Lack of Immutable Compromise Recording | Runtime | Repudiation | medium | 45/100 | +| 10 | Information Disclosure from Insecure Cache Permissions | Directory Cache | Information Disclosure | medium | 30/100 | + +### Threat Details + +#### 1. Compiler Bomb Denial of Service + +The documentation explicitly states the AOT compiler lacks internal resource limits, making it vulnerable to resource exhaustion from maliciously crafted Wasm modules. + +**Component:** AOT Compiler | **STRIDE Category:** StrideCategory.DENIAL_OF_SERVICE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 8/10 Likelihood * 8/10 Impact = 64/100 Overall + +**Likelihood Rationale:** Crafting a Wasm module with deep nesting or complex structures to trigger exponential behavior in a compiler is a known technique. Likelihood is high. + +**Impact Rationale:** A successful attack would cause the host application thread to hang or crash due to resource exhaustion, resulting in a denial of service for the application. + +**Current Mitigation:** There are no current mitigations. The user is responsible for implementing resource limits for WASM modules parsing and compilation. + +**Recommended Mitigation:** Implement a configurable 'compilation budget' (e.g., time limit, instruction count limit, recursion depth limit) within the compiler. Terminate compilation if the budget is exceeded. For robust protection, execute the compilation in a separate thread or process with OS-level resource constraints (cgroups, job objects). + +#### 2. Cache Poisoning leading to Sandbox Escape + +The documentation states the directory cache does not perform integrity checks. An attacker with filesystem access can modify the cached JVM bytecode, leading to arbitrary code execution. + +**Component:** Directory Cache | **STRIDE Category:** StrideCategory.TAMPERING | **Risk Severity:** RiskSeverity.CRITICAL + +**Risk Score:** 8/10 Likelihood * 10/10 Impact = 80/100 Overall + +**Likelihood Rationale:** Requires local filesystem access, but the vector is direct and the assumption confirms the lack of protection. High likelihood if local access is gained. + +**Impact Rationale:** Allows arbitrary code execution with the full privileges of the JVM, completely bypassing the Wasm sandbox. This can lead to total host compromise. + +**Current Mitigation:** None. The cache is assumed to be insecure and is in `experimental` phase. + +**Recommended Mitigation:** Implement integrity verification for cached artifacts. At write time, compute a cryptographic hash (e.g., SHA-256) of the bytecode and store it securely. At read time, re-compute the hash and verify it against the stored value before class loading. Alternatively, sign the bytecode with a private key and verify with a public key. Apply strict file permissions (read/write only by the owner) to the cache directory. + +#### 3. Path Traversal Arbitrary File Access + +The documentation states the runtime lacks built-in path traversal protection, placing the burden of sandboxing on the user. A misconfiguration can expose the host filesystem. + +**Component:** WASI Preview 1 Implementation | **STRIDE Category:** StrideCategory.ELEVATION_OF_PRIVILEGE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 8/10 Likelihood * 9/10 Impact = 72/100 Overall + +**Likelihood Rationale:** Path traversal is a classic and well-understood vulnerability. Misconfiguration by users is a common source of security issues. + +**Impact Rationale:** An attacker could read sensitive files (e.g., /etc/passwd, ~/.ssh/id_rsa) or write malicious files (e.g., backdoors, cron jobs) on the host system, leading to information disclosure or full compromise. + +**Current Mitigation:** The user is responsible for providing a sandboxed virtual filesystem such as [ZeroFs](https://github.com/roastedroot/zerofs) or [JimFs](https://github.com/google/jimfs). The default configuration is not guaranteed to be secure. + +**Recommended Mitigation:** Harden the WASI filesystem implementation to be secure by default. It must canonicalize all paths and strictly enforce that file access remains within pre-opened directory boundaries. Disallow '..' path elements after canonicalization. The default behavior should use an in-memory virtual filesystem (like jimfs, used in tests) instead of exposing the host filesystem. + +#### 4. Denial of Service via Infinite Loop + +A Wasm module can contain an unconditional branch back to the beginning of a loop, which will execute indefinitely and consume 100% of a CPU core. + +**Component:** Interpreter Runtime | **STRIDE Category:** StrideCategory.DENIAL_OF_SERVICE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 9/10 Likelihood * 7/10 Impact = 63/100 Overall + +**Likelihood Rationale:** Crafting a Wasm module with an infinite loop is trivial. + +**Impact Rationale:** The thread executing the Wasm module will become unresponsive, leading to a denial of service for that part of the application. In a thread-per-request model, this can exhaust the application's thread pool. + +**Current Mitigation:** Execution can be interrupted via standard Java thread interruption, but this needs to be implemented by users. + +**Recommended Mitigation:** Implement a 'fuel' or 'tick' system. Each instruction (or block of instructions) consumes a certain amount of fuel. If the fuel tank is empty, the execution traps. The initial amount of fuel should be configurable by the host. The proposed mitigation is unacceptabily costly from the performance stand point. + +#### 5. Sandbox Escape via Compiler Bug + +The documentation states the system trusts the compiler's output and does not perform separate verification of the generated JVM bytecode. A vulnerability in the compiler could be exploited to generate malicious bytecode. + +**Component:** Build-time Compiler | **STRIDE Category:** StrideCategory.ELEVATION_OF_PRIVILEGE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 6/10 Likelihood * 10/10 Impact = 60/100 Overall + +**Likelihood Rationale:** This is a complex attack requiring deep knowledge of the compiler's internals, so likelihood is moderate. However, such vulnerabilities are a known class of bug in language runtimes. + +**Impact Rationale:** A successful exploit would allow the Wasm module to break out of its sandbox and execute arbitrary code with the permissions of the JVM process, leading to host compromise. + +**Current Mitigation:** None. The compiler's output is trusted implicitly. + +**Recommended Mitigation:** Implement a post-compilation bytecode verification pass. This verifier should ensure the generated code adheres to strict sandboxing rules (e.g., does not call unauthorized host APIs, does not use `sun.misc.Unsafe`, only performs sandboxed memory access). Follow the principle of least privilege in the compiler's code generation logic. This is currently covered on a "sample basis" by the approval tests. + +#### 6. Denial of Service via Parser Exhaustion + +The parser is the first component to process untrusted input. A maliciously crafted Wasm file could exploit edge cases in the parsing logic, causing excessive memory allocation or CPU usage. + +**Component:** Wasm Parser | **STRIDE Category:** StrideCategory.DENIAL_OF_SERVICE | **Risk Severity:** RiskSeverity.MEDIUM + +**Risk Score:** 6/10 Likelihood * 6/10 Impact = 36/100 Overall + +**Likelihood Rationale:** Fuzzing often reveals DoS vulnerabilities in parsers. Likelihood is moderate. + +**Impact Rationale:** An application attempting to load a malicious Wasm file could crash with an OutOfMemoryError or become unresponsive. + +**Current Mitigation:** Standard try-catch blocks may exist, but specific resource limits on parsing are not mentioned. + +**Recommended Mitigation:** Conduct extensive fuzz testing (e.g., with wasm-smith) on the parser. Implement pre-parsing checks, such as a maximum file size limit. Within the parser, add safeguards against allocating excessively large data structures based on sizes read from the file. + +#### 7. Vulnerable Host Function Implementation + +Host functions are a bridge between untrusted Wasm and the trusted host. An insecurely written host function can be a vector for sandbox escape. + +**Component:** Interpreter Runtime | **STRIDE Category:** StrideCategory.ELEVATION_OF_PRIVILEGE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 7/10 Likelihood * 10/10 Impact = 70/100 Overall + +**Likelihood Rationale:** Developers frequently make mistakes when dealing with trust boundaries. It is highly likely that a user will write a vulnerable host function. + +**Impact Rationale:** A vulnerable host function could lead to arbitrary code execution, file system access, or network access, depending on its functionality. + +**Current Mitigation:** The security of host functions is the responsibility of the user implementing them. + +**Recommended Mitigation:** Provide extensive documentation with security best practices for writing host functions. Emphasize the need for strict input validation on all data received from Wasm memory. The annotation processor for host functions could be enhanced to inject validation logic or taint-tracking checks. + +#### 8. Insecure Defaults in CLI leading to Host Compromise + +The CLI is marked as `experimental` and may have insecure defaults. The code reveals it can `inheritSystem()` for WASI, potentially exposing the host filesystem without adequate sandboxing. + +**Component:** Experimental CLI | **STRIDE Category:** StrideCategory.ELEVATION_OF_PRIVILEGE | **Risk Severity:** RiskSeverity.HIGH + +**Risk Score:** 8/10 Likelihood * 9/10 Impact = 72/100 Overall + +**Likelihood Rationale:** A user is very likely to use a simple flag like `--wasi` without understanding the full security implications, making misuse easy. + +**Impact Rationale:** Running a malicious Wasm file via the CLI with insecure defaults could grant it arbitrary read/write access to the user's files, leading to data theft or system compromise. + +**Current Mitigation:** The tool is documented as `experimental` and not for production use. + +**Recommended Mitigation:** Change the CLI's default WASI configuration to be fully sandboxed (no filesystem access). Require explicit, verbose flags like `--wasi-dir :` to grant access, and print a prominent security warning when such flags are used. This follows the principle of Secure by Default. + +#### 9. Lack of Immutable Compromise Recording + +The documentation states that security events are raised as exceptions and users are responsible for their own logging. This means there is no built-in, tamper-evident audit trail. + +**Component:** Runtime | **STRIDE Category:** StrideCategory.REPUDIATION | **Risk Severity:** RiskSeverity.MEDIUM + +**Risk Score:** 9/10 Likelihood * 5/10 Impact = 45/100 Overall + +**Likelihood Rationale:** This is explicitly stated as the current design. Without user action, no reliable security logging will occur. + +**Impact Rationale:** In the event of a compromise or attempted compromise, the lack of a reliable audit trail makes it difficult or impossible to perform forensic analysis, determine the scope of the breach, or attribute the malicious action. + +**Current Mitigation:** None. Logging is user-dependent. + +**Recommended Mitigation:** Implement a dedicated, structured security logging interface (e.g., `SecurityLogger`). Critical events like traps, sandbox violations, and WASI calls to sensitive paths should be logged through this interface in a standardized format (like JSON) to a secure, append-only destination. This facilitates integration with SIEMs and provides a reliable audit trail for forensics. + +#### 10. Information Disclosure from Insecure Cache Permissions + +The AOT cache is stored on the local filesystem. If file permissions are not sufficiently restrictive, other unprivileged users on the same machine could read the cached bytecode. + +**Component:** Directory Cache | **STRIDE Category:** StrideCategory.INFORMATION_DISCLOSURE | **Risk Severity:** RiskSeverity.MEDIUM + +**Risk Score:** 6/10 Likelihood * 5/10 Impact = 30/100 Overall + +**Likelihood Rationale:** Multi-tenant systems or developer machines often have multiple users. Insecure default file permissions are a common issue. + +**Impact Rationale:** An attacker could read the compiled JVM bytecode from the cache, allowing them to reverse-engineer proprietary business logic contained within the original Wasm module. + +**Current Mitigation:** Filesystem permissions are managed by the host OS and the user deploying the application. + +**Recommended Mitigation:** When creating the cache directory, explicitly set restrictive file permissions (e.g., `rwx------` or `700` on UNIX-like systems) to ensure only the owner can read, write, and execute files within it. Document this as a security best practice for deployment. + +## Quality Assessment + +- Input Data Quality: 8/10 - High-quality input from full codebase and explicit security assumptions. +- Model Confidence: 9/10 - High confidence due to clear assumptions and direct codebase analysis. diff --git a/threat-model/Readme.md b/threat-model/Readme.md new file mode 100644 index 000000000..8ea57b358 --- /dev/null +++ b/threat-model/Readme.md @@ -0,0 +1,28 @@ +# Automated Threat Modelling Results + +In this folder you can find the results produced by an automated threat-modelling tool (`rapidinsights`). +These outputs have been reviewed and processed by humans, and should be used to derive clear, actionable security improvements. + +Please treat these results as guidance for identifying risks, prioritizing mitigations, and strengthening the overall security of the system when integrating Chicory in a project. + +## Generation + +# Install repomix + +```bash +npm install -g repomix +``` + +# OR: brew install repomix + +# Generate project documentation + +```bash +repomix +``` + +# Interactively analyze with RapidInsights + +```bash +rapidinsights tui --model gemini-2.5-pro repomix-output.xml +```