Skip to content

fix(sdk): add bounds check in extract_description to prevent IndexError#13429

Open
lh1564803535-code wants to merge 3 commits into
kubeflow:masterfrom
lh1564803535-code:fix/extract-description-bounds-check
Open

fix(sdk): add bounds check in extract_description to prevent IndexError#13429
lh1564803535-code wants to merge 3 commits into
kubeflow:masterfrom
lh1564803535-code:fix/extract-description-bounds-check

Conversation

@lh1564803535-code
Copy link
Copy Markdown

Summary

Adds bounds checking to extract_description() in ComponentSpec.from_yaml_documents() to prevent an IndexError (CWE-125: Out-of-bounds access) when parsing malformed or crafted component YAML.

Problem

The extract_description() function in sdk/python/kfp/dsl/structures.py has two unguarded index accesses:

  1. component_yaml.splitlines()[index_of_heading] - crashes if the YAML has fewer than 3 lines
  2. while comments[index][:len(...)] - crashes if the multi-line description loop runs past the end of the file

Any system using the KFP SDK to parse untrusted YAML (CI/CD pipelines, shared component repos, multi-tenant pipeline services) can be crashed with a crafted input.

Reproducer (from issue #13420)

from kfp.components import structures

malicious_yaml = "line0\nline1\n Description: some text\n              continued text"
spec = structures.ComponentSpec.from_yaml_documents(malicious_yaml)
# IndexError: list index out of range

Fix

  • Cache splitlines() result once as lines to avoid redundant splitting
  • Guard index_of_heading access with len(lines) > index_of_heading
  • Add index < len(lines) to the while loop condition
  • Replace comments variable with lines for consistency

Tests

Added TestExtractDescriptionBoundsCheck with 5 test cases covering:

All 56 tests in structures_test.py pass (51 existing + 5 new).

Fixes #13420

Add bounds checking for index_of_heading and the while loop index
in extract_description() within ComponentSpec.from_yaml_documents().

Previously, crafted YAML input could cause an IndexError (CWE-125)
when the '# Description:' heading was present but the YAML had fewer
lines than index_of_heading, or when a multi-line description
continued past the end of the file.

Fixes kubeflow#13420

Signed-off-by: lh1564803535-code <lh1564803535@gmail.com>
Copilot AI review requested due to automatic review settings May 23, 2026 09:14
@google-oss-prow google-oss-prow Bot requested a review from droctothorpe May 23, 2026 09:14
@google-oss-prow
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mprahl for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow
Copy link
Copy Markdown

Hi @lh1564803535-code. Thanks for your PR.

I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR hardens YAML description extraction against malformed inputs that previously could raise IndexError, and adds regression tests to ensure these cases don’t crash.

Changes:

  • Add bounds checks in extract_description() to prevent out-of-range indexing.
  • Add unit tests covering multiple malformed YAML shapes reported to trigger IndexError.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
sdk/python/kfp/dsl/structures.py Adds defensive bounds checks while parsing multi-line description comments.
sdk/python/kfp/dsl/structures_test.py Adds regression tests ensuring malformed YAML does not raise IndexError.

Comment thread sdk/python/kfp/dsl/structures_test.py Outdated

def test_short_yaml_with_description_heading_no_crash(self):
"""YAML with '# Description:' but fewer than 3 lines should not crash."""
malicious_yaml = "line0\nline1\n Description: some text"
Comment thread sdk/python/kfp/dsl/structures_test.py Outdated
Comment on lines +1166 to +1170
try:
structures.ComponentSpec.n(malicious_yaml)
except (ValueError, KeyError, Exception) as e:
self.assertNotIsInstance(e, IndexError,
"IndexError should not be raised on short YAML input")
Comment on lines +868 to 872
while (index < len(lines)
and lines[index][:len(multi_line_description_prefix)
] == multi_line_description_prefix):
description += '\n' + lines[index][
len(multi_line_description_prefix) + 1:]
- Use startswith() instead of slice equality for readability
- Fix test assertions: use mock to test extract_description directly
- Add assert_no_index_error helper to reduce test duplication
- Ensure tests actually exercise the bounds-checking code path

Addresses copilot review comments on kubeflow#13429

Signed-off-by: lh1564803535-code <lh1564803535@gmail.com>
Comment thread sdk/python/kfp/dsl/structures.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Out-of-bounds IndexError in extract_description when parsing crafted component YAML

3 participants