fix(sdk): add bounds check in extract_description to prevent IndexError#13429
fix(sdk): add bounds check in extract_description to prevent IndexError#13429lh1564803535-code wants to merge 3 commits into
Conversation
Add bounds checking for index_of_heading and the while loop index in extract_description() within ComponentSpec.from_yaml_documents(). Previously, crafted YAML input could cause an IndexError (CWE-125) when the '# Description:' heading was present but the YAML had fewer lines than index_of_heading, or when a multi-line description continued past the end of the file. Fixes kubeflow#13420 Signed-off-by: lh1564803535-code <lh1564803535@gmail.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @lh1564803535-code. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR hardens YAML description extraction against malformed inputs that previously could raise IndexError, and adds regression tests to ensure these cases don’t crash.
Changes:
- Add bounds checks in
extract_description()to prevent out-of-range indexing. - Add unit tests covering multiple malformed YAML shapes reported to trigger
IndexError.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| sdk/python/kfp/dsl/structures.py | Adds defensive bounds checks while parsing multi-line description comments. |
| sdk/python/kfp/dsl/structures_test.py | Adds regression tests ensuring malformed YAML does not raise IndexError. |
|
|
||
| def test_short_yaml_with_description_heading_no_crash(self): | ||
| """YAML with '# Description:' but fewer than 3 lines should not crash.""" | ||
| malicious_yaml = "line0\nline1\n Description: some text" |
| try: | ||
| structures.ComponentSpec.n(malicious_yaml) | ||
| except (ValueError, KeyError, Exception) as e: | ||
| self.assertNotIsInstance(e, IndexError, | ||
| "IndexError should not be raised on short YAML input") |
| while (index < len(lines) | ||
| and lines[index][:len(multi_line_description_prefix) | ||
| ] == multi_line_description_prefix): | ||
| description += '\n' + lines[index][ | ||
| len(multi_line_description_prefix) + 1:] |
- Use startswith() instead of slice equality for readability - Fix test assertions: use mock to test extract_description directly - Add assert_no_index_error helper to reduce test duplication - Ensure tests actually exercise the bounds-checking code path Addresses copilot review comments on kubeflow#13429 Signed-off-by: lh1564803535-code <lh1564803535@gmail.com>
Summary
Adds bounds checking to
extract_description()inComponentSpec.from_yaml_documents()to prevent anIndexError(CWE-125: Out-of-bounds access) when parsing malformed or crafted component YAML.Problem
The
extract_description()function insdk/python/kfp/dsl/structures.pyhas two unguarded index accesses:component_yaml.splitlines()[index_of_heading]- crashes if the YAML has fewer than 3 lineswhile comments[index][:len(...)]- crashes if the multi-line description loop runs past the end of the fileAny system using the KFP SDK to parse untrusted YAML (CI/CD pipelines, shared component repos, multi-tenant pipeline services) can be crashed with a crafted input.
Reproducer (from issue #13420)
Fix
splitlines()result once aslinesto avoid redundant splittingindex_of_headingaccess withlen(lines) > index_of_headingindex < len(lines)to the while loop conditioncommentsvariable withlinesfor consistencyTests
Added
TestExtractDescriptionBoundsCheckwith 5 test cases covering:All 56 tests in
structures_test.pypass (51 existing + 5 new).Fixes #13420