-
Notifications
You must be signed in to change notification settings - Fork 23
fix: preserve URL query parameters in storage flash for signed URLs #435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 4 commits
a63c8cc
95b57fc
28e2314
15d3321
8ff6181
9ce59ff
331787b
e5b532c
98cc722
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -10,15 +10,15 @@ | |
| from concurrent.futures import CancelledError | ||
| from contextlib import contextmanager | ||
| from dataclasses import dataclass | ||
| from pathlib import Path, PosixPath | ||
| from pathlib import Path | ||
| from queue import Queue | ||
| from urllib.parse import urlparse | ||
|
|
||
| import click | ||
| import pexpect | ||
| import requests | ||
| from jumpstarter_driver_composite.client import CompositeClient | ||
| from jumpstarter_driver_opendal.client import FlasherClient, OpendalClient, operator_for_path | ||
| from jumpstarter_driver_opendal.client import FlasherClient, OpendalClient, clean_filename, operator_for_path | ||
| from jumpstarter_driver_opendal.common import PathBuf | ||
| from jumpstarter_driver_pyserial.client import Console | ||
| from opendal import Metadata, Operator | ||
|
|
@@ -167,10 +167,14 @@ def flash( # noqa: C901 | |
| "http", root="/", endpoint=f"{parsed.scheme}://{parsed.netloc}", token=bearer_token | ||
| ) | ||
| operator_scheme = "http" | ||
| path = Path(parsed.path) | ||
| # Preserve query parameters so that signed URLs | ||
| # (e.g. CloudFront with ?Expires=...&Signature=...) work correctly. | ||
| path = parsed.path | ||
| if parsed.query: | ||
| path = f"{path}?{parsed.query}" | ||
| else: | ||
| path, operator, operator_scheme = operator_for_path(path) | ||
| image_url = self.http.get_url() + "/" + path.name | ||
| image_url = self.http.get_url() + "/" + self._filename(path) | ||
|
|
||
| # start counting time for the flash operation | ||
| start_time = time.time() | ||
|
|
@@ -968,7 +972,7 @@ def _transfer_bg_thread( | |
| """ | ||
| self.logger.info(f"Writing image to storage in the background: {src_path}") | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [HIGH] Signed URL credentials leaked in log output. After this PR, self.logger.info(f"Writing image to storage in the background: {src_path}")The Suggested fix: log only the clean filename instead: self.logger.info(f"Writing image to storage in the background: {self._filename(src_path)}")AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was already addressed in the current revision. The log line now uses self._filename(src_path) which strips query parameters via clean_filename().
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit 8ff6181. The log line now uses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good security catch. Fixed in commit 8ff6181 -- the log line now uses self._filename(src_path) instead of the raw src_path, so only the clean filename is logged and no query parameters (containing Signature/Key-Pair-Id) are exposed.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch on the security issue. Fixed -- the log line now uses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch on the credential leak. Fixed -- filename = self._filename(src_path)
self.logger.info(f"Writing image to storage in the background: {filename}")This strips all query parameters (including
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. The log line now uses |
||
| try: | ||
| filename = Path(src_path).name if isinstance(src_path, (str, os.PathLike)) else src_path.name | ||
| filename = self._filename(src_path) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [HIGH] Signed URL credentials persisted in metadata JSON file (see The full metadata_dict = {"path": str(src_path)}This is serialized to JSON and written to exporter storage as Suggested fix: strip query parameters before storing: metadata_dict = {"path": clean_filename(src_path) if isinstance(src_path, str) and "?" in src_path else str(src_path)}AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The metadata dictionary now uses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was already addressed in the current revision. The metadata dict now uses clean_filename(src_path) to strip query parameters before persisting.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit 8ff6181. The metadata dict now stores
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed -- storing signed URL credentials in the metadata file was a security issue. Fixed in commit 8ff6181 by using clean_filename(src_path) for the metadata "path" field, which strips all query parameters before persisting.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. Fixed --
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. Fixed --
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. The metadata dict now stores |
||
|
|
||
| if src_operator_scheme == "fs": | ||
| file_hash = self._sha256_file(src_operator, src_path) | ||
|
|
@@ -1088,8 +1092,8 @@ def dump( | |
| raise NotImplementedError("Dump is not implemented for this driver yet") | ||
|
|
||
| def _filename(self, path: PathBuf) -> str: | ||
| """Extract filename from url or path""" | ||
| if path.startswith("oci://"): | ||
| """Extract filename from url or path, stripping any query parameters""" | ||
| if isinstance(path, str) and path.startswith("oci://"): | ||
| oci_path = path[6:] # Remove "oci://" prefix | ||
| if ":" in oci_path: | ||
| repository, tag = oci_path.rsplit(":", 1) | ||
|
|
@@ -1098,10 +1102,8 @@ def _filename(self, path: PathBuf) -> str: | |
| else: | ||
| repo_name = oci_path.split("/")[-1] if "/" in oci_path else oci_path | ||
| return repo_name | ||
| elif path.startswith(("http://", "https://")): | ||
| return urlparse(path).path.split("/")[-1] | ||
| else: | ||
| return Path(path).name | ||
| return clean_filename(path) | ||
|
raballew marked this conversation as resolved.
|
||
|
|
||
| def _upload_artifact(self, storage, path: PathBuf, operator: Operator): | ||
| """Upload artifact to storage""" | ||
|
|
@@ -1636,17 +1638,12 @@ def _get_decompression_command(filename_or_url) -> str: | |
| Determine the appropriate decompression command based on file extension | ||
|
|
||
| Args: | ||
| filename (str): Name of the file to check | ||
| filename_or_url (str): Name of the file or URL to check | ||
|
|
||
| Returns: | ||
| str: Decompression command ('zcat', 'xzcat', or 'cat' for uncompressed) | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [LOW] The docstring says Suggested fix: update the docstring to match the actual return value. AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The docstring now correctly states the return values:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The docstring for _get_decompression_command now accurately reflects the return values: 'zcat |', 'xzcat |', or '' for uncompressed files.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The docstring for
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The docstring now correctly states the return values:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in e5b532c. The docstring now reads
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in e5b532c. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in e5b532c. The docstring now accurately reflects the return values:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. The docstring now reads
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated the docstring to accurately reflect the return values: |
||
| """ | ||
| if type(filename_or_url) is PosixPath: | ||
| filename = filename_or_url.name | ||
| elif filename_or_url.startswith(("http://", "https://")): | ||
| filename = urlparse(filename_or_url).path.split("/")[-1] | ||
|
|
||
| filename = filename.lower() | ||
| filename = clean_filename(filename_or_url).lower() | ||
| if filename.endswith((".gz", ".gzip")): | ||
| return "zcat |" | ||
| elif filename.endswith(".xz"): | ||
|
Comment on lines
+1635
to
1638
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [MEDIUM] The implementation handles Suggested fix: add a AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. Adding
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch on the missing .zst extension, but this is a pre-existing gap that was present before this PR. Adding .zst support would change the scope of this fix beyond the original issue (preserving URL query parameters for signed URLs). I would suggest tracking this as a separate issue/PR to keep changes focused.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. Adding
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added .zst support in commit e5b532c. The _get_decompression_command() function now includes a .zst branch returning "zstdcat |", and test_decompression_command_with_query_params covers .zst for both PosixPath, HTTP URL, and query-parameter paths.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Added
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. A
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Follow-up: after further consideration,
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added in commit
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the |
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -428,6 +428,144 @@ def test_categorize_exception_preserves_cause_for_wrapped_exceptions(): | |
| assert "File not found" in str(result) | ||
|
|
||
|
|
||
| def test_filename_strips_query_params_from_url_path(): | ||
| """Test _filename strips query parameters from paths with signed URL params""" | ||
| client = MockFlasherClient() | ||
|
|
||
| # Full HTTP URL | ||
| assert client._filename("https://cdn.example.com/images/image.raw.xz") == "image.raw.xz" | ||
|
|
||
| # Full HTTP URL with query parameters (e.g. CloudFront signed URL) | ||
| assert ( | ||
| client._filename("https://cdn.example.com/images/image.raw.xz?Expires=123&Signature=abc&Key-Pair-Id=xyz") | ||
| == "image.raw.xz" | ||
| ) | ||
|
|
||
| # Path string with query parameters (as returned by operator_for_path after fix) | ||
| assert client._filename("/images/image.raw.xz?Expires=123&Signature=abc") == "image.raw.xz" | ||
|
|
||
| # Plain path without query parameters | ||
| assert client._filename("/images/image.raw.xz") == "image.raw.xz" | ||
|
|
||
| # OCI path | ||
| assert client._filename("oci://quay.io/org/myimage:latest") == "myimage-latest" | ||
|
|
||
|
|
||
| def test_decompression_command_with_query_params(): | ||
| """Test _get_decompression_command handles paths with query parameters""" | ||
| from pathlib import PosixPath | ||
|
|
||
| from .client import _get_decompression_command | ||
|
|
||
| # Standard PosixPath | ||
| assert _get_decompression_command(PosixPath("/images/image.raw.xz")) == "xzcat |" | ||
| assert _get_decompression_command(PosixPath("/images/image.raw.gz")) == "zcat |" | ||
| assert _get_decompression_command(PosixPath("/images/image.raw")) == "" | ||
|
|
||
| # Full HTTP URL | ||
| assert _get_decompression_command("https://cdn.example.com/images/image.raw.xz") == "xzcat |" | ||
|
|
||
| # String path with query parameters (as returned by operator_for_path for signed URLs) | ||
| assert _get_decompression_command("/images/image.raw.xz?Expires=123&Signature=abc") == "xzcat |" | ||
| assert _get_decompression_command("/images/image.raw.gz?Expires=123") == "zcat |" | ||
| assert _get_decompression_command("/images/image.raw?Expires=123") == "" | ||
|
|
||
|
|
||
| def test_flash_signed_url_preserves_query_params(): | ||
| """Test that flash with a signed HTTP URL preserves query parameters for image_url""" | ||
| client = MockFlasherClient() | ||
|
|
||
| class DummyService: | ||
| def __init__(self): | ||
| self.storage = object() | ||
|
|
||
| def start(self): | ||
| pass | ||
|
|
||
| def stop(self): | ||
| pass | ||
|
|
||
| def get_url(self): | ||
| return "http://exporter" | ||
|
|
||
| client.http = DummyService() | ||
| client.tftp = DummyService() | ||
| client.call = lambda *args, **kwargs: None | ||
|
|
||
| captured = {} | ||
|
|
||
| def capture_perform(*args): | ||
| captured["image_url"] = args[3] | ||
| captured["should_download_to_httpd"] = args[4] | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [LOW] Flash integration tests use brittle positional arg capture. Both flash integration tests mock Suggested fix: use named parameters instead: def capture_perform(partition, block_device, path, image_url, should_download_to_httpd, *rest):AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 9ce59ff. The last remaining test using positional arg capture (
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. Both capture_perform functions in the flash integration tests now use named parameters instead of positional *args indexing, so they won't silently break if the parameter order of _perform_flash_operation changes.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 9ce59ff. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 15d3321. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 9ce59ff. Both flash integration tests now use named parameters in the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed. All |
||
|
|
||
| client._perform_flash_operation = capture_perform | ||
|
|
||
| # Direct HTTP URL with query params (no force_exporter_http) should preserve full URL | ||
| signed_url = "https://cdn.example.com/images/image.raw.xz?Expires=123&Signature=abc&Key-Pair-Id=xyz" | ||
| client.flash(signed_url, method="fls", fls_version="") | ||
|
|
||
| assert captured["image_url"] == signed_url | ||
| assert captured["should_download_to_httpd"] is False | ||
|
raballew marked this conversation as resolved.
raballew marked this conversation as resolved.
|
||
|
|
||
|
|
||
| def test_flash_bearer_token_signed_url_preserves_query_params(): | ||
| """Test that flash with force_exporter_http=True and bearer token preserves query params. | ||
|
|
||
| When a signed URL is used with a bearer token, the flash() method enters the | ||
| bearer token code path (lines 162-174 in client.py) which reconstructs the path | ||
| from parsed.path + '?' + parsed.query. This test verifies query params are preserved | ||
| and the path passed to the storage thread is correct. | ||
| """ | ||
| client = MockFlasherClient() | ||
|
|
||
| class DummyService: | ||
| def __init__(self): | ||
| self.storage = object() | ||
|
|
||
| def start(self): | ||
| pass | ||
|
|
||
| def stop(self): | ||
| pass | ||
|
|
||
| def get_url(self): | ||
| return "http://exporter" | ||
|
|
||
| def get_host(self): | ||
| return "127.0.0.1" | ||
|
|
||
| client.http = DummyService() | ||
| client.tftp = DummyService() | ||
| client.call = lambda *args, **kwargs: None | ||
|
|
||
| captured = {} | ||
|
|
||
| def capture_perform(*args): | ||
| captured["path"] = args[2] | ||
| captured["image_url"] = args[3] | ||
| captured["should_download_to_httpd"] = args[4] | ||
|
|
||
| client._perform_flash_operation = capture_perform | ||
| # Mock the background transfer thread to prevent it from actually running | ||
| client._transfer_bg_thread = lambda *args, **kwargs: None | ||
|
|
||
| signed_url = "https://cdn.example.com/images/image.raw.xz?Expires=123&Signature=abc&Key-Pair-Id=xyz" | ||
| client.flash( | ||
| signed_url, | ||
| force_exporter_http=True, | ||
| bearer_token="test-token-123", | ||
| method="fls", | ||
| fls_version="", | ||
| ) | ||
|
|
||
| # With force_exporter_http=True and bearer_token, should download to httpd | ||
| assert captured["should_download_to_httpd"] is True | ||
| # The path should have query params preserved (reconstructed from parsed.path + '?' + parsed.query) | ||
| assert captured["path"] == "/images/image.raw.xz?Expires=123&Signature=abc&Key-Pair-Id=xyz" | ||
| # The image_url should point to the exporter with the clean filename (no query params) | ||
| assert captured["image_url"] == "http://exporter/image.raw.xz" | ||
|
|
||
|
|
||
| def test_resolve_flash_parameters(): | ||
| """Test flash parameter resolution for single file, partitions, and error cases""" | ||
| client = MockFlasherClient() | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,6 +44,21 @@ async def aclose(self): | |
| pass | ||
|
|
||
|
|
||
| def clean_filename(path: PathBuf) -> str: | ||
| """Extract a clean filename from a path or URL, stripping query parameters. | ||
|
|
||
| This handles paths returned by operator_for_path() which may contain | ||
| query parameters for signed URLs (e.g. /path/to/image.raw.xz?Expires=...&Signature=...). | ||
| """ | ||
| path_str = str(path) | ||
| if path_str.startswith(("http://", "https://")): | ||
| return urlparse(path_str).path.split("/")[-1] | ||
| name = Path(path_str).name | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [HIGH] The non-URL branch calls Example: Suggested fix: strip the query string before passing to def clean_filename(path: PathBuf) -> str:
path_str = str(path)
if path_str.startswith(("http://", "https://")):
return urlparse(path_str).path.split("/")[-1]
if "?" in path_str:
path_str = path_str.split("?", 1)[0]
return Path(path_str).nameAI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. The
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for flagging this case. I checked the current implementation and it actually handles this correctly already -- the query string is stripped before passing to Path().name. So for the example path, the split on '?' produces '/images/image.raw.xz' first, then Path().name returns 'image.raw.xz'. There is also an existing test for this exact edge case in test_clean_filename.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in commit 8ff6181. The non-URL branch now strips query parameters before calling
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great catch -- this was a real bug. Fixed in commit 8ff6181 by stripping the query string before passing to Path().name, exactly as you suggested. The clean_filename() function now does: if "?" in path_str:
path_str = path_str.split("?", 1)[0]
return Path(path_str).nameAlso added a test case for this edge case:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great catch. Fixed by stripping query parameters before passing to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. This is now fixed --
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good catch. Fixed in commit
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good find. Fixed exactly as suggested -- |
||
| if "?" in name: | ||
| name = name.split("?", 1)[0] | ||
| return name | ||
|
raballew marked this conversation as resolved.
Outdated
|
||
|
|
||
|
|
||
| def operator_for_path(path: PathBuf) -> tuple[PathBuf, Operator, str]: | ||
| """Create an operator for the given path | ||
| Return a tuple of: | ||
|
|
@@ -54,7 +69,13 @@ def operator_for_path(path: PathBuf) -> tuple[PathBuf, Operator, str]: | |
| if type(path) is str and path.startswith(("http://", "https://")): | ||
| parsed_url = urlparse(path) | ||
| operator = Operator("http", root="/", endpoint=f"{parsed_url.scheme}://{parsed_url.netloc}") | ||
| return Path(parsed_url.path), operator, "http" | ||
| # Preserve query parameters in the path so that signed URLs | ||
| # (e.g. CloudFront URLs with ?Expires=...&Signature=...&Key-Pair-Id=...) | ||
| # are fetched correctly by the OpenDAL HTTP operator. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [MEDIUM] Inline comments could be replaced with self-documenting code. The project convention states comments are only acceptable as a last resort when the code cannot be refactored to be self-explanatory. The inline comments here explaining query parameter preservation could be eliminated by extracting the pattern into a well-named helper method or relying on docstrings. AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. Replaced the inline comments with self-documenting code: extracted
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The inline comments explain why query parameters need to be preserved (signed URLs with authentication tokens), which is not immediately obvious from the code alone. The helper functions (clean_filename, path_with_query) are self-documenting for what they do, but the comments provide context on why they exist. Happy to remove them if the team feels strongly about it.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed in commit 8ff6181 by extracting the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed by extracting the logic into well-named helper functions: clean_filename() and path_with_query(), each with docstrings that explain their purpose. The inline comments in operator_for_path have been replaced with the function's docstring explaining the signed URL behavior.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. Replaced the inline comments with the self-documenting
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The inline comments explaining query parameter preservation have been mostly replaced by self-documenting code: the helper functions
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed. The inline comments were replaced by extracting the
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed by moving the explanation into the docstrings of |
||
| op_path = parsed_url.path | ||
| if parsed_url.query: | ||
| op_path = f"{op_path}?{parsed_url.query}" | ||
| return op_path, operator, "http" | ||
|
raballew marked this conversation as resolved.
Outdated
|
||
| else: | ||
| return Path(path).resolve(), Operator("fs", root="/"), "fs" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [LOW] The HTTP branch now returns AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The type inconsistency (
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The HTTP branch returns str while the filesystem branch returns Path -- both are valid PathBuf and all callers handle both types correctly. Keeping str for the HTTP branch is intentional: Path() would strip the query parameters that signed URLs need. The type annotation PathBuf = str | os.PathLike already covers this, so no change needed here.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The HTTP branch now returns
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The docstring for
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The return type annotation
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The return type difference is intentional -- the HTTP branch returns
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The HTTP branch returns
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged. The type signature uses
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Acknowledged in the updated |
||
|
|
||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -5,7 +5,7 @@ | |
|
|
||
| import click | ||
| from jumpstarter_driver_composite.client import CompositeClient | ||
| from jumpstarter_driver_opendal.client import FlasherClient, operator_for_path | ||
| from jumpstarter_driver_opendal.client import FlasherClient, clean_filename, operator_for_path | ||
| from jumpstarter_driver_power.client import PowerClient | ||
| from opendal import Operator | ||
|
|
||
|
|
@@ -39,7 +39,7 @@ def _upload_file_if_needed(self, file_path: str, operator: Operator | None = Non | |
| path_buf = Path(file_path) | ||
| operator_scheme = "unknown" | ||
|
|
||
| filename = Path(path_buf).name | ||
| filename = clean_filename(path_buf) | ||
|
raballew marked this conversation as resolved.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [MEDIUM] No test coverage for ridesx This PR changes this line from Suggested fix: add a test verifying AI-generated, human reviewed
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fixed in 8ff6181. Added
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is already addressed -- test_upload_file_if_needed_strips_query_params in client_test.py verifies clean_filename works correctly with signed URL paths including query parameters with unencoded slashes.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added test_upload_file_if_needed_strips_query_params in ridesx client_test.py, which verifies that clean_filename() correctly strips query parameters from signed URL paths including the edge case with unencoded slashes in the Signature parameter.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed. A test
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added in
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added |
||
|
|
||
| if self._should_upload_file(self.storage, filename, path_buf, operator, operator_scheme): | ||
| if operator_scheme == "http": | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[MEDIUM] Inline comments could be replaced with self-documenting code.
Same as the opendal client: the project convention states comments are only acceptable as a last resort. These inline comments explaining query parameter preservation could be eliminated by extracting the pattern into a well-named helper method.
AI-generated, human reviewed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8ff6181. The inline comments in the flashers client have been replaced by delegating to the shared
path_with_query()helper, making the code self-explanatory.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8ff6181. Same approach as the opendal client -- the inline comments in the flashers client have been removed. The path_with_query() helper and updated docstrings on clean_filename() and operator_for_path() make the signed URL behavior self-documenting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8ff6181 alongside the opendal client. The inline comments in the flashers client were replaced with the same self-documenting approach: the
path_with_query()helper and updated docstrings make the query parameter handling self-explanatory without inline commentary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8ff6181. Replaced the inline comments in the flashers client with self-documenting code, same as the opendal client. Both now use the
path_with_query()helper andclean_filename()with descriptive docstrings, eliminating the need for inline comments.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 8ff6181 along with the opendal client instance. Both inline comment blocks have been replaced by the self-documenting
path_with_query()helper.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8ff6181. The inline comments in the flashers client have been replaced with self-documenting code, mirroring the changes made in the opendal client. The query parameter handling is now done through the
path_with_query()helper with a descriptive docstring, and theclean_filename()helper provides the filename extraction logic. Both are self-explanatory without inline comments.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 9ce59ff. The inline comments in the flashers client were replaced with the self-documenting
path_with_query()helper, matching the same refactoring done in the opendal client.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. Same as the opendal client -- the inline comments were replaced by the
path_with_query()helper, making the code self-documenting.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The inline comment in the bearer token path was replaced with the
path_with_query()helper, which is self-documenting through its name and docstring.