[Torchvision API] Input metadata#6364
Conversation
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
Signed-off-by: Marek Dabek <mdabek@nvidia.com>
|
@greptileai review |
|
| Filename | Overview |
|---|---|
| dali/python/nvidia/dali/experimental/torchvision/v2/functional/image_metadata.py | New file implementing get_image_size and get_dimensions; logic correctly mirrors torchvision for PIL and tensor inputs; minor: error messages missing trailing periods. |
| dali/python/nvidia/dali/experimental/torchvision/v2/randomcrop.py | New RandomCrop operator; _randint receives a negative max_value when image is smaller than crop size and pad_if_needed=False, producing undefined DALI pipeline behavior with no actionable error message. |
| dali/python/nvidia/dali/experimental/torchvision/v2/functional/crop.py | New functional crop wrapper; layout-axis mapping is correct; docstring is minimal and omits parameters/return type. |
| dali/test/python/torchvision/test_tv_randomcrop.py | New tests for RandomCrop; correctness verified against torchvision; assert_raises calls lack glob= patterns; large commented-out TODO block must be removed before merging. |
| dali/test/python/torchvision/test_tv_crop.py | New tests for functional crop; validates against torchvision across device/mode/layout; assert_raises calls lack glob= patterns. |
| dali/test/python/torchvision/test_tv_image_metadata.py | New tests for get_image_size and get_dimensions; good coverage of PIL modes, tensor ranks, GPU, and torchvision compatibility; error-path tests present. |
Sequence Diagram
sequenceDiagram
participant User
participant crop_fn as functional.crop
participant RandomCrop
participant ndd_slice as ndd.slice / fn.slice
User->>crop_fn: crop(inpt, top, left, height, width)
crop_fn->>crop_fn: _verify_crop_coordinate(top, left)
crop_fn->>RandomCrop: "verify_args(size=(height,width), ...)"
crop_fn->>ndd_slice: "slice(inpt, (top,left), (height,width), axes=...)"
ndd_slice-->>User: cropped tensor/batch
User->>RandomCrop: __call__(inpt)
RandomCrop->>RandomCrop: preprocess_data → get_HWC_from_layout_pipeline
RandomCrop->>RandomCrop: _kernel(in_h, in_w, _, tensor)
alt needs_padding
RandomCrop->>fn.slice: pad with out_of_bounds_policy
fn.slice-->>RandomCrop: padded tensor
end
RandomCrop->>RandomCrop: _randint(max_top), _randint(max_left)
RandomCrop->>fn.slice: slice at random offset
fn.slice-->>User: cropped tensor
Comments Outside Diff (2)
-
dali/test/python/torchvision/test_tv_randomcrop.py, line 1084-1103 (link)[Bug] Commented-out test code with TODO should not be merged
A large block of commented-out test functions wrapped in a docstring is left behind with
# TODO: Fill using dictionary pattern is currently not supported. Per project policy, TODOs must be resolved (or tracked with an issue reference) before merging, and dead code should not be preserved in source — git history serves that purpose. Either remove this block entirely and open a tracked issue, or implement the functionality before landing. -
dali/test/python/torchvision/test_tv_randomcrop.py, line 1157-1158 (link)[Style]
assert_raisescalls lack a message-pattern globProject convention requires
assert_raises(ExcType, glob="<pattern>")so that a test can't pass if an unrelated path raises the same exception type with a meaningless message. This pattern appears throughouttest_tv_randomcrop.py(e.g.test_random_crop_invalid_size,test_random_crop_invalid_padding,test_random_crop_invalid_fill) and throughouttest_tv_crop.py(test_crop_invalid_input_type,test_crop_invalid_output_size,test_crop_invalid_coordinates). Addingglob=guards that the right error is raised for the right reason.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Reviews (1): Last reviewed commit: "Image information Torchvision's function..." | Re-trigger Greptile
| max_top = fn.cast(in_h, dtype=dali.types.INT32) - crop_h | ||
| max_left = fn.cast(in_w, dtype=dali.types.INT32) - crop_w | ||
|
|
||
| top = RandomCrop._randint(max_top) | ||
| left = RandomCrop._randint(max_left) | ||
|
|
||
| return fn.slice( | ||
| tensor, | ||
| fn.stack(left, top), | ||
| fn.stack(crop_w, crop_h), | ||
| device=self.device, | ||
| axis_names="WH", | ||
| ) |
There was a problem hiding this comment.
[Bug] Undefined behavior when crop size exceeds image size and
pad_if_needed=False
When pad_if_needed=False (the default) and no explicit padding is supplied, self.needs_padding is False so the padding block is skipped. If the input image is smaller than the requested crop, max_top = in_h - crop_h becomes negative. _randint then builds range_end = max_value + 1 ≤ 0, passing fn.random.uniform(range=[0, ≤0]) — an invalid range — to DALI. Torchvision raises a clear ValueError in this case; DALI silently produces garbage or a pipeline crash with no actionable message. A guard like if max_top < 0 or max_left < 0: raise ValueError(...) should be added before _randint is called.
| if inpt.ndim < 2: | ||
| raise TypeError( | ||
| f"get_image_size requires a tensor with at least 2 dimensions, got {inpt.ndim}" | ||
| ) | ||
| return [inpt.shape[-1], inpt.shape[-2]] # [W, H] | ||
| raise TypeError(f"Unsupported input type: {type(inpt)}") | ||
|
|
||
|
|
||
| def get_dimensions(inpt: Image.Image | torch.Tensor) -> List[int]: |
There was a problem hiding this comment.
[Style] Error messages should end with a period
Both error messages are missing a trailing period, violating the project convention that error messages must read as complete sentences.
| if inpt.ndim < 2: | |
| raise TypeError( | |
| f"get_image_size requires a tensor with at least 2 dimensions, got {inpt.ndim}" | |
| ) | |
| return [inpt.shape[-1], inpt.shape[-2]] # [W, H] | |
| raise TypeError(f"Unsupported input type: {type(inpt)}") | |
| def get_dimensions(inpt: Image.Image | torch.Tensor) -> List[int]: | |
| if inpt.ndim < 2: | |
| raise TypeError( | |
| f"get_image_size requires a tensor with at least 2 dimensions, got {inpt.ndim}." | |
| ) | |
| return [inpt.shape[-1], inpt.shape[-2]] # [W, H] | |
| raise TypeError(f"Unsupported input type: {type(inpt)}.") | |
| def get_dimensions(inpt: Image.Image | torch.Tensor) -> List[int]: |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| if inpt.ndim < 2: | ||
| raise TypeError( | ||
| f"get_dimensions requires a tensor with at least 2 dimensions, got {inpt.ndim}" | ||
| ) | ||
| if inpt.ndim == 2: |
There was a problem hiding this comment.
[Style]
get_dimensions error messages also missing trailing period
Same sentence-ending convention issue in get_dimensions.
| if inpt.ndim < 2: | |
| raise TypeError( | |
| f"get_dimensions requires a tensor with at least 2 dimensions, got {inpt.ndim}" | |
| ) | |
| if inpt.ndim == 2: | |
| if inpt.ndim < 2: | |
| raise TypeError( | |
| f"get_dimensions requires a tensor with at least 2 dimensions, got {inpt.ndim}." | |
| ) | |
| if inpt.ndim == 2: |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Category:
New feature
Description:
Torchvision functional API operator to get metadata of input:
Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A