Add shared fusion infrastructure and QuantFusionPass (#19724)#19724
Add shared fusion infrastructure and QuantFusionPass (#19724)#19724ethansfng wants to merge 2 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19724
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 2 Unrelated FailuresAs of commit f91d610 with merge base ec76470 ( NEW FAILURE - The following job has failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
|
|
@ethansfng has exported this pull request. If you are a Meta employee, you can view the originating Diff in D105728137. |
This PR needs a
|
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
639d11e to
7bfa849
Compare
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
7bfa849 to
4f270a7
Compare
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
4f270a7 to
c257454
Compare
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
…19743) Summary: torchao's `convert_pt2e` adds `out_dtype` kwargs to dequant nodes for bf16 models. `cadence::dequantize_per_tensor` doesn't support this kwarg (it hardcodes float32 output), so `ReplacePT2DequantWithCadenceDequantPass` crashes when it forwards kwargs blindly to the cadence op. Strip `out_dtype` from kwargs before creating the cadence dequant node, and insert an `aten.to.dtype` cast after it to preserve the original output dtype semantics. Differential Revision: D105630451
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
c257454 to
f91d610
Compare
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary: Add infrastructure for per-pattern `fuse()` methods on Cadence `QuantizationPattern`: - Add `anchor_ops()` (default: `tuple(partition_types())`) and `fuse()` (default: `None`) to `QuantizationPattern` base class - Add shared fusion helpers: `_get_dequant`, `_find_quant_user`, `_insert_fused_op`, `_maybe_route_depthwise_conv1d`, `_fuse_conv`, `_fuse_linear`, `_fuse_matmul` - Add `QuantFusionPass` to `compiler_funcs.py` — shared executor that iterates patterns, matches `anchor_ops()`, calls `fuse()` with debug logging and dead code elimination Differential Revision: D105728137
Summary:
Add infrastructure for per-pattern
fuse()methods on CadenceQuantizationPattern:anchor_ops()(default:tuple(partition_types())) andfuse()(default:None) toQuantizationPatternbase class_get_dequant,_find_quant_user,_insert_fused_op,_maybe_route_depthwise_conv1d,_fuse_conv,_fuse_linear,_fuse_matmulQuantFusionPasstocompiler_funcs.py— shared executor that iterates patterns, matchesanchor_ops(), callsfuse()with debug logging and dead code eliminationDifferential Revision: D105728137