[SPARK-57023][SQL] DecimalAggregates: peel widened Cast on Min/Max#56078
Open
yaooqinn wants to merge 1 commit into
Open
[SPARK-57023][SQL] DecimalAggregates: peel widened Cast on Min/Max#56078yaooqinn wants to merge 1 commit into
yaooqinn wants to merge 1 commit into
Conversation
### What changes were proposed in this pull request? Extends `DecimalAggregates` with two new arms that hoist a scale-preserving widening `Cast` out of `Min` / `Max` so the aggregate runs on the narrower inner `Decimal`. The peel reuses the existing `WidenedDecimalChild` extractor (introduced for SUM/AVG in SPARK-56627 and tightened in SPARK-56983) — same scale, same-or-larger precision, `CheckOverflow` guard — so the eligibility surface is identical to the already-shipped SUM/AVG fast paths. `Min` / `Max` pick an existing row's value (no arithmetic), so a widening Cast is bit-identical whether applied per-row before the aggregate or once on the aggregate's result. The outer `Cast` preserves the pre-rewrite result dataType (`Min.dataType == child.dataType`). To keep the rule's tree-pattern pruning tight, this PR also adds `MIN` / `MAX` to `TreePatterns` and registers them as `nodePatterns` on the `Min` / `Max` aggregate expressions. ### Why are the changes needed? Widened decimal Cast is a common shape (e.g. `Min/Max(CAST(d AS dec(p',s)))` emitted by type coercion for set ops, UNION-by-name, view schemas). Without the peel the per-row Cast runs inside the aggregate hot path. ### Does this PR introduce _any_ user-facing change? No. Same result type, same eval semantics, same NULL semantics. ### How was this patch tested? Six new oracle tests in `DecimalAggregatesSuite`: - `MIN(CAST(dec(7,2) AS dec(12,2)))` peels via widened-Cast fast path - `MAX(CAST(dec(7,2) AS dec(12,2)))` peels via widened-Cast fast path - `MIN(CAST(dec(7,2) AS dec(12,4)))` does NOT peel (scale change) - `MIN(CAST(dec(17,2) AS dec(10,2)))` does NOT peel (narrowing) - `MIN/MAX(CheckOverflow)` does NOT peel (CheckOverflow guard) - `MinBy/MaxBy/MaxMinByK` with widened-Cast value do NOT peel (rule pattern matches only Min/Max) All 43 tests in DecimalAggregatesSuite pass. ### Was this patch authored or co-authored using generative AI tooling? No.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Extend
DecimalAggregatesto peel a scale-preserving wideningCastaroundMin/Maxarguments, mirroring the existing SUM/AVG widened-Cast arms landed via SPARK-56983.When the input is
Min(Cast(inner: dec(p, s), dec(p', s)))(orMax(...)) withp' >= pand noCheckOverflowwrapper, the rule rewrites toCast(Min(inner), dec(p', s))(and likewise forMax). MIN/MAX are pointwise on a totally-ordered domain, so under same-scale widening the rewrite is value-equivalent and NULL-preserving (see design §D6 self-Q&A).Both arms reuse the same-package
WidenedDecimalChildextractor introduced for SUM/AVG, which refuses to unwrapCheckOverflowand enforces the sames == s',p' >= pguard.TreePatterns.MIN/TreePatterns.MAXare added and registered onMin/Max;DecimalAggregates'scontainsAnyPatternpruning is widened to(SUM, AVERAGE, MIN, MAX). No new rule, no new file — three arms cohabit one object.Why are the changes needed?
The SUM/AVG arms recover the long-backed fast path when BI tools generate
SUM(CAST(small_dec AS larger_dec)). The MIN/MAX case is the natural sibling: same widening pattern, but currently no peel arm exists, so each aggregated row pays a per-rowDecimal.changePrecisioncall insideCast(Cast.scala:1074-1082) even though the outer Cast could be applied once to the partition extremum instead.MIN/MAX are pointwise on a totally-ordered domain and immune to both the SUM overflow boundary (SPARK-56983) and the AVG SPARK-37024 Double-regime gate, so the equivalence is unconditional within the
WidenedDecimalChildguard domain (R1b Lemma 1 — design §D6 self-Q&A).The saving is per-row
changePrecisionelimination on the aggregate input — ceiling −0.39% ~ −6.02% JDK-progressive (JDK 25 strongest) per the GHA 3 JDK × 16 case matrix below — and the patch is essentially free: three lines of extractor reuse, no new rule, no new file.Does this PR introduce any user-facing change?
No.
How was this patch tested?
DecimalAggregatesSuiteextended with 6 SPARK-57023 oracle cases — 2 peel positives (Min/Maxover widening Cast) and 4 negatives (scale-changing Cast, narrowing Cast,CheckOverflow-wrapped,MinBy/MaxBy/MaxMinByK). Suite: 43/43.Full
catalyst/test: 9341 tests / 353 suites, 0 failed, 5 ignored, 670 s.TPCDSV1_4PlanStabilitySuite+TPCDSV1_4PlanStabilityWithStatsSuite: no golden change onapache/master. Existence-test onefb7beab826recorded 0 trigger across the 130 TPC-DS v1.4 + v2.7.0 queries (see investigation0002-baseline-run-results.md, positive-control verified).DecimalAggregatesBenchmark: extended with new MIN section (C1-C4) and MAX section (D1-D4) mirroring the existing SUM/AVG sections. Full 8-case × 3 JDK matrix run on GitHub Actions standardubuntu-22.04runners (AMD EPYC 7763 64-Core, 10M rows × 5 iters);DecimalAggregatesBenchmark{-,-jdk21-,-jdk25-}results.txtregenerated and committed. Headline cells (Best ms peel off / peel on / Δ%):Pattern: ceiling −0.39% ~ −6.02% JDK-progressive (JDK 25 strongest, JDK 17 weakest) across the 24 readings; no negative-delta (regression) cell. The saving is per-row
Decimal.changePrecisionelimination on the aggregate input — design0002-design-minmax-fastpath.md§D5.1 declares this a legal micro-only ship contract. A pre-GHA local sbt sanity (Section C MIN, JDK 17, 10M × 5) recorded −1.5% to −2.0%; the GHA EPYC numbers above supersede it.Note on Section A/B
results.txtrefreshThe sibling SPARK-56627 .scala edited Section B2/B4 cases to
p'=11but did not regenerateDecimalAggregatesBenchmark-results.txt, so the committed text was stale for the new shape. This PR regenerates the whole file under the canonical EPYC 7763 GHA runner (replacing the prior EPYC 9V74 numbers in Section A/B), and the same regeneration produces the JDK 21 / JDK 25 companion files. Section A/B refresh is mechanical housekeeping triggered by the regeneration, not part of the MIN/MAX peel logic — flagged here for reviewer transparency.Why no TPC-DS results?
The existence-test on
apache/masterefb7beab826walked optimized plans across the full 130-query TPC-DS v1.4 + v2.7.0 corpus and recorded 0 queries triggering theMin(Cast(...))/Max(Cast(...))widening pattern (see investigation0002-baseline-run-results.md, positive-control verified). The MIN/MAX peel is justified by semantic equivalence + micro-level pattern coverage, not by TPC-DS revenue. Design0002-design-minmax-fastpath.md§D5.1 declares this as a legal micro-only ship contract.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Opus 4.7