Issues: symptom classification + owner-grouping (engine)#803
Issues: symptom classification + owner-grouping (engine)#803nadaverell wants to merge 3 commits into
Conversation
Adds a pure, deterministic classifier (Category, with a fixed Category→Group rollup) over the signal radar already emits — Source + Kind + Reason + crash context — and wires it into Compose so every /api/issues and MCP `issues` row carries `category` + `category_group`. Both are server-emitted labels (the UI renders the rollup without its own category→group map) and both are exposed as CEL filter bindings. `unknown` is first-class: categories whose detectors don't exist yet, plus CronJob / Job / CAPI / PVC-Lost / Node-Cordoned, fall through to it rather than being force-fit into a neat bucket.
Every issue now carries three additive identity fields: - Owner: the topmost stable controller of a Pod problem (Pod→Deployment, not the intermediate ReplicaSet), resolved at detection time via the existing topOwnerForPod and carried on k8s.Problem alongside the RestartCount/LastTerminatedReason crash context. - GroupingScope: workload|service|pvc|ingress|node|unknown — the subject's coarse bucket (drives the future UI section, part of the ID). - ID: deterministic cluster-local hash(scope, subject key, category), identical for every member row that rolls up to the same subject+category. The hub namespaces it by cluster_id for global uniqueness. Subject = the topmost owner when one was resolved (member pods key on their workload), else the resource itself. resourceKey reuses pkg/audit.ResourceKey so issue grouping and audit deep-links share one key format rather than drifting. Purely additive — rows are not yet collapsed; the shared ID is the handle the collapse fold keys on (next slice). No consumer contract changes.
GroupIssues collapses the flat evidence rows into the public operational
model — one row per shared id (subject+category). A Deployment whose 3 pods
all ImagePullBackOff is one issue with affected:{pods:3} + bounded member
refs, not three rows.
- /api/issues + MCP issues return grouped rows by default; the cap now
counts issue groups, not replica fan-out.
- /api/issues?view=flat returns the raw pre-fold evidence rows for
debugging ("what folded into this group?"). MCP stays grouped-only —
agents use get_resource/get_events for raw state.
- Compose() stays flat internally, so summarycontext's per-resource index
is unchanged; Filters.Grouped gates the fold.
- Representative rules (deterministic): severity = max member, category =
shared, subject = topmost owner, reason/message/crash-context from the
worst member, age = oldest onset, last_seen = newest, members sorted +
capped at 10 with members_truncated past that.
Table-tested — grouping bugs are trust bugs; every consumer inherits them.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 4b11cd6. Configure here.
| GroupScaling Group = "scaling" | ||
| GroupSecurity Group = "security" | ||
| GroupControlPlane Group = "control_plane" | ||
| GroupApplication Group = "application" |
There was a problem hiding this comment.
Unused GroupApplication constant is dead code
Low Severity
GroupApplication is declared as a Group constant but no category in the categoryGroup map maps to it, and it's not referenced anywhere else in the codebase. Unlike the forward-declared categories (e.g. CategoryDNSFailure) which at least have entries in categoryGroup, this group constant has zero consumers — making it truly dead code rather than a planned placeholder.
Reviewed by Cursor Bugbot for commit 4b11cd6. Configure here.
|
Subsumed into #811 — the classification engine (the three |


Builds the Radar Issues engine: every operational issue is classified by symptom category, gets a stable identity, and is grouped under its owning workload — so
/api/issuesand the MCPissuestool emit a triage queue, not a per-object feed. Pure/deterministic, table-tested, MCP-first.What changed (3 commits)
(Source, Kind, Reason)→categoryclassifier (~25 categories → 11 groups,unknownfirst-class), wired intoCompose. Every row carriescategory+category_group, both server-emitted labels and CEL filter bindings. Grounded in radar's actual reason vocabulary.grouping_scopeand a deterministic cluster-localid = hash(scope, subject key, category).resourceKeyreusespkg/audit.ResourceKeyso issues and audit deep-links share one key format.GroupIssuesfolds the flat evidence rows into the public model: one row peridwithaffectedcounts + bounded member refs. Grouped by default on/api/issues+ MCP; the cap now counts issue groups, not replica fan-out.Notes for review
?view=flaton/api/issuesreturns the raw pre-fold rows for debugging ("what folded into this group?"). MCP stays grouped-only — agents useget_resource/get_eventsfor raw state.Compose()stays flat internally, sosummarycontext's per-resource index is unchanged; aFilters.Groupedflag gates the fold.members_truncated.unknowndeliberately (CronJob/Job/CAPI/PVC-Lost/Node-Cordoned, and categories whose detectors don't exist yet).Pairs with skyhook-dev/radar-hub#52 (forwards the new fields through the fleet pivot). SPA grouped
IssuesViewis a follow-up.🤖 Generated with Claude Code