fix(customresourcestate): log when configured CRD is not installed in cluster#2903
fix(customresourcestate): log when configured CRD is not installed in cluster#2903maksimp13 wants to merge 1 commit into
Conversation
|
This issue is currently awaiting triage. If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Welcome @maksimp13! |
| resolvedSet /* GVKPs */, err := discovererInstance.ResolveGVKToGVKPs(schema.GroupVersionKind(resource.GroupVersionKind)) | ||
| if err != nil { | ||
| klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind) | ||
| } else if len(resolvedSet) == 0 { |
There was a problem hiding this comment.
The change looks good to me — adding visibility for silently skipped CRDs is a nice improvement.
One note: this PR adds a log line, but issue #2354 reports that /metrics goes down entirely when non-existent CRDs are listed. A log doesn't fix that behavior, so Fixes #2354 might be inaccurate.
Consider changing it to Ref #2354 unless the endpoint issue has already been resolved separately.
/cc @rexagod
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: maksimp13, nmn3m The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| if err != nil { | ||
| klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind) | ||
| } else if len(resolvedSet) == 0 { | ||
| klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind) |
There was a problem hiding this comment.
How often does this log? Only at the start up of ksm or regularly?
There was a problem hiding this comment.
The log fires once per CRD informer event (add/update/delete), not on every poll tick. generateMetrics() only runs when WasUpdated == true - set by the informer on any CRD change, reset after each run. So it runs once at startup, then only if the cluster CRD state changes.
There was a problem hiding this comment.
Pull request overview
This PR improves observability for Custom Resource State (CRS) config processing by logging when a configured resource can’t be resolved via CRD discovery, helping users understand why metrics aren’t being produced for that resource.
Changes:
- Add an
InfoSlog whenResolveGVKToGVKPsreturns an empty resolution set for a configured GVK. - Preserve existing error logging behavior when GVK resolution fails.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if err != nil { | ||
| klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind) | ||
| } else if len(resolvedSet) == 0 { | ||
| klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind) |
There was a problem hiding this comment.
len(resolvedSet) == 0 can also happen when the configured G/V/K doesn’t match what’s in discovery (e.g., wrong version or kind), not only when the CRD is missing. Consider rewording this log to “no matching CRD found for configured GVK, skipping” (or similar) so it stays accurate.
| klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind) | |
| klog.InfoS("no matching CRD found for configured GVK, skipping", "gvk", resource.GroupVersionKind) |
What happened
When a resource is listed in the custom resource state config but its CRD is not installed in the cluster,
ResolveGVKToGVKPsreturns an empty list with no error. Previously this was silently skipped with no indication to the user.What this PR does
Adds an
InfoSlog when a configured CRD is not found in the cluster, so users understand why metrics are not being collected for that resource.How to test
--custom-resource-state-config-filepointing to that configCRD for resource is not installed in the cluster, skippingRef #2354