Skip to content

fix(customresourcestate): log when configured CRD is not installed in cluster#2903

Open
maksimp13 wants to merge 1 commit into
kubernetes:mainfrom
maksimp13:fix/log-missing-crd-issue-2354
Open

fix(customresourcestate): log when configured CRD is not installed in cluster#2903
maksimp13 wants to merge 1 commit into
kubernetes:mainfrom
maksimp13:fix/log-missing-crd-issue-2354

Conversation

@maksimp13
Copy link
Copy Markdown

@maksimp13 maksimp13 commented Mar 19, 2026

What happened

When a resource is listed in the custom resource state config but its CRD is not installed in the cluster, ResolveGVKToGVKPs returns an empty list with no error. Previously this was silently skipped with no indication to the user.

What this PR does

Adds an InfoS log when a configured CRD is not found in the cluster, so users understand why metrics are not being collected for that resource.

How to test

  1. Create a CRS config referencing a CRD that is not installed in the cluster
  2. Run KSM with --custom-resource-state-config-file pointing to that config
  3. Observe the log: CRD for resource is not installed in the cluster, skipping

Ref #2354

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 19, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Welcome @maksimp13!

It looks like this is your first PR to kubernetes/kube-state-metrics 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kube-state-metrics has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot requested review from mrueg and nmn3m March 19, 2026 20:16
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG Instrumentation Mar 19, 2026
@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 19, 2026
resolvedSet /* GVKPs */, err := discovererInstance.ResolveGVKToGVKPs(schema.GroupVersionKind(resource.GroupVersionKind))
if err != nil {
klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind)
} else if len(resolvedSet) == 0 {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me — adding visibility for silently skipped CRDs is a nice improvement.

One note: this PR adds a log line, but issue #2354 reports that /metrics goes down entirely when non-existent CRDs are listed. A log doesn't fix that behavior, so Fixes #2354 might be inaccurate.
Consider changing it to Ref #2354 unless the endpoint issue has already been resolved separately.

/cc @rexagod

Copy link
Copy Markdown
Author

@maksimp13 maksimp13 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. I wasn't able to reproduce the /metrics crash locally - it seems it was already fixed in a previous versions. The endpoint returns 200 OK even when a configured CRD is not installed in the cluster.

Updated Fixes #2354 → Ref #2354.

@k8s-ci-robot k8s-ci-robot requested a review from rexagod March 20, 2026 02:31
@nmn3m
Copy link
Copy Markdown
Member

nmn3m commented Mar 20, 2026

/lgtm
/assign @mrueg

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 20, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: maksimp13, nmn3m
Once this PR has been reviewed and has the lgtm label, please assign rexagod for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

if err != nil {
klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind)
} else if len(resolvedSet) == 0 {
klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often does this log? Only at the start up of ksm or regularly?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log fires once per CRD informer event (add/update/delete), not on every poll tick. generateMetrics() only runs when WasUpdated == true - set by the informer on any CRD change, reset after each run. So it runs once at startup, then only if the cluster CRD state changes.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves observability for Custom Resource State (CRS) config processing by logging when a configured resource can’t be resolved via CRD discovery, helping users understand why metrics aren’t being produced for that resource.

Changes:

  • Add an InfoS log when ResolveGVKToGVKPs returns an empty resolution set for a configured GVK.
  • Preserve existing error logging behavior when GVK resolution fails.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if err != nil {
klog.ErrorS(err, "failed to resolve GVK", "gvk", resource.GroupVersionKind)
} else if len(resolvedSet) == 0 {
klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind)
Copy link

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(resolvedSet) == 0 can also happen when the configured G/V/K doesn’t match what’s in discovery (e.g., wrong version or kind), not only when the CRD is missing. Consider rewording this log to “no matching CRD found for configured GVK, skipping” (or similar) so it stays accurate.

Suggested change
klog.InfoS("CRD for resource is not installed in the cluster, skipping", "gvk", resource.GroupVersionKind)
klog.InfoS("no matching CRD found for configured GVK, skipping", "gvk", resource.GroupVersionKind)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maksimp13 can you address this comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

5 participants