-
Notifications
You must be signed in to change notification settings - Fork 763
Remove cpu limit for rayservice e2e test #4859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
7ff6508
2b815be
d8248a1
e795b3b
5a93c83
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -43,7 +43,6 @@ spec: | |
| cpu: "1" | ||
| memory: 1G | ||
| limits: | ||
| cpu: "1" | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Worker CPU limit not removed in autoscaling YAMLMedium Severity The CPU limit ( Reviewed by Cursor Bugbot for commit 5a93c83. Configure here.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. see #4859 (comment) |
||
| memory: 2G | ||
| ports: | ||
| - containerPort: 6379 | ||
|
|
||


There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why add PearStand?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to Ray’s official core spec, actors default to num_cpus=1 for scheduling if not explicitly specified.
Because
PearStandwas defined in the graph but omitted in our serveConfigV2, it didn't get any customray_actor_options, so Ray automatically assigned it the default 1 CPU token.Previously, this was masked because our head node had
limits.cpu: 2(which made KubeRay pass--num-cpus=2to Ray). Now that we removed the limit, KubeRay falls back to usingrequests.cpu: 1. With only 1 total CPU token available in Ray, PearStand's default 1-CPU demand broke the budget and caused the scheduling failure.Adding PearStand here with
num_cpus=0.1explicitly overrides Ray's 1-CPU default and aligns it with other deployments.See the controller log showing

PearStandfailed to schedule with only 0.4 CPU available: