Skip to content

perf(pm): add resolver demand mainloop#3043

Draft
elrrrrrrr wants to merge 1 commit into
perf/pm-split-resolver-warm-cache-configfrom
perf/pm-split-resolver-demand-mainloop
Draft

perf(pm): add resolver demand mainloop#3043
elrrrrrrr wants to merge 1 commit into
perf/pm-split-resolver-warm-cache-configfrom
perf/pm-split-resolver-demand-mainloop

Conversation

@elrrrrrrr
Copy link
Copy Markdown
Contributor

@elrrrrrrr elrrrrrrr commented May 21, 2026

Summary

  • replace preload+sequential registry resolution with a demand-driven BFS manifest mainloop
  • keep manifest inflight/cache/waiter state in the resolver loop instead of spawned provider jobs
  • prioritize demand manifest jobs over transitive prefetch work
  • export resolver-owned version manifests back into the project cache
  • stop PM API from pre-populating UnifiedRegistry memory cache; warm cache now flows through resolver config

Review Notes

This is the remaining core scheduling change in the resolver stack. I tried splitting it further along semver demand first / full-manifest later, but that creates a bad intermediate state: semver and npmjs would use different cache ownership paths, and project-cache output would come from two sources. The current boundary keeps the invariant intact: once this PR lands, resolver mainloop owns BFS, inflight dedupe, warm-cache reads, and project-cache writes for both semver and full-manifest registries.

Suggested review entry points:

  • ManifestState and FetchQueues: mainloop-owned cache, waiter, and priority state
  • run_main_loop_bfs: BFS scheduling and demand-vs-prefetch behavior
  • build_deps_with_config_output + service/api.rs: project-cache ownership transfer

Validation

  • cargo fmt
  • cargo check -p utoo-ruborist
  • cargo test -p utoo-ruborist resolver::builder::tests::test_build -- --nocapture
  • cargo test -p utoo-ruborist resolver::builder::tests::test_resolve_high_level_api -- --nocapture
  • cargo test -p utoo-ruborist service::api::tests::test_build_deps_options_creation -- --nocapture
  • cargo test -p utoo-ruborist --lib
  • cargo clippy --all-targets -- -D warnings --no-deps

Split Plan

Part of the resolver stack split from source PR #3028. Follow-up PRs keep tests, obsolete preload removal, registry cache ownership cleanup, and docs separate so this PR contains only the behavior switch.

@elrrrrrrr elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 21, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the dependency resolver from a two-phase (preload and build) approach to a demand-driven BFS resolution strategy. The new implementation manages breadth-first traversal, manifest caching, and inflight de-duplication within the builder, while provider tasks handle specific manifest jobs. Feedback identifies a critical missing trait bound for RegistryClient that will cause compilation errors. Additionally, there are performance concerns regarding $O(N)$ operations within the main resolver loop for tracking demand jobs and active prefetches, as well as potential stalling of the main loop when processing non-registry dependencies synchronously.

Comment on lines +1358 to +1362
where
R: ManifestProvider,
R::Error: Send,
E: EventReceiver,
{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The run_main_loop_bfs function (and its callers like build_deps_with_config_output) is missing the RegistryClient trait bound for R. This will cause a compilation error because process_dependency (called at line 1408) and handle_resolved_registry_manifest (called at line 1440) both require R: RegistryClient. Additionally, registry_error (called at line 1435) requires R::Error: From<RegistryError>, which is a requirement of the RegistryClient trait.

where
    R: ManifestProvider + RegistryClient,
    R::Error: Send,
    E: EventReceiver,

Comment on lines +854 to +862
let prefetch_concurrency = if self
.queued
.values()
.any(|priority| *priority == FetchPriority::Demand)
{
prefetch_concurrency
} else {
usize::MAX
};
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check self.queued.values().any(|priority| *priority == FetchPriority::Demand) is an $O(N)$ operation where $N$ is the number of queued jobs. Since this is called inside pump_fetches which is invoked frequently within the resolver main loop, this will lead to $O(N^2)$ performance degradation on large dependency graphs. It is better to maintain a simple counter for queued demand jobs.

Comment on lines +1408 to +1410
let processed = process_dependency(graph, registry, parent, &edge, config)
.await
.map_err(|inner| chain_err(graph, parent, &edge, inner))?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Processing non-registry dependencies (git, http, file) via a blocking .await inside the while let Some(...) = level_pending.pop_front() loop will stall the entire resolver main loop. While this dependency is being resolved, no other manifest fetches in the fetches stream will be polled, effectively serializing resolution for these types and potentially leaving network/CPU resources underutilized. Consider scheduling these as ManifestJob variants or using a non-blocking approach to keep the main loop pumping.

Comment on lines +887 to +892
fn active_prefetches(&self) -> usize {
self.active
.values()
.filter(|priority| **priority == FetchPriority::Prefetch)
.count()
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The active_prefetches method iterates over all active jobs ($O(M)$), which is called in a loop inside pump_fetches. While $M$ is capped by concurrency, maintaining a simple counter would be more efficient and idiomatic for a high-performance resolver loop.

@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-demand-mainloop branch from 54b55da to 2948997 Compare May 21, 2026 19:21
@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-demand-mainloop branch from 2948997 to c4361f4 Compare May 21, 2026 22:32
@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-warm-cache-config branch 2 times, most recently from c1d84fd to 17611a9 Compare May 21, 2026 23:09
@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-demand-mainloop branch from c4361f4 to 13bb300 Compare May 21, 2026 23:09
@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-warm-cache-config branch from 17611a9 to 32ef04c Compare May 21, 2026 23:39
@elrrrrrrr elrrrrrrr force-pushed the perf/pm-split-resolver-demand-mainloop branch from 13bb300 to 1efa749 Compare May 21, 2026 23:39
@github-actions
Copy link
Copy Markdown

📊 pm-bench-phases · 57a1d24 · linux (ubuntu-latest)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM wall ±σ user sys RSS pgMinor
bun 8.84s 0.09s 10.28s 9.97s 712M 339.4K
utoo-next 8.04s 0.11s 10.44s 12.18s 982M 125.3K
utoo-npm 8.49s 1.13s 10.73s 12.51s 970M 126.1K
utoo 8.90s 1.17s 11.24s 12.49s 960M 148.1K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 13.9K 17.7K 1.19G 6M 1.86G 1.75G 1M
utoo-next 107.1K 71.8K 1.16G 5M 1.71G 1.70G 2M
utoo-npm 134.6K 94.5K 1.16G 5M 1.71G 1.70G 2M
utoo 114.2K 66.9K 1.08G 5M 1.71G 1.70G 2M

p1_resolve

PM wall ±σ user sys RSS pgMinor
bun 1.95s 0.04s 4.01s 1.00s 511M 157.1K
utoo-next 2.84s 0.04s 5.33s 1.65s 612M 82.1K
utoo-npm 3.02s 0.06s 5.52s 2.00s 605M 77.7K
utoo 2.49s 0.19s 5.97s 1.60s 642M 118.4K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 8.1K 4.5K 202M 3M 107M - 1M
utoo-next 45.6K 66.6K 199M 2M 7M 3M 2M
utoo-npm 69.3K 88.7K 199M 2M 7M 3M 2M
utoo 14.3K 18.9K 202M 3M 7M 3M 2M

p3_cold_install

PM wall ±σ user sys RSS pgMinor
bun 6.82s 0.42s 6.28s 9.73s 610M 206.3K
utoo-next 6.04s 0.40s 4.95s 10.72s 467M 61.5K
utoo-npm 8.14s 1.89s 5.15s 11.18s 482M 60.6K
utoo 5.90s 0.24s 4.87s 10.62s 473M 61.9K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 4.4K 6.3K 1018M 3M 1.76G 1.76G 1M
utoo-next 92.6K 47.1K 988M 2M 1.70G 1.70G 2M
utoo-npm 124.4K 51.8K 988M 3M 1.70G 1.70G 2M
utoo 87.0K 51.0K 987M 2M 1.70G 1.70G 2M

p4_warm_link

PM wall ±σ user sys RSS pgMinor
bun 3.39s 0.04s 0.17s 2.42s 134M 31.5K
utoo-next 2.30s 0.21s 0.49s 3.79s 80M 18.4K
utoo-npm 2.19s 0.06s 0.49s 3.78s 81M 18.9K
utoo 2.22s 0.13s 0.51s 3.79s 80M 18.6K
PM vCtx iCtx netRX netTX cache node_mod lock
bun 262 20 29K 37K 1.87G 1.74G 1M
utoo-next 41.9K 19.1K 306K 8K 1.70G 1.70G 2M
utoo-npm 43.0K 20.4K 306K 11K 1.70G 1.70G 2M
utoo 43.3K 19.9K 306K 26K 1.71G 1.70G 2M

npmmirror.com: no output captured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant