perf(pm): source install scheduler architecture by elrrrrrrr · Pull Request #3029 · utooland/utoo

elrrrrrrr · 2026-05-21T17:34:43Z

Summary

Draft source PR for the installer half of #2948. This is not the final review unit; it is the source branch used to verify the full installer stack in one place.

This source branch now matches the current installer split-stack top (perf/pm-split-installer-scheduler-core), so the split PRs should compose back to this diff exactly.

Covers From #2948

Primitive downloader cache operations.
Sync cloner primitive for rayon workers.
InstallScheduler-owned download/extract/clone inflight and waiter state.
Fresh-lock install scheduling.
Stale-lock resolver event scheduling.
Pipeline worker/receiver cleanup.
Clone/download OnceMap fallback removal from the install hot path.
HTTP tarball install docs updated for scheduler-owned cache resolution.

Actual Split Stack

perf(pm): split downloader cache primitives #3032 perf(pm): split downloader cache primitives (+91 / -38)
perf(pm): add sync cloner primitive #3033 perf(pm): add sync cloner primitive (+225 / -79)
perf(pm): schedule fresh lock installs #3051 perf(pm): schedule fresh lock installs (+490 / -5)
test(pm): cover install scheduler core #3052 test(pm): cover install scheduler core (+94)
perf(pm): route stale installs through scheduler #3050 perf(pm): route stale installs through scheduler (+220 / -270)
perf(pm): remove obsolete install pipeline #3037 perf(pm): remove obsolete install pipeline (-258)

Notes

The scheduler core is split so each PR has a clear review focus: primitives, fresh-lock scheduling, tests, stale-lock event wiring, and final deletion of the old pipeline module. The production PRs avoid dead-code staging and do not use clippy allow escapes.

Validation

Validated at installer split-stack top:

cargo fmt
cargo check -p utoo-pm
cargo test -p utoo-pm service::install_scheduler -- --nocapture (4 passed)
cargo clippy --all-targets -- -D warnings --no-deps

pack-napi warns locally because next.js is a symlink in this worktree; clippy exits successfully.

gemini-code-assist

Code Review

This pull request introduces a new InstallScheduler to manage concurrent package downloads and extractions, replacing the previous pipeline-based implementation. It also increases the manifest fetch concurrency limit for non-semver registries to 256. Feedback focuses on improving memory efficiency and robustness: specifically, tightening backpressure in the download queue to prevent excessive buffering, removing inappropriate panic recovery logic in clone workers, and restoring bounded channels and streaming serialization in the manifest store to prevent memory growth and unnecessary allocations.

gemini-code-assist · 2026-05-21T17:37:40Z

+
+    fn pump_downloads(&mut self) {
+        while self.download_active.len() < self.download_limit
+            && self.extract_backlog() < self.download_limit


The backpressure check self.extract_backlog() < self.download_limit may allow an excessively large number of downloaded tarballs to be buffered in memory if download_limit is high (e.g., 256 for non-semver registries). Since extraction is limited by extract_limit (max 8), the backlog should be constrained by a smaller value relative to the extraction capacity to prevent high memory usage.

Suggested change

&& self.extract_backlog() < self.download_limit

&& self.extract_backlog() < self.extract_limit * 2

gemini-code-assist · 2026-05-21T17:37:40Z

+                let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(|| {
+                    clone_package_from_cache_sync(
+                        &job.spec.package.name,
+                        &job.spec.package.version,
+                        &job.spec.package.tarball_url,
+                        &job.cache_path,
+                        &job.spec.target,
+                    )
+                    .map_err(|e| format!("{e:#}"))
+                }))
+                .unwrap_or_else(|_| Err("install clone worker panicked".to_string()));


The use of std::panic::catch_unwind here violates the general rule regarding panic handling. Panics should be treated as unrecoverable bugs rather than being caught and converted into errors. If a clone worker panics, it should be allowed to propagate or cause the process to terminate.

let result = clone_package_from_cache_sync( &job.spec.package.name, &job.spec.package.version, &job.spec.package.tarball_url, &job.cache_path, &job.spec.target, ) .map_err(|e| format!("{e:#}"));

References

Do not implement recovery logic for panics. Panics should be treated as unrecoverable bugs that need to be fixed, not as transient, recoverable errors.

gemini-code-assist · 2026-05-21T17:37:40Z

 use std::path::{Path, PathBuf};
 use std::sync::Arc;
-use std::sync::mpsc::{self, SyncSender, TrySendError};
+use std::sync::mpsc::{self, Sender};


Reverting the use of SyncSender and TrySendError is necessary to restore the bounded channel and opportunistic write policy for the manifest store.

Suggested change

use std::sync::mpsc::{self, Sender};

use std::sync::mpsc::{self, SyncSender, TrySendError};

gemini-code-assist · 2026-05-21T17:37:40Z


 struct ManifestWriter {
-    tx: SyncSender<ManifestWriteJob>,
+    tx: Sender<ManifestWriteJob>,


The manifest writer should use a SyncSender to support bounded channel operations and backpressure.

Suggested change

tx: Sender<ManifestWriteJob>,

tx: SyncSender<ManifestWriteJob>,

gemini-code-assist · 2026-05-21T17:37:40Z

 impl ManifestWriter {
    fn spawn() -> Self {
-        let (tx, rx) = mpsc::sync_channel(MANIFEST_WRITE_QUEUE_CAPACITY);
+        let (tx, rx) = mpsc::channel();


Changing the manifest writer channel to an unbounded channel removes protection against memory growth if disk I/O stalls. Restoring a bounded sync_channel allows the store to maintain its opportunistic write policy and drop writes when the backlog is too large.

Suggested change

let (tx, rx) = mpsc::channel();

let (tx, rx) = mpsc::sync_channel(1024);

gemini-code-assist · 2026-05-21T17:37:40Z

+        if self.tx.send(job).is_err() {
+            tracing::debug!("Manifest store writer stopped before accepting write");
        }


The manifest writer should use try_send to maintain its opportunistic nature, dropping writes if the queue is full rather than blocking the resolver or allowing unbounded memory growth.

match self.tx.try_send(job) { Ok(()) => {} Err(TrySendError::Full(_)) => { tracing::debug!("Manifest store writer queue full; dropping cache write"); } Err(TrySendError::Disconnected(_)) => { tracing::debug!("Manifest store writer stopped before accepting write"); } }

gemini-code-assist · 2026-05-21T17:37:40Z

+    let bytes = match serde_json::to_vec(value) {
+        Ok(b) => b,
+        Err(e) => {
+            tracing::debug!("Failed to serialize {path:?}: {e}");
+            return;
+        }
+    };
+    match std::fs::write(path, &bytes) {
        Ok(()) => {}
-        Err(e) if e.kind() == ErrorKind::NotFound => {
+        Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
            if let Some(parent) = path.parent()
-                && let Err(e) = fs::create_dir_all(parent)
+                && let Err(e) = std::fs::create_dir_all(parent)
            {
                tracing::debug!("Failed to create {parent:?}: {e}");
                return;
            }
-            if let Err(e) = write_compact_sync(path, value) {
+            if let Err(e) = std::fs::write(path, &bytes) {
                tracing::debug!("Failed to write {path:?}: {e}");
            }
        }


Using serde_json::to_vec followed by std::fs::write allocates a full Vec<u8> for the serialized manifest, which can be large. Restoring the use of BufWriter and serde_json::to_writer is more memory-efficient as it streams the data directly to the file.

let write_to_file = |p: &std::path::Path| -> std::io::Result<()> { let file = std::fs::File::create(p)?; let mut writer = std::io::BufWriter::new(file); serde_json::to_writer(&mut writer, value).map_err(|e| std::io::Error::new(std::io::ErrorKind::Other, e))?; writer.flush() }; match write_to_file(path) { Ok(()) => {} Err(e) if e.kind() == std::io::ErrorKind::NotFound => { if let Some(parent) = path.parent() && let Err(e) = std::fs::create_dir_all(parent) { tracing::debug!("Failed to create {parent:?}: {e}"); return; } if let Err(e) = write_to_file(path) { tracing::debug!("Failed to write {path:?}: {e}"); } } Err(e) => tracing::debug!("Failed to write {path:?}: {e}"), }

github-actions · 2026-05-22T01:19:04Z

📊 pm-bench-phases · `578d515` · linux (`ubuntu-latest`)

Workflow run — ant-design

PMs: utoo (this branch) · utoo-npm (latest published) · bun (latest)

npmjs.org

p0_full_cold

PM	wall	±σ	user	sys	RSS	pgMinor
bun	9.29s	0.56s	10.42s	10.34s	770M	336.7K
utoo-next	8.03s	0.11s	10.46s	12.24s	970M	125.5K
utoo-npm	9.69s	1.19s	10.90s	13.00s	979M	119.6K
utoo	8.66s	1.81s	10.57s	12.41s	998M	116.8K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	13.7K	17.9K	1.19G	6M	1.86G	1.75G	1M
utoo-next	108.3K	80.4K	1.16G	5M	1.71G	1.70G	2M
utoo-npm	148.6K	104.1K	1.16G	5M	1.71G	1.70G	2M
utoo	122.7K	77.3K	1.16G	5M	1.71G	1.70G	2M

p1_resolve

PM	wall	±σ	user	sys	RSS	pgMinor
bun	2.04s	0.14s	4.07s	1.08s	513M	171.0K
utoo-next	2.92s	0.01s	5.30s	1.67s	611M	80.3K
utoo-npm	3.23s	0.36s	5.51s	2.05s	614M	79.1K
utoo	2.92s	0.10s	5.28s	1.75s	609M	87.6K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	8.1K	4.6K	204M	3M	108M	-	1M
utoo-next	45.2K	67.7K	200M	2M	7M	3M	2M
utoo-npm	70.4K	87.6K	200M	2M	7M	3M	2M
utoo	46.3K	67.8K	200M	2M	7M	3M	2M

p3_cold_install

PM	wall	±σ	user	sys	RSS	pgMinor
bun	7.12s	0.10s	6.31s	10.11s	625M	199.9K
utoo-next	7.37s	1.72s	5.04s	11.32s	502M	64.3K
utoo-npm	9.63s	3.21s	5.25s	11.67s	549M	71.0K
utoo	5.74s	0.20s	4.93s	10.87s	456M	59.9K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	4.5K	6.8K	1019M	3M	1.76G	1.76G	1M
utoo-next	113.9K	50.6K	989M	3M	1.70G	1.70G	2M
utoo-npm	136.6K	49.3K	990M	4M	1.70G	1.70G	2M
utoo	85.6K	55.0K	989M	3M	1.70G	1.70G	2M

p4_warm_link

PM	wall	±σ	user	sys	RSS	pgMinor
bun	3.53s	0.18s	0.20s	2.38s	135M	32.6K
utoo-next	2.21s	0.10s	0.49s	3.79s	78M	17.8K
utoo-npm	2.28s	0.01s	0.47s	3.79s	79M	18.4K
utoo	2.11s	0.15s	0.41s	3.48s	52M	11.6K

PM	vCtx	iCtx	netRX	netTX	cache	node_mod	lock
bun	277	21	5M	18K	1.91G	1.75G	1M
utoo-next	42.2K	19.7K	5K	5K	1.70G	1.70G	2M
utoo-npm	41.0K	18.6K	5K	5K	1.70G	1.70G	2M
utoo	20.8K	13.4K	6K	11K	1.71G	1.70G	2M

npmmirror.com: no output captured.

elrrrrrrr added A-Pkg Manager Area: Package Manager benchmark Run pm-bench on PR labels May 21, 2026

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

elrrrrrrr force-pushed the perf/pm-source-installer branch from 8d7b025 to 1b9becd Compare May 21, 2026 17:43

perf(pm): split downloader cache primitives

c466f90

This was referenced May 21, 2026

perf(pm): split downloader cache primitives #3032

Draft

perf(pm): add sync cloner primitive #3033

Draft

perf(pm): remove obsolete install pipeline #3037

Draft

elrrrrrrr force-pushed the perf/pm-source-installer branch from 1b9becd to 22beccd Compare May 21, 2026 20:05

fix(ci): stabilize pm view and web worker checks

90a5182

elrrrrrrr force-pushed the perf/pm-source-installer branch from 22beccd to d826277 Compare May 21, 2026 22:32

ci(pm): make defender setup best effort

c4ebecd

elrrrrrrr force-pushed the perf/pm-source-installer branch from d826277 to 0964dc0 Compare May 21, 2026 23:09

elrrrrrrr added 6 commits May 22, 2026 07:37

ci(pm): make bench cargo cache best effort

47a95c3

perf(pm): add sync cloner primitive

0607957

perf(pm): schedule fresh lock installs

170a95f

test(pm): cover install scheduler core

d279722

perf(pm): add install scheduler wiring

61578f1

perf(pm): route install work through scheduler

c244b67

elrrrrrrr force-pushed the perf/pm-source-installer branch from 0964dc0 to c244b67 Compare May 21, 2026 23:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(pm): source install scheduler architecture#3029

perf(pm): source install scheduler architecture#3029
elrrrrrrr wants to merge 9 commits into
nextfrom
perf/pm-source-installer

elrrrrrrr commented May 21, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

gemini-code-assist Bot May 21, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	&& self.extract_backlog() < self.download_limit
	&& self.extract_backlog() < self.extract_limit * 2

	use std::sync::mpsc::{self, Sender};
	use std::sync::mpsc::{self, SyncSender, TrySendError};

	tx: Sender<ManifestWriteJob>,
	tx: SyncSender<ManifestWriteJob>,

	let (tx, rx) = mpsc::channel();
	let (tx, rx) = mpsc::sync_channel(1024);

Conversation

elrrrrrrr commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Covers From #2948

Actual Split Stack

Notes

Validation

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 21, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 22, 2026

📊 pm-bench-phases · 578d515 · linux (ubuntu-latest)

npmjs.org

p0_full_cold

p1_resolve

p3_cold_install

p4_warm_link

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

elrrrrrrr commented May 21, 2026 •

edited

Loading

📊 pm-bench-phases · `578d515` · linux (`ubuntu-latest`)