Skip to content

perf(#11088): paginate _users queries in purging getRoles()#11089

Open
YASHSHARMAOFFICIALLY wants to merge 2 commits into
medic:masterfrom
YASHSHARMAOFFICIALLY:11088-paginate-getroles-purging
Open

perf(#11088): paginate _users queries in purging getRoles()#11089
YASHSHARMAOFFICIALLY wants to merge 2 commits into
medic:masterfrom
YASHSHARMAOFFICIALLY:11088-paginate-getroles-purging

Conversation

@YASHSHARMAOFFICIALLY
Copy link
Copy Markdown
Contributor

Summary

  • Paginate getRoles() in sentinel/src/lib/purging.js to process the _users database in batches of 1000 documents instead of loading every user document into memory at once.
  • Add a test verifying pagination behavior across multiple batches.

Fixes #11088

Context

getRoles() calls db.users.allDocs({ include_docs: true }) without a limit, which loads the entire _users database in a single allocation. In deployments with thousands of community health workers, this creates a significant memory spike during each purging cycle. The rest of the purging module carefully paginates large data operations (e.g., batchedContactsPurge, purgeUnallocatedRecords), and getRoles() was the remaining outlier.

Changes

sentinel/src/lib/purging.js

  • Added USERS_BATCH_SIZE = 1000 constant.
  • Rewrote getRoles() to iterate through _users in paginated batches using startkey and skip for cursor progression.

sentinel/tests/unit/lib/purging.spec.js

  • Updated existing assertion to match the new limit parameter.
  • Added test case should paginate through users in batches that verifies multi-batch iteration, correct query parameters per batch, and correct role deduplication across batches.

Test plan

  • Existing getRoles tests pass with updated assertions.
  • New pagination test validates correct batching behavior.
  • Full purging test suite passes (75 tests).

getRoles() previously loaded the entire _users database into memory
with a single allDocs({ include_docs: true }) call. In deployments
with thousands of community health workers this creates unnecessary
memory pressure during each purging cycle.

This change paginates the query in batches of 1000 documents,
consistent with how the rest of the purging module handles large
datasets.
… in getRoles

Extract fetchUsersBatch and getOfflineRoles helpers from getRoles to
reduce nesting and cognitive complexity flagged by SonarCloud.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

getRoles() in purging loads entire _users database into memory without pagination

1 participant