perf(api): reduce DB load in scan hot loop by 13x#11249
Conversation
- Pre-resolve Resources and ResourceTags in bulk before the per-finding loop. - Replace `Resource.upsert_or_delete_tags` with deferred `ResourceTagMapping.bulk_create(ignore_conflicts=True)` at end of batch (eliminates the per-mapping `SELECT FOR UPDATE`). - Wrap the entire micro-batch in a single `rls_transaction` (was 2N); deadlock retry now per-batch. - Populate `Finding.resource_regions/services/types` on INSERT, dropping the post-INSERT `bulk_update`. - Raise `SCAN_DB_BATCH_SIZE` from 500 to 1000. - Add `update_fields=[...]` to `Scan`/`Provider` saves; throttle progress saves to 1% delta or 10s. - Preserve findings with empty `resource_uid` (IaC scans, some Azure/GCP/K8s). Measured (3000 findings per micro-batch): - Wall-clock 20.8s -> 1.57s (13.2x) - COMMITs 6003 -> 2 - SELECT FOR UPDATE on resource_tag_mappings 15000 -> 0 No schema changes. No migrations. Micro-batches are now atomic: errors previously masked by per-finding SAVEPOINTs may surface in logs.
|
✅ Conflict Markers Resolved All conflict markers have been successfully resolved in this pull request. |
🔒 Container Security ScanImage: 📊 Vulnerability Summary
12 package(s) affected
|
|
✅ All necessary |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #11249 +/- ##
==========================================
- Coverage 93.97% 93.95% -0.02%
==========================================
Files 237 237
Lines 34829 34877 +48
==========================================
+ Hits 32729 32770 +41
- Misses 2100 2107 +7
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
23e53a2 to
8623129
Compare
| for m in created_tag_mappings: | ||
| if m.pk is not None: |
There was a problem hiding this comment.
In created_tag_mappings you are using ignore_conflicts=True.
That means that the database does not return the IDs of the inserted rows, so we won't enter to this If right?
There was a problem hiding this comment.
Good catch. Fixed with a pre-SELECT of existing (resource_id, tag_id) pairs, so updated_at is bumped only on resources that actually gain a mapping
| inserted = sum(1 for m in created_mappings if m.pk) | ||
| if inserted != len(mappings_to_create): | ||
| logger.error( | ||
| f"scan {scan_instance.id}: expected " | ||
| f"{len(mappings_to_create)} ResourceFindingMapping rows, " | ||
| f"inserted {inserted}. Rolling back micro-batch." | ||
| ) |
There was a problem hiding this comment.
Because the other comment, if the pk is None, inserted will be 0.
We will get excpect N rows, inserted 0.
There was a problem hiding this comment.
You're right: ignore_conflicts=True does not populate pk, so this branch fires on every successful batch. It's pre-existing from #10724 though, keeping it as-is to scope this PR to the perf rewrite and opening a separate fix for the silent-failure detection
Context
The per-finding loop in
tasks/jobs/scan.pywas the dominant bottleneck ofperform_prowler_scan: for each finding, it issued multipleSELECT FOR UPDATEagainstresource_tag_mappings, opened a transaction per item, and re-fetchedResource/ResourceTagrows that were already known. For large scans this produced thousands of round-trips and lock contention, slowing the hot loop and increasing DB load.This PR rewrites the micro-batch path to be set-oriented and atomic, with no schema changes.
Description
Changes in
api/src/backend/tasks/jobs/scan.py:ResourceandResourceTagrows in bulk before the per-finding loop, instead of per item.Resource.upsert_or_delete_tags(which issuedSELECT FOR UPDATEper mapping) with deferredResourceTagMapping.bulk_create(ignore_conflicts=True)executed once at the end of the batch.rls_transaction(was 2N). Deadlock retry now operates at the batch level.Finding.resource_regions,resource_servicesandresource_typesdirectly on INSERT, removing the post-INSERTbulk_updatepass.SCAN_DB_BATCH_SIZEfrom 500 to 1000.update_fields=[...]toScan/Providersaves to avoid full-row writes.resource_uid(IaC scans, some Azure/GCP/K8s findings).No schema changes. No migrations. Behaviorally, the micro-batch is now atomic: errors that were previously masked by per-finding SAVEPOINTs may now surface in logs (the batch is retried on deadlock).
Measured impact (3000 findings per micro-batch)
COMMITcountSELECT FOR UPDATEonresource_tag_mappingsSteps to review
api/src/backend/tasks/jobs/scan.pyend-to-end; the change is concentrated in a single file.Resource/ResourceTaglookups previously done inside the loop.ResourceTagMapping.bulk_create(ignore_conflicts=True)correctly replaces the previous per-mapping upsert path (idempotent on retry).rls_transactionper micro-batch instead of per finding; confirm the deadlock retry path still re-runs the whole batch safely.resource_uidfindings (IaC, some Azure/GCP/K8s) are kept and stored.Checklist
Community Checklist
SDK/CLI
API
License
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.