perf: prefetch next ListObjects page while consumer iterates current page#236
perf: prefetch next ListObjects page while consumer iterates current page#236jiuker wants to merge 7 commits into
Conversation
…page perf: prefetch next ListObjects page while consumer iterates current page
|
Warning Review limit reached
More reviews will be available in 18 minutes. Learn how PR review limits work. To continue reviewing without waiting, enable usage-based billing in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthrough
ChangesAsync prefetch for ListObjects pagination
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/miniocpp/client.h`:
- Around line 60-62: The boolean conversion in ListObjectsResult is missing
const qualification, which breaks use on const instances. Update the explicit
operator bool() in the relevant class to be const again, or add a const overload
that preserves the same semantics, while keeping the existing behavior around
Populate() and itr_ intact.
- Line 43: The ListObjectsResult copying logic is unsafe because itr_ remains
tied to the source resp_.contents and the shared one-shot future can be consumed
incorrectly by postfix or user copies. Fix this in ListObjectsResult by either
making the type move-only and adjusting postfix operator++/other copy paths
accordingly, or implement a real copy that rebases the iterator onto the new
resp_.contents and uses a shared_future-style prefetch mechanism so copies do
not lose the prefetch or dereference the wrong container.
In `@include/miniocpp/response.h`:
- Around line 301-304: Avoid default-copying ListObjectsResponse because its
view members can stay bound to another instance’s owned_ storage and dangle
after ListObjectsResult::operator++(int) returns by value. Update
ListObjectsResponse so copy construction and copy assignment are either deleted
or explicitly rebuild owned_ and rebind all dependent views (including
Item::etag and any URL-decoded fields) from the copied data; keep move
operations if they remain safe.
In `@src/client.cc`:
- Around line 197-209: The comments around the prefetch handling in client.cc
are redundant because they merely restate the code in the surrounding block.
Remove the obvious narration comments in the section that uses prefetch_future_
and initializes itr_, keeping the logic in the existing methods unchanged and
leaving only comments that add non-obvious context.
- Around line 171-188: The region resolved in the object listing flow is only
stored in a local variable and never passed along, so the
ListObjectsV1/ListObjectsV2/ListObjectVersions requests keep using an empty or
stale region. Update the request preparation in this branch to assign the
resolved value to next_args.region before constructing the ListObjects*Args
objects, using the existing GetRegion and next_args flow so all subsequent calls
carry the correct region.
- Around line 167-193: The async prefetch lambda in client.cc is capturing the
owning object via this, which can become dangling if the result outlives the
origin. Update the std::async lambda inside the prefetch setup to capture a
Client* (or equivalent stable client handle) instead of this, and use that
pointer for GetRegion, ListObjectVersions, ListObjectsV1, and ListObjectsV2
calls. Keep the ListObjectsResult prefetch logic unchanged otherwise, but ensure
the lambda no longer depends on the lifetime of the parent result object.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: fb5290d1-4478-4ff3-b6da-93f5e324036a
📒 Files selected for processing (3)
include/miniocpp/client.hinclude/miniocpp/response.hsrc/client.cc
apply suggestion
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/client.cc (2)
171-192: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick winPreserve the original
GetRegionfailure.The failure path drops the actual
GetRegionResponseand replaces it with a generic error, so callers lose the original status/message.Proposed fix
- std::string region; - if (GetRegionResponse resp = - client->GetRegion(next_args.bucket, next_args.region)) { - next_args.region = resp.region; + GetRegionResponse region_resp = + client->GetRegion(next_args.bucket, next_args.region); + if (region_resp) { + next_args.region = region_resp.region; if (next_args.recursive) { next_args.delimiter = ""; } else if (next_args.delimiter.empty()) { next_args.delimiter = "/"; } @@ return client->ListObjectsV2(ListObjectsV2Args(next_args)); } } - return ListObjectsResponse( - error::make<ListObjectsResponse>("unable to get region")); + return ListObjectsResponse(region_resp);🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/client.cc` around lines 171 - 192, The region lookup failure in the client listing flow is being collapsed into a generic error, which discards the original GetRegionResponse details. Update the GetRegion handling in client.cc so the failure path preserves and returns the original response/status from client->GetRegion(next_args.bucket, next_args.region) instead of constructing a new generic ListObjectsResponse error. Keep the existing success path intact for ListObjectVersions, ListObjectsV1, and ListObjectsV2.
167-200: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winHandle async failures in
ListObjectsResultLaunch failures fromstd::async(std::launch::async, ...)throwstd::system_errorat the call site, and exceptions from the prefetch task are rethrown byfuture::get()inPopulate(). These bypass theListObjectsResponseerror path; convert both into response errors instead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/client.cc` around lines 167 - 200, Handle async failures in ListObjectsResult by converting both std::async launch errors and task exceptions into ListObjectsResponse errors instead of letting them escape. In the prefetch setup around prefetch_future_ creation, wrap the std::async call and the lambda body in error handling that returns a failing ListObjectsResponse via error::make rather than throwing std::system_error or other exceptions. In ListObjectsResult::Populate(), catch exceptions from future::get() and convert them to resp_ error responses so all failures flow through the existing response error path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/miniocpp/client.h`:
- Around line 54-57: Keep ListObjectsResult source-compatible for synchronous
callers by avoiding the API-breaking removal of copy semantics and the postfix
increment contract. Update ListObjectsResult so callers can still copy or
snapshot results safely, and preserve operator++(int) behavior expected by
existing code that uses result++. If the change must remain breaking, explicitly
document it as an API change and adjust the affected ListObjectsResult methods
accordingly.
---
Outside diff comments:
In `@src/client.cc`:
- Around line 171-192: The region lookup failure in the client listing flow is
being collapsed into a generic error, which discards the original
GetRegionResponse details. Update the GetRegion handling in client.cc so the
failure path preserves and returns the original response/status from
client->GetRegion(next_args.bucket, next_args.region) instead of constructing a
new generic ListObjectsResponse error. Keep the existing success path intact for
ListObjectVersions, ListObjectsV1, and ListObjectsV2.
- Around line 167-200: Handle async failures in ListObjectsResult by converting
both std::async launch errors and task exceptions into ListObjectsResponse
errors instead of letting them escape. In the prefetch setup around
prefetch_future_ creation, wrap the std::async call and the lambda body in error
handling that returns a failing ListObjectsResponse via error::make rather than
throwing std::system_error or other exceptions. In
ListObjectsResult::Populate(), catch exceptions from future::get() and convert
them to resp_ error responses so all failures flow through the existing response
error path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 8a90d3eb-c294-4320-add0-ea6dab3aa9a9
📒 Files selected for processing (3)
include/miniocpp/client.hinclude/miniocpp/response.hsrc/client.cc
apply suggestoin
apply suggestion
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
include/miniocpp/client.h (1)
43-43: 🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy liftDo not share a one-shot
std::futureacross copyable results.
ListObjectsResultis copyable again, but copies and postfixoperator++still share the samestd::future; the first copy that callsPopulate()consumes it, leaving the other copy unable to fetch the prefetched page. Use a copy-safe shared state, such as a future ofshared_ptr<ListObjectsResponse>or another mechanism where each copied result can materialize the same prefetched page.#!/bin/bash set -euo pipefail sed -n '36,79p' include/miniocpp/client.h sed -n '211,231p' src/client.ccAlso applies to: 54-55, 74-78
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/miniocpp/client.h` at line 43, ListObjectsResult is copyable again, but prefetch_future_ still holds a one-shot future that gets consumed by the first copy calling Populate(), so copied results and postfix operator++ can no longer fetch the same prefetched page. Update the ListObjectsResult/Populate flow in client.h and src/client.cc to use copy-safe shared state instead of std::future<ListObjectsResponse>, such as a shared_ptr-backed prefetched response or a future of shared_ptr<ListObjectsResponse>, so every copy can materialize the same page.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/client.cc`:
- Around line 156-164: ListObjectsResult::UpdatePaginationArgs is using
include_versions to decide when to advance version-listing pagination, but
StartPrefetch routes version listing based on version_id_marker, so both paths
must use the same predicate. Update UpdatePaginationArgs to advance key/version
markers whenever the request is in version-listing mode, matching the logic in
StartPrefetch and the ListObjectVersions flow, and apply the same change
anywhere else the pagination update logic is duplicated.
- Around line 142-153: Initialize itr_ before calling StartPrefetch() in both
ListObjectsResult constructors so const operator bool() does not read an
uninitialized iterator. Set itr_ to resp_->contents.end() after resp_ is created
in each constructor, and keep the existing prefetch flow unchanged. Refer to
ListObjectsResult and StartPrefetch when updating the constructor initialization
path.
---
Duplicate comments:
In `@include/miniocpp/client.h`:
- Line 43: ListObjectsResult is copyable again, but prefetch_future_ still holds
a one-shot future that gets consumed by the first copy calling Populate(), so
copied results and postfix operator++ can no longer fetch the same prefetched
page. Update the ListObjectsResult/Populate flow in client.h and src/client.cc
to use copy-safe shared state instead of std::future<ListObjectsResponse>, such
as a shared_ptr-backed prefetched response or a future of
shared_ptr<ListObjectsResponse>, so every copy can materialize the same page.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 6c110e9d-d654-4bd1-9b9b-e27061e1cee1
📒 Files selected for processing (2)
include/miniocpp/client.hsrc/client.cc
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/client.cc (1)
142-155: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winPopulate the first page before returning from synchronous
ListObjects.These constructors now return with
itr_pinned to the empty placeholder response. That makesconst ListObjectsResult result = client.ListObjects(args); if (result)reportfalseeven when the listing has objects, because the const bool overload cannot triggerPopulate(). Start the prefetch, then consume the first page before returning;Populate()will still schedule the next-page prefetch when truncated.Proposed fix
resp_ = std::make_shared<ListObjectsResponse>(); itr_ = resp_->contents.end(); StartPrefetch(); + Populate(); } @@ resp_ = std::make_shared<ListObjectsResponse>(); itr_ = resp_->contents.end(); StartPrefetch(); + Populate(); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/client.cc` around lines 142 - 155, The synchronous ListObjects path leaves ListObjectsResult initialized with only the empty placeholder response, so a const result can evaluate false even when objects exist. Update the ListObjectsResult constructors in src/client.cc so that after StartPrefetch() they also populate the first page before returning, using Populate() (or equivalent first-page consumption) while preserving the next-page prefetch behavior when the response is truncated. Keep the fix localized to the ListObjectsResult construction flow and the bool/operator logic it feeds.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@src/client.cc`:
- Around line 142-155: The synchronous ListObjects path leaves ListObjectsResult
initialized with only the empty placeholder response, so a const result can
evaluate false even when objects exist. Update the ListObjectsResult
constructors in src/client.cc so that after StartPrefetch() they also populate
the first page before returning, using Populate() (or equivalent first-page
consumption) while preserving the next-page prefetch behavior when the response
is truncated. Keep the fix localized to the ListObjectsResult construction flow
and the bool/operator logic it feeds.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 031dc2df-57b8-4a0c-9baa-a10e14e5cbd3
📒 Files selected for processing (2)
include/miniocpp/client.hsrc/client.cc
apply suggestion
fix #218
Summary by CodeRabbit
ListObjectsResultiterator validity behavior (truthiness and increment timing).