Skip to content

Prove safety of RawBlock and support multi-threaded usages#3331

Open
adri326 wants to merge 4 commits into
mthom:masterfrom
adri326:rawblock-safety
Open

Prove safety of RawBlock and support multi-threaded usages#3331
adri326 wants to merge 4 commits into
mthom:masterfrom
adri326:rawblock-safety

Conversation

@adri326
Copy link
Copy Markdown
Contributor

@adri326 adri326 commented May 10, 2026

This pull request is a re-do of #2736, which is currently limited to only RawBlock, with the goal to later expand this work to other structures.

RawBlock is used by AtomTable, F64Table and Stack and aims to provide a table of raw bytes that can be quickly appended-to.
My goal here was to properly encapsulate it to make reasoning about its safety possible, do the safety proof and lastly add the ability to make RawBlock: Sync (which is a pre-requisite to run scryer-prolog on multiple threads in the future).

Performance should be neutral: the main change is the switch from storing (base, top, ptr) to (base, capacity, ptr), which eliminates some usages of unsafe. Benchmarks are within the noise threshold.

@adri326
Copy link
Copy Markdown
Contributor Author

adri326 commented May 10, 2026

Ah darn it I wasn't rebased

@triska
Copy link
Copy Markdown
Contributor

triska commented May 10, 2026

Maybe not rebased, but still ... based!

@adri326 adri326 force-pushed the rawblock-safety branch from edb8afd to 8584772 Compare May 10, 2026 20:21
@adri326
Copy link
Copy Markdown
Contributor Author

adri326 commented May 10, 2026

@mthom I came across this while rebasing, I'd be curious to know what caused the race and how I could test to see if I didn't indirectly fix it:

dc05526

adri326 added 4 commits May 10, 2026 22:24
Direct accesses to `base` are replaced with dedicated methods with
explicit safety requirements.
This is the first step towards enabling multithreading on AtomTable.

For now RawBlock will default to using Cell, which yields a byte-
equivalent compiled output.

Also adds an `atomic` feature, which, when enabled, will make RawBlock
use an AtomicPtr instead, ensuring that it implements `Sync`.
After calling `grow()`, the new head would jump to `old_capacity` rather
than staying to the same offset.

In practice this only loses a few bytes at most.
@adri326 adri326 force-pushed the rawblock-safety branch from 8584772 to f395d55 Compare May 10, 2026 20:38
@triska
Copy link
Copy Markdown
Contributor

triska commented May 10, 2026

The impression I get from this work is that this may be the first time we benefit notably from using Rust, in the sense that this may be a very distinct advantage of Rust, with a hint of quite extreme potential for future safety guarantees. Thank you so much for your work on this!

Comment thread src/raw_block.rs
Comment on lines +253 to +257
// SAFETY:
// - Definition: `self.base` contains `self.allocated()` bytes
// - Invariant: `self.used_bytes() < self.allocated()`
unsafe {
self.base.copy_to(new_block.base.cast_mut(), used_bytes);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would state the safety note slightly differently and I think we can use the non overlapping variant of copy_to here

Suggested change
// SAFETY:
// - Definition: `self.base` contains `self.allocated()` bytes
// - Invariant: `self.used_bytes() < self.allocated()`
unsafe {
self.base.copy_to(new_block.base.cast_mut(), used_bytes);
// SAFETY:
// - Definition: `self.base` is valid for `self.capacity()` bytes
// - Definition: `new_block.base` is valid for `self.capacity() * 2` bytes
// - Invariant: `self.used_bytes() <= self.capacity()`
// - new_block.base and self.base belong to separate allocation and as such don't overlap
unsafe {
self.base.copy_to_nonoverlapping(new_block.base.cast_mut(), used_bytes);

What safety preconditions must the caller of grow_new uphold?

  • no concurrent writes to pointers returned by alloc/get/get_unchecked?

Comment thread src/raw_block.rs
}
/// ## Safety
///
/// `ptr` is a valid pointer be obtained from [`RawBlock::get()`] or [`RawBlock::alloc()`].
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While technically valid I don't think it ever makes sense to call this with a pointer obtained from get as you need the offset to call get in the first place.

Could alloc maybe just return both the pointer and its offset so that we don't need this function?
I would assume most callers of alloc to write to the pointer,then get the offset and from then on only work with the offset.

Could get/get_unchecked/alloc maybe return a wrapper type that has a lifetime linking it to the RawBlock and never exposes the pointer directly? Would that allow us to make grow safe?

@mthom
Copy link
Copy Markdown
Owner

mthom commented May 24, 2026

Thanks for this. I just returned from Italy and am still recovering but I will reply to your question soon @adri326

@mthom
Copy link
Copy Markdown
Owner

mthom commented May 26, 2026

@adri326 Regarding the race condition in offset_table.rs: it occurs because it's possible for two threads, each with the same (in the sense of Eq) value value, to reach the let offset = ... line after the mutex on concurrent_tbl.indirection_tbl is released so that two distinct F64Offset values are generated for the same f64 in the offset table.

Values of type F64Offset are meant to correspond uniquely to the f64 values they reference so that indices containing keys of type HeapCellValue can work correctly (for indexing instructions), which is only sensible if each f64 value stored in the offset table appears in the table exactly once, and hence has a unique offset. This is a race condition in the sense that a race between two threads to write the same f64 to the table can break the invariant by generating two distinct offsets for it.

In my in-progress multi-arg indexing implementation work, I've changed the indexing instructions to use the hash tables of the hashbrown crate so that HeapCellValue can go back to not having a Hash instance, removing the need to maintain the invariant.

So to finally the question, no, I don't believe your changes in this PR fix the race condition, but soon there will be no need for the indirection table anyway. For now the race condition could be fixed by locking a mutex at the beginning of that entire match section, actually. I bristled at that originally because it serializes the write operation, and so isn't very concurrent, but as I said, soon it will be a non-issue.

@mthom
Copy link
Copy Markdown
Owner

mthom commented May 26, 2026

Looks like this is merge ready unless there are further comments.

@Skgland
Copy link
Copy Markdown
Contributor

Skgland commented May 26, 2026

Looks like this is merge ready unless there are further comments.

I don't think my review comments #3331 (comment) and #3331 (comment) have been addressed.

Though I wouldn't consider that a blocker as

  • missing safety comment on grow_new -> pre-existing
  • further API changes/improvements -> can be a follow up PR
  • wording change of safety comment in grow_new -> sufficiently minor/due to suggestions to use copy_to_nonoverlapping

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants