Prove safety of RawBlock and support multi-threaded usages#3331
Prove safety of RawBlock and support multi-threaded usages#3331adri326 wants to merge 4 commits into
Conversation
|
Ah darn it I wasn't rebased |
|
Maybe not rebased, but still ... based! |
Direct accesses to `base` are replaced with dedicated methods with explicit safety requirements.
This is the first step towards enabling multithreading on AtomTable. For now RawBlock will default to using Cell, which yields a byte- equivalent compiled output. Also adds an `atomic` feature, which, when enabled, will make RawBlock use an AtomicPtr instead, ensuring that it implements `Sync`.
After calling `grow()`, the new head would jump to `old_capacity` rather than staying to the same offset. In practice this only loses a few bytes at most.
|
The impression I get from this work is that this may be the first time we benefit notably from using Rust, in the sense that this may be a very distinct advantage of Rust, with a hint of quite extreme potential for future safety guarantees. Thank you so much for your work on this! |
| // SAFETY: | ||
| // - Definition: `self.base` contains `self.allocated()` bytes | ||
| // - Invariant: `self.used_bytes() < self.allocated()` | ||
| unsafe { | ||
| self.base.copy_to(new_block.base.cast_mut(), used_bytes); |
There was a problem hiding this comment.
I would state the safety note slightly differently and I think we can use the non overlapping variant of copy_to here
| // SAFETY: | |
| // - Definition: `self.base` contains `self.allocated()` bytes | |
| // - Invariant: `self.used_bytes() < self.allocated()` | |
| unsafe { | |
| self.base.copy_to(new_block.base.cast_mut(), used_bytes); | |
| // SAFETY: | |
| // - Definition: `self.base` is valid for `self.capacity()` bytes | |
| // - Definition: `new_block.base` is valid for `self.capacity() * 2` bytes | |
| // - Invariant: `self.used_bytes() <= self.capacity()` | |
| // - new_block.base and self.base belong to separate allocation and as such don't overlap | |
| unsafe { | |
| self.base.copy_to_nonoverlapping(new_block.base.cast_mut(), used_bytes); |
What safety preconditions must the caller of grow_new uphold?
- no concurrent writes to pointers returned by
alloc/get/get_unchecked?
| } | ||
| /// ## Safety | ||
| /// | ||
| /// `ptr` is a valid pointer be obtained from [`RawBlock::get()`] or [`RawBlock::alloc()`]. |
There was a problem hiding this comment.
While technically valid I don't think it ever makes sense to call this with a pointer obtained from get as you need the offset to call get in the first place.
Could alloc maybe just return both the pointer and its offset so that we don't need this function?
I would assume most callers of alloc to write to the pointer,then get the offset and from then on only work with the offset.
Could get/get_unchecked/alloc maybe return a wrapper type that has a lifetime linking it to the RawBlock and never exposes the pointer directly? Would that allow us to make grow safe?
|
Thanks for this. I just returned from Italy and am still recovering but I will reply to your question soon @adri326 |
|
@adri326 Regarding the race condition in offset_table.rs: it occurs because it's possible for two threads, each with the same (in the sense of Values of type In my in-progress multi-arg indexing implementation work, I've changed the indexing instructions to use the hash tables of the hashbrown crate so that So to finally the question, no, I don't believe your changes in this PR fix the race condition, but soon there will be no need for the indirection table anyway. For now the race condition could be fixed by locking a mutex at the beginning of that entire match section, actually. I bristled at that originally because it serializes the write operation, and so isn't very concurrent, but as I said, soon it will be a non-issue. |
|
Looks like this is merge ready unless there are further comments. |
I don't think my review comments #3331 (comment) and #3331 (comment) have been addressed. Though I wouldn't consider that a blocker as
|
This pull request is a re-do of #2736, which is currently limited to only
RawBlock, with the goal to later expand this work to other structures.RawBlockis used byAtomTable,F64TableandStackand aims to provide a table of raw bytes that can be quickly appended-to.My goal here was to properly encapsulate it to make reasoning about its safety possible, do the safety proof and lastly add the ability to make
RawBlock: Sync(which is a pre-requisite to run scryer-prolog on multiple threads in the future).Performance should be neutral: the main change is the switch from storing
(base, top, ptr)to(base, capacity, ptr), which eliminates some usages ofunsafe. Benchmarks are within the noise threshold.