Skip to content

Conversation

clabby
Copy link
Collaborator

@clabby clabby commented Jul 27, 2025

Overview

Introduces a new store module that contains a simple, unauthenticated Store K/V DB.

closes #1206

@patrick-ogrady
Copy link
Contributor

👀 👀 👀

@clabby clabby marked this pull request as draft July 28, 2025 00:05
@patrick-ogrady patrick-ogrady requested a review from danlaine July 28, 2025 00:08
@clabby clabby force-pushed the cl/store branch 2 times, most recently from e742649 to 0022902 Compare July 28, 2025 00:09
@clabby clabby marked this pull request as ready for review July 28, 2025 00:11
Copy link
Contributor

@patrick-ogrady patrick-ogrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original ambition with this issue (#1206) was to actually create a totally unrelated implementation to adb (with variable-sized operations)...something like archive but "mutable" (and where each key wasn't associated with some "index").

This PR has made me second guess that original effort given how straightforward your abstraction seems to integrate into the adbs (especially because it is a "known useful" pattern rather than just something I thought may be useful).

The code you've provided looks pretty solid (I just left comments primarily around how it is presented).

Thanks for putting this up! We'll get back to you tomorrow with direction on what route we prefer 👍

@roberto-bayardo
Copy link
Collaborator

If decoupled from ADB, we don't need to support by-index lookups, and could therefore support variable length items without the additional log structure used by adb::immutable.

I'm leaning towards keeping this "non authenticated" store independent of ADB and giving it variable-length record capability to make it more flexible?

@patrick-ogrady
Copy link
Contributor

If decoupled from ADB, we don't need to support by-index lookups, and could therefore support variable length items without the additional log structure used by adb::immutable.

I'm leaning towards keeping this "non authenticated" store independent of ADB and giving it variable-length record capability to make it more flexible?

Agreed. Let's keep store as a standalone thing for now (at least until we get further along the adb optimization path). We should support variable length as well (you can look at archive::prunable for an example of how this is done with journal::variable and Index).

@clabby
Copy link
Collaborator Author

clabby commented Jul 29, 2025

Sounds good - thanks y'all 😄

I should have some time mid-week to revisit.

@clabby
Copy link
Collaborator Author

clabby commented Aug 2, 2025

Rewrote + simplified the store - now allows for variable-length values and is more separate from the adb in functionality.

Copy link
Contributor

@patrick-ogrady patrick-ogrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your "historical functionality" introduced some unintended behavior (based on a smal misunderstanding of how Index works).

If you revert those, I think it should be pretty close!

Adding some sort of historical query pattern is an interesting idea (and as you are likely well-aware is commonly used to power archive state stores these days) but have typically found that to require some sort of "range query" functionality that isn't present in any of our DBs (so far 😉 ).

@clabby
Copy link
Collaborator Author

clabby commented Aug 3, 2025

Thanks @patrick-ogrady - good catch! I fooled myself on the underlying behavior of Index here with test_store_historical_access; The test only works out as-is due to using the same key, but "the "nth" version of some key is really the nth version of some super set of keys with the same prefix" (#1327 (comment)) bears its teeth when you assign a few values to a key that shares a prefix with the one used to saturate values as you suggested.

If you revert those, I think it should be pretty close!

The code for the historical retrieval + allowing the index to expand with the number of ops was added in 5be4491, and prior to that change (a3a3e55), the store does handle conflicts within the cursor correctly (as suggested in #1327 (comment).)

The downside to that version is that it does not support a form of pruning inactive operations in the log, which I attempted to give purpose to in 5be4491 by adding the option to access historical values that the key has taken on.

Would you prefer to keep it simple, and just revert to a3a3e55, or is there more functionality you'd like to see here (i.e. pruning)?

@patrick-ogrady
Copy link
Contributor

Thanks @patrick-ogrady - good catch! I fooled myself on the underlying behavior of Index here with test_store_historical_access; The test only works out as-is due to using the same key, but "the "nth" version of some key is really the nth version of some super set of keys with the same prefix" (#1327 (comment)) bears its teeth when you assign a few values to a key that shares a prefix with the one used to saturate values as you suggested.

👍

The code for the historical retrieval + allowing the index to expand with the number of ops was added in 5be4491, and prior to that change (a3a3e55), the store does handle conflicts within the cursor correctly (as suggested in #1327 (comment)). The downside to that version is that it does not support a form of pruning inactive operations in the log, which I attempted to give purpose to in 5be4491 by adding the option to access historical values that the key has taken on. Would you prefer to keep it simple, and just revert to a3a3e55, or is there more functionality you'd like to see here (i.e. pruning)?

Like with adb::any, we should maintain an "inactivity floor" and rewrite keys to the tip of the log (reading from the base of the log at the rate new keys are added to bound the "junk" on disk):

/// Raise the inactivity floor by exactly `max_steps` steps, followed by applying a commit
/// operation. Each step either advances over an inactive operation, or re-applies an active
/// operation to the tip and then advances over it.
///
/// This method does not change the state of the db's snapshot, but it always changes the root
/// since it applies at least one operation.
pub(super) async fn raise_inactivity_floor(&mut self, max_steps: u64) -> Result<(), Error> {
for _ in 0..max_steps {
if self.inactivity_floor_loc == self.op_count() {
break;
}
let op = self.log.read(self.inactivity_floor_loc).await?;
self.move_op_if_active(op, self.inactivity_floor_loc)
.await?;
self.inactivity_floor_loc += 1;
}
self.apply_op(Fixed::Commit(self.inactivity_floor_loc))
.await?;
Ok(())
}
/// Prune historical operations that are > `pruning_delay` steps behind the inactivity floor.
/// This does not affect the db's root or current snapshot.
pub(super) async fn prune_inactive(&mut self) -> Result<(), Error> {
let Some(oldest_retained_loc) = self.log.oldest_retained_pos().await? else {
return Ok(());
};
// Calculate the target pruning position: inactivity_floor_loc - pruning_delay
let target_prune_loc = self.inactivity_floor_loc.saturating_sub(self.pruning_delay);
let ops_to_prune = target_prune_loc.saturating_sub(oldest_retained_loc);
if ops_to_prune == 0 {
return Ok(());
}
debug!(ops_to_prune, target_prune_loc, "pruning inactive ops");
// Prune the MMR, whose pruning boundary serves as the "source of truth" for proving.
let prune_to_pos = leaf_num_to_pos(target_prune_loc);
self.ops
.prune_to_pos(&mut self.hasher, prune_to_pos)
.await?;
// Because the log's pruning boundary will be blob-size aligned, we cannot use it as a
// source of truth for the min provable element.
self.log.prune(target_prune_loc).await?;
Ok(())
}

You may definitely feel you are duplicating a good bit of logic and (as discussed previously) I'm ok with that for this early phase. We are planning to do a "deduplication effort" once the core storage primitives are complete (and prefer keeping them un-entangled for faster experimentation).

Hope that helps!

@clabby
Copy link
Collaborator Author

clabby commented Aug 4, 2025

Reverted the changes in 5be4491, and added pruning logic to the store akin to adb::any's.

@clabby clabby force-pushed the cl/store branch 2 times, most recently from 71ac6a6 to aa42c41 Compare August 4, 2025 19:53
Copy link
Collaborator Author

@clabby clabby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added deletion support + fixed an issue with conflicting keys in the snapshot building process, one last todo.

clabby added 4 commits August 5, 2025 11:01
Introduces a new module in `commonware-storage` that provides an unauthenticated key-value store, supporting variable-sized values.
Copy link
Contributor

@patrick-ogrady patrick-ogrady left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

~5 minutes away from merge!

@clabby clabby requested a review from patrick-ogrady August 6, 2025 18:22
patrick-ogrady
patrick-ogrady previously approved these changes Aug 6, 2025
@patrick-ogrady
Copy link
Contributor

🚀 🚀 🚀

patrick-ogrady
patrick-ogrady previously approved these changes Aug 6, 2025
@patrick-ogrady patrick-ogrady merged commit 1d7dc6b into commonwarexyz:main Aug 6, 2025
32 checks passed
Copy link

codecov bot commented Aug 6, 2025

Codecov Report

❌ Patch coverage is 96.86347% with 17 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.32%. Comparing base (18e6449) to head (84af2f4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
storage/src/store/mod.rs 95.96% 17 Missing ⚠️
@@            Coverage Diff             @@
##             main    #1327      +/-   ##
==========================================
+ Coverage   91.24%   91.32%   +0.08%     
==========================================
  Files         262      265       +3     
  Lines       65310    66018     +708     
==========================================
+ Hits        59590    60291     +701     
- Misses       5720     5727       +7     
Files with missing lines Coverage Δ
storage/src/adb/any/mod.rs 98.90% <ø> (ø)
storage/src/adb/operation.rs 88.06% <ø> (ø)
storage/src/store/operation.rs 100.00% <100.00%> (ø)
storage/src/store/mod.rs 95.96% <95.96%> (ø)

... and 6 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18e6449...84af2f4. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[storage] Implement store

3 participants