Skip to content

Conversation

@Davidson-Souza
Copy link
Member

What is the purpose of this pull request?

  • Bug fix
  • Documentation update
  • New feature
  • Test
  • Other:

Which crates are being modified?

  • floresta-chain
  • floresta-cli
  • floresta-common
  • floresta-compact-filters
  • floresta-electrum
  • floresta-watch-only
  • floresta-wire
  • floresta
  • florestad
  • Other:

Description

This commit adds a simple, XOR-based check for integrity for the
flat_chain_store. If we find it to be corrupted, we reindex to find how
much of it is lost, and try to continue from there.

There's no error-correction logic in this commit, because this tends to
bring complexity and require more space. Since modern file-systems are
so good at keeping things from corrupting, I don't think there's that
much need to make our own error correction code here.

Notes to the reviewers

Depends on #251

@Davidson-Souza Davidson-Souza added the enhancement New feature or request label May 12, 2025
@Davidson-Souza Davidson-Souza marked this pull request as ready for review May 16, 2025 15:36
@Davidson-Souza Davidson-Souza changed the title [WIP/RFC]flat-chain-store: Detect file corruption and reindex if needed. flat-chain-store: Detect file corruption and reindex if needed. May 19, 2025
Copy link
Collaborator

@jaoleal jaoleal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Just some documentation nits/suggestions

@Davidson-Souza
Copy link
Member Author

Rebased and addressed comments

Copy link
Contributor

@JoseSK999 JoseSK999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

I'm just not sure XORing is a sufficiently effective way, since the bits can easily "cancel out" and do not get propagated like a proper hash. Apparently the CRC-32 and especially the XXH3-64 are quite fast and robust. A SHA-256 or BLAKE could be nice for tamper resistance but doesn't seem a realistic threat idk. Wdyt?

@Davidson-Souza
Copy link
Member Author

Davidson-Souza commented May 22, 2025

Concept ACK.

I'm just not sure XORing is a sufficiently effective way, since the bits can easily "cancel out" and do not get propagated like a proper hash. Apparently the CRC-32 and especially the XXH3-64 are quite fast and robust. A SHA-256 or BLAKE could be nice for tamper resistance but doesn't seem a realistic threat idk. Wdyt?

It depends on what are we protecting against. My goal here is to catch sporadic, random file corruptions. Something like "you've killed the machine in the middle of a flush" or "your fs is broken". I'm not considering that someone has access to your disk and can choose specific mutations to make the checksum pass. The only goal here was: it can detect random anomalies in a satisfactory way AND is really fast.

Do you see need for something more robust?

Edit: I also have some code for snapshoting the database, so it could be a best measure against more catastrophic cases

@JoseSK999
Copy link
Contributor

JoseSK999 commented May 22, 2025

It depends on what are we protecting against. My goal here is to catch sporadic, random file corruptions. Something like "you've killed the machine in the middle of a flush" or "your fs is broken". I'm not considering that someone has access to your disk and can choose specific mutations to make the checksum pass. The only goal here was: it can detect random anomalies in a satisfactory way AND is really fast.

Yeah, we don't need cryptographic hashes for that, but maybe we should use any of the two non-crypto hashes I mentioned (or any other). I think the XOR approach is much less effective than using any of these, and the performance should be acceptable.

From what I'm reading the XXH3 seems extremely fast and uses 64 bits with a lot of robustness. Tradeoff here is that we add another dependency.

@Davidson-Souza
Copy link
Member Author

Pushed a version using CRC-32. The crate I found for xxHash has MSRV of 1.81 :/

I'll benchmark the both of them, tho. Results soon (TM)

@JoseSK999
Copy link
Contributor

JoseSK999 commented May 22, 2025

Pushed a version using CRC-32. The crate I found for xxHash has MSRV of 1.81 :/

I'll benchmark the both of them, tho. Results soon (TM)

Nice! The most downloaded crate for it seems to be twox-hash, idk if that is the one you found.

@Davidson-Souza
Copy link
Member Author

Pushed a version using CRC-32. The crate I found for xxHash has MSRV of 1.81 :/
I'll benchmark the both of them, tho. Results soon (TM)

Nice! The most downloaded crate for it seems to be twox-hash, idk if that is the one you found.

That one! Their Cargo.toml says 1.81

@Davidson-Souza
Copy link
Member Author

Did a little benchmarking, here are the results:

Algorithm Min Avg Max
xxh64 154.32 ms 155.39 ms 156.47 ms
crc32 215.77 ms 219.41 ms 223.35 ms
xor 230.03 ms 230.88 ms 231.94 ms

xxh64 is a clear winner. This is actually not that surprising. Being word-aligned usually helps a lot when doing hashing, with utreexo we use SHA512/256 because it tends to perform better than SHA256 on 64 bits machines.

Here's the benchmark code:

fn flat_store_compute_checksum(
    c: &mut Criterion,
) {
    use floresta_chain::{pruned_utreexo::{flat_chain_store::{FlatChainStore, FlatChainStoreConfig}, ChainStore}, DiskBlockHeader};
    
    let test_id = rand::random::<u64>();
    let config = FlatChainStoreConfig::new(
        format!("./tmp-db/{test_id}/"),
    );

    let store = FlatChainStore::new(config).unwrap();
    let blocks = read_blocks_txt();

    blocks.iter().enumerate().for_each(|(height, block)| {
        let header = DiskBlockHeader::FullyValid(block.header, height as u32);
        store.save_header(&header).unwrap();

        store.update_block_index(height as u32, block.block_hash()).unwrap();
    });

    c.bench_function("flat_store_compute_checksum", |b| {
        b.iter(|| {
            let _ = store.benchmark_compute_checksum();
        });
    });
}

@Davidson-Souza Davidson-Souza force-pushed the checksum branch 2 times, most recently from ac91b5c to b11a474 Compare May 26, 2025 17:44
@JoseSK999
Copy link
Contributor

Wow, I didn’t expect the simple XOR approach to be the slowest

Copy link
Contributor

@JoseSK999 JoseSK999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK b11a474

This commit adds a simple, xxHash-based check for integrity for the
flat_chain_store. If we find it to be corrupted, we reindex to find how
much of it is lost, and try to continue from there.

There's no error-correction logic in this commit, because this tends to
bring complexity and require more space. Since modern file-systems are
so good at keeping things from corrupting, I don't think there's that
much need to make our own error correction code here.
@Davidson-Souza
Copy link
Member Author

Updated with @lucad70's suggestion.

Copy link
Contributor

@JoseSK999 JoseSK999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 992916a

@Davidson-Souza Davidson-Souza merged commit ce1b56f into vinteumorg:master May 27, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants