-
Notifications
You must be signed in to change notification settings - Fork 70
flat-chain-store: Detect file corruption and reindex if needed. #479
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Just some documentation nits/suggestions
|
Rebased and addressed comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK.
I'm just not sure XORing is a sufficiently effective way, since the bits can easily "cancel out" and do not get propagated like a proper hash. Apparently the CRC-32 and especially the XXH3-64 are quite fast and robust. A SHA-256 or BLAKE could be nice for tamper resistance but doesn't seem a realistic threat idk. Wdyt?
It depends on what are we protecting against. My goal here is to catch sporadic, random file corruptions. Something like "you've killed the machine in the middle of a flush" or "your fs is broken". I'm not considering that someone has access to your disk and can choose specific mutations to make the checksum pass. The only goal here was: it can detect random anomalies in a satisfactory way AND is really fast. Do you see need for something more robust? Edit: I also have some code for snapshoting the database, so it could be a best measure against more catastrophic cases |
Yeah, we don't need cryptographic hashes for that, but maybe we should use any of the two non-crypto hashes I mentioned (or any other). I think the XOR approach is much less effective than using any of these, and the performance should be acceptable. From what I'm reading the |
|
Pushed a version using CRC-32. The crate I found for xxHash has MSRV of 1.81 :/ I'll benchmark the both of them, tho. Results soon (TM) |
Nice! The most downloaded crate for it seems to be |
That one! Their Cargo.toml says |
|
Did a little benchmarking, here are the results:
xxh64 is a clear winner. This is actually not that surprising. Being word-aligned usually helps a lot when doing hashing, with utreexo we use SHA512/256 because it tends to perform better than SHA256 on 64 bits machines. Here's the benchmark code: fn flat_store_compute_checksum(
c: &mut Criterion,
) {
use floresta_chain::{pruned_utreexo::{flat_chain_store::{FlatChainStore, FlatChainStoreConfig}, ChainStore}, DiskBlockHeader};
let test_id = rand::random::<u64>();
let config = FlatChainStoreConfig::new(
format!("./tmp-db/{test_id}/"),
);
let store = FlatChainStore::new(config).unwrap();
let blocks = read_blocks_txt();
blocks.iter().enumerate().for_each(|(height, block)| {
let header = DiskBlockHeader::FullyValid(block.header, height as u32);
store.save_header(&header).unwrap();
store.update_block_index(height as u32, block.block_hash()).unwrap();
});
c.bench_function("flat_store_compute_checksum", |b| {
b.iter(|| {
let _ = store.benchmark_compute_checksum();
});
});
} |
ac91b5c to
b11a474
Compare
|
Wow, I didn’t expect the simple XOR approach to be the slowest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK b11a474
This commit adds a simple, xxHash-based check for integrity for the flat_chain_store. If we find it to be corrupted, we reindex to find how much of it is lost, and try to continue from there. There's no error-correction logic in this commit, because this tends to bring complexity and require more space. Since modern file-systems are so good at keeping things from corrupting, I don't think there's that much need to make our own error correction code here.
|
Updated with @lucad70's suggestion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK 992916a
What is the purpose of this pull request?
Which crates are being modified?
Description
This commit adds a simple, XOR-based check for integrity for the
flat_chain_store. If we find it to be corrupted, we reindex to find how
much of it is lost, and try to continue from there.
There's no error-correction logic in this commit, because this tends to
bring complexity and require more space. Since modern file-systems are
so good at keeping things from corrupting, I don't think there's that
much need to make our own error correction code here.
Notes to the reviewers
Depends on #251