In Physical Backup, currently we use only file metadata for validating snapshot.
We need more validations.
But, sha256 like checksums can't be used
- We can't read the whole file, because that will do high IO usage
- cryptographic checksumming is slow on large files (5~10GB)
Solution :
- Randomly pick segment of files and checksum those, so that we can validate later. 1MB read for 500MB file can be the target. (Need to do some math on how much reliable it can be)
- Use non-cryptographic algo instead : https://xxhash.com/