-
Notifications
You must be signed in to change notification settings - Fork 96
fix!: change how client heads work to fix ChunkNotFound #465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
General approach looks good. Keeping more things alive prevents things from getting garbage collected behind our backs. Isn't there still a problem with I'm also worried about us clobbering the Client/ClientGroup write in case there are overlapping refreshes/persist. Let's add asserts that the invariants we had at the start didn't change before we do any writes. |
Also, BREAKING CHANGE so update the commit title. |
I believe the call to
Can you be more specific about which invariants you think we should assert? |
Yes, I think this actually requires a format version change because the client group gc code asserts that all the clients in the client map are of the version it expects (so the old code asserts V5). I was thinking about making this code laxer so that it just checks the the client has a client group id (all that it needs), but then decided its just safer to do format version changes when we change things like this instead of reasoning about the backwards compat all over the place. I think its best if we can minimize the places where we have to consider old db formats to: 1. idb collect and 2. mutation recovery. |
const result: RefreshResult = | ||
await memdag.withSuspendedSourceCacheEvictsAndDeletes(async () => { | ||
const perdagWriteResult: | ||
| [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use a named type here too?
const refreshHashesSet = new Set<Hash>(); | ||
client.refreshHashes.forEach(h => refreshHashesSet.add(h)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const refreshHashesSet = new Set<Hash>(); | |
client.refreshHashes.forEach(h => refreshHashesSet.add(h)); | |
const refreshHashesSet = new Set(client.refreshHashes); |
clientGroupID: 'client-group-1', | ||
}); | ||
const client2 = makeClientV5({ | ||
const client2 = makeClientV6({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test now has one v5 client and 2 v6 clients... That seems good but maybe copy the test and have a test where all the clients are v6?
await assertClientHeadHash( | ||
perdag, | ||
clientID, | ||
await assertRefreshHashes(perdag, clientID, [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why isn't the hash removed from the refreshHashes at the end?
} | ||
return undefined; | ||
} | ||
await setRefreshHashes([result.newPerdagClientHeadHash]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we ensure we do not clobber another refresh?
packages/replicache/src/replicache-mutation-recovery-dd31.test.ts
Outdated
Show resolved
Hide resolved
And a few other code review fixes
to make it clearer where the transactions are
It is confusing to have chunksPersisted on the LazyStore. If it takes a write lock it is better to keep this on the write implementation.
Use the same write transaction for clarity
In persist we have:
Similar things in refresh. |
* Acquires a write lock on the store. | ||
*/ | ||
chunksPersisted(chunkHashes: Iterable<Hash>): Promise<void> { | ||
return withWrite(this, () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this write in here is problematic because it hides transactions.
} | ||
} | ||
this._sourceChunksCache.persisted(chunksToCache); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing commit. In this case the commit would be a no op but it is still confusing.
const [newMemdagMutations, memdagBaseSnapshot] = await withRead( | ||
memdag, | ||
async memdagRead => { | ||
const [newMemdagMutations, memdagBaseSnapshot, gatheredChunks] = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved things around to remove a memdag read transaction here
} | ||
|
||
let memdagBaseSnapshotPersisted = false; | ||
await withWrite(perdag, async perdagWrite => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactor to fold the if into the transaction and reuse code.
For refresh:
What if we have 2 clients C1 and C2 and we end up with the following sequence:
It looks like TXA can be overlapping without problems. I'm convinced that refresh is fine... I will try the same exercise with persist too |
Actually, I think we will have issues if TXC-C2 happens before TXB-C1 |
}); | ||
if (memdagBaseSnapshotPersisted) { | ||
await memdag.chunksPersisted(gatheredChunks.keys()); | ||
await withWrite(memdag, async perdagWrite => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memdagWrite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// If increase the format version we need to decide how to deal with this | ||
// logic. | ||
assert(db.replicacheFormatVersion === REPLICACHE_FORMAT_VERSION_DD31); | ||
assert(db.replicacheFormatVersion >= REPLICACHE_FORMAT_VERSION_DD31); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be assert(db.replicacheFormatVersion === REPLICACHE_FORMAT_VERSION_DD31 || db.replicacheFormatVersion === REPLICACHE_FORMAT_VERSION_V6)
So that it fails next time we try to update format version and we reassess if this logic still works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
name: 'a', | ||
lastOpenedTimestampMS: 0, | ||
replicacheFormatVersion: REPLICACHE_FORMAT_VERSION_DD31, | ||
replicacheFormatVersion: REPLICACHE_FORMAT_VERSION_V6, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also have a test collecting a idb with format version 5. Since the check actually reads the db's perdag to see if it has pending mutations, I think we should use a perdag snapshot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I don't think this is necessary as the algorithm only looks at the client groups, and this change only effects the client map.
'initClientGroupGC starts 5 min interval that collects client groups that are not referred to by any clients and have no pending mutations', | ||
() => { | ||
test('v5 and v6', async () => { | ||
const client1 = makeClientV5({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are bumping the format version, the client map should never contain a mixture of v5 and v6. We just need the test of all v6, and the assert in client-group-gc.ts can be update form assertClientV5OrV6 to just assertClientV6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
); | ||
assertSnapshotCommitDD31(memdagBaseSnapshot); | ||
|
||
type PerdagWriteResult = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type PerdagWriteResult = [
perdagClientGroupHeadHash: Hash,
perdagClientGroupBaseSnapshot: db.Commit<db.SnapshotMetaDD31>,
perdagLmid: number,
gatheredChunks: ReadonlyMap<Hash, ChunkWithSize>,
refreshHashesForRevert: readonly Hash[]
];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Instead of a headHash and tempRefreshHash, we now have: refreshHashes and persistHash.
refreshHashes
refreshHashes is a set of hashes that replaces headHash and tempRefreshHash. This change addresses a problem that occurs if the second perdag write tx in refresh fails, failing to update headHash and cleanup tempRefreshHash.
refreshHashes contains:
Upon refresh completing and successfully running its clean up step, this set will contain a single hash: the hash of the last commit this client refreshed.
persistHash
Is new and most directly resolves the observed ChunkNotFound issue. persistHash is the hash of the last snapshot commit persisted by this client to this client's client group, or null if has never persisted a snapshot.
In persist, when we write the memdag's basesnapshot to the perdag we tell the memdag via call to chunksPersisted that it can move these chunks from its
_memOnlyChunks
(where chunks can not be evicted) to its_sourceChunksCache
(where chunks can be evicted). However, the only thing retaining these chunks in the perdag is the ClientGroup's head hash, which can be updated by another client. This newpersistHash
will retain these chunks in the case that another client updates the ClientGroup's head hash.Fixes #434 #30
BREAKING CHANGE: Updates format version from 5 to 6 because the structure of persisted Client(s) in the ClientMap is changed.