Skip to content

Conversation

mkalinin
Copy link
Contributor

@mkalinin mkalinin commented Jul 1, 2022

This change is proposed after the following situation occurred on MSF7:

  • CL was syncing and sending an fcU to EL
  • EL wasn't responding and CL timed out after 8s of waiting for the response
    • An exact reason of why EL didn't respond doesn't matter, in this particular case EL started to sync and didn't respond with SYNCING
  • CL didn't retry the call later on, thus, processing a beacon block stayed unfinished which bricked the sync process
  • A node got stuck in a deadlock as CL stopped sending anything to EL because it was needed a signal from EL to proceed

CL could try to i) resend fcU ii) reprocess entire beacon block or do anything else to keep progressing in this scenario. These reactions are pretty much implementation specific. Therefore, the intention of proposed change is to recommend retrying a call after timeout when it may be crucial for CL to keep progressing which ultimately depends on the implementation.

h/t @djrtwo for review
cc @parithosh

@lightclient lightclient added the A-engine Area: for future consideration label Jul 11, 2022
@djrtwo djrtwo merged commit 0b965fb into ethereum:main Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-engine Area: for future consideration
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants