Skip to content

Reduce unnecessary data column downloads during backfill for full nodes #8308

@jimmygchen

Description

@jimmygchen

Description

Full nodes currently download and validate more data columns than necessary during
backfilling of finalized blocks. Since backfilling serves the network rather than
providing security guarantees, this results in wasteful bandwidth and resource usage.

Present Behaviour

During data availability sampling for new blocks, full nodes custodying 4 columns are
required to sample 8 columns total. The 4 additional columns are downloaded, validated,
and discarded without being stored. This behavior is necessary for verifying data
availability of non-finalized blocks.

However, the same behavior currently applies during backfilling of finalized blocks,
where nodes download and validate the extra columns even though they won't be stored.

Proposed Changes

During backfilling of finalized blocks, full nodes should only download and store the
data columns they are custodying (e.g., 4 columns), rather than sampling the full 8
columns. Since backfilling is performed to serve the network with historical data
rather than for security guarantees, the extra sampling is unnecessary and wastes
bandwidth and computational resources.

Original R&D Discord thread here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    optimizationSomething to make Lighthouse run more efficiently.v8.1.0Post-Fulu release

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions