Skip to content

Commit 787f4a1

Browse files
feat: Add Consensus Peer Height Metrics (#5517)
--- <img width="1132" height="345" alt="Screenshot 2025-12-04 at 3 48 42 PM" src="https://github.com/user-attachments/assets/f6a93751-389f-4911-8c14-a8b389268545" /> #### PR checklist - [ ] Tests written/updated - [ ] Changelog entry added in `.changelog` (we use [unclog](https://github.com/informalsystems/unclog) to manage our changelog) - [ ] Updated relevant documentation (`docs/` or `spec/`) and code comments <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Introduce a per-peer consensus height metric and emit updates on NewRoundStep via the reactor. > > - **Consensus**: > - **Metrics**: Add `PeerHeight` gauge (`metrics.go`, `metrics.gen.go`) labeled by `peer_id` to track each peer's current consensus height. > - **Reactor**: On `NewRoundStepMessage`, enqueue to `statsMsgQueue` and update `Metrics.PeerHeight` per peer in `peerStatsRoutine`; minor refactor in switch variable naming. > - **Changelog**: Add improvement entry for publishing peer height metric; minor formatting fixes. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit f1d7bd9. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY -->
2 parents 7b2b241 + f1d7bd9 commit 787f4a1

File tree

4 files changed

+26
-9
lines changed

4 files changed

+26
-9
lines changed

CHANGELOG.md

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
### IMPROVEMENTS
1313

14-
- `[mempool]` perf(mempool/cache): Optimize LRUTxCache.Remove to reduce lock contention and map access
14+
- `[mempool]` perf(mempool/cache): Optimize LRUTxCache.Remove to reduce lock contention and map access
1515
([\#5244](https://github.com/cometbft/cometbft/pull/5244))
1616
- `[e2e]` add support for testing different keytypes, including BLS
1717
([\#3513](https://github.com/cometbft/cometbft/pull/3513))
@@ -63,7 +63,7 @@
6363
([\#2988](https://github.com/cometbft/cometbft/issues/2988))
6464
- `[types]` Significantly speedup types.MakePartSet and types.AddPart, which are used in creating a block proposal
6565
([\#3117](https://github.com/cometbft/cometbft/issues/3117))
66-
- `[types] Make a new method `GetByAddressMut` for `ValSet`, which does not copy the returned validator.
66+
- `[types] Make a new method`GetByAddressMut` for `ValSet`, which does not copy the returned validator.
6767
([\#3119](https://github.com/cometbft/cometbft/issues/3119))
6868
- `[consensus]` Make Vote messages only take one peerstate mutex
6969
([\#3156](https://github.com/cometbft/cometbft/issues/3156))
@@ -82,6 +82,8 @@
8282
- `[consensus]` Make mempool updates asynchronous from consensus Commit's,
8383
reducing latency for reaching consensus timeouts.
8484
([#3008](https://github.com/cometbft/cometbft/pull/3008))
85+
- [consensus] Add peer height metric publication to the consensus reactor's peer state.
86+
([#5517](https://github.com/cometbft/cometbft/pull/5517))
8587

8688
### BUG-FIXES
8789

@@ -148,6 +150,7 @@ encouraged to upgrade as soon as possible.
148150
*December 20 2024*
149151

150152
This release:
153+
151154
- fixes a bug that caused a node produce errors caused by the sending of next PEX requests too soon.
152155
As a consequence of this incorrect behavior a node would be marked as BAD.
153156
- Adds a proper description of `ExtendedVoteInfo` and `VoteInfo` in the spec.
@@ -604,7 +607,7 @@ gossip.
604607
This release includes the second part of ABCI++, called ABCI 2.0.
605608
ABCI 2.0 introduces ABCI methods `ExtendVote` and `VerifyVoteExtension`.
606609
These new methods allow the application to add data (opaque to CometBFT),
607-
called _vote extensions_ to precommit votes sent by validators.
610+
called *vote extensions* to precommit votes sent by validators.
608611
These vote extensions are made available to the proposer(s) of the next height.
609612
Additionally, ABCI 2.0 coalesces `BeginBlock`, `DeliverTx`, and `EndBlock`
610613
into one method, `FinalizeBlock`, whose `Request*` and `Response*`
@@ -688,7 +691,7 @@ for people who forked CometBFT and interact directly with the indexers kvstore.
688691
- `[light]` Fixed an edge case where a light client would panic when attempting
689692
to query a node that (1) has started from a non-zero height and (2) does
690693
not yet have any data. The light client will now, correctly, not panic
691-
_and_ keep the node in its list of providers in the same way it would if
694+
*and* keep the node in its list of providers in the same way it would if
692695
it queried a node starting from height zero that does not yet have data
693696
([\#575](https://github.com/cometbft/cometbft/issues/575))
694697
- `[abci]` Restore the snake_case naming in JSON serialization of
@@ -853,7 +856,7 @@ See below for more details.
853856
[\#77](https://github.com/cometbft/cometbft/pull/77). @jmalicevic
854857
([\#382](https://github.com/cometbft/cometbft/pull/382))
855858
- `[consensus]` ([\#386](https://github.com/cometbft/cometbft/pull/386)) Short-term fix for the case when `needProofBlock` cannot find previous block meta by defaulting to the creation of a new proof block. (@adizere)
856-
- Special thanks to the [Vega.xyz](https://vega.xyz/) team, and in particular to Zohar (@ze97286), for reporting the problem and working with us to get to a fix.
859+
- Special thanks to the [Vega.xyz](https://vega.xyz/) team, and in particular to Zohar (@ze97286), for reporting the problem and working with us to get to a fix.
857860
- `[docker]` enable cross platform build using docker buildx
858861
([\#9073](https://github.com/tendermint/tendermint/pull/9073))
859862
- `[consensus]` fix round number of `enterPropose`
@@ -942,7 +945,7 @@ to this release!
942945
- `[consensus]` Short-term fix for the case when `needProofBlock` cannot find
943946
previous block meta by defaulting to the creation of a new proof block.
944947
([\#386](https://github.com/cometbft/cometbft/pull/386): @adizere)
945-
- Special thanks to the [Vega.xyz](https://vega.xyz/) team, and in particular
948+
- Special thanks to the [Vega.xyz](https://vega.xyz/) team, and in particular
946949
to Zohar (@ze97286), for reporting the problem and working with us to get to
947950
a fix.
948951
- `[p2p]` Correctly use non-blocking `TrySendEnvelope` method when attempting to
@@ -965,7 +968,7 @@ to this release!
965968
### FEATURES
966969

967970
- `[rpc]` Add `match_event` query parameter to indicate to the RPC that it
968-
should match events _within_ attributes, not only within a height
971+
should match events *within* attributes, not only within a height
969972
([tendermint/tendermint\#9759](https://github.com/tendermint/tendermint/pull/9759))
970973

971974
### IMPROVEMENTS

consensus/metrics.gen.go

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

consensus/metrics.go

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,11 @@ type Metrics struct {
143143
// correspond to earlier heights and rounds than this node is currently
144144
// in.
145145
LateVotes metrics.Counter `metrics_labels:"vote_type"`
146+
147+
// PeerHeight is the consensus reactor's view of what height their peers are currently on.
148+
// It is reported with a separate tag for every peer we are connected to, and updated when their height updates
149+
// in our consensus state.
150+
PeerHeight metrics.Gauge `metrics_labels:"peer_id"`
146151
}
147152

148153
func (m *Metrics) MarkProposalProcessed(accepted bool) {

consensus/reactor.go

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,7 @@ func (conR *Reactor) Receive(e p2p.Envelope) {
276276
return
277277
}
278278
ps.ApplyNewRoundStepMessage(msg)
279+
conR.conS.statsMsgQueue <- msgInfo{msg, e.Src.ID()}
279280
case *NewValidBlockMessage:
280281
ps.ApplyNewValidBlockMessage(msg)
281282
case *HasVoteMessage:
@@ -984,7 +985,7 @@ func (conR *Reactor) peerStatsRoutine() {
984985
if !ok {
985986
panic(fmt.Sprintf("Peer %v has no state", peer))
986987
}
987-
switch msg.Msg.(type) {
988+
switch concreteMsg := msg.Msg.(type) {
988989
case *VoteMessage:
989990
if numVotes := ps.RecordVote(); numVotes%votesToContributeToBecomeGoodPeer == 0 {
990991
conR.Switch.MarkPeerAsGood(peer)
@@ -993,6 +994,8 @@ func (conR *Reactor) peerStatsRoutine() {
993994
if numParts := ps.RecordBlockPart(); numParts%blocksToContributeToBecomeGoodPeer == 0 {
994995
conR.Switch.MarkPeerAsGood(peer)
995996
}
997+
case *NewRoundStepMessage:
998+
conR.Metrics.PeerHeight.With("peer_id", string(msg.PeerID)).Set(float64(concreteMsg.Height))
996999
}
9971000
case <-conR.conS.Quit():
9981001
return
@@ -1669,7 +1672,6 @@ func (m *NewRoundStepMessage) ValidateHeight(initialHeight int64) error {
16691672
Field: "Height",
16701673
Reason: fmt.Sprintf("%v should be lower than initial height %v", m.Height, initialHeight),
16711674
}
1672-
16731675
}
16741676

16751677
if m.Height == initialHeight && m.LastCommitRound != -1 {

0 commit comments

Comments
 (0)