Measure bloom filter AppGossip hit rate #4085

yacovm · 2025-07-14T23:03:28Z

Why this should be merged

It can be insightful to measure how effective and efficient is our gossip pull mechanism that is based on bloom filters.

How this works

This commit adds a metric to measure the bloom filter hit rate % for the gossip pull queries.

How this was tested

Updated TestGossiperGossip accordingly to reflect the metrics
Ran a modified avalanchego node and installed prometheus and grafana and observed the following:

Need to be documented in RELEASES.md?

No.

Copilot

Pull Request Overview

This PR adds instrumentation to measure the effectiveness of the bloom filter used in the gossip pull mechanism.

Track and compute bloom filter hits and misses in AppRequest
Introduce a bloomFilterHitRate histogram in the metrics and register it
Extend unit tests to cover the new histogram and the computeBloomFilterHitPercentage function

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
network/p2p/gossip/handler.go	Count filter hits/misses and compute hit percentage
network/p2p/gossip/gossip.go	Add `bloomFilterHitRate` histogram and register it
network/p2p/gossip/gossip_test.go	Update tests to assert metric observation and new logic

Comments suppressed due to low confidence (3)

network/p2p/gossip/gossip.go:115

[nitpick] Metric name bloomfilter_hit_rate is inconsistent with the project’s snake_case convention; consider renaming to bloom_filter_hit_rate or prefixing with gossip_ to match other metric names.

			Name:      "bloomfilter_hit_rate",

network/p2p/gossip/gossip_test.go:633

[nitpick] Consider adding tests for the error branches in computeBloomFilterHitPercentage (e.g., when safemath.Add or Mul overflows) to verify that ok is false in those cases.

func TestComputeBloomFilterHitPercentage(t *testing.T) {

network/p2p/gossip/handler.go:18

The computeBloomFilterHitPercentage function uses zap.Uint64 and zap.Error but go.uber.org/zap is not imported, causing a compile error. Add import "go.uber.org/zap" or switch to the existing logging API.

	safemath "github.com/ava-labs/avalanchego/utils/math"

This commit adds a metric to measure the bloom filter hit rate % for the gossip pull queries. Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>

joshua-kim · 2025-07-15T16:12:55Z

network/p2p/gossip/handler.go

+	total, err := safemath.Add(hits, misses)
+	if err != nil {
+		log.Warn("failed to calculate total hits and misses",
+			zap.Uint64("hits", hits),
+			zap.Uint64("misses", misses),
+			zap.Error(err),
+		)
+		return 0, false
+	}
+
+	hitsOneHundred, err := safemath.Mul(hits, 100)
+	if err != nil {
+		log.Warn("failed to calculate hit ratio",
+			zap.Uint64("hits", hits),
+			zap.Uint64("misses", misses),
+			zap.Error(err),
+		)
+		return 0, false
+	}


I wonder if we can get rid of these overflow checks... since the bloom filter is []byte (which can be at most of size max int) could we just emit a metric on both the hits + length of the filter? We could figure out the hit/miss % in dashboards with metric math.

I wonder if we can get rid of these overflow checks... since the bloom filter is []byte (which can be at most of size max int) could we just emit a metric on both the hits + length of the filter? We could figure out the hit/miss % in dashboards with metric math.

I am not sure what you mean by the length of the filter. The bloom filter is constant size and its length is always the same, but yet independent of the elements that it can "contain".

I guess you meant the size of the mempool? Since the maximum number of hits we can get is the size of the mempool, not the bloom filter. Sure I can do that, that's a good point.

Unfortunately we don't have a Size() on the set interface so I will try to code it without the safemath.

Changed the code to not use safemath.

joshua-kim · 2025-07-15T16:14:29Z

network/p2p/gossip/gossip_test.go

+
+			testHistogram := &testHistogram{
+				Histogram: metrics.bloomFilterHitRate,
+			}
+			metrics.bloomFilterHitRate = testHistogram
+


This looks a bit hacky to me, if we want to do an assert that the metrics are the values we're expecting would it make sense for us to consider using prometheus/testutil here?

testutil.ToFloat64 doesn't work with histograms. Do you have a suggestion?

network/p2p/gossip/gossip_test.go

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>

Copilot AI review requested due to automatic review settings July 14, 2025 23:03

yacovm requested a review from joshua-kim as a code owner July 14, 2025 23:03

github-project-automation bot added this to avalanchego Jul 14, 2025

Copilot AI reviewed Jul 14, 2025

View reviewed changes

Measure bloom filter AppGossip hit rate

e89b59e

This commit adds a metric to measure the bloom filter hit rate % for the gossip pull queries. Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>

yacovm force-pushed the bloom_hit_rate branch from aa23054 to e89b59e Compare July 15, 2025 12:44

yacovm marked this pull request as draft July 15, 2025 16:13

joshua-kim reviewed Jul 15, 2025

View reviewed changes

Address code review comments

b21c5a5

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>

yacovm force-pushed the bloom_hit_rate branch from c49be06 to b21c5a5 Compare July 15, 2025 18:33

yacovm marked this pull request as ready for review July 15, 2025 18:36

Merge branch 'master' into bloom_hit_rate

a87154f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Measure bloom filter AppGossip hit rate #4085

Measure bloom filter AppGossip hit rate #4085

Uh oh!

yacovm commented Jul 14, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

joshua-kim Jul 15, 2025

Uh oh!

yacovm Jul 15, 2025

Uh oh!

yacovm Jul 15, 2025

Uh oh!

yacovm Jul 15, 2025

Uh oh!

joshua-kim Jul 15, 2025

Uh oh!

yacovm Jul 15, 2025

Uh oh!

Uh oh!

Uh oh!

Measure bloom filter AppGossip hit rate #4085

Are you sure you want to change the base?

Measure bloom filter AppGossip hit rate #4085

Uh oh!

Conversation

yacovm commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this should be merged

How this works

How this was tested

Need to be documented in RELEASES.md?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

joshua-kim Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

yacovm Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

yacovm Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

yacovm Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

joshua-kim Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

yacovm Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yacovm commented Jul 14, 2025 •

edited

Loading