Skip to content

Measure bloom filter AppGossip hit rate #4085

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

yacovm
Copy link
Contributor

@yacovm yacovm commented Jul 14, 2025

Why this should be merged

It can be insightful to measure how effective and efficient is our gossip pull mechanism that is based on bloom filters.

How this works

This commit adds a metric to measure the bloom filter hit rate % for the gossip pull queries.

How this was tested

  • Updated TestGossiperGossip accordingly to reflect the metrics
  • Ran a modified avalanchego node and installed prometheus and grafana and observed the following:
image

Need to be documented in RELEASES.md?

No.

@Copilot Copilot AI review requested due to automatic review settings July 14, 2025 23:03
@yacovm yacovm requested a review from joshua-kim as a code owner July 14, 2025 23:03
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds instrumentation to measure the effectiveness of the bloom filter used in the gossip pull mechanism.

  • Track and compute bloom filter hits and misses in AppRequest
  • Introduce a bloomFilterHitRate histogram in the metrics and register it
  • Extend unit tests to cover the new histogram and the computeBloomFilterHitPercentage function

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
network/p2p/gossip/handler.go Count filter hits/misses and compute hit percentage
network/p2p/gossip/gossip.go Add bloomFilterHitRate histogram and register it
network/p2p/gossip/gossip_test.go Update tests to assert metric observation and new logic
Comments suppressed due to low confidence (3)

network/p2p/gossip/gossip.go:115

  • [nitpick] Metric name bloomfilter_hit_rate is inconsistent with the project’s snake_case convention; consider renaming to bloom_filter_hit_rate or prefixing with gossip_ to match other metric names.
			Name:      "bloomfilter_hit_rate",

network/p2p/gossip/gossip_test.go:633

  • [nitpick] Consider adding tests for the error branches in computeBloomFilterHitPercentage (e.g., when safemath.Add or Mul overflows) to verify that ok is false in those cases.
func TestComputeBloomFilterHitPercentage(t *testing.T) {

network/p2p/gossip/handler.go:18

  • The computeBloomFilterHitPercentage function uses zap.Uint64 and zap.Error but go.uber.org/zap is not imported, causing a compile error. Add import "go.uber.org/zap" or switch to the existing logging API.
	safemath "github.com/ava-labs/avalanchego/utils/math"

This commit adds a metric to measure the bloom filter hit rate % for the gossip pull queries.

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
@yacovm yacovm marked this pull request as draft July 15, 2025 16:13
Comment on lines 140 to 148
total, err := safemath.Add(hits, misses)
if err != nil {
log.Warn("failed to calculate total hits and misses",
zap.Uint64("hits", hits),
zap.Uint64("misses", misses),
zap.Error(err),
)
return 0, false
}

hitsOneHundred, err := safemath.Mul(hits, 100)
if err != nil {
log.Warn("failed to calculate hit ratio",
zap.Uint64("hits", hits),
zap.Uint64("misses", misses),
zap.Error(err),
)
return 0, false
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can get rid of these overflow checks... since the bloom filter is []byte (which can be at most of size max int) could we just emit a metric on both the hits + length of the filter? We could figure out the hit/miss % in dashboards with metric math.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can get rid of these overflow checks... since the bloom filter is []byte (which can be at most of size max int) could we just emit a metric on both the hits + length of the filter? We could figure out the hit/miss % in dashboards with metric math.

I am not sure what you mean by the length of the filter. The bloom filter is constant size and its length is always the same, but yet independent of the elements that it can "contain".

I guess you meant the size of the mempool? Since the maximum number of hits we can get is the size of the mempool, not the bloom filter. Sure I can do that, that's a good point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately we don't have a Size() on the set interface so I will try to code it without the safemath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the code to not use safemath.

Comment on lines +134 to +139

testHistogram := &testHistogram{
Histogram: metrics.bloomFilterHitRate,
}
metrics.bloomFilterHitRate = testHistogram

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit hacky to me, if we want to do an assert that the metrics are the values we're expecting would it make sense for us to consider using prometheus/testutil here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testutil.ToFloat64 doesn't work with histograms. Do you have a suggestion?

Signed-off-by: Yacov Manevich <yacov.manevich@avalabs.org>
@yacovm yacovm marked this pull request as ready for review July 15, 2025 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

2 participants