Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Gateway HTTP request metrics #8441

Closed
gmasgras opened this issue Sep 16, 2021 · 2 comments · Fixed by #8443
Closed

Improved Gateway HTTP request metrics #8441

gmasgras opened this issue Sep 16, 2021 · 2 comments · Fixed by #8443
Assignees
Labels
kind/enhancement A net-new feature or improvement to an existing feature

Comments

@gmasgras
Copy link

gmasgras commented Sep 16, 2021

Wishlist

from @gmasgras

  • Merging feat: register first block metric by default #8332 means that we can now track time to first block 🎉 however only the sum and count metrics are exported. It would be very useful to also export this metric as a histogram (eg unixfs_get_latency_seconds_bucket) so that we can calculate percentiles (p90, p95) and get a better understanding of outliers. A good set of buckets could be 100ms, 500ms, 1s, 2s, 3s, 5s, 8s, 13s

from @lidel

  • Merging Gateway support for /ipfs/{cid}?format=car|raw|... #8234 will introduce response types other than Unixfs (block, car), and we are also planning native support for dag-json and dag-cbor on Gateways
    • This means we need solid metrics that are format agnostic (e.g. time to first block)
      • gw_first_root_block_get_latency_seconds with time to return the root block – /ipfs/{cid}, /ipns/{example.com} low utility, we care about first content block
      • gw_first_content_block_get_latency_seconds with time to first content block of specific resource – /ipfs/{cid}/some/file.jpg
      • histogram
    • nice to have: metrics per response type
      • unixfs file
      • generated directory listing
      • raw block
      • car stream
      • (futrure) dag-json, dag-cbor
      • request count and time to generate full response may be useful thing to track per type
@gmasgras gmasgras added the kind/enhancement A net-new feature or improvement to an existing feature label Sep 16, 2021
@BigLep BigLep added this to the go-ipfs 0.11 milestone Sep 22, 2021
@BigLep BigLep modified the milestones: go-ipfs 0.11, go-ipfs 0.13 Nov 23, 2021
@BigLep BigLep modified the milestones: go-ipfs 0.13, Best Effort Track Mar 3, 2022
@lidel lidel changed the title Export unixfs_get_latency_seconds as histogram Improved Gateway metrics Mar 8, 2022
@lidel lidel assigned lidel and unassigned aschmahmann Mar 8, 2022
lidel added a commit that referenced this issue Mar 8, 2022
Include block and car in unixfs_get_latency_seconds for now,
so we keep basic visibility into gateway behavior until better metrics
are added by #8441
@lidel lidel changed the title Improved Gateway metrics Improved Gateway HTTP request metrics Mar 10, 2022
@thibmeu
Copy link
Contributor

thibmeu commented Mar 17, 2022

Regarding the set of buckets, what is the reasoning behind these values?
I feel like the granularity in the seconds unit is important, while there is very little information for blocks that are retrieved quickly.

I would suggest 50ms, 100ms, 250ms, 500ms, 1s, 2s, 5s, 10s. This is still arbitrary, but I feel it could capture more information.

@gmasgras
Copy link
Author

My reasoning for most of the values is that it's common to see buckets using the Fibonacci sequence. 30s and 60s are personal preference + they align with some of our timeouts on the gateways.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement A net-new feature or improvement to an existing feature
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

5 participants