-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using Prometheus to monitor whether SRS is suitable for media streaming. #3141
Comments
If you look at the example provided by Prometheus, Use Labels
He said that for the metric Of course, what he meant was using labels instead of creating a separate metric for each stream. Typically, there are only a few dozen metrics, not hundreds or thousands.
If you want to categorize metric indicators, you should use label tags. For example, instead of defining two metrics I don't know if you have performance issues with Prometheus. How many machines do you have? How many routes/streams? How are the metrics defined? How are the labels defined?
|
This scenario:
In this scenario, when aggregating data using Grafana, there may be performance issues when matching and filtering through different labels. For example, querying all data for a specific stream using the stream ID label, querying all streams within a certain time period, querying all streams with poor network conditions, or querying streams with frequent reconnections from the streaming source.
|
The metrics of Prometheus are generally suitable for aggregation, such as start time and end time, which are not suitable to be stored in Prometheus. These are suitable to be stored in log systems like ELK or APM/Trace. After processing and filtering with these systems, they can also be displayed through Grafana. For more details, you can refer to this article Metrics, tracing, and logging. Generally speaking, Prometheus belongs to Metrics, which means it is used for alerting and aggregates many metrics. Therefore, the data stored in Prometheus is relatively small. For example, if there is an issue with the flow, the alert should collect the error count metric of the flow and aggregate it into normal flows and abnormal flows across the network. Querying flows within a specific time period or analyzing flows with poor network conditions is more of a task for data analysis tools like ELK or APM. These tools are part of the operations system and should not rely solely on alerts, Prometheus, or metrics. Relying solely on these tools can lead to excessive usage and high system load, resulting in slow query performance. Add me on WeChat to chat? We are currently designing the official SRS exporter and welcome your participation.
|
In general, if it's not a hundred thousand streams or a million plays, Prometheus is completely capable. Currently, SRS has supported Prometheus Exporter, and we will continue to add new metrics. Please refer to #2899.
|
Is there a conclusion yet?
|
Update: For about 99% of use cases, which means virtually all scenarios, Prometheus can support stream-level monitoring data. SRS will gradually improve in the future.
|
When using Prometheus, our scenario is to collect statistics such as bitrate and fps for each video stream in real-time. We have deployed our own Prometheus instance, but it is limited by the storage capacity of a single machine. Therefore, we use Grafana for visualization.
During usage, we found that the data volume is too large, and Prometheus easily encounters performance bottlenecks. We would like to discuss whether Prometheus is only suitable for collecting information from the entire set, and not suitable for monitoring the status of each individual stream.
TRANS_BY_GPT3
The text was updated successfully, but these errors were encountered: