Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Server input plugin: Add maximum log file size #4382

Closed
matthenning opened this issue Jul 5, 2018 · 10 comments · Fixed by #6869
Closed

SQL Server input plugin: Add maximum log file size #4382

matthenning opened this issue Jul 5, 2018 · 10 comments · Fixed by #6869
Labels
area/sqlserver feature request Requests for new plugin and for new features to existing plugins
Milestone

Comments

@matthenning
Copy link

matthenning commented Jul 5, 2018

Feature Request

Proposal:

To monitor SQL Server log file usage you require the file size, the maximum file size and the file usage.
The query could look like this:

WITH fs
AS
(
    SELECT database_id, type, size * 8 / 1024 size, max_size * 8 / 1024 max_size
    FROM sys.master_files
    WHERE database_id NOT IN (3,1,4)
)
SELECT
    name,
    (SELECT SUM(size) FROM fs WHERE type = 0 AND fs.database_id = db.database_id) DataFileSizeMB,
    (SELECT SUM(max_size) FROM fs WHERE type = 1 AND fs.database_id = db.database_id) LogFileSizeMB
FROM sys.databases db
WHERE database_id NOT IN (3,1,4) 

Current behavior:

Currently the SQL Server Telegraf plugin only collects the file size and the file usage.

Desired behavior:

The maximum file size is also collected

Use case:

I want to monitor SQL Server log file usage with Telegraf, InfluxDB and Kapacitor.
If the current file usage is 100% of the current file size there is no reason to trigger an alert as long as the maximum file size is not yet reached. The maximum size isn't collected by the SQL Server plugin though, so I have the source this metric from somewhere else.
It would be great to have this included in the Telegraf plugin.

@glinton
Copy link
Contributor

glinton commented Jul 5, 2018

What alert triggers have you configured in kapacitor? It seems like you just need to set a size[s] that you'd like to be alerted for (by kapacitor) when telegraf reports that the file has reached that size. I'm thinking this isn't a thing for telegraf.

@danielnelson
Copy link
Contributor

I think this would just make the Kapacitor alerting based on the values configured in the Database. @matthenning Are you using query_version = 2?

@matthenning
Copy link
Author

@danielnelson Yes, query_version = 2 ist set.

@glinton I agree that it's not a performance metric per se but defining a static value in the Kapacitor task would require the DBA to open a ticket with the monitoring team every time the max file size is adjusted. And because the value can be adjusted frequently or even by some kind of automated process I think collecting it would be the way to go.

@danielnelson
Copy link
Contributor

@m82labs Is is another issue I would appreciate it if you could review, does it seem like a good idea and does it fit well with the existing philosophy of the new format? Things that you don't feel like they are good fits could go into a input similar to postgresql_extensible, we already have several submissions for something like this which are currently just waiting on us.

@m82labs
Copy link
Contributor

m82labs commented Jul 12, 2018

I tend to agree with @matthenning on this one. It is not a perf metric, but if you monitor a lot of database servers or a lot of databases with varying log file sizes, something like this could be handy.

When I re-worked this plugin it was with the goal of decreasing the amount of time it took for me and my team to troubleshoot issues (perf issues, space issues, etc). I never imagined using any of these metrics for alerting purposes so I think the design might reflect that.

I think we could fit this metric in with the other perf counters by "faking it". Just adding the data to the table variable used in the query that gets perf counters, and coming up with a good counter and object name for this. I already do this to increase the amount of resource governor data the plugin pulls in: https://github.com/influxdata/telegraf/blob/master/plugins/inputs/sqlserver/sqlserver.go#L516

@danielnelson danielnelson added feature request Requests for new plugin and for new features to existing plugins and removed need more info labels Jul 12, 2018
@nwneisen
Copy link
Contributor

I started taking a look at this and got metrics going into InfluxDB.

I wasn't sure what the differences are between query_version = 1 and query_version = 2 and I can't seem to find any documentation from Microsoft online. I only added a v2 query in the PR.

V2 queries were also fairly consistent about using SET DEADLOCK_PRIORITY -10; while V1 queries are each setting a number of different properties. I can go through and check whether each property is appropriate unless there are specific properties desired for Telegraf queries.

@nwneisen
Copy link
Contributor

I realized after opening a PR that the query version information was in the README. My only question now would be if a query should be added for V1 as well.

@danielnelson
Copy link
Contributor

I should clarify this in the README and configuration but query_version = 1 is deprecated and we will remove it in Telegraf 2.0. So, no further changes to it.

@nwneisen
Copy link
Contributor

I wondered but thought I would check

@matthenning
Copy link
Author

Thank you!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sqlserver feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants