Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem parsing timestamps in CSV parser #5824

Closed
graemeyeo opened this issue May 9, 2019 · 0 comments · Fixed by #5826
Closed

Problem parsing timestamps in CSV parser #5824

graemeyeo opened this issue May 9, 2019 · 0 comments · Fixed by #5826
Assignees
Labels
bug unexpected problem or unintended behavior
Milestone

Comments

@graemeyeo
Copy link

graemeyeo commented May 9, 2019

Relevant telegraf.conf:

[[inputs.file]]
  files = ["data2s.csv"]
  data_format = "csv"
  csv_header_row_count = 0
  csv_column_names = ["pid","time","msg"]
  csv_timestamp_column = "time"
  csv_timestamp_format = "unix"

System info:

Telegraf 1.10.3
Ubuntu 16.04 LTS

Steps to reproduce:

CSV contains Unix timestamps with fractional seconds with a large numver of decimal places.

Input csv file:

2108,1551129661.95456123352050781250,0105000008000000cb2e392df52c9c2d
2108,1551129662.14315605163574218750,0405000008000000527bec7ab47a20bf
2108,1551129662.14344882965087890625,02050000080000006a2f312ed62ca02d

Expected behavior:

Timestamps would be parsed as Unix epoch seconds.fractional_seconds at down to nanosecond precision.

Actual behavior:

The timestamps were being parsed without giving an error message, but were incorrect when imported to the database.

Additional info:

It turned out that the ParseTimestampWithLocation function in internal/internal.go was receiving a timestamp string with an exponent e.g. 1.5515752371385763e+09 . This was confusing the part of the function which splits the string into timeInt and timeFractional.

Fix:

The fix was to change one line in plugins/parsers/csv/parser.go inside the function parseTimestamp. I changed the line:
tStr := fmt.Sprintf("%v", recordFields[timestampColumn])
to
tStr := fmt.Sprintf("%.9f", recordFields[timestampColumn])

This retains (up to) nanosecond precision, but forces it not to give an exponent.

EDIT:
In my haste, I forgot to test this with integer timestamps. Of course forcing Sprintf to try to parse as a float causes integers to be parsed incorrectly. The final solution is slightly longer:

import "reflect"

...

var tStr string
k := reflect.ValueOf(recordFields[timestampColumn]).Kind()
if k == reflect.Float32 || k == reflect.Float64 {
    tStr = fmt.Sprintf("%.9f", recordFields[timestampColumn])
} else {
    tStr = fmt.Sprintf("%d", recordFields[timestampColumn])
}
@danielnelson danielnelson self-assigned this May 9, 2019
@danielnelson danielnelson added the bug unexpected problem or unintended behavior label May 9, 2019
@danielnelson danielnelson added this to the 1.10.4 milestone May 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants