Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON output format for output #684

Closed
wants to merge 1 commit into from

Conversation

ominiet
Copy link

@ominiet ominiet commented Aug 23, 2024

This introduces JSON output format. I based this MR on one that was existing from a few years back under the label, #68

It follows the same pattern, but rather than exposing some extra keys that suited a particular use case, it prints the fields out as they appear in the LintProblem class with the filepath added in to each record.

@coveralls
Copy link

coveralls commented Aug 23, 2024

Coverage Status

coverage: 99.825%. remained the same
when pulling 4e3e9bf on ominiet:master
into 99fb29a on adrienverge:master.

@ominiet
Copy link
Author

ominiet commented Aug 23, 2024

updated commit to satisfy linters

Copy link
Owner

@adrienverge adrienverge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Oscar, thanks for this contribution!

I see a problem: the format this outputs is not valid JSON:

{"file": "a.yaml", "level": "error", "line": 2, "column": 4, "desc": "trailing spaces", "rule": "trailing-spaces"}
{"file": "a.yaml", "level": "error", "line": 3, "column": 4, "desc": "no new line character at the end of file", "rule": "new-line-at-end-of-file"}

… while it should be:

[
  {"file": "a.yaml", "level": "error", "line": 2, "column": 4, "desc": "trailing spaces", "rule": "trailing-spaces"},
  {"file": "a.yaml", "level": "error", "line": 3, "column": 4, "desc": "no new line character at the end of file", "rule": "new-line-at-end-of-file"}
]

Fortunately a valid format seems achievable easily: you can define and call Format.json_pre_all() and Format.json_post_all() to print [\n and \n]\n at start and end, and use first to know whether to print ,\n (see the other formats, they use first).

yamllint/cli.py Outdated
Comment on lines 93 to 101
line = {
"file": filename,
"level": problem.level,
"line": problem.line,
"column": problem.column,
"desc": problem.desc,
"rule": problem.rule
}
return json.dumps(line)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose these changes:

  • Return directly (like in Format.parsable() above).
  • Change desc to description.
  • Use single quotes for consistency with the rest of the code.
Suggested change
line = {
"file": filename,
"level": problem.level,
"line": problem.line,
"column": problem.column,
"desc": problem.desc,
"rule": problem.rule
}
return json.dumps(line)
return json.dumps({
'file': filename,
'level': problem.level,
'line': problem.line,
'column': problem.column,
'description': problem.desc,
'rule': problem.rule,
})

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took this suggestion. Given there are other suggestions, I will squash this commit in with the other suggestions at the end as well.

Copy link
Author

@ominiet ominiet Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regarding the json list structure. I am a bit confused by the suggestion, given that we would actually want ,\n for every entry in the list except the last entry.

What would your suggestion be for that? Given my limited python knowledge, it may require an update to the for loop to use the built-in enumerate() to maintain a current index and check if the current index is the last to know when we have arrived at the last element.

Something like the below:

size = len(problems)
for index, problem in enumerate(problems):
  # do some formatting for every other element
  if index == (size -1):
    # do some formatting for the last element

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. I have updated the PR to include the suggestions of json_pre_all and json_post_all, but I'm struggling a bit with printing the ,\n for all except the last element given that it uses a generator and not an actual list.

I have tried casting to a list to get a length reference and current index, but received many errors in most test cases as a result. Please provide any guidance you may have.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up. The most recent revision seems to work as I have built it, but it is probably not the best implementation

tests/test_cli.py Outdated Show resolved Hide resolved
@ominiet ominiet force-pushed the master branch 4 times, most recently from 9e6297a to 6beddbe Compare August 26, 2024 18:48
@adrienverge
Copy link
Owner

Hello Oscar, thanks for following up and squashing.

Sorry to be direct but I lack time:

  • Please use single quotes, I see new ones got in since my last message.
  • I think my proposal was simpler, and avoids using size, enumerate, index, etc.:
    print('[', end='')  # actually more readable that print(Format.json_pre_all(), end='')?
    
    for problem in problems:
        if first:
            first = False
        else:
            print(',', end='')
        print('\n  {"file": "foo.bar"}', end='')
    
    print('\n]')
    [
      {"file": "foo.bar"},
      {"file": "foo.bar"},
      {"file": "foo.bar"}
    ]

Once it's good to go you can squash again so it's ready to merge 👍

@ominiet
Copy link
Author

ominiet commented Aug 27, 2024

No need for apology, Adrien. Your suggestion is far simpler and more readable. It was me who didn't understand it the first time.

I have applied the suggested code and removed any double quotes I may have added that are not related to validating the json format. Thanks for you time reviewing the change request.

@ominiet
Copy link
Author

ominiet commented Aug 27, 2024

It occurred to me that this implementation only works for a single yaml file. When providing a directory and allowing the tool to find multiple files, the opening and closing braces are added once for every file, rather than once around all "problems"

Here is a quick example output when scanning multiple files where only one has an issue:

[
][
][
  {"file": "./my.yaml", "level": "warning", "line": 1, "column": 1, "description": "missing document start \"---\"", "rule": "document-start"},
  {"file": "./my.yaml", "level": "error", "line": 2, "column": 12, "description": "no new line character at the end of file", "rule": "new-line-at-end-of-file"}
]
[
][
][
][
][
][
][
]

I'd expect to see a single json array as output.

I'm going to close this merge request and re-open when the implementation works a bit better

@ominiet ominiet closed this Aug 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants