Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating SARIF2004 #1995

Merged
merged 12 commits into from
Jul 20, 2020
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions src/Sarif.Multitool/Extensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
using System;

namespace Microsoft.CodeAnalysis.Sarif.Multitool
{
public static class Extensions
{
public static bool RefersToDriver(this ToolComponentReference toolComponent, string driverGuid)
{
if (toolComponent.Index == -1)
{
if (toolComponent.Guid == null)
{
return true;
}
else
{
return toolComponent.Guid.Equals(driverGuid, StringComparison.OrdinalIgnoreCase);
}
}

return false;
}
}
}
27 changes: 27 additions & 0 deletions src/Sarif.Multitool/Rules/RuleResources.Designer.cs

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 16 additions & 1 deletion src/Sarif.Multitool/Rules/RuleResources.resx
Original file line number Diff line number Diff line change
Expand Up @@ -272,7 +272,13 @@ Many tools follow a conventional format for the 'reportingDescriptor.id' propert

In several parts of a SARIF log file, a subset of information about an object appears in one place, and the full information describing all such objects appears in an array elsewhere in the log file. For example, each 'result' object has a 'ruleId' property that identifies the rule that was violated. Elsewhere in the log file, the array 'run.tool.driver.rules' contains additional information about the rules. But if the elements of the 'rules' array contained no information about the rules beyond their ids, then there might be no reason to include the 'rules' array at all, and the log file could be made smaller simply by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'rules' array might be used to record the full set of rules that were evaluated. In such a scenario, the 'rules' array should be retained even if it contains only id information.

Similarly, most 'result' objects contain at least one 'artifactLocation' object. Elsewhere in the log file, the array 'run.artifacts' contains additional information about the artifacts that were analyzed. But if the elements of the 'artifacts' array contained not information about the artifacts beyond their locations, then there might be no reason to include the 'artifacts' array at all, and again the log file could be made smaller by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.</value>
Similarly, most 'result' objects contain at least one 'artifactLocation' object. Elsewhere in the log file, the array 'run.artifacts' contains additional information about the artifacts that were analyzed. But if the elements of the 'artifacts' array contained not information about the artifacts beyond their locations, then there might be no reason to include the 'artifacts' array at all, and again the log file could be made smaller by omitting it. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.

In addition to the avoiding unnecessary arrays, there are other ways to optimize the size of SARIF log files.

Prefer the result object properties 'ruleId' and 'ruleIndex' to the nested object-valued property 'result.rule', unless the rule comes from a tool component other than the driver (in which case only 'result.rule' can accurately point to the metadata for the rule). The 'ruleId' and 'ruleIndex' properties are shorter and just as clear.

Do not specify the result object's 'analysisTarget' property unless it differs from the result location. The canonical scenario for using 'result.analysisTarget' is a C/C++ language analyzer that is instructed to analyze example.c, and detects a result in the included file example.h. In this case, 'analysisTarget' is example.c, and the result location is in example.h.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_EliminateLocationOnlyArtifacts_Text" xml:space="preserve">
<value>{0}: The 'artifacts' array contains no information beyond the locations of the artifacts. Removing this array might reduce the log file size without losing information. In some scenarios (for example, when assessing compliance with policy), the 'artifacts' array might be used to record the full set of artifacts that were analyzed. In such a scenario, the 'artifacts' array should be retained even if it contains only location information.</value>
Expand Down Expand Up @@ -360,4 +366,13 @@ This is part of a set of authoring practices that make your rule messages more r
<data name="SARIF2007_ExpressPathsRelativeToRepoRoot_Warning_ExpressResultLocationsRelativeToMappedTo_Text" xml:space="preserve">
<value>{0}: This result location does not provide any of the 'uriBaseId' values that specify repository locations: '{1}'. As a result, it will not be possible to determine the location of the file containing this result relative to the root of the repository that contains it.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeAnalysisTarget_Text" xml:space="preserve">
<value>{0}: The 'analysisTarget' property '{1}' is unnecessary because it is the same as the result location. Remove the 'analysisTarget' property.</value>
Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{0}: [](start = 11, length = 5)

"In the result at '{0}', the 'analysisTarget' property '{1}' is unnecessary..." #Closed

</data>
<data name="SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeResultRuleInformation_Text" xml:space="preserve">
<value>{0}: This result specifies both 'result.ruleId' and 'result.rule'. Prefer 'result.ruleId' because it is shorter and just as clear.</value>
</data>
<data name="SARIF2004_OptimizeFileSize_Warning_PreferRuleId_Text" xml:space="preserve">
<value>{0}: This result uses the 'rule' property to specify the rule metadata, but the 'ruleId' property suffices because the rule is defined by 'tool.driver'. Prefer 'result.ruleId' because it is shorter and just as clear.</value>
</data>
</root>
89 changes: 88 additions & 1 deletion src/Sarif.Multitool/Rules/SARIF2004.OptimizeFileSize.cs
Original file line number Diff line number Diff line change
Expand Up @@ -41,20 +41,107 @@ public class OptimizeFileSize : SarifValidationSkimmerBase
/// some scenarios (for example, when assessing compliance with policy), the 'artifacts' array
/// might be used to record the full set of artifacts that were analyzed. In such a scenario,
/// the 'artifacts' array should be retained even if it contains only location information.
///
/// In addition to the avoiding unnecessary arrays, there are other ways to optimize the
/// size of SARIF log files.
///
/// Prefer the result object properties 'ruleId' and 'ruleIndex' to the nested object-valued
/// property 'result.rule', unless the rule comes from a tool component other than the driver
/// (in which case only 'result.rule' can accurately point to the metadata for the rule).
/// The 'ruleId' and 'ruleIndex' properties are shorter and just as clear.
///
/// Do not specify the result object's 'analysisTarget' property unless it differs from the
/// result location. The canonical scenario for using 'result.analysisTarget' is a C/C++ language
/// analyzer that is instructed to analyze example.c, and detects a result in the included file
/// example.h. In this case, 'analysisTarget' is example.c, and the result location is in example.h.
/// </summary>
public override MultiformatMessageString FullDescription => new MultiformatMessageString { Text = RuleResources.SARIF2004_OptimizeFileSize_FullDescription_Text };

protected override IEnumerable<string> MessageResourceNames => new string[] {
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeAnalysisTarget_Text),
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeResultRuleInformation_Text),
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_EliminateLocationOnlyArtifacts_Text),
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_EliminateIdOnlyRules_Text)
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_EliminateIdOnlyRules_Text),
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_PreferRuleId_Text)
};

public override FailureLevel DefaultLevel => FailureLevel.Warning;

private string driverGuid;

protected override void Analyze(Run run, string runPointer)
{
AnalyzeLocationOnlyArtifacts(run, runPointer);
AnalyzeIdOnlyRules(run, runPointer);

this.driverGuid = run.Tool.Driver?.Guid;
}

protected override void Analyze(Result result, string resultPointer)
{
ReportUnnecessaryAnalysisTarget(result, resultPointer);
ReportRuleDuplication(result, resultPointer);
}

private void ReportRuleDuplication(Result result, string resultPointer)
{
if (result.Rule != null)
{
if (result.Rule.ToolComponent != null)
{
if (result.Rule.ToolComponent.RefersToDriver(this.driverGuid))
Copy link

@ghost ghost Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can shorten this:

if (result.Rule.ToolComponent?.RefersToDriver(this.driverGuid) == true)

Note that you need the explicit comparison == true because the null-coalescing operator ?. causes the left-hand side to have the type Nullable<bool> -- and you can't do an implicit logical test on a Nullable<bool>. #Closed

{
// {0}: This result uses the 'rule' property to specify the rule
// metadata, but the 'ruleId' property suffices because the rule
// is defined by 'tool.driver'. Prefer 'result.ruleId' because it
// is shorter and just as clear.
Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New message:

The result at '{0}' uses the 'rule' property to specify the violated rule, but this is not necessary because the rule is defined by 'tool.driver'. Use the 'ruleId' and 'ruleIndex' instead, because they are shorter and just as clear. #Closed

LogResult(
Copy link

@ghost ghost Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LogResult [](start = 24, length = 9)

This logic is backwards. Take a look at the notes I attached to the chat. The idea is:

If toolComponent is present, and if it does not refer to the driver -- which means that it has to be present! -- then result.ruleId and result.ruleIndex should not also be present.

Here's what the notes said:

  if (result.Rule.ToolComponent != null && !result.Rule.ToolComponent.RefersToDriver(this.currentDriverGuid))
  {
    // We HAD to use result.Rule (because otherwise we couldn't have specified the extension component).
    if either result.ruleId or result.ruleIndex is present
        // Complain, because we should have put them in one place: under result.Rule along with the toolComponent.
        "This result provides the 'rule' property, but it also provides either or both of 'ruleId' and 'ruleIndex'. If 'rule' is present, then use 'rule.id' instead of 'ruleId', and use 'rule.index' instead of 'ruleIndex'."
  }
``` #Closed

resultPointer,
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_PreferRuleId_Text));
}
else
{
if (!string.IsNullOrWhiteSpace(result.RuleId) || result.RuleIndex >= 0)
{
// {0}: This result specifies both 'result.ruleId' and 'result.rule'.
// Prefer 'result.ruleId' because it is shorter and just as clear.
Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will provide new strings. #Closed

Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New message:

The result at '{0}' uses the 'rule' property to specify the violated rule, so it is not necessary also to specify 'ruleId' or 'ruleIndex'. Remove the 'ruleId' and 'ruleIndex' properties.


In reply to: 455809589 [](ancestors = 455809589)

LogResult(
resultPointer,
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeResultRuleInformation_Text));
}
}
}
else
{
// {0}: This result uses the 'rule' property to specify the rule
// metadata, but the 'ruleId' property suffices because the rule
// is defined by 'tool.driver'. Prefer 'result.ruleId' because it
// is shorter and just as clear.
Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same message as on Line 94 above ("The result at '{0}' uses the 'rule' property to specify the violated rule, but this is not necessary because ..."). #Closed

LogResult(
resultPointer,
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_PreferRuleId_Text));
}
}
}

private void ReportUnnecessaryAnalysisTarget(Result result, string resultPointer)
{
if (result.Locations != null && result.AnalysisTarget != null)
{
foreach (Location location in result.Locations)
{
if (location?.PhysicalLocation?.ArtifactLocation?.Uri == result.AnalysisTarget.Uri)
{
// {0}: The 'analysisTarget' property '{1}' is unnecessary because it is the same as
// the result location. Remove the 'analysisTarget' property.
Copy link

@ghost ghost Jul 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the [](start = 27, length = 3)

The 'analysisTarget' property '{1}' at '{0}' is unnecessary because it is the same as the result location. Remove the 'analysisTarget' property. #Closed

LogResult(
resultPointer.AtProperty(SarifPropertyName.AnalysisTarget),
nameof(RuleResources.SARIF2004_OptimizeFileSize_Warning_AvoidDuplicativeAnalysisTarget_Text),
result.AnalysisTarget.Uri.OriginalString);
Copy link

@ghost ghost Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result [](start = 28, length = 6)

I would break after logging the first one. If you discover that "analysisTarget isn't necessary because it's the same as location 0", there's no need to also report "analysisTarget isn't necessary because it's the same as location 1". #Closed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was marked Resolved but it wasn't actually fixed. I think you tried to fix it with the return statements above, but that's not what I meant.


In reply to: 455252796 [](ancestors = 455252796)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see! i though it was in another place.


In reply to: 455344867 [](ancestors = 455344867,455252796)

break;
}
}
}
}

private void AnalyzeLocationOnlyArtifacts(Run run, string runPointer)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -121,15 +121,18 @@ public void SARIF1009_IndexPropertiesMustBeConsistentWithArrays_Valid()

[Fact]
public void SARIF1009_IndexPropertiesMustBeConsistentWithArrays_Invalid()
=> RunTest(MakeInvalidTestFileName(RuleId.IndexPropertiesMustBeConsistentWithArrays, nameof(RuleId.IndexPropertiesMustBeConsistentWithArrays)));
=> RunTest(MakeInvalidTestFileName(RuleId.IndexPropertiesMustBeConsistentWithArrays, nameof(RuleId.IndexPropertiesMustBeConsistentWithArrays)),
parameter: new TestParameters(configFileName: "disable2004.configuration.xml"));

[Fact]
public void SARIF1010_RuleIdMustBeConsistent_Valid()
=> RunTest(MakeValidTestFileName(RuleId.RuleIdMustBeConsistent, nameof(RuleId.RuleIdMustBeConsistent)));
=> RunTest(MakeValidTestFileName(RuleId.RuleIdMustBeConsistent, nameof(RuleId.RuleIdMustBeConsistent)),
parameter: new TestParameters(configFileName: "disable2004.configuration.xml"));

[Fact]
public void SARIF1010_RuleIdMustBeConsistent_Invalid()
=> RunTest(MakeInvalidTestFileName(RuleId.RuleIdMustBeConsistent, nameof(RuleId.RuleIdMustBeConsistent)));
=> RunTest(MakeInvalidTestFileName(RuleId.RuleIdMustBeConsistent, nameof(RuleId.RuleIdMustBeConsistent)),
parameter: new TestParameters(configFileName: "disable2004.configuration.xml"));

[Fact]
public void SARIF1011_ReferenceFinalSchema_Valid()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,9 @@
<None Update="default.configuration.xml">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="disable2004.configuration.xml">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="disable2011.configuration.xml">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@
"arguments": [
"runs[0].results[0].analysisTarget",
"%SRCROOT%",
"file:///C:/src/file.c"
"file:///C:/src/file.h"
]
},
"locations": [
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2004' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
},
"descriptor": {
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2006' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,14 @@
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2004' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
},
"descriptor": {
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2006' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,14 @@
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2004' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
},
"descriptor": {
"id": "WRN999.RuleExplicitlyDisabled"
}
},
{
"message": {
"text": "Rule 'SARIF2006' was explicitly disabled by the user. As result, this tool run cannot be used for compliance or other auditing processes that require a comprehensive analysis."
Expand Down
Loading