Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Update queries #3172

Merged
merged 15 commits into from
Aug 7, 2023
Merged

feat: Update queries #3172

merged 15 commits into from
Aug 7, 2023

Conversation

Rexbeast2
Copy link
Contributor

This PR is to be only considered after closing #3147.

Copy link
Contributor

@anthonyharrison anthonyharrison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to also add some tests to validate that the metrics table has been populated correctly.

@@ -118,5 +118,5 @@ def parse_epss_data(self, file_path=None):
# Parse the data from the remaining rows
for row in reader:
cve_id, epss_score, epss_percentile = row[:3]
parsed_data.append((cve_id, "EPSS", epss_score, epss_percentile))
parsed_data.append((cve_id, 1, epss_score, epss_percentile))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rexbeast2 You should find the value of the EPSS metric by searching for the EPSS string in the metrics table rather than assume it is '1'

cursor.execute(query, [cve.get("CVSS_version")])
# Fetch all the results of the query and use 'map' to extract only the 'metrics_name' from the result
metric = list(map(lambda x: x[0], cursor.fetchall()))
# Since the query is expected to return a single result, extract the first item from the list and store it in 'metric'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably add some debug to check this assumption


for cve in severity_data:
# Check no None values
if not bool(cve.get("score")):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't cve["score"] = cve.get("score","unknown") work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, for this. I tried to keep it as similar as possible to populate_severity.

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder: because I enabled branch protection I won't be able to merge this until all tests are passing. It looks like you'll need to update this branch to get the last one I had to disable to get main passing again. (Plus it looks like Anthony's got other feedback that needs incorporation.)

@Rexbeast2
Copy link
Contributor Author

Rexbeast2 commented Jul 25, 2023

@terriko As far as I can see, the test cases are failing due to test/test_csv2cve. Not because of any code change that I made. Can you please verify that?
@anthonyharrison if you see any more improvements, please let me know.

@terriko
Copy link
Contributor

terriko commented Jul 26, 2023

@terriko As far as I can see, the test cases are failing due to test/test_csv2cve. Not because of any code change that I made.

test_csv2cve isn't failing on main, and the way it's failing makes it look like it's not finding some data (it looks like the assert is expecting to find more than 60 of something, and instead it's finding 1). If I had to guess, you're accidentally clobbering some data that it needs to return correct results. Probably something in the get_cves() code?

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you've got a merge conflict to resolve now that I've merged #3147 . I'll leave that to you, I also put in some inline comments about docstrings that you might as well do at the same time. We haven't been super consistent about docstrings across the project but you've got a couple of places where you've already written the string as a comment so we might as well just format it as a docstring! If you're bored while you're waiting for tests on this to run you could go back and look at your previously added functions for some easy refactoring commits.

@@ -552,6 +554,56 @@ def populate_affected(self, affected_data, cursor, data_source):
except Exception as e:
LOGGER.info(f"Unable to insert data for {data_source} - {e}")

def metric_finder(self, cursor, cve):
# SQL query to retrieve the metrics_name based on the metrics_id
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is helpful; maybe we could turn it into a docstring so it can be parsed and used as such?

Suggested change
# SQL query to retrieve the metrics_name based on the metrics_id
''' SQL query to retrieve the metrics_name based on the metrics_id '''

Short explainer on docstrings in case you're not familiar with them: https://www.programiz.com/python-programming/docstrings

More generally: cvedb is lacking in a lot of docstrings but it would be nice to have them in new functions if you remember. I'll flag a few more new functions in this PR for you. Sorry I didn't think to do it with earlier PRs!

)
return metric

def populate_cve_metrics(self, severity_data, cursor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good place for a docstring!

@@ -567,6 +619,12 @@ def populate_metrics(self):
self.connection.commit()
self.db_close()

def populate_epss(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good place for a docstring. I feel like this short function is mostly self-evident to people who know the acronyms but could be difficult for someone who doesn't. So this is a nice opportunity to expand the acronym and have a really short explainer about what it's for. Something like "Add Exploit Prediction Scoring System (EPSS) data to help users evaluate risks"

@@ -99,6 +101,14 @@ async def download_epss_data(self):
except aiohttp.ClientError as e:
self.LOGGER.error(f"An error occurred during downloading epss {e}")

def EPSS_id_finder(self, cursor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another good place for a docstring.

test/test_source_epss.py Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Aug 4, 2023

Codecov Report

Merging #3172 (7853193) into main (ae66713) will decrease coverage by 0.79%.
Report is 1 commits behind head on main.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #3172      +/-   ##
==========================================
- Coverage   81.28%   80.50%   -0.79%     
==========================================
  Files         716      716              
  Lines       11126    11163      +37     
  Branches     1495     1497       +2     
==========================================
- Hits         9044     8987      -57     
- Misses       1693     1776      +83     
- Partials      389      400      +11     
Flag Coverage Δ
longtests 80.50% <100.00%> (+5.11%) ⬆️
win-longtests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
cve_bin_tool/cve_scanner.py 85.26% <100.00%> (+1.54%) ⬆️
cve_bin_tool/cvedb.py 62.24% <100.00%> (+0.15%) ⬆️
cve_bin_tool/data_sources/epss_source.py 72.15% <100.00%> (+2.28%) ⬆️
cve_bin_tool/util.py 79.66% <100.00%> (+0.17%) ⬆️
test/test_source_epss.py 100.00% <100.00%> (ø)

... and 20 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some formatting fixes for docstrings. I'm so glad you managed to resolve the test issue!

Comment on lines 458 to 461
"""
EPSS uses metrics table to get the EPSS metric id.
It can't be ran before creation of metrics table.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one actually doesn't need to be a string, it can be a comment. Docstrings are a special case of a string appearing immediately after a def function definition (at least for functions) but anything else inside the function can probably be a comment.

Suggested change
"""
EPSS uses metrics table to get the EPSS metric id.
It can't be ran before creation of metrics table.
"""
# EPSS uses metrics table to get the EPSS metric id.
# It can't be ran before creation of metrics table.

Comment on lines 537 to 539
"""Adds data into CVE metrics table"""

def populate_cve_metrics(self, severity_data, cursor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Adds data into CVE metrics table"""
def populate_cve_metrics(self, severity_data, cursor):
def populate_cve_metrics(self, severity_data, cursor):
""" Adds data into CVE metrics table """

Docstring goes after def here

Comment on lines 592 to 594
"""Adding data to metric table."""

def populate_metrics(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Adding data to metric table."""
def populate_metrics(self):
def populate_metrics(self):
""" Adding data to metric table. """

Same deal. Beware, I'm doing suggestions from the web interface so my indentation may be incorrect if you try to just merge these in.

Comment on lines 609 to 611
"""Add EPSS data into the database"""

def populate_epss(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Add EPSS data into the database"""
def populate_epss(self):
def populate_epss(self):
""" Add EPSS data into the database """

Same thing

Comment on lines 105 to 107
"""Search for metric id in EPSS table"""

def EPSS_id_finder(self, cursor):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Search for metric id in EPSS table"""
def EPSS_id_finder(self, cursor):
def EPSS_id_finder(self, cursor):
"""Search for metric id in EPSS table"""

Comment on lines 115 to 117
"""Parse epss data from the file path given and return the parse data"""

def parse_epss_data(self, file_path=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Parse epss data from the file path given and return the parse data"""
def parse_epss_data(self, file_path=None):
def parse_epss_data(self, file_path=None):
""" Parse epss data from the file path given and return the parse data """

Copy link
Contributor

@terriko terriko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, looks like those docstrings are handled. Let's get this merged so you're set for the next piece. Thank you!

@terriko terriko merged commit 06b55f7 into intel:main Aug 7, 2023
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants