-
Notifications
You must be signed in to change notification settings - Fork 456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
remove string.c from the package. #655
Conversation
Codecov Report
@@ Coverage Diff @@
## master #655 +/- ##
==========================================
+ Coverage 78.50% 78.64% +0.14%
==========================================
Files 63 63
Lines 2549 2538 -11
Branches 498 500 +2
==========================================
- Hits 2001 1996 -5
+ Misses 406 402 -4
+ Partials 142 140 -2
Continue to review full report at Codecov.
|
Surprisingly, new string parser is as fast as string command during first run and in subsequent run new python parser is faster than strings command. test_binutils
strings command: 0.0014863650000052075
string.parse: 0.0014164240000127393
#########
test_curl
strings command: 0.0018305610000197703
string.parse: 0.0010337909999407202
#########
test_kerberos
strings command: 0.0015554930000689637
string.parse: 0.0011313310000105048
#########
Here is the script I used for benchmarking def _parse_test(self, filename):
"""Helper function to parse a binary file and check whether
the given string is in the parsed result"""
self.strings.filename = os.path.join(BINARIES_PATH, filename)
with tempfile.TemporaryFile() as f:
t1 = perf_counter()
subprocess.call(["strings", self.strings.filename], stdout=f)
binutils_strings = f.readlines()
t2 = perf_counter()
print(f"strings command: {t2 - t1}")
t1 = perf_counter()
ours = self.strings.parse().split("\n")
t2 = perf_counter()
print(f"string.parse: {t2 - t1}")
for theirs in binutils_strings:
self.assertIn(theirs.decode("utf-8"), ours) Should we remove subprocess? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist/cve_bin_tool-1.0-py3.7.egg
should be removed from this.
Looking good!
subprocess is slow because of the overhead of system call. current python solution is very fast due to various improvement like set to lookup printable character, use of join and list instead of string concatenation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dist/cve_bin_tool-1.0-py3.7.egg
is still here. Also let's make sure we have performance numbers for the final implementation before we hit merge on this.
python implementation is taking half the time of subprocess string subprocess string: 0.004477346000385296
python string: 0.002734930999849894
subprocess string: 0.005265122999844607
python string: 0.002693021000141016 |
Okay looks good. It is faster than For example here are the numbers I saw with 10,000 runs.
Let's back out the benchmarking code and then we'll merge this. Thanks! |
Thanks! Nice work! |
Removed string.c from the package and also removed usage of it from setup script and string module.