Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to deal with test data that might trigger antivirus engines #215

Open
samiraguiar opened this issue Nov 8, 2017 · 12 comments · Fixed by #217
Open

How to deal with test data that might trigger antivirus engines #215

samiraguiar opened this issue Nov 8, 2017 · 12 comments · Fixed by #217

Comments

@samiraguiar
Copy link
Contributor

samiraguiar commented Nov 8, 2017

Following the question initially done in #201, I'd like to discuss it further.

After checking my sample RTF against VirusTotal, although harmless, it does trigger around 8 engines (due to heuristic checks). I've tried to change it a bit but the result was the same. I also believe that we will eventually face similar situations since we will need to simulate real malware in order to create better unit tests.

I've come up with three possible solutions (after talking to people from Intra2net):

  • Encrypt or base64-encode the test data and decrypt/decode when running each test. Some utils.py file in the test folder would help here.
  • Move the code to a secondary repository which contains only the unit tests and reference it as a submodule so Travis can clone it when checking PRs and commits.
  • Each test creates its own test data before running, but this might get complicated and hard to maintain when dealing with complex cases.

Any other suggestion?

@decalage2
Copy link
Owner

I think the best is to zip test files, encrypted with a known password such as "infected". Then it's easy to decrypt them in memory with python zipfile, from the test scripts. It is better than BASE64 encoding or similar, that antivirus may decode automatically. And if some AV also try the "infected" password automatically, use a different one such as "infected-test".

@onefuncman
Copy link

I think the submodule approach is best, or another method to prevent a basic pip install from triggering this issue.

@bacar
Copy link

bacar commented Oct 16, 2019

As a minimum - is it possible to avoid distributing the test data with pip install oletools ? (i.e. can the issue of "how to distribute the test data without triggering scanners" be separated from the idea of "can we avoid distributing the test data entirely for those who don't need it"?)

I nearly caused an "incident" at work due to triggering virus scanners - I was merely after playing with olevba for its VBA manipulation ability...

@decalage2
Copy link
Owner

Now that PR #217 is merged, we need to check which test files trigger antivirus detection, zip them with the correct password 'infected-test', and change the corresponding test scripts.

@decalage2 decalage2 reopened this Oct 18, 2019
@decalage2
Copy link
Owner

decalage2 commented Oct 18, 2019

At least the following files are detected by Windows Defender (as Exploit:O97M/DDEDownloader!rfn):

  • tests\test-data\msodde\dde-in-word2003.xml
  • tests\test-data\msodde\dde-test-from-office2016.doc
  • tests\test-data\ooxml\dde-in-word2003.xml

This one as Exploit:O97M/DDEDownloader.C:

  • tests\test-data\msodde\dde-in-word2007.xml

decalage2 added a commit that referenced this issue Nov 29, 2019
…alerts (temporary workaround for #398), corresponding test files are now zipped with password 'infected-test' (for #215)
@dominik-chilla-dv
Copy link

Yesterday, we wanted to start to use oletools. Unfortunately wo also ran into AV incidents while building docker images out of a jenkins pipeline. Our webproxy denied the access to the pip-repo with following content scanner claim:

oletools-0.55.1\tests\test-data\msodde\dde-test.docx 
--> word/document.xml <<< Contains HEUR/Downloader.DDE suspicious code

After that, I´ve downloaded the zip-package with a private system and uploaded it on virustotal.com with dramatic results:

https://www.virustotal.com/gui/file/edea57914c4040e7d0d64cfd88c84355d4305548d761d476fbac21ee26b25d8d/detection

We would very appreciate it, if you could fix this ASAP. As long as this situation persists, we won´t be able to use oletools in our environment. The approach to encrypt the tests inside of the zip-package or to outsource it in a submodule seem to be a goot compromises.

Thanks in advance!

@decalage2
Copy link
Owner

As it happened several times already in the past, users are reporting errors and antivirus alerts when installing oletools. This is because some test files are incorrectly detected as malicious by some antivirus engines.
For example the test file "dde-test-encrypt-standardpassword.xls" is now detected by Comodo AV.
Potential solutions:

@decalage2 decalage2 modified the milestones: oletools 0.55, oletools 0.56 Oct 5, 2020
@dominik-chilla-dv
Copy link

Nice, thanks for continuing working on it ;)

@christian-intra2net
Copy link
Contributor

First of all: I am so sorry to be partially responsible for these problems. I created most of these "malicious" files and added them, not knowing that they would be distributed via pip. My understanding was that test-files do not get distributed.

My thoughts on this:

  • Reverting stuff that has been in the package for a while tends to get nastier than expected
  • We could adapt the build process to not release source-packages but release-packages with pip, which would then not include test data (thanks a lot @samiraguiar for finding that out)
  • Why would we have to do in-memory scanning when we encrypt the files in some sort of container (like zip)? Could we not unzip to tempfile.gettempdir() and scan the files there?

@decalage2
Copy link
Owner

I agree that if the packages released on PyPI would not include the test files, it would be the simplest solution. Indeed it looks like a way to do it is to generate wheel distributions instead of source distributions (same for olefile: decalage2/olefile#140). So I will change that in the next release.

About the in-memory scanning, I think it's better than using temporary files on disk, because they often trigger antivirus engines when using oletools on Windows. I would like to be able to scan a file stored in a zip with password "infected" without creating a temporary file on disk. Some time ago I started to develop a module to handle files on disk or in memory transparently, I will release it to make it easier in oletools.

@christian-intra2net
Copy link
Contributor

Just stumbled over quite a few unittests that are still disabled and useless because of this issue. Any plans on this?

christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Nov 25, 2022
Some samples triggered antivirus engines, issues decalage2#215 and decalage2#217 ended with
the agreement to encapsulate problematic samples in encrypted zip
containers and decrypt them on-the-fly. Initial support for this was added
but that did not cover 5 tests. Create on-the-fly decryption for these
tests as well and re-enable them.
christian-intra2net added a commit to christian-intra2net/oletools that referenced this issue Nov 25, 2022
Some samples triggered antivirus engines, issues decalage2#215 and decalage2#217 ended with
the agreement to encapsulate problematic samples in encrypted zip
containers and decrypt them on-the-fly. Initial support for this was added
but that did not cover 5 tests. Create on-the-fly decryption for these
tests as well and re-enable them.
decalage2 added a commit that referenced this issue Nov 27, 2022
…sue215

tests: Re-enable samples skipped because of #215
@luxaritas
Copy link

luxaritas commented Mar 26, 2024

Apologies for the necro, but wanted to note that I'm seeing a number of other test files triggering antivirus (in my situation, I was seeing this in AWS GuardDuty scans flagging the pip cache for oletools). In particular, running a number of the xls* files through virustotal will show flags, eg autostart-encrypt-standardpassword.xls, encrypted.xls, and excel4_sample_macro.xls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants