Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support a pass through mode for modification time of DataEntries in ZipWriter #121

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

netomi
Copy link

@netomi netomi commented Nov 8, 2023

Right now, the ZipWriter by default will set the modification time of any written DataEntry to the specified modificationTime that has been passed as constructor argument.

However for certain use-cases it would be desirable to retain the original modification time of DataEntries that are being processed.

This PR adds support for retaining the modification time if it is available. The behavior can be controlled by setting the modification time of a ZipWriter to -1, which will instruct the ZipWriter to use the modification time of each DataEntry if available, otherwise resort to Instant.now() which is a reasonable default imho.

The PR also adds a JarSigner example to showcase how proguard-core can be used as a replacement for the jarsigner tool.

In a subsequent PR I will also add support for timestamping in the SignedJarWriter.

Please let me know if a change like that would be of interest for the library.

@netomi
Copy link
Author

netomi commented Nov 22, 2023

Let me know if there is any interest in integrating this PR so I can work on another one wrt timestamping of signed jars. Otherwise there is no point ofc.

@mrjameshamilton
Copy link
Collaborator

Let me know if there is any interest in integrating this PR so I can work on another one wrt timestamping of signed jars. Otherwise there is no point ofc.

Sorry for the delay! The changes seem to make sense; we just need to run some extra internal tests to make sure nothing breaks.

@mrjameshamilton
Copy link
Collaborator

The changes passed the internal testing.

@netomi could you add some unit test(s) to automatically test these changes in ZipWriter behaviour in ProGuardCORE? i.e. test that the pass-through of the timestamp works.

@netomi
Copy link
Author

netomi commented Nov 26, 2023

I added a test case, but while doing so I found a roadblock that I was not aware of before.

The test case captures a case that does not pass atm. It's because the modification time in a zip archive is stored historically in dos format. Now the modification time is converted to dos format and encoded in 4 bytes. These 4 bytes can not capture a dos format completely, only even seconds can be encoded correctly with 4 bytes (and milliseconds are missing completely). The remaining data plus milliseconds is encoded in bits > 32, and thus get lost in the zip format. Now there is an extra field that allows to capture additional information, including a modification time with more precision. So when using the ZipOutputStream of the java jdk the modification time is stored in the local file header with limited precision and additionally stored in the extra field with higher precision. Now when Proguard rewrites the jar, it reads the correct time from the extra field, but then stores the time with limited precision only in the local file header as it does not support the extra field.

This leads to the case where modification times with odd seconds are 1 second off after processing with ProGuard. To solve that gracefully ProGuard-Core would need to support the extra field as well.

After realizing this, my original idea of using Proguard-Core as jarsigner replacement will probably not work as too much work will have to go into that to be as close to jarsigner as possible.

@mrjameshamilton
Copy link
Collaborator

I added a test case, but while doing so I found a roadblock that I was not aware of before.

The test case captures a case that does not pass atm. It's because the modification time in a zip archive is stored historically in dos format. Now the modification time is converted to dos format and encoded in 4 bytes. These 4 bytes can not capture a dos format completely, only even seconds can be encoded correctly with 4 bytes (and milliseconds are missing completely). The remaining data plus milliseconds is encoded in bits > 32, and thus get lost in the zip format. Now there is an extra field that allows to capture additional information, including a modification time with more precision. So when using the ZipOutputStream of the java jdk the modification time is stored in the local file header with limited precision and additionally stored in the extra field with higher precision. Now when Proguard rewrites the jar, it reads the correct time from the extra field, but then stores the time with limited precision only in the local file header as it does not support the extra field.

This leads to the case where modification times with odd seconds are 1 second off after processing with ProGuard. To solve that gracefully ProGuard-Core would need to support the extra field as well.

After realizing this, my original idea of using Proguard-Core as jarsigner replacement will probably not work as too much work will have to go into that to be as close to jarsigner as possible.

That's in interesting find! That means then that this PR is not so useful right now without further changes required?

@netomi
Copy link
Author

netomi commented Nov 29, 2023

I will look into ways to support writing the extra field as well. Also on a second thought, there should be a mechanism to indicate if the data entry has been modified upon writing to change the modification time to now instead of passing through. Not sure though yet how to best achieve that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants