Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Attachment body parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded #491

Closed
roy20021 opened this issue Jan 21, 2024 · 9 comments

Comments

@roy20021
Copy link

roy20021 commented Jan 21, 2024

Hello,

I am trying to use simple-java-mail for parsing and showing mails, however I am finding out an issue regards parts that should be treated as CID instead of Attachments.

More precisely, the library is not handling at all the different multipart subtype (mainly: mixed, alternative and related) and relies on the content-disposition header to determine if the part should be a CID or an attachment.
As per my trials, at least a message sent via gmail, the eml might be composed of multipart/related (https://www.ietf.org/rfc/rfc2387.txt ) and the CID parts declared with disposition 'attachment'.
As per https://www.ietf.org/rfc/rfc2183.txt (2.1, 2.2 paragraph) the disposition meaning is about automatic or manual actions from the user.

Example:
Example.zip

There is an image "doclife.jpg" that should appear in the cidMap (but it is not).

I propose to handle that more loosely: roy20021@eed9bf9

I haven't tested yet the change since #490 is a blocker.
Please provide a formatter (I am using eclipse) I should use if necessary.

@bbottema
Copy link
Owner

bbottema commented Jan 21, 2024

Hello roy, thank you for contributing. I haven't had much time to dive into this, but it's difficult to see what you changed exactly, with all the formatting changes. Would it be possible to split out the formatting from the actual change?

Somewhat related to #179

@roy20021
Copy link
Author

Can you apply a specific formatter to all the code and reference it into the repo? It will be extremely easier to contribute.
As an example: https://github.com/google/google-java-format

I describe below my changes:

  1. made MimeDataSource implements DataSource
  2. at MimeMessageParser changed the parseMimePartTree method in order to assume every attachment as a CID
  3. at MimeMessageParser renamed moveInvalidEmbeddedResourcesToAttachments to moveNonEmbeddedResourcesToAttachments changing its logic in order to iterate over all cids and move to attachment if not into the html content

@bbottema
Copy link
Owner

bbottema commented Jan 27, 2024

Thank you for the suggestion, but I don't want to format my whole code base right now.

I did a little bit of research, but it seems there is no clear behaviour for this. Sure, resources should be used as embedded based on CID, but it is not clear these same resources should not also be downloadable attachments. Meaning, a resource can both be an embedded image and a downloadable file.

I never realized this, but this seems to be the reason why I sometimes get emails in gmail where many of the embedded images are also listed as downloadable files. Apparently, gmail does this commonly, but this is not universal behaviour and the RFC's don't explicitly define this use case. It's more of a by-product due to the flexible nature of the RFC's.

What are your thoughts on this?

@bbottema
Copy link
Owner

I'm going to treat the handling of Content-Disposition separate from any embedding/CID references. This means I'm going to assume the resource should both be embedded and and attachment which in the case of Simple Java Mail means, the resources occurs in two list at the same time (embeddedImages and attachments).

@roy20021
Copy link
Author

I'm going to treat the handling of Content-Disposition separate from any embedding/CID references. This means I'm going to assume the resource should both be embedded and and attachment which in the case of Simple Java Mail means, the resources occurs in two list at the same time (embeddedImages and attachments).

Great, I think the same.

Are you going to take care of the changes? If you want me doing any changes let me know.

@bbottema
Copy link
Owner

I'm already having a crack at it 👍

bbottema pushed a commit that referenced this issue Feb 12, 2024
…t" and a ContentID for embedding it in HTML
@bbottema
Copy link
Owner

I'll release the fix tonight (my time)

@bbottema bbottema changed the title Attachment or CID shouldn't be disposition-driven Attachment pody parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded Feb 13, 2024
@bbottema bbottema changed the title Attachment pody parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded [bug] Attachment pody parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded Feb 13, 2024
@bbottema bbottema changed the title [bug] Attachment pody parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded [bug] Attachment body parts should separately parse Content-Disposition and ContentID, possible resulting in an downloadable attachment that is also embedded Feb 13, 2024
@bbottema
Copy link
Owner

Released in 8.6.3. If you could verify it works as you expect, that would be very helpful.

@bbottema bbottema added this to the 8.6.3 milestone Feb 13, 2024
@roy20021
Copy link
Author

roy20021 commented Mar 4, 2024

@bbottema I confirm it works as expected. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants