Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for custom CA certificates #392

Merged
merged 1 commit into from
Aug 2, 2023
Merged

Conversation

rassie
Copy link
Contributor

@rassie rassie commented May 19, 2023

This adds the capability to add custom CA certificates for Java truststore.

This capability used to exist in the old openjdk images, but has been removed from eclipse-temurin images. This PR adds an entrypoint to eclipse-temurin images, which adds CA certificates from /certificates path inside the image to the system certificate store and replaces JRE's truststore with it. This only happens if USE_SYSTEM_CA_CERTS environment variable is set, so by default, no action is taken.

Important caveat (needs further discussion!)

This PR deviates from the discussion in #293 in one important aspect. When discussing this feature, I've made an assumption about the inner workings of openjdk images which turned out incorrect. I thought that the system certificate store and JRE's truststore used to get merged by the openjdk image, which was option c) in our discussion. However, it turns out that the system store used to get converted to truststore format and then replaced JRE's store completely. This is the option b) in the #293 discussion, which we dismissed as non-sensical.

While implementing this PR, I've took previously existing process in openjdk images as my blueprint, which means, I've actually implemented b) instead of c) and would like to argue in favor of keeping this implementation. I'm fully aware that this decision will need to be discussed further.

My arguments in favor are:

  • This is what openjdk images used to implement; reinstating that functionality was the whole point of Support the use system cacerts as an option #293
  • Merging two certificate truststores is certainly possible, but way more complicated than replacing
  • The differences between the system and JRE truststore are minimal (I can provide an evaluation if needed)
  • Feature is opt-in as opposed to openjdk, so the option to use untampered truststore still exists and is the default
  • Merging truststores can still be implemented as needed in addition to replacing with an additional opt-in
  • The discussion in Support the use system cacerts as an option #293 gravitated naturally to option b) without us noticing

Differences from openjdk images

Even though this PR is intended to re-introduce functionality previously available in openjdk images, there are important differences in the implementation and usage:

  • openjdk images added hooks to OS's certificate store update functionality (update-ca-trust / update-ca-certificates), but never actually updated the store on image start. This PR makes sure that truststores are updated in the entrypoint if the opt-in variable is set
  • openjdk images did not provide a dedicated directory for additional certificates, the user was expected to add them to /usr/share/pki/ca-trust-source/anchors/ or /usr/local/share/ca-certificates/, depending on the underlying OS. This PR provides a /certificates directory, which is a stable mount point for CA certificates.

Basically, with openjdk images it has been necessary to patch the entrypoint to include update-ca-certificates, eclipse-temurin images would only require an opt-in environment variable to be set.

Documentation

The documentation is missing completely, because I don't know (yet) whether to put it.

Short itemized quick-start guide:

  • Mount custom CA certificates at /certificates/ inside the container
  • Activate certificate processing by setting the USE_SYSTEM_CA_CERTS environment variable to any value.

Testing

This patch has been tested semi-manually with the following Bash script executed from the repository root

#!/bin/bash

# shellcheck source=common_functions.sh
source ./common_functions.sh

mkdir -p certs

openssl req -nodes -new -x509 -subj "/DC=Termurin/CN=DockerBuilder" -keyout certs/server.key -out certs/server.crt >&/dev/null

check() {
    if [ "$1" = "$2" ]; then
        # output green unicode checkmark
        echo -n -e "\xE2\x9C\x94"
    else
        # output red unicode cross
        echo -n -e "\xE2\x9C\x98"
    fi
}

for ver in ${supported_versions}; do

    find ${ver} -iname 'Dockerfile.*' -printf "$ver/%P\n" | grep -v "windows" | while read -r dockerfile; do
        NAME=$(dirname $dockerfile)
        docker build -t "temurin/$NAME" -f $dockerfile "$NAME" >&/dev/null

        CMD1=date
        if [ "$ver" = "8" ]; then
            CACERTS=/opt/java/openjdk/lib/security/cacerts
            CACERTS2=/opt/java/openjdk/jre/lib/security/cacerts

            CMD2=(sh -c "keytool -list -keystore $CACERTS -storepass changeit -alias dockerbuilder || keytool -list -keystore $CACERTS2 -storepass changeit -alias dockerbuilder")
        else
            CMD2=(keytool -list -cacerts -storepass changeit -alias dockerbuilder)
        fi

        echo -n "$NAME: "

        docker run "temurin/$NAME" $CMD1 >/dev/null
        check $? 0
        docker run "temurin/$NAME" "${CMD2[@]}" >&/dev/null
        check $? 1

        docker run -v $(pwd)/certs:/certificates/ "temurin/$NAME" $CMD1 >/dev/null
        check $? 0
        docker run -v $(pwd)/certs:/certificates/ "temurin/$NAME" "${CMD2[@]}" >&/dev/null
        check $? 1

        docker run -v $(pwd)/certs:/certificates/ -e USE_SYSTEM_CA_CERTS=1 "temurin/$NAME" $CMD1 >&/dev/null
        check $? 0
        docker run -v $(pwd)/certs:/certificates/ -e USE_SYSTEM_CA_CERTS=1 "temurin/$NAME" "${CMD2[@]}" >&/dev/null
        check $? 0
        echo
    done
done

This scripts generates a certificate, builds all images and tests that the entrypoint does not fail:

  • if nothing is added
  • if the certificate is mounted, but opt-in is not set
  • if the certificate is mounted and opt-in is set

The script checks that both normal execution of any command (in this case date) is possible and that the certificate is in the JRE's truststore when it's expected to be there (when the certificate is mounted and opt-in is set).

The following output is produced by the test script, reporting a full success:

8/jre/ubi/ubi9-minimal: ✔✔✔✔✔✔
8/jre/centos: ✔✔✔✔✔✔
8/jre/alpine: ✔✔✔✔✔✔
8/jre/ubuntu/focal: ✔✔✔✔✔✔
8/jre/ubuntu/jammy: ✔✔✔✔✔✔
8/jdk/ubi/ubi9-minimal: ✔✔✔✔✔✔
8/jdk/centos: ✔✔✔✔✔✔
8/jdk/alpine: ✔✔✔✔✔✔
8/jdk/ubuntu/focal: ✔✔✔✔✔✔
8/jdk/ubuntu/jammy: ✔✔✔✔✔✔
11/jre/ubi/ubi9-minimal: ✔✔✔✔✔✔
11/jre/centos: ✔✔✔✔✔✔
11/jre/alpine: ✔✔✔✔✔✔
11/jre/ubuntu/focal: ✔✔✔✔✔✔
11/jre/ubuntu/jammy: ✔✔✔✔✔✔
11/jdk/ubi/ubi9-minimal: ✔✔✔✔✔✔
11/jdk/centos: ✔✔✔✔✔✔
11/jdk/alpine: ✔✔✔✔✔✔
11/jdk/ubuntu/focal: ✔✔✔✔✔✔
11/jdk/ubuntu/jammy: ✔✔✔✔✔✔
17/jre/ubi/ubi9-minimal: ✔✔✔✔✔✔
17/jre/centos: ✔✔✔✔✔✔
17/jre/alpine: ✔✔✔✔✔✔
17/jre/ubuntu/focal: ✔✔✔✔✔✔
17/jre/ubuntu/jammy: ✔✔✔✔✔✔
17/jdk/ubi/ubi9-minimal: ✔✔✔✔✔✔
17/jdk/centos: ✔✔✔✔✔✔
17/jdk/alpine: ✔✔✔✔✔✔
17/jdk/ubuntu/focal: ✔✔✔✔✔✔
17/jdk/ubuntu/jammy: ✔✔✔✔✔✔
20/jre/ubi/ubi9-minimal: ✔✔✔✔✔✔
20/jre/alpine: ✔✔✔✔✔✔
20/jre/ubuntu/jammy: ✔✔✔✔✔✔
20/jdk/ubi/ubi9-minimal: ✔✔✔✔✔✔
20/jdk/alpine: ✔✔✔✔✔✔
20/jdk/ubuntu/jammy: ✔✔✔✔✔✔

OS support

This PR explicitely excludes Windows support, mostly because I lack expertise and an actual Windows installation to develop and test it. Additionally, it's unclear whether openjdk included CA certificate support for Windows.

TODO / Help needed

I need some guidance of the following items:

  • Where can I add some documentation for this feature?
  • Should the opt-in environment variable's naming be adjusted to meet some criteria?
  • How should the test script be integrated?
  • How should the caveat about merging certificate stores be resolved? (see above)

Fixes: #293

@jerboaa
Copy link

jerboaa commented May 19, 2023

Please be sure to sign the ECA.

@rassie
Copy link
Contributor Author

rassie commented May 19, 2023

Already on it

@gaeljw
Copy link

gaeljw commented May 19, 2023

As original requestor of the feature, I'm fine with the choice that has been made to replace certs rather than merge because merging can be done beforehand if needed and it's probably better to let the user choose the "merging strategy" (I'm not that familiar with these stuff but I heard that there's several ways to do it).

@karianna
Copy link
Contributor

@rassie Can we set this to ready for review now?

@rassie rassie marked this pull request as ready for review May 30, 2023 05:58
@rassie
Copy link
Contributor Author

rassie commented May 30, 2023

@rassie Can we set this to ready for review now?

From my point of view, of course. I've expected a bit more discussion in terms of TODOs, but I suppose it can be done as part of the review.

@karianna
Copy link
Contributor

@jerboaa Over to you for the first pass I think :-)

@jerboaa
Copy link

jerboaa commented May 30, 2023

It's on my list of things to look at but likely not before end of next week. Sorry.

Copy link

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems alright to me from a conceptional perspective. Please integrate the test script into the repo and use it from the PR tester workflow as an additional test. Otherwise, I'd defer to @gdams for a review as he's more familiar with the scripting.

dockerfile_functions.sh Show resolved Hide resolved
@rassie
Copy link
Contributor Author

rassie commented Jun 12, 2023

Please integrate the test script into the repo and use it from the PR tester workflow as an additional test.

I'll try to figure out how that works, but if I have some questions, is there someone specific I could ask, either here or on Adoptium Slack?

.test/config.sh Outdated Show resolved Hide resolved
@jerboaa
Copy link

jerboaa commented Jun 13, 2023

Paging @gdams for input on the test strategy. Thanks!

@rassie
Copy link
Contributor Author

rassie commented Jun 13, 2023

@jerboaa so it seems the test is being found and executed, so we've got the infrastructure part right. I'll look over the script error later today or tomorrow.

@rassie
Copy link
Contributor Author

rassie commented Jun 14, 2023

It seems the tests need to be in the tests subdirectory. Let's try this again...

@rassie
Copy link
Contributor Author

rassie commented Jun 14, 2023

Another fix for JRE/JDK 8 detection. The fix for Windows exclusion is still pending.

@rassie
Copy link
Contributor Author

rassie commented Jun 14, 2023

I think this should be it. 🤞

Copy link

@jerboaa jerboaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine to me. Thanks for the contribution!

@karianna
Copy link
Contributor

The code changes LGTM me. I'll ask our (MSFT) infra and security folks to take a look at this as well. Sorry for the hold up but this is pretty critical stuff, more eyes are good :-)

@rassie
Copy link
Contributor Author

rassie commented Jun 15, 2023

@jerboaa on the topic of documentation: should I be updating https://github.com/docker-library/docs/blob/master/eclipse-temurin/content.md in some way?

@jerboaa
Copy link

jerboaa commented Jun 15, 2023

@rassie Yes that seems suitable. But only once this PR is merged and builds have been produced. Feel free to produce a draft PR there, though. We'd also welcome a blog post here: https://github.com/adoptium/adoptium.net/tree/main/content

Thanks!

@gdams
Copy link
Member

gdams commented Jun 15, 2023

I'll do a thorough review today, also tagging @tianon and @yosifkit as they may have some comments on the proposed implementation

@gdams

This comment was marked as outdated.

@rassie
Copy link
Contributor Author

rassie commented Aug 2, 2023

@gdams Thanks, fingers crossed!

@tellison
Copy link
Contributor

tellison commented Aug 2, 2023

@rassie the PMC wishes to pass on thanks for your work on this new feature. We realise that it has been open for a long time, and recognise that it has been well thought through and provides thorough local testing. This is a great new capability for the official Temurin images and as gdams noted, once this has been validated on one Java version we will roll it out to all.

Thanks again 👍

rassie added a commit to rassie/docs that referenced this pull request Aug 10, 2023
rassie added a commit to rassie/docs that referenced this pull request Aug 10, 2023
rassie added a commit to rassie/docs that referenced this pull request Aug 10, 2023
yosifkit added a commit to infosiftr/tomcat that referenced this pull request Aug 11, 2023
The upstream entrypoint is `sh` and so loses dotted environment variables, lets prevent that from happening by just skipping it as the `tomcat` images are not reliant on its functionality. See docker-library/docs#2338 and adoptium/containers#392 for info about what it provides.

Fixes docker-library#302 which is a recurrence of docker-library#77
@QuintenQVD0
Copy link

QuintenQVD0 commented Aug 14, 2023

This pr broke all images that depends on this. as we have our own entrypoint and are using a low privileged user. thx

@rassie
Copy link
Contributor Author

rassie commented Aug 14, 2023

This pr broke all images that depends on this. as we have our own entrypoint and are using a low privileged user. thx :(

Care to elaborate? What exactly is breaking when you override the entrypoint? I think I've tested that use-case when developing and it worked fine, IIRC, would fix ASAP if I understand the problem correctly.

@QuintenQVD0
Copy link

This pr broke all images that depends on this. as we have our own entrypoint and are using a low privileged user. thx :(

Care to elaborate? What exactly is breaking when you override the entrypoint? I think I've tested that use-case when developing and it worked fine, IIRC, would fix ASAP if I understand the problem correctly.

We have fixt it but your entryoint.sh is root owned and we can not overwrite it directly as our images are run by a low priv user called container. example:

https://github.com/parkervcp/yolks/tree/master/java/11

because the base image is yours and you have some logic in the entryoint of you as we do not overwrite it as we use CMD and coppy in our own entrypoint. when we start are containers it tries to run your entrypoint but it is root owned so it will fail. we got arround it but we just wanted to let you know as not every use for this as a base image will fit everyone.

@rassie
Copy link
Contributor Author

rassie commented Aug 14, 2023

We have fixt it but your entryoint.sh is root owned and we can not overwrite it directly as our images are run by a low priv user called container. example:

Oh, so we have a clash with filenames, i.e. we've introduced /entrypoint.sh which you have been using until now? It's rather unfortunate, but I can understand how it came to be and it is certainly unexpected (both for user and developer), I agree.

because the base image is yours and you have some logic in the entryoint of you as we do not overwrite it as we use CMD and coppy in our own entrypoint. when we start are containers it tries to run your entrypoint but it is root owned so it will fail. we got arround it but we just wanted to let you know as not every use for this as a base image will fit everyone.

I've tried running eclipse-temurin with an unprivileged user and had no problems calling the entrypoint script. Can you show me how it fails?

In general, thank you for the feedback, is there anything you'd have us do differently with this feature?

@gdams
Copy link
Member

gdams commented Aug 14, 2023

We have fixt it but your entryoint.sh is root owned and we can not overwrite it directly as our images are run by a low priv user called container. example:

Oh, so we have a clash with filenames, i.e. we've introduced /entrypoint.sh which you have been using until now? It's rather unfortunate, but I can understand how it came to be and it is certainly unexpected (both for user and developer), I agree.

because the base image is yours and you have some logic in the entryoint of you as we do not overwrite it as we use CMD and coppy in our own entrypoint. when we start are containers it tries to run your entrypoint but it is root owned so it will fail. we got arround it but we just wanted to let you know as not every use for this as a base image will fit everyone.

I've tried running eclipse-temurin with an unprivileged user and had no problems calling the entrypoint script. Can you show me how it fails?

In general, thank you for the feedback, is there anything you'd have us do differently with this feature?

@rassie I would suggest renaming our custom entrypoint.sh script as it's the most likely name to cause conflict? What do you think?

@dashford
Copy link

Hi @rassie - have you any thoughts on this issue posted here? #415

We're running into issues on some of our applications where env variables with a - in them have stopped returning data. We're thinking it's because of the change in shell now that the entrypoint script is defined in the Dockerfile and is setting it as /usr/bin/sh rathaer than the default shell for ubuntu:22.04.

@rassie
Copy link
Contributor Author

rassie commented Aug 14, 2023

@rassie I would suggest renaming our custom entrypoint.sh script as it's the most likely name to cause conflict? What do you think?

Would be one the easier solutions, however it's a breaking change, even though it's only been a couple of weeks. Either way, someone from the project will need to decide how to go on.

@gdams
Copy link
Member

gdams commented Aug 14, 2023

@rassie I would suggest renaming our custom entrypoint.sh script as it's the most likely name to cause conflict? What do you think?

Would be one the easier solutions, however it's a breaking change, even though it's only been a couple of weeks. Either way, someone from the project will need to decide how to go on.

I think that where we've already created a breaking change I'm going to back this out from the upstream images (for now) so we can work on a fix here.

@rassie
Copy link
Contributor Author

rassie commented Aug 14, 2023

Hi @rassie - have you any thoughts on this issue posted here? #415

We're running into issues on some of our applications where env variables with a - in them have stopped returning data. We're thinking it's because of the change in shell now that the entrypoint script is defined in the Dockerfile and is setting it as /usr/bin/sh rathaer than the default shell for ubuntu:22.04.

From that issue:

The new entrypoint from #392 is sh on Ubuntu and Alpine images and so loses variables. Please change all the entrypoint scripts to use bash

I have no problem with changing to bash, however, sh has been the common ground for all Linux derivates because of Alpine, which doesn't have bash installed by default. Should we just install bash to Alpine and be done with it?

@rassie
Copy link
Contributor Author

rassie commented Aug 14, 2023

@gdams Thanks, gives us a bit of time. I'll prepare a PR with moved entrypoint script (somewhere with little chance of name clash) and with bash running the entrypoint, correct?

@gdams
Copy link
Member

gdams commented Aug 14, 2023

@gdams Thanks, gives us a bit of time. I'll prepare a PR with moved entrypoint script (somewhere with little chance of name clash) and with bash running the entrypoint, correct?

yes please

@QuintenQVD0
Copy link

We have fixt it but your entryoint.sh is root owned and we can not overwrite it directly as our images are run by a low priv user called container. example:

Oh, so we have a clash with filenames, i.e. we've introduced /entrypoint.sh which you have been using until now? It's rather unfortunate, but I can understand how it came to be and it is certainly unexpected (both for user and developer), I agree.

because the base image is yours and you have some logic in the entryoint of you as we do not overwrite it as we use CMD and coppy in our own entrypoint. when we start are containers it tries to run your entrypoint but it is root owned so it will fail. we got arround it but we just wanted to let you know as not every use for this as a base image will fit everyone.

I've tried running eclipse-temurin with an unprivileged user and had no problems calling the entrypoint script. Can you show me how it fails?

In general, thank you for the feedback, is there anything you'd have us do differently with this feature?

thank you for your fast and grate reesponse, this is what the error is
environment/docker: failed to start container: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/entrypoint.sh": permission denied: unknown

as the container is forced to start as a low prive user called "container" and that file is indeed a naming conflict so it tryes to execute you entrypoint what is root owned or at least conflict with ours as our CMD entry specifyes the same file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support the use system cacerts as an option
9 participants