Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Master enable ostree containers #4561

Merged

Conversation

jkonecny12
Copy link
Member

@jkonecny12 jkonecny12 commented Feb 16, 2023

RPM OSTree is getting new functionality to use containers as the base
image. Thanks to that it's possible to use standard container
repositories.

This source enables to use these repositories.

Depends on

Related: rhbz#2125655

Please check that your PR follows these rules:

  • Code conventions. tl;dr: Follow PEP8, wrap at 100 chars, and provide docstrings.

  • Commit message conventions. tl;dr: Heading, empty line, longer explanations, all wrapped manually. If in doubt, write a longer commit message with more details.

  • Release notes and docs if the PR adds something major or changes existing behavior.
    If you don't know how, ask us for help!

  • Add better coverage. The rpm-ostree payload needs tests to cover new functionality.

  • Change osname to stateroot. If the changes are accepted here Enable container support for ostreesetup command pykickstart/pykickstart#434 .

  • Add skopeo. The ostree container deployment is using skopeo tool to download image. However, it's not part of Lorax. Will add this as Anaconda dependency.

  • Add kickstart test. We already have kickstart test for ostreesetup would be great to add one also for the new ostreecontainer command.

@jkonecny12 jkonecny12 added f38 Fedora 38 f39 labels Feb 16, 2023
Comment on lines 21 to 45
def containers_needs_network(url):
"""Detect if the container registry URL needs network.

:return: True or False
"""
if not url:
return False
# Only docker:// and registry:// (could be omitted) should be used for remote installations
# however, you can use 'localhost' in these situations to connect local resources.
if url.startswith("docker://localhost") or \
url.startswith("registry://localhost") or \
url.startswith("localhost"):
return False

# Not connecting localhost (above check) anything else in docker:// or registry:// should
# require network.
if url.startswith("docker://") or \
url.startswith("registry://") or \
"://" not in url:
return True

return False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgwalters What do you think about this solution for network detection?

Copy link
Contributor

@cgwalters cgwalters Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, seems fine to start. It'd be better probably to have this be a shared library function somewhere, we could even do something like ostree container url-requires-network <imgref> or perhaps skopeo image-requires-network <imgref>.

But I'm not really concerned about this to start, I think the 99% cases are:

  • netinst ISO pulling from a remote registry
  • "Full" ISO pulling from embedded content i.e. oci:/path/to/embedded/image as is done for desktop installers today, and is also how the Fedora CoreOS and derivative Live ISO works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea about the shared version. Please ping us in case it's implemented.

I think you are correct but I rather play safe in Anaconda about these.

@jkonecny12 jkonecny12 added the blocked Don't merge this pull request! label Feb 16, 2023
@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 1bfbeae to 6047d4b Compare February 16, 2023 15:26
Copy link
Contributor

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks so much for working on this!

Comment on lines 21 to 45
def containers_needs_network(url):
"""Detect if the container registry URL needs network.

:return: True or False
"""
if not url:
return False
# Only docker:// and registry:// (could be omitted) should be used for remote installations
# however, you can use 'localhost' in these situations to connect local resources.
if url.startswith("docker://localhost") or \
url.startswith("registry://localhost") or \
url.startswith("localhost"):
return False

# Not connecting localhost (above check) anything else in docker:// or registry:// should
# require network.
if url.startswith("docker://") or \
url.startswith("registry://") or \
"://" not in url:
return True

return False
Copy link
Contributor

@cgwalters cgwalters Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, seems fine to start. It'd be better probably to have this be a shared library function somewhere, we could even do something like ostree container url-requires-network <imgref> or perhaps skopeo image-requires-network <imgref>.

But I'm not really concerned about this to start, I think the 99% cases are:

  • netinst ISO pulling from a remote registry
  • "Full" ISO pulling from embedded content i.e. oci:/path/to/embedded/image as is done for desktop installers today, and is also how the Fedora CoreOS and derivative Live ISO works.

data.osname = "fedora-coreos"
data.remote = "fcos-28"
data.url = "quay.io/fedora/coreos:stable"
data._signature_verification_enabled = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, the FCOS image actually does have ostree-based GPG signatures, and verifying those does work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just test values, we don't run any ostree command here.

"""Test the configuration property."""
data = RPMOSTreeContainerConfigurationData()
data.osname = "fedora-coreos"
data.remote = "fcos-28"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fedora most likely; we're just using the ostree remote name as a way to find the GPG verification setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value is not important in tests :).

@@ -580,7 +599,11 @@ def run(self):
# We use the UNTRUSTED flag as a way to help verification for potentially
# corrupted ISOs - this way ostree will validate the checksum of objects as
# it writes data.
pull_opts = {'refs': Variant('as', [ref]), 'flags': Variant('i', OSTree.RepoPullFlags.UNTRUSTED)}
if self._is_container:
pull_opts = {'flags': Variant('i', OSTree.RepoPullFlags.UNTRUSTED)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is going to work. It needs to instead fork off ostree container image pull if that's the goal (to separate pulling from deployment).

(Which i think may complicate things as that isn't part of the C library today exposed via pygobject, so progress reporting is more complex)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to know more about this. I honestly not sure why we are pulling the remote then remove remote and do the deployment. What is the difference between deployment and pull?

Honestly, I don't think I care if it's done by one command or two. Less is better for us probably?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason to distinguish between the pulling and deployment is that pulling from the network is something for which one commonly wants e.g. a progress bar in a GUI environment.

The deployment phase is usually fast enough that it doesn't need it.

But indeed, today in Anaconda we could just skip this separate pull phase and rely on the deploy code automatically doing a pull if the image isn't present.

This whole discussion is really exactly the semantics of e.g.:

$ podman run busybox echo hello

If you don't have the busybox image present, podman will pull it by default. But it also means that if the image pull fails (due to network flakes) the command will fail. And so in many environments there's going to be a separation between pulling (and retries) versus the run/deployment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the deployment is enough for this task we can probably just read command output to get progress reporting? Or for first version don't complicate things and instead just show "deploying". Could be improved later.

Should we leave the pull for ostree repository and bypass that just for container or there is no benefit in doing that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, worst case anaconda in the GUI could display a terminal with the status.

(That said, what will be glaring in this model is the fact that the ostree stack today is not translated, while Anaconda last I looked is...)

@jkonecny12
Copy link
Member Author

Found another blocker for this feature. We don't have skopeo in the installation environment so skopeo needs to be added to Lorax.

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 6047d4b to 59d526b Compare February 17, 2023 23:00
@jkonecny12
Copy link
Member Author

jkonecny12 commented Feb 17, 2023

UPDATED:

  • Add release note
  • Lot of fixes in code
  • Lot of fixes in tests
  • Add tests for rpm_ostree payload with container source

Test coverage should be fine after these changes. Now I need to make it working in real world.

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 59d526b to e80c0aa Compare February 20, 2023 09:13
@jkonecny12
Copy link
Member Author

Rebased.

@jkonecny12
Copy link
Member Author

Add missing skopeo dependency.

Copy link
Contributor

@poncovka poncovka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, it looks good to me. Thanks!

anaconda.spec.in Outdated
@@ -102,6 +102,10 @@ Requires: python3-systemd
Requires: python3-productmd
Requires: python3-dasbus >= %{dasbusver}
Requires: flatpak-libs
# depndencies for rpm-ostree payload module
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo.

:Summary: Add support for OSTree native containers (#2125655)

:Description:
Fedora is adding a new enhance the (rpm-)ostree stack to natively support OCI/Docker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new enhancement for ?

if not self._data.signature_verification_enabled:
remote_options['gpg-verify'] = Variant('b', False)
else:
if not self._data.gpg_verification_enabled:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would hide these details in new methods of this task. This logic can be moved to a new method called something like _is_gpg_verify_enabled that will return True or False. Then you can set up the remote options based on one check.

if self._data.is_container():
ref = self._data.url
else:
ref = RpmOstree.varsubst_basearch(self._data.ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. The ref object can be returned by a new method.

if source:
return source

return self._get_source(SourceType.RPM_OSTREE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method can be simplified:

return self._get_source(SourceType.RPM_OSTREE_CONTAINER) or self._get_source(SourceType.RPM_OSTREE)

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from a0d9c6f to 9f44e70 Compare February 23, 2023 22:02
@jkonecny12
Copy link
Member Author

jkonecny12 commented Feb 23, 2023

UPDATED:

  • More fixes in tests
  • Add FIXME for network requirement detection because format is not what was expected
  • Implement notes from @poncovka above. Thanks!

@cgwalters I've build ISO with the skope new requirement. However, I'm not able to make that work because I'm still getting message:
error: Performing deployment: Creating importer: Failed to spawn skopeo: No such file or directory (os error 2)

Unfortunately I'm not sure what happens and I'm not able to find more verbose output. When I use skope inspect it seems to be working just fine. Seems that I need deeper input from someone who knows the code a bit...
Here is the ISO and kickstart test if you want to play with it: https://jkonecny.fedorapeople.org/redhat/bugs/2125655/

To test this, run the ISO and add inst.ks=http;.... which is path to the kickstart file. When the installation fails just switch to terminal and you can see in journal the failed command and try it manually. If you need any assistance please ping me.

cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Feb 25, 2023
Otherwise, minimal environments like Anaconda might omit this.
xref rhinstaller/anaconda#4561 (comment)
@cgwalters
Copy link
Contributor

Hi, I think this is because the Anaconda environment is missing setpriv. See https://github.com/ostreedev/ostree-rs-ext/blob/d70eb3f5ff26a44fed4cf76416cfbacd6c2e9a2e/lib/src/isolation.rs#L30

Filed coreos/rpm-ostree#4317

# FIXME: this is not useful because the URL seems to request reference-scheme
# https://github.com/ostreedev/ostree-rs-ext/blob/b93946a1c512661ed4026914ff48c7c4ef25d18a/lib/src/container/mod.rs#L167
# self.module.configuration.url = "localhost/my/path"
# assert self.module.network_required is False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it won't work, please drop it from the final version.

@jkonecny12
Copy link
Member Author

Hi, I think this is because the Anaconda environment is missing setpriv. See https://github.com/ostreedev/ostree-rs-ext/blob/d70eb3f5ff26a44fed4cf76416cfbacd6c2e9a2e/lib/src/isolation.rs#L30

Filed coreos/rpm-ostree#4317

Would it be possible to also add dependency for skopeo to ostree? It's not that way right now and we don't use skopeo directly to have it as dependency.

@jkonecny12
Copy link
Member Author

jkonecny12 commented Feb 28, 2023

Just testing the setpriv. It's part of the util-linux package which is cleaned by Lorax cleanups.

I'll make PR also to Lorax if this will work correctly.

@cgwalters
Copy link
Contributor

cgwalters commented Feb 28, 2023

Ah, yes. The space savings from filtering packages like util-linux really can't be worth the maintenance overhead of the bugs (like this) that it introduces; at least for any Anaconda image which contains a GUI.

@jkonecny12
Copy link
Member Author

Ah, yes. The space savings from filtering packages like util-linux really can't be worth the maintenance overhead of the bugs (like this) that it introduces; at least for any Anaconda image which contains a GUI.

Agree and we don't plan to keep that with Image Builder.


I was able to move a bit. By using KS command:

ostreecontainer --url=ostree-unverified-registry:quay.io/fedora/fedora-coreos:stable

and changing the lorax. I was able to get to this error where Anaconda complains about missing /mnt/sysroot/boot/efi/EFI/fedora/grub.cfg Which I think is because fedora-coreos used for testing has a different way of bootloader solution which Anaconda didn't expect and tries to write our configuration for grub. @cgwalters is that the case or is it because of bootc changes?

anaconda.spec.in Outdated
@@ -102,6 +102,10 @@ Requires: python3-systemd
Requires: python3-productmd
Requires: python3-dasbus >= %{dasbusver}
Requires: flatpak-libs
# dependencies for rpm-ostree payload module
Requires: rpm-ostree

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see where rpm-ostree and skopeo are used. Is it transparently called by another command?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, today the command ostree container is actually forwarded to rpm-ostree. (This is confusing I know, but it's to avoid duplicating the Rust code for containers...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ostree container image deploy calls skopeo and I think also rpm-ostree is called by ostree. I took that dependency from Lorax. Might be worth checking why it's not ostree...

@jkonecny12
Copy link
Member Author

jkonecny12 commented Mar 3, 2023

@cgwalters could you please help me with the grub error above? I would like to have a working version.

Anaconda tries to write to grub.cfg file and we are getting this:

15:49:30,918 WARNING org.fedoraproject.Anaconda.Modules.Storage:FileNotFoundError: [Errno 2] No such file or directory: '/mnt/sysroot/boot/efi/EFI/fedora/grub.cfg'

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 9f44e70 to 3fc7480 Compare March 3, 2023 11:29
@jkonecny12
Copy link
Member Author

jkonecny12 commented Mar 3, 2023

UPDATED:

  • Removed container network detection function. We need a better solution but that could be added later.

@cgwalters
Copy link
Contributor

I was able to get to this error where Anaconda complains about missing /mnt/sysroot/boot/efi/EFI/fedora/grub.cfg Which I think is because fedora-coreos used for testing has a different way of bootloader solution which Anaconda didn't expect and tries to write our configuration for grub. @cgwalters is that the case or is it because of bootc changes?

Yes. This is also related to e.g. https://pagure.io/workstation-ostree-config/pull-request/344

What I'd actually like to accomplish here is to have anaconda switch to using bootc install-to-filesystem, which is code that just landed pretty recently. And specifically related to this, bootc (unlike ostree) takes over installing the bootloader, and hence it owns today generating grub.cfg.

(I know this is a bigger change because it hence needs to take over more of what anaconda is doing today with the bootloader...so for now, it does make sense to continue to work to enable the pure-ostree-container logic)

Specifically instead of trying with the fedora-coreos image, well, I was hoping to get official desktop container images pushed (see e.g. https://pagure.io/releng/issue/11047 and it looks like that's super close and needs one more fix) but anyways you can try for now the images built here https://github.com/cgwalters/sync-fedora-ostree-containers

specifically e.g. ghcr.io/cgwalters/fedora-silverblue:37 etc.

@cgwalters
Copy link
Contributor

Also if you need to make changes to that image to test stuff, you can now use any container buildsystem (e.g. a Dockerfile) to derive a custom image and inject directories or grub configs as needed.

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 77de815 to 7939f76 Compare March 16, 2023 11:35
@jkonecny12
Copy link
Member Author

jkonecny12 commented Mar 16, 2023

Fix the broken indentation.

@jkonecny12
Copy link
Member Author

/kickstart-test --testtype smoke

@jkonecny12
Copy link
Member Author

/kickstart-test --testtype payload

Copy link
Contributor

@poncovka poncovka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@jkonecny12
Copy link
Member Author

HMC installation failed on timetout -> shouldn't be related

Copy link
Contributor

@VladimirSlavik VladimirSlavik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the code really wrongly formed. A lot of the if is_container() spaghetti could be avoided, starting with better data class field naming and common ancestors, and this problem then flows all the way from the data/config to the task's run().

I won't nack this because it's something that people ask for and is running late. BUT, I won't ack this either; I do not want to maintain this, the right time to do it right is now, by the author. It will work, but it feels write-only.

Comment on lines +74 to +93
def _get_stateroot(data):
"""Get stateroot.

The OSTree renamed old osname to stateroot for containers.

:param data: OSTree source structure
:return str: stateroot or osname value based on source
"""
if data.is_container():
# osname was renamed to stateroot so let's use the new name
if data.stateroot:
return data.stateroot
else:
# The stateroot doesn't have to be defined
# https://github.com/ostreedev/ostree-rs-ext/pull/462/files
# However, it's working just for a subset of calls now.
# TODO: Remove this when all ostree commands undestarstands this
return "default"
else:
return data.osname
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this code simply does not have to exist. I think you should have renamed one of these fields so that they match. Especially when the first thing "stateroot" docstring says is that it's the osname...

Comment on lines +105 to +108
if data.is_container():
return data.signature_verification_enabled
else:
return data.gpg_verification_enabled
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here. If it's the same thing, just call it the same. The docstrings also say that it's just gpg now, so why did we have to invent a new name?

Comment on lines +66 to +71
# Variable substitute the ref: https://pagure.io/atomic-wg/issue/299
if data.is_container():
# we don't have ref with container; there are not multiple references in one container
return data.url
else:
return RpmOstree.varsubst_basearch(data.ref)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is above both calls, so perhaps you should substitute also for the url?

)

log.info("ostree admin deploy starting")
if self._data.is_container():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part and below feel as if you forgot that polymorphism exists?

docs/release-notes/ostree-native-containers-support.rst Outdated Show resolved Hide resolved
Comment on lines -539 to +530
data = self._get_data()
data = _make_config_data()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to remove the actual function after it's not called here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed.

Comment on lines +36 to +39
@staticmethod
def is_container():
"""Is this native container source?"""
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You extended this class so it's the same as the new class... but then made it different anyway?

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 7939f76 to 021a5b0 Compare March 16, 2023 16:09
@jkonecny12 jkonecny12 removed the blocked Don't merge this pull request! label Mar 16, 2023
@jkonecny12
Copy link
Member Author

Fixed typo in RN and leftover testing method mentioned by @VladimirSlavik .

For other point we agreed to do it after merging this PR.

@jkonecny12
Copy link
Member Author

/kickstart-test --testtype smoke

@cgwalters
Copy link
Contributor

A lot of the if is_container() spaghetti could be avoided

In the future though, I think we can remove the non-container ostree backend.

@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 021a5b0 to 251dc7a Compare March 17, 2023 16:18
@jkonecny12
Copy link
Member Author

New ISO with new pykickstart (reason for failures here) on Cockpit side is prepared cockpit-project/bots#4530. However, it has an issue blocking for merge. Hopefully, we will be able to resolve that on Monday.

@jkonecny12
Copy link
Member Author

/kickstart-test --testtype smoke

1 similar comment
@jkonecny12
Copy link
Member Author

/kickstart-test --testtype smoke

@VladimirSlavik
Copy link
Contributor

Probably you want to rebase and drop the pykickstart bump, it's at next version already.

RPM OSTree is getting new functionality to use containers as the base
image. Thanks to that it's possible to use standard container
repositories.

This source enables to use these repositories.

Related: rhbz#2125655
We have a new RPMOStreeContainer source now which enables us to use
containers as installation source. In this commit we adapt the current
rpm_ostree to be able to consume this new container source.

Add a lot of rpm ostree tasks tests for the container source.

Resolves: rhbz#2125655
There is already existing top level function to create the same, so
let's re-use it and remove duplication.
Our intention is to move dependencies specific for Anaconda execution
from Lorax into Anaconda. Let's do that for rpm-ostree because we need
to also add `skopeo` project to be able to download container images.

Require rpm-ostree version which contains:
- ostreedev/ostree-rs-ext#464 (simplified
syntax)
- ostreedev/ostree-rs-ext#462 (stateroot is not
mandatory)

Related: rhbz#2125655
Based on my discussion and recommendation on the bug, the 'registry'
transport should be the only one which needs network to work. Other ways
of transport should be used for local management.

Related: rhbz#2125655
@jkonecny12 jkonecny12 force-pushed the master-enable-ostree-containers branch from 251dc7a to 377f54a Compare March 21, 2023 08:55
@jkonecny12
Copy link
Member Author

/kickstart-test --testtype smoke

@jkonecny12 jkonecny12 merged commit f0ece1f into rhinstaller:master Mar 21, 2023
@rvykydal rvykydal mentioned this pull request Jan 12, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
f38 Fedora 38 f39
Development

Successfully merging this pull request may close these issues.

6 participants