Refactor for deploying a Xenial buildfarm. #146

nuclearsandwich · 2017-04-07T14:24:54Z

This is a heavy refactor of the buildfarm deployment puppet.
The purpose of the refactor is to

Use the Roles & Profiles pattern endorsed by Puppet Labs for keeping insanity low
Allow the configuration for different roles to coexist on a single machine.
Initiate the move to Puppet 4

Outstanding work

Reprepro 5.x on the repo node. current reprepro version doesn't support .buildinfo files #145
~~Allow a single node to have multiple roles make repo, master and slave non colliding #42~~ (deferred)
Refactor to hiera hierarchy (necessary for role coexistence)
refactor *_files modules into respective profile modules

Issues resolved by this PR

Java 8 on master (and everywhere else to boot) upgrade master to Java 8 #144
Ubuntu Xenial Support ubuntu xenial support #137 (it was kinda the whole point)
Switch to python puppet module with better support upgrade to stankevich/python from tracywebtech/pip #135
Remove vendored upstart module (superseded by systemd) replace krakatoa-upstart #63

Outside the scope of this PR is completing the move to Puppet 4. Some of the modules we're using don't yet explicitly support Puppet 4 and Xenial ships with the recently EOL'd 3.8.5 which is what this targets currently. We are using the "future" parser which is shared by puppet 3 and 4. I've also incorporated a puppet lint config and I think it'd be a nice idea to get puppet-lint and puppet parser validate running through travisci or something.

This PR is quite large and moves a lot of stuff around. When it comes time to review I'll point out a few highlights and areas I think need the most attention.

nuclearsandwich · 2017-04-13T16:14:21Z

In an offline discussion earlier this week I decided that testing with and supporting a single node buildfarm was not going to be a goal for this PR. The refactoring I've done so far should make it much more doable but right now I want to focus on getting the official buildfarm tested and then migrated to Xenial. Once that's complete circling back to support a single node configuration can be done with time pressure.

It's still something we want to do as it greatly eases running local or alternate buildfarms on smaller setups.

nuclearsandwich · 2017-04-13T16:16:20Z

I also found and fixed some bugs related to the systemd <-> initscript interaction for the agent profile which was blocking further work on the repo profile.

nuclearsandwich · 2017-04-24T14:42:15Z

This PR is completely untested on Trusty and I don't currently see a reason to try and achieve trusty compatibility.

Plugin dependency management and versioning is still a royal pain so there will be some post setup manual work to do. As this is my first time migrating a ROS buildfarm I'll be looking for help preparing a test and migration checklist.

dirk-thomas · 2017-04-24T15:44:57Z

This PR is completely untested on Trusty and I don't currently see a reason to try and achieve trusty compatibility.

👍 Supporting only Xenial is absolutely fine.

Plugin dependency management and versioning is still a royal pain so there will be some post setup manual work to do.

For a first steps that might be fine. But in order to use this for CI testing it needs to be fully automated at some point.

As this is my first time migrating a ROS buildfarm I'll be looking for help preparing a test and migration checklist.

I guess the "best" test is setting up a test farm and let it build e.g. all of Lunar and compare that the jobs which pass on the live farm also pass on the test farm.

nuclearsandwich · 2017-04-25T13:19:44Z

For a first steps that might be fine. But in order to use this for CI testing it needs to be fully automated at some point.

Do you know how this was dealt with previously? The puppet module for Jenkins doesn't resolve plugin dependencies because only the latest (non-LTS even) plugin versions have dependency metadata provided via the Jenkins wiki. There is an effort underway at rtyler/jpm to get a sophisticated package manager for Jenkins plugins off the ground but it's also in very early stages and doesn't yet support LTS.

EDIT: I should point out that this step is currently necessary using the master configs on Trusty, not a regression caused by the refactor / updates. It's definitely something I tried to resolve, but there doesn't seem to be a solution beyond pinning every plugin version in the config, then manually adding each plugins recursive dependencies to the puppet configs, which seems like a lot of work to redo whenever we update plugin versions.

dirk-thomas · 2017-04-25T15:31:57Z

Do you know how this was dealt with previously? The puppet module for Jenkins doesn't resolve plugin dependencies because only the latest (non-LTS even) plugin versions have dependency metadata provided via the Jenkins wiki. There is an effort underway at rtyler/jpm to get a sophisticated package manager for Jenkins plugins off the ground but it's also in very early stages and doesn't yet support LTS.

The puppet files explicitly installed the dependencies. Those lines usually had a comment to mention which other plugin they are required for.

While that is certainly more effort it avoids any manual work for deploying the machines. Once a better solution is available we should certainly change to that to avoid having to list the dependencies explicitly.

nuclearsandwich · 2017-05-01T14:00:37Z

cc50aca spikes out a script that reads a current Jenkins instance and dumps the currently installed plugins as an include-able puppet module. Each plugin has a require block for it's dependencies rather than a comment with its dependents since that's the format of the data from Jenkins but we could easily invert that if we think it's preferable.

nicolov · 2017-06-04T17:19:52Z

Thanks for all this good work. I'm looking to set up a small (4-5 machines) build farm, do you suggest trying to use this branch or sticking to trusty for now?

nuclearsandwich · 2017-06-05T05:09:23Z

Thanks for all this good work. I'm looking to set up a small (4-5 machines) build farm, do you suggest trying to use this branch or sticking to trusty for now?

If you're familiar with the ROS buildfarm's design (I wasn't when I started this project 😁), plan to keep your farm around for a while, and don't mind a few speedbumps early in the setup process, then the Xenial branch is probably suitable. I think the prime advantage of using Xenial is Java 8 for the Jenkins master. Since the packages themselves are built in containers, the host OS has less impact on available dependencies. If you're going to be making modifications to the puppet code, the refactor here is intended to make it much more grok-able and customizable but there's still work to be done documenting recommended ways to hook in and layer this code into your own puppet scripts. I particularly need to document the hiera keys.

The trusty code has been configuring agent machines consistently for the current farm though the plugin versions and dependencies are out of sync with what's currently deployed on the farm (which can be resolved by updating plugins in the Jenkins UI). If you want to take advantage of that proofing or your farm needs to match as faithfully as possible to the currently deployed ROS buildfarm then the code in master is the place to start, with the caveat that we are in early testing for a xenial based buildfarm and want to migrate the current farm in the coming months.

Be sure to browse some of the recent issues on this repository, particularly #149 as there are some problems with both the current master and the current xenial branch.

nuclearsandwich · 2017-07-18T19:03:11Z

c14135d will need to be incorporated here.

dirk-thomas · 2017-08-30T15:54:32Z

I didn't see any Java arguments in the patch but I might have just missed them. What JAVA_ARGS are you using for Java 8? The existing Jenkins machine has some commented options in the /etc/default/jenkins file which should be considered (which are based on the guide referenced in #144).

nuclearsandwich · 2017-08-30T17:27:44Z

I didn't see any Java arguments in the patch but I might have just missed them. What JAVA_ARGS are you using for Java 8? The existing Jenkins machine has some commented options in the /etc/default/jenkins file which should be considered (which are based on the guide referenced in #144).

For now, the Jenkins Java arguments are still part of the hiera config rather than the Puppet. This is how they were maintained previously and I was changing so much else that I wanted to leave anything I didn't have an explicit reason to change the way it was.

I need to take time to read through the entire referenced blog post still. It looks like a lot of the flags they're using are tailored toward logging and diagnostics. Is there going to be a disk utilization (storage and throuput) consideration if we just enable all of those? Are there tuning flags we should adopt directly without the logging for now?

dirk-thomas · 2017-08-30T17:35:27Z

For now, the Jenkins Java arguments are still part of the hiera config...

That's why I didn't see them in the diff 😉 Thanks for the pointer.

Is there going to be a disk utilization (storage and throuput) consideration if we just enable all of those? Are there tuning flags we should adopt directly without the logging for now?

I wouldn't blindly enable all the arguments mentioned in the blog post. Since we didn't have Java 8 I wasn't able to try those but the config file on the existing Jenkins machine has the options I thought would be most valuable in a commented line:

Tweaking of the garbage collector will hopefully prevent us to have phases where the master is unresponsive for longer periods of time while doing gc.
Reduce memory usage by applying deduplication (payed for by additional CPU resources)

nuclearsandwich · 2017-08-30T17:47:55Z

Let's move the JVM tuning discussion to the config repo in order to keep the discussion closer to the code.

nuclearsandwich · 2017-09-08T19:13:14Z

I have changed the target branch of this pull request to point to a new xenial branch in order to stagger the rollout of xenial changes to default branches. As a note to (mostly) myself: Do not delete the origin branch until the corresponding configuration repositories are updated to target ros-infrastructure/xenial branch instead of my fork.

Xenial uses systemd so we'll need to change how the explicit upstart code is used.

* Remove plugin declaration (already declared?) * Specify full python version. * Try alternate pip installation allocation. * Explicitly declare curl package.

I put this in just to test that it resolved the issue, now I want to justify its presence.

The 2.0 Docker python library removes untagged ('<none>:<none>') images from the RepoTags image attribute so this should test for an empty list rather then an explict none tag. https://github.com/docker/docker-py/blob/2.4.2/docker/models/images.py#L41-L44 The remove_image function was also moved and should be accessed by the client's `image` attribute.

docker-py appears to be an abandoned copy of the docker pypi package stuck at 1.10.6. An issue with the script on Xenial prompted us to update the dependency.

I think we'll want to start using the Jenkins CLI for managed configuration of plugins. It's the recommended config management solution for multiple plugins and if I can figure out how to use the Jenkins puppet module to run Groovy script files rather than inline scripts we'll be even better off. This change resolves ros-infrastructure#148

There's more documentation QA to be done but I think I'll do that as part of a broader docs sweep after adding a solo-machine buildfarm.

At some point this should be cleaned up, factored out, and released as a separate puppet module. Can't be bothered now though.

Refactor for deploying a Xenial buildfarm.

nuclearsandwich force-pushed the xenialize branch from d5dc35d to c48f585 Compare April 24, 2017 14:23

dirk-thomas mentioned this pull request Apr 28, 2017

git config to support merges in PR jobs #148

Closed

nuclearsandwich mentioned this pull request May 16, 2017

Jenkins slave swarm jar not downloading #149

Closed

nuclearsandwich self-assigned this Jul 6, 2017

nuclearsandwich mentioned this pull request Aug 30, 2017

evaluate docker 1.10 on the test farm #116

Open

nuclearsandwich mentioned this pull request Aug 30, 2017

images are not cleaned up - requires newer kernel ros-infrastructure/ros_buildfarm#120

Closed

nuclearsandwich mentioned this pull request Sep 8, 2017

Add a short message deprecating Trusty support for buildfarm deployment. #157

Merged

nuclearsandwich changed the base branch from master to xenial September 8, 2017 19:11

nuclearsandwich added 7 commits September 8, 2017 16:03

Remove upstart library.

7956b69

Remove upstart.

22f2dd9

Xenial uses systemd so we'll need to change how the explicit upstart code is used.

Bump and pin puppet library versions for Xenial.

b164263

Set up Python.

c4e5abf

* Remove plugin declaration (already declared?) * Specify full python version. * Try alternate pip installation allocation. * Explicitly declare curl package.

Explain why this is needed.

c5b4d5b

I put this in just to test that it resolved the issue, now I want to justify its presence.

Cleanup: apt is no longer directly used.

e9dfe41

Create modules directory to refactor puppet config.

c7a65bf

mikaelarguedas and others added 22 commits September 8, 2017 16:03

update API for python3-docker >2.0 (tested with 2.4.2)

35c7a74

remove now unused duplicated cleanup_docker_images.py

589d545

Flake8 comment and line spacing cleanup.

70211be

Remove unused import.

d9481bc

Use up-to-date name and version for the Docker Python API package.

e31d523

docker-py appears to be an abandoned copy of the docker pypi package stuck at 1.10.6. An issue with the script on Xenial prompted us to update the dependency.

Update installed plugins to latest versions.

081b820

Add gpg.conf to repo machine.

5b60819

Source not content.

df63791

WIP pass through readme updating for xenial.

e1f2f82

Remove empty repo_files module.

d649690

Finish pass through readme.

ea9cefd

There's more documentation QA to be done but I think I'll do that as part of a broader docs sweep after adding a solo-machine buildfarm.

Add a module for installing and configuring netdata.

6d6c76e

At some point this should be cleaned up, factored out, and released as a separate puppet module. Can't be bothered now though.

Document ROS base puppet profile.

4d8f0bc

Install netdata with profile::ros::base.

55220c9

Add upload scripts to puppet config.

c082c0d

For the sake of a closing brace.

3cda0ad

Define systemd reload.

cbb8c01

Mark netdata as WIP.

6f5d262

Create upload_triggers directory.

df8d4bf

Permissions will never not haunt me.

2459d0f

nuclearsandwich force-pushed the xenialize branch from 07b5598 to 2459d0f Compare September 8, 2017 20:06

nuclearsandwich merged commit 2459d0f into ros-infrastructure:xenial Sep 8, 2017

nuclearsandwich added a commit that referenced this pull request Sep 8, 2017

Merge pull request #146 from nuclearsandwich/xenialize

8afa820

Refactor for deploying a Xenial buildfarm.

nuclearsandwich mentioned this pull request Sep 13, 2017

Update example configuration for Xenial refactor. ros-infrastructure/buildfarm_deployment_config#19

Merged

nuclearsandwich mentioned this pull request Oct 4, 2017

Update master to support the Xenial changes #158

Merged

This was referenced Feb 5, 2018

/etc/hosts gets clobbered #191

Open

Using a supported puppet version #192

Open

tfoote mentioned this pull request Jan 7, 2020

backport reprepro 5.3 for the repository #223

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor for deploying a Xenial buildfarm. #146

Refactor for deploying a Xenial buildfarm. #146

nuclearsandwich commented Apr 7, 2017 •

edited

Loading

nuclearsandwich commented Apr 13, 2017

nuclearsandwich commented Apr 13, 2017

nuclearsandwich commented Apr 24, 2017

dirk-thomas commented Apr 24, 2017

nuclearsandwich commented Apr 25, 2017 •

edited

Loading

dirk-thomas commented Apr 25, 2017

nuclearsandwich commented May 1, 2017

nicolov commented Jun 4, 2017

nuclearsandwich commented Jun 5, 2017

nuclearsandwich commented Jul 18, 2017

dirk-thomas commented Aug 30, 2017

nuclearsandwich commented Aug 30, 2017 •

edited

Loading

dirk-thomas commented Aug 30, 2017

nuclearsandwich commented Aug 30, 2017

nuclearsandwich commented Sep 8, 2017

Refactor for deploying a Xenial buildfarm. #146

Refactor for deploying a Xenial buildfarm. #146

Conversation

nuclearsandwich commented Apr 7, 2017 • edited Loading

Outstanding work

Issues resolved by this PR

nuclearsandwich commented Apr 13, 2017

nuclearsandwich commented Apr 13, 2017

nuclearsandwich commented Apr 24, 2017

dirk-thomas commented Apr 24, 2017

nuclearsandwich commented Apr 25, 2017 • edited Loading

dirk-thomas commented Apr 25, 2017

nuclearsandwich commented May 1, 2017

nicolov commented Jun 4, 2017

nuclearsandwich commented Jun 5, 2017

nuclearsandwich commented Jul 18, 2017

dirk-thomas commented Aug 30, 2017

nuclearsandwich commented Aug 30, 2017 • edited Loading

dirk-thomas commented Aug 30, 2017

nuclearsandwich commented Aug 30, 2017

nuclearsandwich commented Sep 8, 2017

nuclearsandwich commented Apr 7, 2017 •

edited

Loading

nuclearsandwich commented Apr 25, 2017 •

edited

Loading

nuclearsandwich commented Aug 30, 2017 •

edited

Loading