feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609

pdk27 · 2021-12-21T21:52:40Z

Changes:

Improving AWS EC2 instance types API integration and caching:
- Replaced AWS EC2 pricing docs integration with AWS EC2 describe-instance-types API. Pricing docs precedes the API. Instance types retrieved via API are up-to-date and include other associated metadata.
- Added instance type information to the cache.
Adding architecture type to images API and caching

Clouddriver API response before-after:

Before:

After:

Use cases / reasoning for the changes:

Deck - filtering out incompatible instance types
Deck - display instance type info in instance type selector drop down and support filtering by metadata (PR to follow)
See draft PR and demo in Improvements to AMI and instance type validations, custom instance type selector deck#9793

Instructions to be included in release (preview) notes:

~~As part of upgrading to a Spinnaker version that includes these changes, repopulate caches by running hal deploy apply --flush-infrastructure-caches.~~
See feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609 (comment)

pdk27 · 2021-12-21T21:54:01Z

Reviewers, please also review the Instructions to be included in release (preview) notes in PR overview.

mattgogerly · 2021-12-21T23:24:03Z

Haven't had a chance to look at the code yet, but for the release note - not everybody deploys Spinnaker using Halyard. Are there instructions for other installation methods (i.e. what is that flag actually doing?)

pdk27 · 2021-12-23T17:20:50Z

Thanks for the callout @mattgogerly. That is where I need help :)
I believe hal deploy apply --flush-infrastructure-caches refreshes the caches. Someone familiar with it can add more details. I will post about this PR and ask for help in the Slack channel too.

feat(aws): Adding archi type to images API and caching spinnaker/spinnaker#5989

...rc/main/groovy/com/netflix/spinnaker/clouddriver/aws/provider/agent/ImageCachingAgent.groovy

mattgogerly · 2022-01-20T20:38:35Z

All I can find is that it wipes the Redis instance, but that would only work if you're using the built in Redis. Why is it necessary to do that?

deverton · 2022-01-24T00:18:48Z

...ava/com/netflix/spinnaker/clouddriver/aws/provider/agent/AmazonInstanceTypeCachingAgent.java

+                    attributes.put("account", account.getName());
+                    attributes.put("region", region);
+                    attributes.put("name", i.getInstanceType());
+                    attributes.put("defaultVCpus", i.getVCpuInfo().getDefaultVCpus());


Can .getVCpuInfo() or getMemoryInfo() ever return null? There's a null check for getInstanceStorageInfo() below so wanted to be sure these were different.

getVCpuInfo() and getMemoryInfo() can never be null. getInstanceStorageInfo() can return null for instance types that don't support instance storage, and support EBS-only for example. Here are some examples - https://aws.amazon.com/ec2/instance-types/

deverton · 2022-01-24T00:23:47Z

...ava/com/netflix/spinnaker/clouddriver/aws/provider/agent/AmazonInstanceTypeCachingAgent.java

+                    }
+
+                    if (i.getNetworkInfo() != null) {
+                      attributes.put("ipv6Supported", i.getNetworkInfo().getIpv6Supported());


Would it be better to have this attribute not be optional? It's always a bit clunky consuming these values from the map when you have to do something like if (attributes.get("key") != null && attributes.get("key") == true) or equivalent.

The i - InstanceTypeInfo here is a result object returned from the AWS API DescribeInstanceTypes and networkInfo is an optional parameter. Also, not all instance types support IPv6, hence, the null check. Hope this adds clarification.

link108 · 2022-03-09T18:03:45Z

Hi @pdk27, just wanted to follow up on this PR, I saw the clouddriver feature was merged (#5610) and want to ensure this gets in as well 👍

pdk27 · 2022-03-28T22:30:12Z

Thanks for the heads-up @link108. Will address the comments here soon.

pdk27 · 2022-04-04T23:37:21Z

@mattgogerly
Its needed to refresh the cache, and to be able to use the features in this PR.

mattgogerly · 2022-04-05T14:20:38Z

Does the cache not get updated by normal clouddriver caching? The equivalent to the hal command for SQL/non-Halyard would be to wipe the database, at least of AWS data, which isn't ideal operationally.

pdk27 · 2022-04-05T15:32:09Z

Gotcha! That doesn't sound ideal. As part of testing these changes, I couldn't see the new cache fields without flushing caches in my setup. Any ideas on how to confirm if flushing caches can be avoided? I will try a few things on my setup and update here.

pdk27 · 2022-04-05T18:19:57Z

I was able to verify in my setup that flushing caches is required in order to see the changes to the cached data. May be that is too extreme? Do you know if there is a softer option like a refresh? Will reach on Slack again to see if anyone has a solution - https://spinnakerteam.slack.com/archives/C091CCWRJ/p1649183109750109

Without running hal deploy apply --flush-infrastructure-caches, the images API response just didn't include the new architecture field and the response looked identical to one from before the change.
I think the question here is related to cache refresh instructions (no matter what the change to the cache is / how cache is setup) and it doesn't need to block this PR from getting merged. I can followup and a note to release notes. Do you agree?

spinnaker/spinnaker#5989

mattgogerly · 2022-04-07T08:07:40Z

@pdk27

Hm, that seems weird (or a bug?) Tagging @german-muzquiz who might be more familiar with cache data schema updates

german-muzquiz · 2022-04-13T14:55:16Z

I'm not too familiar with the AWS provider, but as a general principle the cache in clouddriver should be able to automatically rebuild itself if the underlying datastore is deleted (sql or redis), which according to the comments is working fine.

Then what I'm hearing is that the cache for instance types only gets populated once and then never gets updated. Probably that was expected in the beginning since instance types is not something that changes regularly, but then I think ideally this PR needs to automatically refresh that cache. If there are no caching agents for instance types, maybe refresh it upon startup or use something like last_modified fields to know that it should be refreshed?

pdk27 · 2022-04-25T21:48:25Z

Thanks for the pointer @german-muzquiz. I will dig into this further and get back to you.

pdk27 · 2022-04-27T20:23:46Z

@german-muzquiz
Just to give you a brief context, there are 2 API related changes in this PR: instance type API, and images API

Within images API, there are 2 kinds:
(1) find by name like http://localhost:8084/images/find?q=hello*
(2) find by ID like http://localhost:8084/images/find?q=ami-23751835
The change in this PR includes adding a new attribute architecture to both the images APIs. I can see the new field in the API response for (2) without flushing the caches, but not in the API response for (1) which comes from namedImages, built in ImageCachingAgent. Flushing caches is non-ideal for certain types of cache setups as @mattgogerly pointed out that flushing caches means wiping the SQL/ non-Halyard database.

My observations:

Flushing infrastructure caches via hal deploy apply --service-names clouddriver --flush-infrastructure-caches is the only way I could get the public images in the caches to reload with the new architecture field. Re-deploying clouddriver didn't help.

Some questions we need help with:

Is this a problem in all types of cache setups? (I could verify the behavior only for Halyard + Redis cache)
May be there are other ways to reload images cache in other setups, without the need for wiping the whole cache?
How are new attributes added to cached data in general?
With the given findings, how do we proceed?

german-muzquiz · 2022-05-18T20:06:31Z

@pdk27 I don't see anything wrong in the code leading to the issue of the architecture not showing up in http://localhost:8084/images/find?q=hello without flushing caches, the ImageCachingAgent should be rebuilding the cache each time for both named images and ami images.

My suggestion is to investigate as follows:

Start with an empty redis
Run clouddriver without your changes, so that images are initially cached without the architecture field.
Verify in redis that images don't have the architecture field. This can be done through redis-cli with a command like this:

mget "com.netflix.spinnaker.clouddriver.aws.provider.AwsProvider:namedImages:attributes:aws:namedImages:{account}:{image name}"

Where you replace {account} and {image name} for their respective values.

Verify that the request to clouddriver http://localhost:7002/aws/images/find?q={image name} doesn't include the architecture field
Stop clouddriver and then run it with your changes, without flushing the caches
Check again redis, this time the architecture field should have been stored
Check again the response of clouddriver REST API, this time the architecture field should be included

The best approach is to not having to delete the cache (redis or sql) in order for this change to work, because the images cache is rebuilt every 30 seconds anyway.

pdk27 · 2022-08-22T19:50:39Z

Apologies for the delay following-up on this.

@mattgogerly @dbyron-sf I tried the instructions provided by @german-muzquiz and it worked. i.e. I didn't need to flush infrastructure caches to see the architecture type in response.

There is no other pending action from my side. Can you please help me identify the right reviewers for this PR?

...ava/com/netflix/spinnaker/clouddriver/aws/provider/agent/AmazonInstanceTypeCachingAgent.java

pdk27 · 2022-08-24T20:02:03Z

Here is a demo of the testing done:

Spinnaker-imagesApi-demo-reduced.mov

pdk27 · 2022-08-24T20:49:19Z

@dbyron-sf ✅ Updated release preview PR.

pdk27 requested review from ajordens, aravindmd and jeyrschabu as code owners December 21, 2021 21:52

pdk27 requested a review from cfieber December 21, 2021 22:02

pdk27 requested a review from mattgogerly December 23, 2021 17:21

pdk27 mentioned this pull request Dec 23, 2021

Improvements to AMI and instance type validations, custom instance type selector spinnaker/deck#9793

Merged

feat(aws): Improving AWS EC2 instance types API integration and caching

bfa9ce6

feat(aws): Adding archi type to images API and caching spinnaker/spinnaker#5989

pdk27 force-pushed the aws-ec2-instance-types-integration branch from 7a4cbbb to bfa9ce6 Compare January 10, 2022 17:40

pdk27 requested review from dbyron-sf and removed request for ajordens and aravindmd January 11, 2022 16:41

dbyron-sf reviewed Jan 20, 2022

View reviewed changes

...rc/main/groovy/com/netflix/spinnaker/clouddriver/aws/provider/agent/ImageCachingAgent.groovy Show resolved Hide resolved

deverton reviewed Jan 24, 2022

View reviewed changes

feat(aws): Adding archi type to images API(find by image ID)

1d3b82e

spinnaker/spinnaker#5989

pdk27 force-pushed the aws-ec2-instance-types-integration branch from 7073938 to 1d3b82e Compare April 5, 2022 18:50

pdk27 removed request for jeyrschabu and cfieber August 22, 2022 19:51

dbyron-sf reviewed Aug 22, 2022

View reviewed changes

...ava/com/netflix/spinnaker/clouddriver/aws/provider/agent/AmazonInstanceTypeCachingAgent.java Outdated Show resolved Hide resolved

nemesisOsorio reviewed Aug 24, 2022

View reviewed changes

...ava/com/netflix/spinnaker/clouddriver/aws/provider/agent/AmazonInstanceTypeCachingAgent.java Outdated Show resolved Hide resolved

PR feedback

dedb325

pdk27 mentioned this pull request Aug 24, 2022

Update next release preview spinnaker/spinnaker.io#230

Merged

dbyron-sf approved these changes Aug 24, 2022

View reviewed changes

dbyron-sf added the ready to merge Approved and ready for a merge label Aug 24, 2022

mergify bot added the auto merged Merged automatically by a bot label Aug 24, 2022

Merge branch 'master' into aws-ec2-instance-types-integration

990d451

mergify bot merged commit a9b2de9 into spinnaker:master Aug 24, 2022

spinnakerbot added the target-release/1.29 label Aug 24, 2022

nemesisOsorio mentioned this pull request Nov 11, 2022

REQUEST: New Approver status for nemesisOsorio spinnaker/governance#318

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609

feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609

pdk27 commented Dec 21, 2021 •

edited

Loading

pdk27 commented Dec 21, 2021

mattgogerly commented Dec 21, 2021

pdk27 commented Dec 23, 2021

mattgogerly commented Jan 20, 2022

deverton Jan 24, 2022

pdk27 Apr 4, 2022

deverton Jan 24, 2022

pdk27 Apr 4, 2022

link108 commented Mar 9, 2022

pdk27 commented Mar 28, 2022

pdk27 commented Apr 4, 2022

mattgogerly commented Apr 5, 2022

pdk27 commented Apr 5, 2022 •

edited

Loading

pdk27 commented Apr 5, 2022 •

edited

Loading

mattgogerly commented Apr 7, 2022

german-muzquiz commented Apr 13, 2022

pdk27 commented Apr 25, 2022 •

edited

Loading

pdk27 commented Apr 27, 2022

german-muzquiz commented May 18, 2022

pdk27 commented Aug 22, 2022

pdk27 commented Aug 24, 2022

pdk27 commented Aug 24, 2022 •

edited

Loading

feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609

feat(aws): Improving AWS EC2 instance types API integration and caching, feat(aws): Adding archi type to images API and caching #5609

Conversation

pdk27 commented Dec 21, 2021 • edited Loading

Changes:

Clouddriver API response before-after:

Use cases / reasoning for the changes:

Instructions to be included in release (preview) notes:

pdk27 commented Dec 21, 2021

mattgogerly commented Dec 21, 2021

pdk27 commented Dec 23, 2021

mattgogerly commented Jan 20, 2022

deverton Jan 24, 2022

Choose a reason for hiding this comment

pdk27 Apr 4, 2022

Choose a reason for hiding this comment

deverton Jan 24, 2022

Choose a reason for hiding this comment

pdk27 Apr 4, 2022

Choose a reason for hiding this comment

link108 commented Mar 9, 2022

pdk27 commented Mar 28, 2022

pdk27 commented Apr 4, 2022

mattgogerly commented Apr 5, 2022

pdk27 commented Apr 5, 2022 • edited Loading

pdk27 commented Apr 5, 2022 • edited Loading

mattgogerly commented Apr 7, 2022

german-muzquiz commented Apr 13, 2022

pdk27 commented Apr 25, 2022 • edited Loading

pdk27 commented Apr 27, 2022

german-muzquiz commented May 18, 2022

pdk27 commented Aug 22, 2022

pdk27 commented Aug 24, 2022

pdk27 commented Aug 24, 2022 • edited Loading

pdk27 commented Dec 21, 2021 •

edited

Loading

pdk27 commented Apr 5, 2022 •

edited

Loading

pdk27 commented Apr 5, 2022 •

edited

Loading

pdk27 commented Apr 25, 2022 •

edited

Loading

pdk27 commented Aug 24, 2022 •

edited

Loading