Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing AWS region in S3 operations #9546

Merged
merged 42 commits into from
Mar 27, 2024

Conversation

radeusgd
Copy link
Member

@radeusgd radeusgd commented Mar 26, 2024

Pull Request Description

  • Closes Missing region in AWS setup (e.g. datalink) #9284
  • Now our tests run without the default AWS_ config, thus ensuring that the tested setups work in a clean environment.
  • After all, more complicated logic was needed for buckets access - apparently the AWS SDK only allows for some operations on buckets to happen if the client is connected to the correct region. Thus detection of bucket regions had to be implemented.
  • Added AWS_Region widget based on autoscoping.
  • Fixed AWS_Credential.profile_names crashing if no AWS config was found. Now it returns no profiles if not found. Added a regression test.

Important Notes

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

  • The documentation has been updated, if necessary.
  • Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
  • All code follows the
    Scala,
    Java,
    and
    Rust
    style guides. In case you are using a language not listed above, follow the Rust style guide.
  • All code has been tested:
    • Unit tests have been written where possible.
    • If GUI codebase was changed, the GUI was tested when built using ./run ide build.

@radeusgd radeusgd added the CI: No changelog needed Do not require a changelog entry for this PR. label Mar 26, 2024
@radeusgd radeusgd self-assigned this Mar 26, 2024
Comment on lines 44 to 99
if (locationConstraint == BucketLocationConstraint.UNKNOWN_TO_SDK_VERSION) {
Logger.getLogger("S3-BucketLocator").fine("AWS returned an unknown location constraint.");
return null;
}

var inferredRegion = Region.of(locationConstraint.toString());
boolean isKnown = Region.regions().contains(inferredRegion);
if (isKnown) {
return inferredRegion;
} else {
Logger.getLogger("S3-BucketLocator").fine("AWS returned a location constraint that cannot be mapped to a known region: " + locationConstraint);
return null;
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that sometimes getBucketLocation may not be able to find out the true bucket location.

In such case, we will fall back to the region set in credentials which seems our best bet - as the user can always amend that region to be right. If the region is wrong, the user will be getting failures like: The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint. (message coming from AWS SDK)

Copy link
Member Author

@radeusgd radeusgd Mar 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AWS docs seem to suggest using HeadBucket instead of GetBucketLocation but I could not get it to work, because ironically HeadBucket fails with the above error:

((S3_Error.Error 'The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.' 'PermanentRedirect'))

even on the same configuration that allows GetBucketLocation to work...

The patch was
Index: std-bits/aws/src/main/java/org/enso/aws/BucketLocator.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/std-bits/aws/src/main/java/org/enso/aws/BucketLocator.java b/std-bits/aws/src/main/java/org/enso/aws/BucketLocator.java
--- a/std-bits/aws/src/main/java/org/enso/aws/BucketLocator.java	(revision fc80d4d118f4e2797a281b95ba96ffae68afe26a)
+++ b/std-bits/aws/src/main/java/org/enso/aws/BucketLocator.java	(date 1711457551132)
@@ -1,9 +1,10 @@
 package org.enso.aws;
 
 import software.amazon.awssdk.regions.Region;
-import software.amazon.awssdk.services.s3.model.BucketLocationConstraint;
+import software.amazon.awssdk.services.s3.model.HeadBucketResponse;
 
 import java.util.HashMap;
+import java.util.Optional;
 import java.util.logging.Logger;
 
 /**
@@ -35,25 +36,9 @@
     DefaultRegionProvider defaultRegionProvider = new DefaultRegionProvider(null, Region.US_EAST_1);
     var clientBuilder = new ClientBuilder(associatedCredential, defaultRegionProvider.getRegion());
     try (var client = clientBuilder.buildGlobalS3Client()) {
-      BucketLocationConstraint locationConstraint = client.getBucketLocation(builder -> builder.bucket(bucketName)).locationConstraint();
-      if (locationConstraint == null) {
-        // Weird edge case: documentation says that buckets in region us-east-1 return null
-        return Region.US_EAST_1;
-      }
-
-      if (locationConstraint == BucketLocationConstraint.UNKNOWN_TO_SDK_VERSION) {
-        Logger.getLogger("S3-BucketLocator").fine("AWS returned an unknown location constraint.");
-        return null;
-      }
-
-      var inferredRegion = Region.of(locationConstraint.toString());
-      boolean isKnown = Region.regions().contains(inferredRegion);
-      if (isKnown) {
-        return inferredRegion;
-      } else {
-        Logger.getLogger("S3-BucketLocator").fine("AWS returned a location constraint that cannot be mapped to a known region: " + locationConstraint);
-        return null;
-      }
+      HeadBucketResponse response = client.headBucket(builder -> builder.bucket(bucketName));
+      Optional<String> regionId = response.sdkHttpResponse().firstMatchingHeader("x-amz-bucket-region");
+      return regionId.map(Region::of).orElse(null);
     } catch (Exception e) {
       Logger.getLogger("S3-BucketLocator").fine("Failed to locate bucket " + bucketName + ": " + e.getMessage());
       return null;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After all I figured out we can extract the region from the S3Exception that happens when HeadBucket fails, so I switched to this strategy.

I thought to keep the legacy strategy as a fallback just in case - if the headers ever disappear from the exception for example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH I'm wondering if that's not introducing unnecessary maintenance burden.

What do you'all think? Shall I keep the fallback, or shall we remove it, knowing that we can add it if we ever run into issues with the current way of fetching regions?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any harm leaving the fallback - my guess is it will never be fired but not a big burden and nice to have the fallback if response format changes.

Copy link
Contributor

@somebody1234 somebody1234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably remove the index.html symlink that points to a nonexistent file (causing issues when formatting)

@radeusgd radeusgd force-pushed the wip/radeusgd/9284-fix-missing-aws-region branch from fc80d4d to e139f12 Compare March 26, 2024 14:06
@radeusgd radeusgd force-pushed the wip/radeusgd/9284-fix-missing-aws-region branch from 7799140 to 08f6da2 Compare March 26, 2024 15:24
@radeusgd
Copy link
Member Author

After troubles with the AWS_Region widget, I went for using autoscoping and it seems to work :)

image

Copy link
Member

@jdunkerley jdunkerley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - couple of minor suggestions.

@radeusgd radeusgd added the CI: Ready to merge This PR is eligible for automatic merge label Mar 26, 2024
# Conflicts:
#	app/ide-desktop/lib/dashboard/src/index.html
@mergify mergify bot merged commit f2d6079 into develop Mar 27, 2024
40 checks passed
@mergify mergify bot deleted the wip/radeusgd/9284-fix-missing-aws-region branch March 27, 2024 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing region in AWS setup (e.g. datalink)
5 participants