Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] provide consistent results for aws ec2 import-image and provide separate errors to allow triage #808

Open
PaulCharlton opened this issue Aug 19, 2024 · 11 comments
Assignees
Labels
ec2 feature-request New feature or request service-api This issue pertains to the AWS API

Comments

@PaulCharlton
Copy link

PaulCharlton commented Aug 19, 2024

Describe the bug

this works:

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (DryRunOperation) when calling the ImportImage operation: Request would have succeeded, but DryRun flag is set

and then, this fails:

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

Failure after dry run success is inconsistent with existing documentation.

context:

aws ec2 import-image --version
aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

Expected Behavior

--dry-run should fail if subsequent call without --dry-run is going to fail

Too much ambiguity in error response An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

error response should indicate precise nature of error, such as:

  1. can not use trust policy due to mis-matched principal
  2. missing permission GetObject for S3 bucket access
  3. specified role 'vmimport` does not exist
  4. vmimport role does not have appropriate trust relationship with user running the command
  5. STS is not enabled for your account in the target region
  6. sufficient permission is an inadequate response. What would be suitable is permission s3:getObject is required.
  7. ...

In reviewing errors of aws ec2 image-import reporting on various Internet forums, there are literally a dozen root causes which can cause the single error above.

Current Behavior

aws ec2 image-import should work if --dry-run is working [this is what the documentation states]

aws ec2 image-import help shows

       --dry-run | --no-dry-run (boolean)
          Checks whether you have the required permissions for the action,
          without actually making the request, and provides an error response.
          If you have the required permissions, the error response is
          DryRunOperation . Otherwise, it is UnauthorizedOperation .

Reproduction Steps

declare -rx S3_REGION="${S3_REGION:-us-east-1}"
declare -rx S3_ACCOUNT_ID="${S3_ACCOUNT_ID:-}"
declare -rx S3_BUCKET_NAME="${S3_BUCKET_NAME:-rawimages002}"
declare -rx S3_VM_IMPORT_POLICY_NAME="${S3_VM_IMPORT_POLICY_NAME:-VMImportPolicy002}"
declare -rx S3_VM_IMPORT_ROLE="${S3_VM_IMPORT_ROLE:-VMImportRole002}"
  provider_arch='x86_64'
  boot_type='bios'
  export_name='mbr_volume.vmdk'

  aws_create_vm_import_role
  aws_put_vm_import_role_policy

  # bucket already exists
  aws s3 cp \
    --region "${S3_REGION}" \
    ".results/${provider_arch}/${export_name}" \
    "s3://${S3_BUCKET_NAME}/${provider_arch}/${export_name}"

  aws_ec2_import_image "${provider_arch}" "${boot_type}" "${export_name}"
aws_ec2_import_image() {
  local -r provider_arch="${1}"
  local -r boot_type="${2}"
  local -r export_name="${3}"
  aws ec2 import-image \
    --region "${S3_REGION}" \
    --role-name "${S3_VM_IMPORT_ROLE}" \
    --disk-containers "file://"<(aws_disk_containers "${provider_arch}" "${boot_type}" "${export_name}")
}

aws_disk_containers() {
  local -r provider_arch="${1}"
  local -r boot_type="${2}"
  local -r export_name="${3}"
  cat <<CONTAINER_JSON
[ 
  { 
    "Description": "Image for ${provider_arch} with ${boot_type}",
    "Format": "vmdk",
    "UserBucket": {
      "S3Bucket": "${S3_BUCKET_NAME}",
      "S3Key": "${provider_arch}/${export_name}"
    }
  }
]
CONTAINER_JSON
}  
aws_create_vm_import_role() {
  local role_arn
  role_arn=$(aws iam get-role --role-name "${S3_VM_IMPORT_ROLE}" --query 'Role.Arn' --output text 2>/dev/null)
  if [ -z "${role_arn}" ]; then
    aws iam create-role \
      --region "${S3_REGION}" \
      --role-name "${S3_VM_IMPORT_ROLE}" \
      --assume-role-policy-document "$(aws_vm_import_role_trust_policy)"
  fi
}

aws_vm_import_role_trust_policy() {
cat <<TRUST_POLICY
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
       "Principal": {
          "Service": "vmie.amazonaws.com"
       },
       "Action": "sts:AssumeRole",
       "Condition": {
          "StringEquals":{
             "sts:Externalid": "${S3_VM_IMPORT_ROLE}"
          }
       }
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ec2.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::${S3_ACCOUNT_ID}:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
TRUST_POLICY
}
aws_put_vm_import_role_policy() {
  aws iam put-role-policy \
    --region "${S3_REGION}" \
    --role-name "${S3_VM_IMPORT_ROLE}" \
    --policy-name "${S3_VM_IMPORT_POLICY_NAME}" \
    --policy-document "file://"<(aws_vm_import_role_policy)
} 
  
aws_vm_import_role_policy() {
  cat <<ROLE_POLICY
{   
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect": "Allow",
         "Action": [
            "s3:GetBucketLocation",
            "s3:GetObject",
            "s3:ListBucket"
         ],
         "Resource": [
            "arn:aws:s3:::${S3_BUCKET_NAME}",
            "arn:aws:s3:::${S3_BUCKET_NAME}/*"
         ]
      },
      {
         "Effect": "Allow",
         "Action": [
            "s3:GetBucketLocation",
            "s3:GetObject",
            "s3:ListBucket",
            "s3:PutObject",
            "s3:GetBucketAcl"
         ],
         "Resource": [
            "arn:aws:s3:::export-bucket",
            "arn:aws:s3:::export-bucket/*"
         ]
      },
      {     
         "Effect": "Allow",
         "Action": [
            "ec2:ModifySnapshotAttribute",
            "ec2:CopySnapshot",
            "ec2:RegisterImage",
            "ec2:Describe*"
         ],
         "Resource": "*"
      },
      {
        "Effect": "Allow",
        "Action": [
          "kms:CreateGrant",
          "kms:Decrypt",
          "kms:DescribeKey",
          "kms:Encrypt",
          "kms:GenerateDataKey*",
          "kms:ReEncrypt*"
        ],
        "Resource": "*"
      },
      {
        "Effect": "Allow",
        "Action": [
          "license-manager:GetLicenseConfiguration",
          "license-manager:UpdateLicenseSpecificationsForResource",
          "license-manager:ListLicenseSpecificationsForResource"
        ],
        "Resource": "*"
      }
   ]
}
ROLE_POLICY
}

Possible Solution

What would be suitable is permission s3:getObject is required.

error response should indicate precise nature of error, such as:

  1. can not use trust policy due to mis-matched principal
  2. missing permission GetObject for S3 bucket access
  3. specified role 'vmimport` does not exist
  4. vmimport role does not have appropriate trust relationship with user running the command
  5. STS is not enabled for your account in the target region
  6. sufficient permission is an inadequate response. What would be suitable is permission s3:getObject is required.
  7. ...

In reviewing errors of aws ec2 image-import reporting on various Internet forums, there are literally a dozen root causes which can cause the single error above.

ps: Cloud Trace logs are also not showing the specific failed operation.

Additional Information/Context

aws ec2 import-image --version
aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

CLI version used

aws-cli/2.17.25 Python/3.11.9 Darwin/23.6.0 source/arm64

Environment details (OS name and version, etc.)

Darwin 14.5

@PaulCharlton PaulCharlton added bug Something isn't working needs-triage labels Aug 19, 2024
@tim-finnigan
Copy link

tim-finnigan commented Aug 19, 2024

Thanks for reaching out. The issue you described is with the EC2 ImportImage API / EC2 error codes rather than with the AWS CLI directly. We can reach out to the EC2 team with the request to improve the error messages here. (ref: P149339833). I'll transfer this to our cross-SDK respository for tracking since the issue involves a service API which is used across AWS SDKs in addition to the CLI.

Also there is a related troubleshooting guide: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-troubleshooting.html#import-image-errors

image

So there are several possible causes of that error, and the error message could potentially make that clearer.

@tim-finnigan tim-finnigan transferred this issue from aws/aws-cli Aug 19, 2024
@tim-finnigan tim-finnigan self-assigned this Aug 19, 2024
@tim-finnigan tim-finnigan added feature-request New feature or request service-api This issue pertains to the AWS API ec2 and removed bug Something isn't working needs-triage labels Aug 19, 2024
@PaulCharlton
Copy link
Author

Thanks @tim-finnigan. I already went through that troubleshooting guide in detail before I posted here. Something else going on. One big clue is that the role itself has never been accessed, which means that the Invalid parameter is being thrown prior to the adoption of the import role.

@PaulCharlton
Copy link
Author

PaulCharlton commented Aug 20, 2024

Status Update

  1. unresolved
    1. more specific errors from SDK call
    2. prove that the awscli json payload is correct
  2. resolved
    1. immediate problem of inability to use image-import -- in the past, the "vmimport" role was auto-provisioned and managed by AWS on first use of the API. This was deprecated in favor of the account owner creating a new role. Even more recently, in addition to the AWS Service being granted an "sts:assumeRole" policy permission, the API caller user must have the "iam:passRole" permission on their account, and the API caller user MUST NOT be "root" account user.

This knowledge regarding "iam:passRole" does not appear to be available in any online triage protocol for "import-image" that I have found, and was discovered by making the Import USER very promiscuous in granting "iam:*" as allowed actions, which made things work, then paring that grant down to the essence of WHICH IAM action caused things to work.

===> better telemetry from server-side failures is still needed.

@PaulCharlton
Copy link
Author

to the extent that better telemetry would introduce a breaking change if the HTTP response body is altered, the new payload info could be returned via a new response header field.

@PaulCharlton PaulCharlton changed the title [bug] aws ec2 import-image inconsistent results and too many error sources grouped into one error response [feature] provide consistent results for aws ec2 import-image and provide separate errors to allow triage Aug 20, 2024
@PaulCharlton
Copy link
Author

still more useless and ambiguous telemetry. A message which essentially says "upload deleted, invalid image due to missing filesystem components" -- needs to say what it was expecting, and what actually happened, like "/etc/fstab" is missing. or "root volume is missing", or "no partition contains the root volume", or "unable to install grub updates" would be much more informative.

https://docs.aws.amazon.com/vm-import/latest/userguide/what-is-vmimport.html

@PaulCharlton
Copy link
Author

this one An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions is also flat-out wrong when the error is that the USER invoking the SDK API does not have the "iam:passRole" action enabled.

@PaulCharlton
Copy link
Author

here's another useless message: ClientError: Unknown OS / Missing OS files.

ok, sure ... but which files are missing? Please.

@amberkushwaha
Copy link

here's another useless messages clientform but which files are missing in it.

@amberkushwaha
Copy link

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions

The context of the file is still in the middle of the conceptual behaviour and more often the file in it.code of conduct.contact the file remember the dialogue in the file concept of it.

Also its been in the contributions.

aws ec2 import-image --region us-east-1 --role-name VMImportRole002 --disk-containers file://.disks.json --dry-run

An error occurred (InvalidParameter) when calling the ImportImage operation: The service role VMImportRole002 provided does not exist or does not have sufficient permissions.paste drop

@amberkushwaha
Copy link

Add a comment in the main box d=systems of the following file in the circuit.paste drop or click to add files is also code of conduct in it for the given time time period and issues in it were also docs and contact management cookies section circuits were prompted for the main portals.contributing guidelines security policy and code of conduct.manage cookies in the file for interuptions.

@PaulCharlton
Copy link
Author

@amberkushwaha I do not understand your word-salad -- other than cut/paste from some of the comments above, why mention a "code of conduct" or "circuits" or "main portals" ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ec2 feature-request New feature or request service-api This issue pertains to the AWS API
Projects
None yet
Development

No branches or pull requests

3 participants