Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BCDA-8301: Fix large object S3 trigger + update logging #984

Merged
merged 14 commits into from
Aug 13, 2024
Merged

Conversation

kyeah
Copy link
Contributor

@kyeah kyeah commented Aug 7, 2024

🎫 Ticket

https://jira.cms.gov/browse/BCDA-8301

🛠 Changes

  • Look for ObjectCreated:CompleteMultipartUpload events
  • Remove non-JSON logging statements
  • Add additional log fields (i.e. import_filename and import_count) to easily find logs for specific import runs

ℹ️ Context

Noticed that most files received yesterday were quietly skipped. After investigation, when a large file is uploaded, it triggers the multipart upload event instead of the ObjectCreated:Put event.

This blog post indicates that there are several different events that can trigger an ObjectCreated:* event: https://aws.amazon.com/blogs/aws/s3-event-notification/

In addition, the error alert was triggered due to non-JSON logs being picked up, and several "warning-only" logs being logged at the ERROR level. This updates that to ensure only real errors are logged at the ERROR level.

🧪 Validation

  • Integration testing
  • I was able to deploy this to prod and import one of the large files (triggered an event by re-uploading the file.)

@kyeah kyeah marked this pull request as ready for review August 7, 2024 20:18
@kyeah kyeah changed the title (CCLF Import) Fix large object S3 trigger + update logging BCDA-8301: Fix large object S3 trigger + update logging Aug 9, 2024
@@ -308,7 +286,7 @@ func (importer CclfImporter) ImportCCLFDirectory(filePath string) (success, fail
}

if len(cclfMap) == 0 {
log.API.Info("Failed to find any CCLF files in directory")
importer.Logger.Info("Failed to find any CCLF files in directory")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit on the existing log message: should we update the language here to omit the word failed, since a lack of CCLF files doesn't necessarily indicate that something has gone wrong?

Possibly something like this: No CCLF files in directory

@@ -360,14 +335,12 @@ func (importer CclfImporter) ImportCCLFDirectory(filePath string) (success, fail
defer c()
return importer.FileProcessor.CleanUpCCLF(ctx, cclfMap)
}(); err != nil {
fmt.Println(err.Error())
log.API.Error(err)
importer.Logger.Error(err)
}

if failure > 0 {
err = errors.New("one or more files failed to import correctly")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another logging suggestion/non-blocking issue: would it be helpful to include the # of files that failed to import vs a generic message?

@@ -58,11 +59,12 @@ func cclfImportHandler(ctx context.Context, sqsEvent events.SQSEvent) (string, e
// Send the entire filepath into the CCLF Importer so we are only
// importing the one file that was sent in the trigger.
filepath := fmt.Sprintf("%s/%s", e.S3.Bucket.Name, e.S3.Object.Key)
logger.Infof("Reading %s event for file %s", e.EventName, filepath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like the addition of logging here to get more visibility!

@laurenkrugen-navapbc
Copy link
Contributor

LGTM! Were there any files that were too large for ingestion that were cleaned up by BFD that we may need to retrieve to try and process again?

@kyeah
Copy link
Contributor Author

kyeah commented Aug 13, 2024

LGTM! Were there any files that were too large for ingestion that were cleaned up by BFD that we may need to retrieve to try and process again?

Good question! The timing is such that it's unlikely there were files cleaned up before I noticed, but I can double-check since we're still waiting for A* model entity files this month (last import was July 8th looking at the prod DB).

@kyeah kyeah merged commit ceafe63 into main Aug 13, 2024
11 checks passed
@kyeah kyeah deleted the kev/log-eventname branch August 13, 2024 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants