Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix http corrupted download #5275

Merged
merged 5 commits into from
Sep 3, 2024
Merged

Fix http corrupted download #5275

merged 5 commits into from
Sep 3, 2024

Conversation

pditommaso
Copy link
Member

This PR tries to address the issue with corrupted download reported by #5214.

The overall approach consist on using an input stream filter that checks that the number of bytes read matches the expected one.

However, I think this kind of check is already implemented by the http client, and therefore still persists.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso pditommaso marked this pull request as draft September 2, 2024 15:40
Copy link

netlify bot commented Sep 2, 2024

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 08d00e2
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/66d6bd60796acb0008e2837e
😎 Deploy Preview https://deploy-preview-5275--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso
Copy link
Member Author

Actually using the python server included in #5214 it looks working properly.

@scwatts would you be able to test this branch will real data?

@pditommaso pditommaso marked this pull request as ready for review September 2, 2024 16:37
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@scwatts
Copy link

scwatts commented Sep 3, 2024

@pditommaso I've also confirmed this branch catches corrupt HTTP downloads as served by the toy example I gave. Testing this is in a real-world situation is tough since it relies on the intermittent failure of the Cloudflare R2 server.

I have oncoanalyser running in a loop configured only to stage and decompress the large VIRUSBreakend database with both Nextflow compiled from this branch and the v24.04.4 release. So far I have only captured one instance of such a server failure, which was in a run with this branch.

Your changes look to correct the behaviour: oncoanalyser.nextflow_branch.stdstrm.failure_captured.txt.

I'll post here to confirm the negative case under the same conditions where I am able observe that as well.

@pditommaso
Copy link
Member Author

Excellent!

@pditommaso pditommaso merged commit bf0cd32 into master Sep 3, 2024
5 checks passed
@pditommaso pditommaso deleted the fix-http-corrupted-download branch September 3, 2024 14:50
@scwatts
Copy link

scwatts commented Sep 3, 2024

After leaving the oncoanalyser loops to run overnight I've now replicated six fatal failures with v24.04.4 and an additional three successfully handled failures using this branch.

Thank you for this fix @pditommaso!

@pditommaso
Copy link
Member Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Foreign file staging can allow pipelines to proceed with corrupt files
2 participants