Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test database persistency on argocd #2807

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

Test database persistency on argocd #2807

wants to merge 11 commits into from

Conversation

anna-parker
Copy link
Contributor

@anna-parker anna-parker commented Sep 16, 2024

resolves #

preview URL:

Summary

Launch database then try adding more metadata fields to this persistent database, currently running here: https://test-presistent.loculus.org/.
Then:

  • Confirm uploading sequences with the new fields does not crash the database:
  • upload of new cchf sequences with empty lat and long fields is successful:
image

Hmmm... the page is behaving ok but I see some LAPIS issues when I look at argo (SILO pods are degraded):

[2024-09-18 08:33:54.882] [logger] [info] [logging_request_handler.cpp:21] Request Id [c562bba9-db79-40ef-aa8f-cb3dd0579a84] - Responding with status code 503
[2024-09-18 08:33:55.099] [logger] [info] [logging_request_handler.cpp:15] Request Id [ff72065f-bf61-46b8-91f0-bacb9bb41030] - Handling GET /info
[2024-09-18 08:33:55.099] [logger] [info] [error_request_handler.cpp:30] Caught exception: Database not initialized yet
[2024-09-18 08:33:55.099] [logger] [info] [logging_request_handler.cpp:21] Request Id [ff72065f-bf61-46b8-91f0-bacb9bb41030] - Responding with status code 503
[2024-09-18 08:33:55.343] [logger] [info] [database_directory_watcher.cpp:112] No data found in /data/ for ingestion

maybe this is just due to LAPIS being restarted? Hmm... search and download is working... interesting so LAPIS for cchf seems to have restarted logs look fine and downloads now include the new metadata fields, however for west nile and ebola where the SILO pods are degraded downloads do not include the new fields

Ok I tried to force the SILO pods to restart by deleting the old ones but they stay in this state.

So it would seem that until new sequences are uploaded which include all metadata fields the new SILO pods cannot start...

Screenshot

PR Checklist

  • All necessary documentation has been adapted.
  • The implemented feature is covered by an appropriate test.

@anna-parker anna-parker added the preview Triggers a deployment to argocd label Sep 16, 2024
@anna-parker
Copy link
Contributor Author

Sadly I don't think this actually tested persistency as the database pod restarted after the last commit:
image

@@ -39,5 +39,5 @@ spec:
- name: POSTGRES_HOST_AUTH_METHOD
value: "trust"
- name: LOCULUS_VERSION
value: {{ $dockerTag }}
value: "commit-e3f268f"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For info - this isn't consumed at all by Postgres so it could also just be removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(It is used specifically to cause the db to be recreated because that is useful in some scenarios)

@anna-parker
Copy link
Contributor Author

Looks like the culprit is this error in SILO preprocessing:

[2024-09-18 08:34:49.414] [logger] [error] [api.cpp:282] The field 'geoLocLatitude' which is contained in the database config is not contained in the input field '/preprocessing/input/data.ndjson.zst'.
std::exception

@anna-parker
Copy link
Contributor Author

This makes sense - get-released-data just streams the contents of the processed metadata joint_metadata field in the sequence_entries_view. This metadata was computed by preprocessing before the additional metadata fields were added and thus does not contain the fields.

@anna-parker
Copy link
Contributor Author

anna-parker commented Sep 18, 2024

I think it makes total sense now that we see this behavior. We could either alter get-released-data to do this check (doesn't really make sense) or as already suggested just increase the version of the preprocessing pipeline causing all samples to have the new fields.

Also we allow the json to not include all metadata fields and just assume they are null if not given

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Triggers a deployment to argocd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants