-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structure metadata and additional properties in output documents #502
Comments
I think making those documents a subclass of Personally, I'm less keen on adding calculation parameters for the reason that you mentioned - these are code specific so creating a general schema will be difficult and could change in the future, e.g., with ML force fields. Both the |
Maybe,however, we could add information to the documentation on how to retrieve the static computations belonging to an elastic or phonon computation? It might help clarifying this for all users. |
Agreed, that would be very useful. |
Thanks for the feedback. Indeed I was using a needlessly complicated approach to query the input parameters. |
+1 for I strongly support keeping calculation details out of the document model, except maybe in a very abstract way (e.g. something like an originating task_id key, or a generic method field or similar). In general, I'm observing that it's very difficult to stop a proliferation of document models. There's already another elastic property document model being proposed (tagging @janosh) for ML-generated elastic properties, despite the fact the property itself is the same. We have multiple document models for the same task in different repos, or even within the same repo etc. What I'm proposing/asking for is that we adopt a very disciplined approach for keeping document models focused and consistent, e.g. one and only one model per property, and thinking very carefully about what fields to include and how to define units etc. In the long term I think this will greatly benefit us (e.g. allowing analysis that combines property data generated via multiple methods), and avoid very costly database migrations (e.g. if a document model changes in a breaking way). |
I strongly agree. Also, very timely observation that applies more generally across the MP stack considering the current discussion to deduplicate |
Btw, same applies probably to phonons as well. Also outputs of workflows to compute phonons with forcefields. |
Fixed in #514 |
Flows like the ones calculating elastic tensors or phonons usually produce a final document containing the parsed results but no default metadata about the structure or how the calculation was performed. This makes somewhat difficult to retrieve the outputs, as one does not have any obvious way to filter them. I think it would be usuful to convert the output documents (e.g.
ElasticDocument
andPhononBSDOSDoc
) to be subclasses ofStructureMetadata
. I don't think this will add much data to the DB, but could be useful.Another related point is the fact that the final document does not contain the input parameters used to perform the simulation. To retrieve that information one would first need to match the final output to the perturbation jobs and extract the information from there. While this is not too complex, it might be still be a bit involved for a casual user. It could thus be interesting to store the input parameters of one of the main steps of the flow in the final document (e.g. one of the perturbations for elastic or phonon flow). However this would be code dependent and not necessarily representative of all the steps of the flow, so it may be less straightforward than the structure metadata.
If any of these options seem acceptable I can try to work on it.
The text was updated successfully, but these errors were encountered: