Skip to content

Mtg 230821

Bryan Lawrence edited this page Aug 21, 2023 · 1 revision

When: 2023-08-21

Who: Bryan, David George

Technology Bits

  1. Apparently "CMOR variable names" appear as netcdf variable names, so we probably need to ensure we pull out the netcdf variable names as variable properties as well.
  2. We observe that the table frequency names of the DRS (e.g. 1hr, 3hr, 3hrPt) are not in CF, and must either be taken from filenames at aggregation time, from a combination of the (XIOS?) interval-write and the cell method, or by inspection of the time-coordinate and cell method. For CANARI we can use interval-write, but we need this functionality.
    1. This is a bit messy, since the table name and frequency name are effectively parts of a domain description. Surely we can do better in CF/CMIP? To discuss with the CMIP7 folks.
  • #bryan needs to do some triage on the cfstore tickets and push most to some legacy state. 📅 2023-08-22

Use Case Discussion

Think about the following use case:

  • Entire Canari (as it stands, has several simulations, with several variant ids, with differing levels of data, with some data on JASMIN disk, some on tape, and some still at ARCHER).
    • For this version, let's assume we can ignore the data at Archer unless it is relevant for the workflow of updating the database. Query this with George!

Then:

What experiments have been run?

  • Soon will include historical as well as SSP370. What about the various different macro/micro initialisation, how do we find that out?

What variants (e.g. r1i1f1) have been run? (How do we get from those variant ids to explanations of what they mean)?

How many years of data have been run for a particular variant simulation? (Clue, we have a bounding box, how do we get at that in the web interface)?

What do we do about data that doesn't have interval-write? Can we inspect the files at aggregation time and construct an interval-write if it is the same across the aggregation? (If it is not, what does that mean)?

What data is online across an experiment? For a particular variable? For a particular frequency (interval-write) and/or cell-method.

Having used "faceted browse" (the check boxes on the right hand side of the current GUI), when we hit submit, we expect:

  • the list on the left (list of variables), to shrink according to the constraints of the facets
  • when we click on that variable, we get presented with a list of "version" "collections", that is, a list of variables which differ across a collection of properties. (At this point, I think George creates a collection, but I think that's precipitate)
  • At that point , there are three (possibly) four sets of interesting information: 1. The metadata they have in common 2. The metadata (keys) that differs across the set 3. The values of the metadata properties that correspond to those differences. 4. Some of those are trivial/management (e.g. creation date) and some are important/science (e.g. cell-method).
  • We need to be able to display that information, and allow someone to create/name a collection when they have selected one of these "versions".

Current GUI:

Pasted image 20230821141504

Clone this wiki locally