wiki:ticket/930

Version 2 (modified by bryan, 2 years ago) (diff)

--

Experiment Naming

(ticket:930)

One of the difficulties with marrying the Metafor description of experiments with the  http://cmip-pcmdi.llnl.gov/cmip5/output_req.html?submenuheader=2#req_formatDRS and  http://cmip-pcmdi.llnl.gov/cmip5/experiment_design.html?submenuheader=1 Taylor et al descriptions (which differ) are that the word experiment is used in different ways by CMIP5 and Metafor.

So starting with some helpful concepts (from metafor):

  • An experiment is a description of an activity that needs to be carried out. It is independent of whether or not anyone has carried out such an activity. So, in the case of a numerical experiment, it could exist, whether or not someone has "run" a "simulation" which conforms to it or not.
    • In the case of CMIP5, the metafor experiment description should correspond as closely as possible to the description in Taylor et al ...
  • A simulation (in metafor) describes one or more (in which case it is an ensemble of ) "runs" which conform to the experiment.

(In truth I think we need to rename simulation in metafor to become "ensemble" and accept that it might have only one member - since any simulation can be turned into an ensemble at some later date. We should consider that as part of ticket:920.)

In the questionnaire, we enter an ensemble by defining as simulation, and then adding modifications for ensemble members ...

So

  • In the case of CMIP5, we probably want simulations to correspond to ensembles within specific Taylor et al experiments.

Now, let's take the DRS view of the world:

/CMIP5/output1/UKMO/HadCM3/decadal1990/day/atmos/day/r3i2p1/v20100105/tas/ 
tas_day_HADCM3_ decadal1990_r3i2p1_199001-199012.nc

corresponding to

<activity>/<product>/<institute>/<model>/<experiment>/<frequency>/<modeling realm>/
<MIP table>/<ensemble member>/<version number>/<variable name>/ <CMOR filename>.nc

In this string, the experiment concept conflates the idea of an experiment, a simulation, and to an extent, the concept of an ensemble.

To a great extent that conflation is Bryan's fault: he took his eye off the ball at a crucial time, and we are where we are. We need to fix it though ...

It has been compounded by three further issues, typified by the confusion with experiment names in the DRS and Taylor et al.

  1. In Taylor et al, experiment names are, for example
    1. 1.1 "Ensemble of 10-year hindcasts and predictions"
    2. 1.2 "Ensemble of 30-year hindcasts and predictions"
    3. 7.1 historical with natural forcings
    4. 7.2 historical with GHG only
    5. 7.3 other historical forcings
  2. Which are, in DRS language:
    1. experiment name decadalXXXX where XXXX should be the initialisation year.
    2. experiment name decadalXXXX where XXXX should be the initialisation year.
      • i.e. no different from the first, so to find 1.2 experiments, you need to search for simulations which went for 30 years ...
    3. "historicalNat":
    4. "historicalGHG"
    5. "historicalMisc"

Each of these can of course be carried out with an ensemble, and the following is mandatory:

r<N>i<M>p<L>

in all DRS names, where:

  1. p<L> is used to identify perturbations in physics (from the same base model).
  2. p<M> is used to identify perturbations in initialisation method, and
  3. The "realisation" number is used to identify ensemble members which differ in other characteristics.

So the three issues, to be clear, are:

  1. Are the experiment names decadal1960 etc or are they decadal?
  2. Should we have experiments with suffixes (such as E for extended)?
  3. Should we ever expect to see any of r.i.p appearing in an experiment description?

The first should be resolved as follows:

  1. The official experiment names should be "decadal", and that's what the questionaire should show.
    • Metafor should provide guidance that folks doing the 1960 ensemble should create a simulation with start date 1960, and an ensemble within that, similarly a simulation for 1965 etc ...
    • The questonnaire to XML code should conflate the experiment name (decadal) with the start year, and write that into the "target DRS name" attribute for each ensemble member thus allowing
    • The portal to go from ensemble member to the right metadata.
    • Issue: do we believe the ESG portal is handling ensembles sensibly yet? Is there a ticket on that? (At one point they wanted all ensemble members to be handled as simulations, in curator language, which would result in very unwieldy lists following searches.)
  2. The second issue can be handled via the same mechanism effectively. Folks should see both decadal-E as an experiment, and enter their extensions as continuations of simulations from the first. However, in their filenames on disk, they follow the DRS naming, and we deal with it in metadata.
  3. The answer to the last question is unambiguously no. And we can't expect folks to assume that any given p number will correspond to the same forcing between experiments or between modelling centres. Folks will need to go to metadata (both sorts) to get to that!