Ticket #729 (assigned Task)
Service Model within the CIM
| Reported by: | bryan | Owned by: | bryan |
|---|---|---|---|
| Priority: | blocker | Milestone: | D4.5 Service reportand revised infrastructure |
| Component: | WP4 - Deployment of Services | Version: | |
| Keywords: | Cc: | ||
| Requirement: | http://metaforclimate.eu/Work-Package-2/Developing-the-CIM/Project-Requirements-summary.htm | ||
Description
This issue is associated with modelling services applying to multiple replicates of data.
The issue is that a thredds catalog provides a description of data along with services local to that specific thredds server. It currently has no discrimination between properties from the underlying files, and properties added in by the ESG publisher and the Thredds catalog itself.
We and ESGF are considering the Thredds catalog as authoritative about data held on a specific node, but there is no comprehensive thredds data mode, beyond an ad hoc description.
Since for the forseeable future, the data available at any location is likely to be exposed by a Thredds catalog (hereafter a tds-c) in an ESG data node, we need to think about the consequences.
To first order the tds-c consists of a bunch of data descriptions and a bunch of service descriptions. We need to harvest those into
- CIM data objects, and
- CIM service objects (the latter don't yet exist).
We need to ensure that ESG publisher tags the datasets in such a way that we can compare the dataset descriptors at two different TDS sites, and confirm if they are describing the same data.
Then we need to ensure that we can extract the service information, so that when we ingest the information, we can do one of the following:
- Simply add the service description to our database,
- Add the data description (it's currently unique), and then add a service description, and
- In both cases, register the associatoin which points from the service to the data (and if they exist, their replicants).
note the complication if one tds-c has part of a dataset held elsewhere.
