Loading...
 
ESA > Join & Share > Forums > LTDP SAFE > SAFE Metadata Model for Auxiliary files

LTDP SAFE

Help

Show posts:
Jump to forum:

SAFE Metadata Model for Auxiliary files

During the PDR-C collocation meeting it was agreed to analyse if OGC-OM can be used as base metadata model for auxiliary data, and if it is not possible to use it has to be decided which metadata model has to be used as a base for auxiliary data (PDR-C_A10).

The attached document “SAFE Metadata Model for Auxiliary Data” (PDGS-SAFE-GMV-TN-12/0214) provides the analysis and conclusions reached on this topic.
All your comments will be appreciated.

Best Regards.

Adrián Sanz (GMV)
LTDP SAFE Project Manager


Re: SAFE Metadata Model for Auxiliary files

Comments from EUMETSAT:
1)



Re: SAFE Metadata Model for Auxiliary files

Please consider the following comments from EUMETSAT:
1) have a product format version additional to file version
2) have a quality flag describing the quality of the aux data (e.g. passed, degraded, failed)
3) have validityStartTime and validityStopTime as applicable date/times for the aux data.
4) discuss if (like presented) grouping aux files into groups as described is beneficial in comparison to specify all different aux file types (e.g. Sentinel 3 has ~245 different aux file types, each specified as a type)
5) do we need the mission name in the aux metadata? Currently it's not there.
6) do we need some information about the origin of the aux file/data, e.g. creation organisation/facility?
7) safe-aux:earthSurfaceRepresentation can be vector, then maybe safe-aux:vectorTopology should include as well multipolygon
8) safe-aux:earthSurfaceRepresentation can be vector, then maybe a bounding (multi)polygon to describe the extent in addition to bounding box would be beneficial?

Regards
--
Stephan Zinke


Re: Re: SAFE Metadata Model for Auxiliary files

Dear Stephan,

Thank you so much for your feedback. Your comments had been checked and some changes introduced in the proposed schema, see below a brief description

1) have a product format version additional to file version
A new field set for this purpose, named safe-aux:fileFormatInformation has been added within safe-aux:fileInformation
In turn safe-aux:fileFormatInformation comprises safe-aux:fileFormat and safe-aux:fileFormatVersion

2) have a quality flag describing the quality of the aux data (e.g. passed, degraded, failed)
A new field set for this purpose, named safe-aux:qualityInformation has been added within safe-aux:fileInformation
In this case a subsidiary type has been defined safe-aux:qualityInformationValueType, allowed values are: “PASSED”, “DEGRADED”, “FAILED” and other (to be defined by the user)

3) have validityStartTime and validityStopTime as applicable date/times for the aux data.
A new field set for this purpose, named safe-aux:validityTime has been added within safe-aux:fileInformation.
In turn safe-aux:validityTime comprises safe-aux:validityTimeStartTime and safe-aux:validityTimeStopTime

4) discuss if (like presented) grouping aux files into groups as described is beneficial in comparison to specify all different aux file types (e.g. Sentinel 3 has ~245 different aux file types, each specified as a type)
The design of the schema was driven, among other questions already discussed in the document resulting of the trade off, by the idea of abstraction, making it open and flexible enough to accommodate any kind of file types, independently to its usage or purpose. At the same time grouping file types into thematic groups allows the discovery, through metadata catalog, of the auxiliary files, i.e. an user would be interested in gathering information on the DEMs archived, if they are not indexed into thematic groups, getting such kind of information would be difficult.

5) do we need the mission name in the aux metadata? Currently it's not there.
We have internally discussed this question during the design phase, it was decided not to include it due to the fact of in some cases, an (individual) auxiliary file is being employed as auxiliary by different missions (i.e. star files, land-sea mask)

6) do we need some information about the origin of the aux file/data, e.g. creation organisation/facility?
A new field set for this purpose, named safe-aux:fileOriginInformation has been added within safe-aux:fileInformation
In turn safe-aux:fileFormatInformation comprises safe-aux:fileOriginOrganisation, safe-aux:fileOriginFacility and safe-aux:fileOriginArchivingCenter

7) safe-aux:earthSurfaceRepresentation can be vector, then maybe safe-aux:vectorTopology should include as well multipolygon
The option “MULTIPOLYGON” has been included within the restriction list of safe-aux:VectorTopologyValueType/VectorTopologyValueEnumerationType. At the same time a simple type safe-aux:VectorTopologyValueType/VectorTopologyValueOtherType has been created to accommodate other topologies

8) safe-aux:earthSurfaceRepresentation can be vector, then maybe a bounding (multi)polygon to describe the extent in addition to bounding box would be beneficial?
Some changes had been introduced in the field safe-aux:SpatialDomain as follows:
A new complex type, safe-aux:boundingVox has been created, comprising the already existing simple types; safe-aux:westBoundingCoordinate, safe-aux:eastBoundingCoordinate, safe-aux:northBoundingCoordinate, safe-aux:southBoundingCoordinate

To accommodate the definition of polygon and multipolygon topologies two types had been created, safe-aux:boundingPolygon (cardinality 1) and safe-aux:boundingMultipolygon (cardinality 2…*)
Spatial domain described by closed polygon (last point=first point), Latitude, Longitude pairs. Expected structure is gml:Polygon/gml:exterior/gml:LinearRing/gml:posList.


Regards



Re: SAFE Metadata Model for Auxiliary files

Dear GMV,
thanks a lot for your prompt feedback.
Re. 4: I understand the rationale behind and was just wondering, if Sentinel 3 was ever to be merged/transcribed into the SAFE 2.0, if that would create issues. I personally feel as well that the S3 approach is over the top.
Would be interesting though what other members of this discussion group think...
Re. 5: Understood.
Re. 8: "safe-aux:boundingMultipolygon (cardinality 2…*)", why not use here the gml:MultiPolygon approach, then the cardinality need only be 1.
Regards
--
Stephan


Re: Re: SAFE Metadata Model for Auxiliary files

> Re. 8: "safe-aux:boundingMultipolygon (cardinality 2…*)", why not use here the gml:MultiPolygon approach, then the cardinality need only be 1.

You're right Stephan, it should be gml:MultiPolygon (cardinality 1). The schema will be updated accordingly

Thanks



Re: SAFE Metadata Model for Auxiliary files

Hi.

On other activities not directly related to SAFE, we have actually been thinking about this and we have opted for quite a different solution. In general, we believe that you need very little metadata for auxiliary files, so much so that we are planning to extract the important metadata from the filename, eventually by imposing a suitable naming convention. We will not use any metadata model specific for auxiliary files. This is more than enough for missions like ENVISAT and Earth Explorers (GOCE, SMOS, Cryosat) which pretty much "see" auxiliary products as normal products and apply exactly the same naming convention to both.

We consider that the important information to have is the file type and the validity start/stop time (that Stephan asked for as well). This is the basic information that is used for selecting auxiliary files, namely for processing purposes. In fact, what you "need" is determined by what you want to do with it and while "normal" metadata is useful for cataloguing and searching, it is not common, at least today and for the average user, to want a catalogue of auxiliary data and wanting to search for it, at least in a very advanced manner. At ESA, in general, the services provided for "normal" data cataloguing and search are not integrated with those for auxiliary data (typically, for the latter, you have just an FTP/HTTP server giving access to all data, static web pages with links, etc.).

I think the analysis on the TN is interesting and may be useful as input to auxiliary metadata standardization efforts (in current or future HMA projects, for example), but for SAFE I'm not so sure. In 2.0 we have made steps in the standardization direction (with DFDL and OGC EOP O&M), so a proprietary metadata model seems like going in the opposite direction, especially if we don't have strong use-cases in support. From a Long-Term Preservation perspective, it may make more sense to record quite some information about auxiliary files than in other perspectives, but adopting a standard (which does not exist today) is, I would say, even more important from that perspective.

It also seems to me that filling all these metadata fields for each auxiliary file type would be a burden, first on the definition of the specialisation, and then on the conversion infrastructure/activities and this has never been given any thought in the current Toolset/concept, because auxiliary data was largely disregarded in SAFE 1.3 (so, further work and complexity on that side).

To conclude, we should consider limiting our ambitions and in particular I propose that either we simply say that there is no special metadata file for auxiliary files (the SAFE structure, in any case, would not prevent you from adding any metadata in any format you wish to) or at least that the auxiliary metadata file, following this proposed schema, is optional inside the Auxiliary SAFE Package (as opposed to the OGC EOP O&M metadata file which is mandatory for EO Product Packages).

As for the model itself, I would just comment that since most of the types are defined as enumerations of fixed values (with the option to add virtually anything else as "OTHER"), there is an underlying assumption that the content is not extracted from the products (or some other related entity) but rather statically defined by an operator or even as part of the specialisation for the given auxiliary file type, a priori to conversion/creation. This means that the content will typically be exactly the same for all products of the same type, i.e. defining characteristics of the product type rather than the product. This is of limited value (also adds redundancy) and would make more sense as part of "auxiliary data collection metadata", which does not exist in SAFE and I'm not proposing to include.

Paulo



Show posts:
Jump to forum: