Metadata for a Global Spatial Data Infrastructure
David M. Danko
National Imagery and Mapping Agency
Standards and Interoperability Division
12310 Sunrise Valley Drive
Reston, VA 22091
USA
ABSTRACT
The International Organization for Standardization Technical Committee for Global GIS Standards (ISO/TC211) is developing an integrated suite of standards to promote global interoperability. Metadata is an important part of this standard. Metadata provides a vehicle to locate and understand geospatial data which may be produced by one community and applied by another. This paper examines the premises and principles which are guiding the development of the metadata standard, its present status, and its future in supporting a Global Spatial Data Infrastructure (GSDI).
INTRODUCTION
Someday, as the use of geospatial datasets increases:
These and many other questions require a good understanding of data. They require that data be well documented; they require complete and correct metadata. As we move into the age of spatial data infrastructures, metadata is essential, allowing users to locate, evaluate, extract, and employ geospatial data. Diverse communities with a common understanding of metadata will be able to manage, share, and reuse each other's geographic data, and make global interoperability a reality. The ISO Standard for Geographic Information - Metadata (ISO 15046-15) will provide this common understanding.
TRADITIONS IN METADATA
Metadata is not new; it is used every day in library card catalogs, Compact Disc (CD) jackets, user’s manuals, and in many other ways. Geographic data has a long history using metadata. The marginalia on maps and charts are, of course, metadata. The title, source, scale, accuracy, producer, symbols, navigation notices, warnings, and all of the information found in the borders of maps and charts are metadata. This metadata is very user oriented; just about anyone can pick up a map, understand the metadata, and use the map. Map catalogs are another traditional use of metadata. Typically, catalog metadata is limited to information such as area coverage, series identifiers (subject matter and scale), publication dates, and distribution information.
The tradition of metadata continued as we entered the digital world. Digital products such as the defense community’s DIGEST/VPF based products, Vmap™ and DNC™ for example, are loaded with metadata. The whole philosophy behind these products is to provide information to the data users, people that need to make geographic decisions when using the data.
Several years ago, with the increase in the number of geospatial datasets, the need for metadata to support the sharing and efficient utilitization of geospatial data became apparent. In 1994 the US Federal Geographic Data Committee (FGDC) developed the FGDC Content Standard for Geospatial Metadata to facilitate locating data, determining its fitness for use, and promoting data sharing. The FGDC Clearinghouse is now implementing this standard using SGML, Z39.50, and other protocols.
METADATA PERSPECTIVES
Non-geographers using geospatial data:
A revival in the awareness of the importance of geography and how things relate spatially, combined with the advancement in the use of electronic technology, have caused an expansion in the use of digital geospatial information and geographic information systems worldwide. Increasingly, individuals from a wide range of disciplines outside of the geographic sciences and information technologies are capable of producing, enhancing, and modifying digital geospatial information. As the number, complexity, and diversity of geospatial datasets grow, a method for providing an understanding of all aspects of this data grows in importance.Geospatial data is imperfect: Digital geospatial data is an attempt to model and describe the real world. Any description of reality is always an abstraction, always partial, and always just one of many possible "views". This "view" or model of the real world is not an exact duplication; some things are approximated, others are simplified, and some things are ignored - there is no such thing as perfect, complete, and correct data. To insure that data is not misused, the assumptions and limitations affecting the collection of the data must be fully documented. Metadata allows a producer to fully describe a dataset; users can understand the assumptions and limitations and evaluate the dataset's applicability for their intended use.
Increasingly, the producer is not the user: Most geospatial data is used multiple times, perhaps by more than one person. Typically, it is produced by one individual or organization and used by another. Proper documentation will provide those not involved with data production with a better understanding of the data and enable them to use it properly. As geospatial data producers and users handle more and more data, proper documentation will provide them with a keener knowledge of their holdings and allow them to better manage data production, storage, updating, and reuse.
The Metadata Environment
Metadata is required in at least four different circumstances and perhaps in different forms to facilitate its use in these situations: in a catalog for the discovery and locating of geospatial data; within a database management system or within a dataset to be used by application software, operating on geospatial data; in a historical archive; and in a human readable form to allow users to understand and get a feel for the data they are using.
Catalogs: Metadata for cataloging purposes should be in a form not unlike a library card catalog or on-line catalog. Metadata in a catalog should support searches by subject matter/area coverage/theme, author/producer, detail/resolution/scale, currency/date, data structure/form, and physical form/media.
Historical Records: Metadata should support the documentation of data holdings to facilitate storage, updates, production management, and maintenance of geospatial data. Historical records will provide legal documentation to protect an organization if conflict arises over the use or misuse of geospatial data.
Within a geospatial dataset: Metadata should accompany a dataset and be in a form to support the proper application of geospatial data. GIS and other application software using data will need to evaluate data as it applies to a situation. In this form the metadata may be incorporated into the structure of the data itself.
Human readable: Metadata in a form in which a computer can locate and sort information, or manage the warehousing, production, and use by application software, will greatly enhance the use of geospatial data but eventually a human must understand the data. One person’s or organization’s geospatial data is a subjective abstract view of the real world, it must be understood by others to ensure the data is used correctly. Metadata needs to be in a form which can be readily and thoroughly understood by users.
Metadata Supported Applications
Metdata supports many applications, these can be classified into four primary functions:
Locate: Metadata enables users to locate geospatial information and allows producers to "advertise" their data. Metadata helps organizations locate data outside the organization and find partners to share in data collection and maintenance.
Evaluate: By having proper metadata elements describing a dataset users will be able to determine its "fitness for an intended use." Understanding the quality and accuracy, the spatial and temporal schema, the content, and the spatial reference system used, allows users to determine if a dataset fills their needs. Metadata also provides the size, format, media, price, and restrictions on use.
Extract: After locating a dataset and determining if it meets users needs, metadata is used to describe how to access a dataset and transfer it to a specific site. Once it has been transferred, users will need to know how to process and interpret the data and incorporate it into their holdings.
Employ: Metadata is needed to support the processing and the application of a dataset. Metadata facilitates utilization of data, allowing users to properly merge and combine data with their own, apply it properly, and have a full understanding of its properties and limitations.
STANDARDS FOR GLOBAL INTEROPERABILITY
In late 1994, the International Organization for Standardization formed a Technical Committee, ISO TC211, to establish standards for Geographic Information. The work is being performed in five working groups. The standards development effort is further divided into 20 Work Items with a project leader for each item. Working Group 1 is developing the high-level conceptual design - integrating the geographic standard with existing and developing information technology standards; Working Group 2 is standardizing the spatial schema - how geographic things are modeled; Working Group 3 is standardizing geographic data administration - the management and description of geographic data; Working Group 4 is standardizing geospatial services; and Working group 5 is developing standards for profiles - how to develop and register profiles. Many of the major geographic standards existing today will probably become registered profiles of the ISO standard. The 20 Work Items and the development schedule for the standard are shown in Figure 1.
Figure 1.
ISO 15046-15 Geographic Information - Metadata
The ISO metadata standard is being developed within Working Group 3. It defines and standardizes a comprehensive set of metadata elements and their characteristics and the schema necessary to fully, and extensively, document geographic data. The standard applies to all geographic data, it is applicable to dataset series, datasets, individual geographic features, and their attributes. The standard defines metadata for two levels of conformance: Conformance Level 1 metadata - a minimum number of metadata elements (50) which support the cataloging of datasets for discovery, especially in on-line catalogs and clearinghouses. It is actually a profile or subset of the full metadata set; Conformance Level 2 metadata describes a complete inventory of metadata required to fully describe geospatial data. Many of these are optional and, when used, will standardize metadata down to the lowest level of detail. For ease of understanding, the metadata is divided into sections: Identification Information, Data Quality Information, Lineage Information, Reference System Information, Spatial Representation Information, Feature Catalog Information, Distribution Information, and Metadata Reference Information.
Because of the diversity of geographic data, no single set of metadata elements will satisfy all requirements. For this reason the ISO metadata standard provides a standard way for users to extend their metadata and still ensure interoperability. By using standard methods other users will be able to understand and use this extended metadata.
An Informative Section provides users with ideas for implementing metadata and provides sample metadata sets and guidelines for use.
Development Methodology
Development of the metadata standard did not need to start from scratch. Geographic information has been in use for many years, and experience has been growing in this field. Many national, regional and special-use groups have developed standards and methods for transferring and handling this type of data. This highlights the need for an international standard. These standards evolved in separate niches and in many cases are incompatible. However, the experience gained in the development and use of these standards was invaluable in the development of the ISO metadata standard. The initial ISO standard was based on the ANZLIC Working Group on Metadata: Core Metadata Elements, the Canadian Directory Information Describing Digital Geo-referenced Data Sets, the European Committee for Standardisation (CEN) Standard for Geographic Information - Metadata, and the US Federal Geographic Data Committee (FGDC) Content Standard for Geospatial Metadata. Many transfer/exchange standards also carry metadata. These transfer standards were also examined and provided input to the ISO standard. Some of the transfer standards which provided metadata elements were: the Digital Geographic Information Exchange Standard (DIGEST), the International Hydrographic Organization Special Publication 57, and the Spatial Data Transfer Standard (SDTS). Experienced users of these standards also added input as to what worked, what didn’t work, and what was important. Early drafts of the metadata standard were reviewed by Project Experts representing countries from around the world. These individuals represented the producer and user communities; so the metadata requirements and experiences of both of these communities were incorporated into the developing standard.
As proposed metadata elements were considered for incorporation into the standard, they were examined as to their utility in supporting the applications in the environments mentioned in the previous paragraphs of this paper. All candidate metadata elements were compared against a Metadata Usage Reference Matrix (Table 1) before inclusion into the metadata standard.
Catalog |
Within Dataset |
Historical Record |
Human Readable |
|
Locate |
X |
X |
X |
|
Evaluate |
X |
X |
X |
X |
Extract |
X |
X |
||
Employ |
X |
X |
Table 1. Metadata Usage Reference Matrix
Community Profile
The ISO metadata standard defines almost 300 metadata elements, most of which are listed as "optional. Individual communities, nations, or organizations will develop "community profiles" of the ISO standard. They will make a select set of metadata elements mandatory for their applications or special needs. For an example: the "price" of a dataset may be established as "mandatory" for a certain community which will always want that metadata element reported. A community of users may also want to establish additional metadata elements which are not in the ISO standard. For example: a community may want to develop metadata elements for the status of datasets within their own system to help manage production. However, these added elements will not be known outside the community unless they are published. Profiles may be applied within an organization, nationally, or internationally. International profiles are registered within ISO as International Standardized Profiles (ISP).
Community profiles may also establish field sizes and domains for metadata elements. If one system within a community uses 32 characters for the title of a dataset and another system handles 8 characters interoperability will not be achieved. Standardizing selected domains within a community is also important, allowing more efficient searches and less chance of misunderstanding of the data.
SUMMARY
Metadata has always played an important role in cartography. For centuries it has provided users with an understanding of maps. It has allowed us to find the maps we need in map catalogs. Metadata is equally important as we have moved into the digital environment. Because digital data is an imperfect representation of the real world, assumptions made during production need to be understood by users. With the proliferation of data from an ever widening array of sources and producers, we need information to control and manage geographic data. Metadata is absolutely essential to spatial data infrastructures, networks, and warehouses. Proper metadata will allow users across networks to locate, evaluate, extract, and employ geographic data. Metadata adhering to an international standard will allow global networks to operate, providing a common global understanding of geographic data and promoting global interoperability.
REFERENCES