The FAIR principle for research data is gaining increasing attention with the development of open science and data-driven research. “FAIR” is an acronym for four words: “Findable,” “Accessible,” “Interoperable,” and “Reusable.” As it is becoming common to understand that data services should meet these four criteria at a high level, DIAS also recognizes the importance of enhancing the value of data services, referring to this FAIR Principle.
Metadata is the critical information that makes data findable. What is the data about, how and when was it obtained, who or what organization is responsible for it, what kind of pre-processing or post-processing did they apply to it, and in what format is it provided? With this information, data can be searched, sorted, and used. In other words, metadata is created not for the data creators and providers but for the “others,” the data users. DIAS provides the DIAS metadata management system to support metadata entry, and our data curators review the metadata to improve the Findability of the data.
Furthermore, DIAS is enhancing the findability of data by promoting metadata sharing with external services other than DIAS.
- First, by providing metadata created by DIAS to external services, DIAS datasets are also made findable by external services.
- Second, by collecting metadata created besides DIAS, we make their datasets findable from the DIAS Dataset Search and Discovery system.
- Third, by utilizing the DIAS metadata infrastructure, it is possible to create metadata about external datasets besides DIAS and make them findable.
Thus, DIAS publishes metadata, the content linked to datasets, to the public in cooperation with data creators as an infrastructure for making various global environmental datasets findable.
Provision of DIAS metadata as an external service
We will introduce three services as activities to make DIAS datasets findable by providing metadata created by DIAS to external services other than DIAS.
Providing metadata to the GEOSS Platform operated by the Group on Earth Observations (GEO)
GEOSS is an international initiative to utilize Earth observation data widely for solving social issues. The GEOSS Platform is a system that automatically collects (“harvests”) metadata of earth observation data released by organizations in various countries, and DIAS metadata is a part of the collection. The metadata collected by the GEOSS Platform can be searched in the GEOSS Portal, making the DIAS data set findable by the international research community.
There is a growing trend in open science to assign identifiers to all academic resources, including papers and datasets. It has already become common in many academic fields that the idea of including DOIs in the citation in academic papers to enhance the sustainability of access. Including dataset DOIs as the citation in the future will become more common. Including DOIs in citations will play an essential role in visualizing the contributions of various research activities as citation relationships. Therefore, DIAS also actively assigns DOIs to datasets. DIAS provides DataCite with metadata to assign DataCite DOIs on datasets the provider wishes to do. A variety of services based on DOIs use this metadata. Typical examples include research achievement management services such as ORCID, researchmap, and research achievement search services such as CiNii Research. Furthermore, DIAS has developed Mahalo Button as a service that utilizes DOIs and is expanding from understanding the status of dataset usage using DOIs to a service that visualizes the contribution of data creators.
Support for Google Dataset Search
Google dataset search service works by embedding appropriate metadata in dataset publication pages, and Google crawls them and incorporates the metadata into its search engine. DIAS dataset is already searchable via Google Dataset Search, and it is promising in a wide range of fields.
Finally, we want to discuss metadata licensing, which is crucial when providing metadata to external services. The external services mentioned above show that metadata is material for making datasets findable and that it is essential to guarantee the freedom of using metadata by external services. Therefore, DIAS licenses metadata under the Creative Commons License CC0. This license allows use under conditions as close as possible to the public domain. We hope this will lead to greater use of metadata. However, since the purpose of promoting the use of metadata is to make datasets findable, the licenses and conditions of use set by the data providers still apply to downloading datasets.
Collection of external metadata
DIAS operates a service in cooperation with the external service to collect metadata from external services, not only to provide external services. Currently, DIAS links with four services below:
You can search metadata collected from these services in the DIAS Search and Discovery system. The search results indicate that the dataset originates from the linked system, and a link to the source dataset publication page is displayed. If your institution is considering a new linkage, don’t hesitate to contact DIAS.
Use case of DIAS metadata infrastructure
In addition to the case of linking DIAS and external services so far, there has been a case where DIAS has become a base for providing metadata related to external datasets. DIAS has created and released metadata for the datasets created and managed in the projects listed in the “Action Plan for Earth Observation in Japan” compiled by MEXT.
In this way, the metadata infrastructure that DIAS has built to date has many functions related to creating, collecting, and distributing metadata and is possible to use as a distribution center of metadata in the global environment field. If you have any new requests for such various service linkages or utilization of the metadata infrastructure, please consult the DIAS office.