IUSSP-CODATA FAIR Vocabularies Working Group

  

The IUSSP Council has approved a Scientific Panel on FAIR Vocabularies for Population Research, a joint initiative with CODATA, the Committee on Data of the International Science Council.  The working group is co-chaired by George Alter (University of Michigan, IUSSP) with Arofan Gregory (DDI Alliance) and Steven McEachern (Australian National University and DDI Alliance) from CODATA. 

 

This Panel responds to the growing movement to make data “Findable, Accessible, Interoperable, and Reusable” (FAIR).  Population research is an empirically focussed field with a long tradition of widely shared, easily accessible data collections.  The FAIR Principles point to ways that this tradition can be enhanced by taking advantage of emerging standards and technologies. This Panel will focus on the development of FAIR Vocabularies for population data, which is an essential step in making data reusable and interoperable.  

 

FAIR vocabularies yield benefits when data from different sources must be combined. Consider the most basic variable in demographic analysis: age.  OECD has a list of 643 age categories, while the UN Population Division copes with more than 1100 age groups.  If the meanings of variables in a dataset are only available through human-readable documentation, like a pdf, harmonizing data from two providers will remain a tedious manual process.  However, if the age categories are linked to persistent identifiers in machine actionable metadata, software can be coded to harmonize age groupings.  If these operations are performed across dozens of variables in hundreds of data sources, enormous amounts of human time will be saved.

 

In cooperation with CODATA, this new IUSSP Panel will build upon the work of the FAIR Vocabularies Group, who recently released “Ten Simple Rules for making a vocabulary FAIR”.  Most of their guidance is straightforward, like "Determine the governance arrangements and custodian responsible for the legacy vocabulary." But some steps require specialized expertise in standards like Simple Knowledge Organisation System (SKOS) or the Web Ontology Language (OWL).  FAIR vocabularies will also need to be maintained, requiring sustainable institutions with the capacity to maintain necessary technologies.  The Panel will be advised by members of the FAIR Vocabularies Group, which is chaired by Simon Cox (CSIRO Australia), and experts from other scientific domains will be invited to evaluate alternative strategies (e.g. centralized versus federated) and software. 

 

The operational goal will be to work with three to five partners in international organizations and academia to convert their existing vocabularies to FAIR principles.  The group will give special attention to coordinating with existing initiatives, like the terminology repository supported by Statistical Data and Metadata eXchange (SDMX).

 

The ultimate goal of this initiative is to make demographic data more interoperable by publishing controlled vocabularies that can be found and acted upon by software.  This has the potential to vastly reduce the costs of merging data from multiple sources for researchers seeking to use population data.  The Panel will learn where additional technical development is needed and when community involvement through IUSSP and other organizations is beneficial.  A two-year work plan is envisioned.

 

Members interested in learning more about this new initiative or participating in the work of this Panel should contact George Alter (FAIRvocab@iussp.org).