The GDC Blog
This is the first in a series of dialogues, tips & tools to help understand what “good data” is and how to license it. Our goal is to help develop and share best practice information within the community of users with activities outside of their traditional home markets. The topics will usually fall into one of three areas: Data Sourcing, Data Licensing or Data Management.
While we are highly knowledgeable about international data & technology, we are not professional bloggers; so if we violate some general standards then please, be gentle.
Data Sourcing- The Definition of Good Data
In dialogue with numerous customers in the past, a key issue they seem to have is determining how good is the data.
There is no generally accepted standard for measuring the quality of international reference data. Many data suppliers or technologies utilizing international reference data have created ambiguous standards related to their specific datasets. The intent is to simplify the understanding of what can be a complex topic. Typically suppliers present a scoring between A+ to C, with A+ being defined as the “Best Data” and C being “As Good As It Gets Data.” Customers should remember that the “Best Data” standard may really mean “The Best the Supplier Can Get.” Customers are not able to easily compare solutions between vendors and as a result, one vendor scores an A+, another vendor scores an A- and neither vendor may offer a true level of completeness for the country in question.
How to solve this problem? When discussing quality of data with a vendor start by asking these three questions:
When was the dataset created?
How many sources comprise the dataset?
Are the sources governmental, private or both?
How many unique records are in the dataset?
In our next entry we will discuss how the answers to these questions can be used to better assess, if you have the right vendor for your needs.