Dedicated Dashboard, another tool to improve the data quality in Wikidata
At Wikimania 2018, Lydia Pintscher, Product Manager for Wikidata at Wikimedia Deutschland (WMDE), presented a cool poster about data quality tools for Wikidata. Several tools are cited, including:
- built-in tools in Mediawiki, fully integrated into Wikidata, like Watchlist, Recent changes, and Page information;
- Constraints, an implementation of business rules in Wikidata;
- Wikidata vandalism dashboard, a dashboard using ORES to find possible vandalism by language;
- Listeria, a bot that generates lists using a SPARQL query to Wikidata (example);
- sparql-rc, a tool to track changes in Wikidata;
- EditGroups, a user interface to monitor sets of changes in Wikidata.
In my opinion, this list lacks the tools dedicated to a specific topic, like the dashboard I made about French members of parliament (MPs) and which first version was released in 2017.
A dedicated dashboard
The goal of this tool is to provide a quick and convenient way to monitor data quality about the members of French National Assembly in the Fifth Republic. It is divided in two components:
- a dashboard, that gives numbers based on several criteria;
- a list of MPs when you click on a cell of the dashboard.
Each row of the dashboard represents a business rule, the first ones providing general statistics (for example number of MPs) and the following ones data quality issues (for example MPs without France as country of citizenship). To facilitate the work on this set, the data is broken down by time and each column of the dashboard represents a legislative term.
When you click on a cell, MPs meeting the criteria (both from the line and from the column) are listed. For each MP, the tool provides a link to the Wikidata item, but also to the French Wikipedia, and to useful third-party databases, like French National Assembly. This allows Wikidata contributors to easily find data quality issues on this topic and to quickly fix them with reliable sources.
Statistics in the main dashboard are updated once a day, lists of MPs are live from the Wikidata Query Service (WDQS, which has a 5 minutes cache).
Generalizing
I generalized the features of this tool into a new one, Dedicated Dashboard, which allows you to set up your own dashboard on the topic of your choice, using SPARQL queries to populate it. Several examples are provided:
Only the links to Wikimedia projects and third-party databases are not yet implemented. The tool also needs a way to list existing dashboards, allowing users to easily manage them (at the moment, the configuration of a dashboard is stored in its URL).
Happy birthday Wikidata 😉