Wikidata, the first new project to emerge from theWikimedia Foundation since 2006, is now beginning development. The organization, known best for its user-edited encyclopedia of knowledge Wikipedia, recently announced the new project at February’s Semantic Tech & Business Conference in Berlin, describing Wikidata as new effort to provide a database of knowledge that can be read and edited by humans and machines alike
.
.
There have been other attempts at creating a semantic database built from Wikipedia’s data before – for example, DBpedia, a community effort to extract structured content from Wikipedia and make it available online. The difference is that, with Wikidata, the data won’t just be made available, it will also be made editable by anyone.
The project’s goal in developing a semantic, machine-readable database doesn’t just help push the web forward, it also helps Wikipedia itself. The data will bring all the localized versions of Wikipedia on par with each other in terms of the basic facts they house. Today, the English, German, French and Dutch versions offer the most coverage, with other languages falling much further behind.
Wikidata will also enable users to ask different types of questions, like which of the world’s ten largest cities have a female mayor?, for example. Queries like this are today answered by user-created Wikipedia Lists – that is, manually created structured answers. Wikidata, on the hand, will be able to create these lists automatically.
The initial effort to create Wikidata is being led by the German chapter of Wikimedia, Wikimedia Deutschland, whose CEO Pavel Richter calls the project “ground-breaking,” and describes it as “the largest technical project ever undertaken by one of the 40 international Wikimedia chapters.” Much of the early experimentation which resulted in the Wikidata concept was done in Germany, which is why it’s serving as the base of operations for the new undertaking.
The German Chapter will perform the initial development involved in the creation of Wikidata, but will later hand over the operation and maintenance to the Wikimedia Foundation when complete. The estimation is that hand-off will occur a year from now, in March 2013.
The overall project will have three phases, the first of which involves creating one Wikidata page for each Wikipedia entry across Wikipedia’s over 280 supported languages. This will provide the online encyclopedia with one common source of structured data that can be used in all articles, no matter which language they’re in. For example, the date of someone’s birth would be recorded and maintained in one place: Wikidata. Phase one will also involve centralizing the links between the different language versions of Wikipedia. This part of the work will be finished by August 2012.
In phase two, editors will be able to add and use data in Wikidata, and this will be available by December 2012. Finally, phase three will allow for the automatic creation of lists and charts based on the data in Wikidata, which can then populate the pages of Wikipedia.
In terms of how Wikidata will impact Wikipedia’s user interface, the plan is for the data to live in the “info boxes” that run down the right-hand side of a Wikipedia page. (For example: those on the right side of NYC’s page). The data will be inputted at data.wikipedia.org, which will then drive the info boxes wherever they appear, across languages, and in other pages that use the same info boxes. However, because the project is just now going into development, some of these details may change.