Mass data imports in the CEE region

Let’s talk about importing data sets from/about the CEE region into Wikidata.

What projects/datasets do you know have already been imported into Wikidata?

What datasets are you aware of that could be imported, but haven’t been yet? Perhaps we can organize a technical team to help accomplish that.

Some examples of datasets that can be imported:

  • Census data
  • Open government/municipal data
  • catalogs and metadata of GLAMs
  • prize winners and recipients of state honors
  • past and present members of parliament

Relevant wiki pages:

5 Likes

What about data from Book Chamber of Ukraine (Книжкова палата України) with their data on books printed in Ukraine? Is it relevant to your request?

2 Likes

Hi @Perohanych and welcome in Wikimedia Discuss-Space,

I think, Asaf offered his nice support to import existing, digitally available and free datasets into Wikidata. The Book Chamber of Ukraine can be relevant, if the printed books with their data are digitized and there is no legal restriction to use these data in Wikidata. If the second condition is fulfilled, but the first isn’t, it can be a nice Wikisource project first…

But please correct me, if I am wrong.

2 Likes

In Estonia we are slowly importing data about every painting in a museum collection in the country. Thanks to the fact that all public museums share a database, MuIS, it’s at least doable to use the same process to import almost all of them (although the quality of the data itself does vary a lot depending on the museum). In the future we might want to expand this to other kinds of artworks (such as sculptures).

We recently imported coordinates and addresses for every library in the country (the data had been compiled by people at the National Library but it wasn’t really being used anywhere publicly as far as we know). In 2020 we plan to do the same for museums, schools and kindergartens.

We’re trying to find a good source with a compatible license for lexeme importing (or to convince one of the main dictionaries to change their already pretty open CC-BY license to CC0).

I’d love to import election data, but I’m not quite sure of its licensing nor of the Wikidata guidelines for notability with regard to elections (I know people who got elected to the national parliament are notable, but what about people who stood but did not get elected, or about people elected to a small municipality’s council?) :)

3 Likes

One current global project that needs to be mentioned here is FindingGLAMs.

While the project also invites small contributions, it has a list of available datasets for many countries. CEE communities definitely can play a role here.

2 Likes

We have population list of Turkey districts and cities. from 2007 till 2017. We collect manually Q ids for all list. I told with @Tgr on CEE meeting for this mass upload.

3 Likes

We imported the data about cultural monuments of Slovenia two years ago, which enabled us to do Wiki Loves Monuments this year (as I was explaining in my talk in Belgrade last month).

The Slovene “Spatial data and application portal” features export-friendly APIs that may be useful again in the future - for now, only the Registry of Cultural Monuments is there in full, other government agencies have not yet put their data up.

Lists of monuments obviously - where they exist in a reasonable form.

Census and /or recent population data is to some extent available for most countries - I have encountered it for Russia, Ukraine, Belarus, Kazakhstan, and Austria. I do not think it has been imported (perhaps for Austria, though I did not check).