Difference between revisions of "APIs"

From Wikidocumentaries
Jump to navigation Jump to search
(Data provenance)
 
(33 intermediate revisions by the same user not shown)
Line 1: Line 1:
==Seaching for related content==
+
==Finding related content==
Several components can ideally display content from sources across the web. The goal is to make available tools to enrich Wikimedia projects with then. Possible scenarios include
+
Several components can ideally display content from sources across the web. The goal is to enrich Wikimedia projects with content from the sources using the available tools. We currently read some image sources, but the content could ideally include newspapers, sounds, archival material, films, videos, literature, magazines, scientific articles etc.
  
===For images===
+
The source content could be used in many ways: It could be imported as such, the metadata could be consolidated with metadata of similar objects from other outlets, or the data/metadata could be used to enrich items in Wikimedia projects. In these scenarios the open projects (Wikimedia, OSM) are seen as central databases which store the collectively enriched data and can serve that back to the institutions.
* Find PD or openly licensed images for a topic in different image repositories, based on all the available data in Wikidata (labels in different languages, aliases, time, location, authority data, other related information)
+
 
* Match metadata fields to Wikidata or SDC properties.
+
==Selecting data for the query==
* Instantly save image metadata without reconciliation as text. Provide opportunities to associate them with Wikidata items.
+
* Labels and aliases in all languages
* Save and share metadata field mappings for later use or for use by others.
+
** Not all languages can be handled by an API. The unsuitable ones need to be filtered our in the preprocessing phase.
* Detect similar images in different repositories
+
** If the main languages of the API are known, the query can use labels of that language as a primary option.
* Compare and consolidate metadata from different sources.
+
** Preprocessing can collect all necessary values and send to the local API. The local component for a specific API can arrange the data suitable for each API, for example concatenate the query strings in different ways, using AND or OR.
* Store metadata statement sources and create data provenance based on that.
+
* Item's location can be used to narrow down search results or to distinguish from namesakes.
* Manage copyright status for images based on information about creators.
+
* The dates of the item can be used to narrow down search results. This is especially useful with maps.
 +
* For maps the zoom level or the scale can be calculated using the size of the area the item covers.
  
 
==Metadata roundtripping==
 
==Metadata roundtripping==
[[File:Metadata transformation.png]]
+
<gallery mode=slideshow>
 +
Roundtrip.png
 +
Roundtrip (1).png
 +
Roundtrip (2).png
 +
Roundtrip (3).png
 +
Roundtrip (4).png
 +
Roundtrip (5).png
 +
Roundtrip (6).png
 +
Roundtrip (8).png
 +
Roundtrip (9).png
 +
</gallery>
 +
 
  
# GLAM makes available images and their metadata through their public API. Wikidocumentaries uses many properties from the current topic to query that.
+
# GLAM makes available images and their metadata through their public API. Wikidocumentaries uses many Wikidata properties from the current topic to query that.
# When reading the data through the Wikidocumentaries API, the metadata is normalized using different transformations for each GLAM.
+
# When reading the data through the Wikidocumentaries API, the metadata is normalized using a different transformation for each GLAM.
# The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display.
+
# The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display as strings.
 
# When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later.
 
# When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later.
 
# In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted.
 
# In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted.
Line 23: Line 35:
  
 
==Consolidate data from different sources for the same item==
 
==Consolidate data from different sources for the same item==
Especially if Wikidocumentaries decides to store images or their metadata locally, these scenarios become available.
+
Especially if Wikidocumentaries decides to store images or their metadata locally, these scenarios become available. Similar images from different sources can be detected. Their metadata can be compared, and the user is asked to verify the correct information. The updated data is saved in the central repository (Wikimedia Commons) with a reference to the source that provided this information. One of the information types that can be compared is the copyright status.
* Similar images from different sources can be detected. Their metadata can be compared, and the user is asked to verify the correct information. The updated data is saved in the central repository (Wikimedia Commons) with a reference to the source that provided this information. One of the information types that can be compared is the copyright status.
 
 
 
==Search criteria for different media types==
 
  
===Images===
+
==APIs in Wikidocumentaries==
Image search is ideally based on
+
===Wikimedia APIs===
* Item label in the current language > must be changed to read the most suitable label.
 
* Native label
 
* Aliases > Which languages
 
* Item location > must be made more contextual
 
* Item date
 
 
 
===Digitized maps===
 
* Date
 
* Names of historical administrative entities
 
* Zoom level / scale
 
 
 
==Wikimedia APIs==
 
  
 
* '''MediaWiki API'''
 
* '''MediaWiki API'''
Line 51: Line 48:
 
** [https://www.wikidata.org/w/api.php Wikidata API help] - MediaWiki API
 
** [https://www.wikidata.org/w/api.php Wikidata API help] - MediaWiki API
  
==Image APIs==
+
===Image APIs===
 +
====In use====
  
 
* '''Wikimedia Commons'''
 
* '''Wikimedia Commons'''
Line 65: Line 63:
 
* '''Flickr'''
 
* '''Flickr'''
 
** [https://www.flickr.com/services/api/ Flick API Docs] - Includes useful API Explorer
 
** [https://www.flickr.com/services/api/ Flick API Docs] - Includes useful API Explorer
* '''Smithsonian'''
 
* '''Paris Musées'''
 
* '''Creative Commons Search, Openverse'''
 
* '''Internet Archive'''
 
* '''Ajapaik'''
 
** [https://ajapaik.ee Ajapaik.ee]
 
* '''Topothek'''
 
** Images from Topotheks submitted by local people
 
* '''K-Samsök'''
 
  
==Data APIs==
+
====To be explored====
 +
* [https://www.programmableweb.com/api/smithsonian-institution-open-access Smithsonian Institution Open Access API]
 +
* [https://www.programmableweb.com/api/smk-open-rest-api-v110 SMK Open API]
 +
* [https://www.programmableweb.com/api/metropolitan-museum-art-met-collection Metropolitan Museum of Art Met Collection API]
 +
* [https://www.programmableweb.com/api/deutsche-digitale-bibliothek Deutsche Digitale Bibliothek API]
 +
* [https://www.programmableweb.com/api/rijksmuseum Rijksmuseum API]
 +
* [https://www.programmableweb.com/api/digitalnz DigitalNZ API]
 +
* [https://www.programmableweb.com/api/artsy-rest-api-0 Artsy REST API]
 +
* [https://www.programmableweb.com/api/natural-history-museum-rest-api Natural History Museum REST API]
 +
* [https://www.programmableweb.com/api/victoria-albert-museum-rest-api Victoria & Albert Museum REST API]
 +
* Paris Musées
 +
* [https://www.programmableweb.com/api/soch SOCH API]
 +
* Creative Commons Search, Openverse
 +
* [https://archive.org/services/docs/api/ Internet Archive]
 +
* [https://www.programmableweb.com/api/cleveland-museum-art-open-access The Cleveland Museum of Art Open Access API]
 +
* [https://ajapaik.ee Ajapaik.ee]
 +
* Topothek
 +
* [https://pro.dp.la/developers/api-codex DPLA]
 +
* Structured Data on Commons
 +
* [https://www.loc.gov/apis/ Library of Congress]
 +
 
 +
===Data APIs===
 +
====To be explored====
 
* '''Nimiarkisto.fi'''
 
* '''Nimiarkisto.fi'''
 
* '''Linked Data Finland'''
 
* '''Linked Data Finland'''
** BiographySampo
+
** [https://colab.research.google.com/drive/1scxCJl-w0Fsq_cn1cI1gwIhUk9DLA4K4?usp=sharing BiographySampo]
 
** SotaSampo
 
** SotaSampo
 
* '''FIN-CLARIN'''
 
* '''FIN-CLARIN'''
  
==Map APIs ==
+
===Map APIs ===
 +
====To be explored====
 
* '''Map Warper'''
 
* '''Map Warper'''
 
** [https://github.com/timwaters/mapwarper/blob/master/README_API.md Map Warper API documentation]
 
** [https://github.com/timwaters/mapwarper/blob/master/README_API.md Map Warper API documentation]
 +
* Library of Congress
 +
 +
==Wikimedia integrations==
 +
* Wikisource
  
 
{{design-nav}}
 
{{design-nav}}

Latest revision as of 19:46, 11 December 2021

Finding related content

Several components can ideally display content from sources across the web. The goal is to enrich Wikimedia projects with content from the sources using the available tools. We currently read some image sources, but the content could ideally include newspapers, sounds, archival material, films, videos, literature, magazines, scientific articles etc.

The source content could be used in many ways: It could be imported as such, the metadata could be consolidated with metadata of similar objects from other outlets, or the data/metadata could be used to enrich items in Wikimedia projects. In these scenarios the open projects (Wikimedia, OSM) are seen as central databases which store the collectively enriched data and can serve that back to the institutions.

Selecting data for the query

  • Labels and aliases in all languages
    • Not all languages can be handled by an API. The unsuitable ones need to be filtered our in the preprocessing phase.
    • If the main languages of the API are known, the query can use labels of that language as a primary option.
    • Preprocessing can collect all necessary values and send to the local API. The local component for a specific API can arrange the data suitable for each API, for example concatenate the query strings in different ways, using AND or OR.
  • Item's location can be used to narrow down search results or to distinguish from namesakes.
  • The dates of the item can be used to narrow down search results. This is especially useful with maps.
  • For maps the zoom level or the scale can be calculated using the size of the area the item covers.

Metadata roundtripping


  1. GLAM makes available images and their metadata through their public API. Wikidocumentaries uses many Wikidata properties from the current topic to query that.
  2. When reading the data through the Wikidocumentaries API, the metadata is normalized using a different transformation for each GLAM.
  3. The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display as strings.
  4. When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later.
  5. In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted.
  6. Maybe the GLAM could query Wikidocumentaries for changed information?

Consolidate data from different sources for the same item

Especially if Wikidocumentaries decides to store images or their metadata locally, these scenarios become available. Similar images from different sources can be detected. Their metadata can be compared, and the user is asked to verify the correct information. The updated data is saved in the central repository (Wikimedia Commons) with a reference to the source that provided this information. One of the information types that can be compared is the copyright status.

APIs in Wikidocumentaries

Wikimedia APIs

Image APIs

In use

To be explored

Data APIs

To be explored

  • Nimiarkisto.fi
  • Linked Data Finland
  • FIN-CLARIN

Map APIs

To be explored

Wikimedia integrations

  • Wikisource


Navigation

About Technology Design Content modules Tool pages Projects
Status

Wikidocumentaries blog

Wikidocumentaries demo

Phabricator project

Facebook group

GitHub repository

Translation in TranslateWiki

Wikidocumentaries Slack

Setting up dev environment

Resources

Translations

Languages

Using Wikibase

Federation with Wikidata

APIs

Linking

Media metadata

Properties to content

User registration

Licensing

Page types

Landing page

Search page

Topic page

User page

Organisation page

Project page

Tool page

Tasks

Components

Main toolbar and footer

Search

Faceting

Topic page header

Content module

Dropdown

Modal

Icons

Active modules

Article

Family tree

Gallery

Historical map

Images

Image viewer

Infobox

Map

Module ideas

Audio/Video

Bibliography

Correspondence

Discussion

Graph

Name

Newspaper articles

Testimonials

Timeline

Visualizations

Wikidocumentary

3D

Visual editor

Query tool

QuickStatements

Open Refine

Creating topics

Uploading images

Geolocating tool

Metadata editor

Rectifying maps

Transcription

Annotation tool

Central Park Archives

Convent Quarter

Wikisource