Difference between revisions of "APIs"
(→Metadata roundtripping) |
|||
(39 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | Several components can ideally display content from sources across the web. The goal is | + | ==Finding related content== |
+ | Several components can ideally display content from sources across the web. The goal is to enrich Wikimedia projects with content from the sources using the available tools. We currently read some image sources, but the content could ideally include newspapers, sounds, archival material, films, videos, literature, magazines, scientific articles etc. | ||
− | == | + | The source content could be used in many ways: It could be imported as such, the metadata could be consolidated with metadata of similar objects from other outlets, or the data/metadata could be used to enrich items in Wikimedia projects. In these scenarios the open projects (Wikimedia, OSM) are seen as central databases which store the collectively enriched data and can serve that back to the institutions. |
− | * | + | |
− | * | + | ==Selecting data for the query== |
− | * | + | * Labels and aliases in all languages |
− | * | + | ** Not all languages can be handled by an API. The unsuitable ones need to be filtered our in the preprocessing phase. |
− | + | ** If the main languages of the API are known, the query can use labels of that language as a primary option. | |
− | * | + | ** Preprocessing can collect all necessary values and send to the local API. The local component for a specific API can arrange the data suitable for each API, for example concatenate the query strings in different ways, using AND or OR. |
− | * | + | * Item's location can be used to narrow down search results or to distinguish from namesakes. |
− | * | + | * The dates of the item can be used to narrow down search results. This is especially useful with maps. |
+ | * For maps the zoom level or the scale can be calculated using the size of the area the item covers. | ||
==Metadata roundtripping== | ==Metadata roundtripping== | ||
− | + | <gallery mode=slideshow> | |
+ | Roundtrip.png | ||
+ | Roundtrip (1).png | ||
+ | Roundtrip (2).png | ||
+ | Roundtrip (3).png | ||
+ | Roundtrip (4).png | ||
+ | Roundtrip (5).png | ||
+ | Roundtrip (6).png | ||
+ | Roundtrip (8).png | ||
+ | Roundtrip (9).png | ||
+ | </gallery> | ||
− | # GLAM makes available images and their metadata through | + | |
− | # When reading the data through the Wikidocumentaries API, the metadata is normalized using different | + | # GLAM makes available images and their metadata through their public API. Wikidocumentaries uses many Wikidata properties from the current topic to query that. |
− | # The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display. | + | # When reading the data through the Wikidocumentaries API, the metadata is normalized using a different transformation for each GLAM. |
+ | # The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display as strings. | ||
# When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later. | # When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later. | ||
# In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted. | # In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted. | ||
+ | # Maybe the GLAM could query Wikidocumentaries for changed information? | ||
− | == | + | ==Consolidate data from different sources for the same item== |
− | + | Especially if Wikidocumentaries decides to store images or their metadata locally, these scenarios become available. Similar images from different sources can be detected. Their metadata can be compared, and the user is asked to verify the correct information. The updated data is saved in the central repository (Wikimedia Commons) with a reference to the source that provided this information. One of the information types that can be compared is the copyright status. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | ==Wikimedia APIs== | + | ==APIs in Wikidocumentaries== |
+ | ===Wikimedia APIs=== | ||
* '''MediaWiki API''' | * '''MediaWiki API''' | ||
Line 45: | Line 48: | ||
** [https://www.wikidata.org/w/api.php Wikidata API help] - MediaWiki API | ** [https://www.wikidata.org/w/api.php Wikidata API help] - MediaWiki API | ||
− | ==Image APIs== | + | ===Image APIs=== |
+ | ====In use==== | ||
* '''Wikimedia Commons''' | * '''Wikimedia Commons''' | ||
Line 59: | Line 63: | ||
* '''Flickr''' | * '''Flickr''' | ||
** [https://www.flickr.com/services/api/ Flick API Docs] - Includes useful API Explorer | ** [https://www.flickr.com/services/api/ Flick API Docs] - Includes useful API Explorer | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | ==Data APIs== | + | ====To be explored==== |
+ | * [https://www.programmableweb.com/api/smithsonian-institution-open-access Smithsonian Institution Open Access API] | ||
+ | * [https://www.programmableweb.com/api/smk-open-rest-api-v110 SMK Open API] | ||
+ | * [https://www.programmableweb.com/api/metropolitan-museum-art-met-collection Metropolitan Museum of Art Met Collection API] | ||
+ | * [https://www.programmableweb.com/api/deutsche-digitale-bibliothek Deutsche Digitale Bibliothek API] | ||
+ | * [https://www.programmableweb.com/api/rijksmuseum Rijksmuseum API] | ||
+ | * [https://www.programmableweb.com/api/digitalnz DigitalNZ API] | ||
+ | * [https://www.programmableweb.com/api/artsy-rest-api-0 Artsy REST API] | ||
+ | * [https://www.programmableweb.com/api/natural-history-museum-rest-api Natural History Museum REST API] | ||
+ | * [https://www.programmableweb.com/api/victoria-albert-museum-rest-api Victoria & Albert Museum REST API] | ||
+ | * Paris Musées | ||
+ | * [https://www.programmableweb.com/api/soch SOCH API] | ||
+ | * Creative Commons Search, Openverse | ||
+ | * [https://archive.org/services/docs/api/ Internet Archive] | ||
+ | * [https://www.programmableweb.com/api/cleveland-museum-art-open-access The Cleveland Museum of Art Open Access API] | ||
+ | * [https://ajapaik.ee Ajapaik.ee] | ||
+ | * Topothek | ||
+ | * [https://pro.dp.la/developers/api-codex DPLA] | ||
+ | * Structured Data on Commons | ||
+ | * [https://www.loc.gov/apis/ Library of Congress] | ||
+ | |||
+ | ===Data APIs=== | ||
+ | ====To be explored==== | ||
* '''Nimiarkisto.fi''' | * '''Nimiarkisto.fi''' | ||
* '''Linked Data Finland''' | * '''Linked Data Finland''' | ||
− | ** BiographySampo | + | ** [https://colab.research.google.com/drive/1scxCJl-w0Fsq_cn1cI1gwIhUk9DLA4K4?usp=sharing BiographySampo] |
** SotaSampo | ** SotaSampo | ||
* '''FIN-CLARIN''' | * '''FIN-CLARIN''' | ||
− | ==Map APIs == | + | ===Map APIs === |
+ | ====To be explored==== | ||
* '''Map Warper''' | * '''Map Warper''' | ||
** [https://github.com/timwaters/mapwarper/blob/master/README_API.md Map Warper API documentation] | ** [https://github.com/timwaters/mapwarper/blob/master/README_API.md Map Warper API documentation] | ||
+ | * Library of Congress | ||
+ | |||
+ | ==Wikimedia integrations== | ||
+ | * Wikisource | ||
{{design-nav}} | {{design-nav}} |
Latest revision as of 19:46, 11 December 2021
Contents
Several components can ideally display content from sources across the web. The goal is to enrich Wikimedia projects with content from the sources using the available tools. We currently read some image sources, but the content could ideally include newspapers, sounds, archival material, films, videos, literature, magazines, scientific articles etc.
The source content could be used in many ways: It could be imported as such, the metadata could be consolidated with metadata of similar objects from other outlets, or the data/metadata could be used to enrich items in Wikimedia projects. In these scenarios the open projects (Wikimedia, OSM) are seen as central databases which store the collectively enriched data and can serve that back to the institutions.
Selecting data for the query
- Labels and aliases in all languages
- Not all languages can be handled by an API. The unsuitable ones need to be filtered our in the preprocessing phase.
- If the main languages of the API are known, the query can use labels of that language as a primary option.
- Preprocessing can collect all necessary values and send to the local API. The local component for a specific API can arrange the data suitable for each API, for example concatenate the query strings in different ways, using AND or OR.
- Item's location can be used to narrow down search results or to distinguish from namesakes.
- The dates of the item can be used to narrow down search results. This is especially useful with maps.
- For maps the zoom level or the scale can be calculated using the size of the area the item covers.
Metadata roundtripping
- GLAM makes available images and their metadata through their public API. Wikidocumentaries uses many Wikidata properties from the current topic to query that.
- When reading the data through the Wikidocumentaries API, the metadata is normalized using a different transformation for each GLAM.
- The metadata from different GLAMs is displayed in a uniform format in the Wikidocumentaries metadata display as strings.
- When an image is saved to Wikimedia projects, users can reconcile string values with Wikidata items. String values can be saved as well, they will be available for reconciling later.
- In the Wikidocumentaries metadata interface, the string values are replaced by reconciled Wikidata items. Differences between the source data from the GLAM and Wikidata, such as recent changes in the GLAM's metadata, can also be highlighted.
- Maybe the GLAM could query Wikidocumentaries for changed information?
Consolidate data from different sources for the same item
Especially if Wikidocumentaries decides to store images or their metadata locally, these scenarios become available. Similar images from different sources can be detected. Their metadata can be compared, and the user is asked to verify the correct information. The updated data is saved in the central repository (Wikimedia Commons) with a reference to the source that provided this information. One of the information types that can be compared is the copyright status.
APIs in Wikidocumentaries
Wikimedia APIs
- MediaWiki API
- https://www.mediawiki.org/wiki/API:Tutorial
- https://en.wikipedia.org/w/api.php Wikipedia API help
- Wikidata
- Wikidata Query Service - SPARQL
- Wikidata API help - MediaWiki API
Image APIs
In use
- Wikimedia Commons
- Wikimedia Commons API help - MediaWiki API
- Finna.fi
- Images from Finnish museums
- Publications from Finnish libraries
- Documentation
- Finna API documentation
- Old but still somewhat useful Finna API documentation
- CC licensing data available, prepared for RightsStatements
- Europeana
- Flickr
- Flick API Docs - Includes useful API Explorer
To be explored
- Smithsonian Institution Open Access API
- SMK Open API
- Metropolitan Museum of Art Met Collection API
- Deutsche Digitale Bibliothek API
- Rijksmuseum API
- DigitalNZ API
- Artsy REST API
- Natural History Museum REST API
- Victoria & Albert Museum REST API
- Paris Musées
- SOCH API
- Creative Commons Search, Openverse
- Internet Archive
- The Cleveland Museum of Art Open Access API
- Ajapaik.ee
- Topothek
- DPLA
- Structured Data on Commons
- Library of Congress
Data APIs
To be explored
- Nimiarkisto.fi
- Linked Data Finland
- BiographySampo
- SotaSampo
- FIN-CLARIN
Map APIs
To be explored
- Map Warper
- Library of Congress
Wikimedia integrations
- Wikisource
About | Technology | Design | Content modules | Tool pages | Projects |
Status
Wikidocumentaries Slack |
Setting up dev environment |
Components |
Active modules Module ideas |
Visual editor | Central Park Archives |