Linking Public Vocabularies – openvocabulary.info
A Web 3.0 service would be nothing without the semantics and users.
Before we will see most of the Internet flooded with semantic annotations of the web resources and services, we need to contribute existing v
ocabularies to the space of linked data.
This is the mission of our new project – Open Vocabulary.
Our goal is to take existing, open and free to use vocabularies (taxonomies or thesauri), translate them into SKOS/RDF, and deliver access to this vocabularies through Java and REST API.
The Open Vocabularies project consists of three modules:
- SKOS/RDF representations of open and free vocabularies.
- Java and REST API for accessing, searching and managing vocabularies.
- Portal for browsing and searching vocabularies.
Vocabularies in SKOS/RDF
At the moment we have prepared and tested following vocabularies:
- Taxonomies:
- Association for Computing Machinery (ACM)
- Dewey Decimal Classification (DDC)
- Library of Congress (LOC)
- Polska Klasyfikacja Tematyczna (PKT) – popular taxonomy used in Polish libraries
- Universal Decimal Classification (UCD)
- Open Directory Project Taxonomy (DMoz)
- Polish concepts from the Open Directory Project (DMoz – PL)
- Thesauri:
- WordNet 2.0 Thesaurus – the original RDF/OWL extended with SKOS relations
- OpenThesaurus (PL) – SKOS/RDF version of the Polish thesaurus used in the OpenOffice software
SKOS/RDF representations for all taxonomies (except DMoz) where created from flat-file specifications. DMoz taxonomies were created out of RDF-like version of DMoz, and enriched with use of SKOS relations. We have also enriched WordNet RDF/OWL with SKOS relations.
The SKOS/RDF version of Polish OpenThesaurus has been created based on rules we have designed for processing OpenThesaurus document type.
Java and REST API
The OpenVocabulary delivers Java library which can be used to embed OpenVocabulary in JEE applications. It delivers:
- Java API for managing taxonomies, thesauri and tags stored in RDF Storage (RDF2Go over Sesame)
- Access to full-text index (using Lucene) to improve search and retrieval of vocabulary concepts stored in Open Vocabulary repository.
- REST API that enables access to Open Vocabulary repository and concepts though HTML, RDF and JSON clients.
The REST API has been build following guidelines outlines by the Linked Data initiative, especially recommendation for Cool URIs and content negotiation. Apart from RDF rendering API, Open Vocabulary delivers two methods for vocabulary concepts retrieval:
- Lookup – provided through /vocabularis/lookup?uri= scheme enables rendering information about any vocabulary concepts in the RDF storage, including those from WordNet OWL/RDF which URI scheme is not hosted at OpenVocabulary.info.
- Search – provided through /vocabularies/search?q=[&threshold=&size=] scheme enables searching through the full-text index of vocabularies. It is possible to define similarity threshold or maximal number of results, or both.
Online portal for browsing and searching vocabularies
We have setup an online portal at http://www.openvocabulary.info/ to:
- Expose supported vocabularies (in the openvocabulary.info URI scheme) through Cool URIs; at the moment we support agents accepting HTML, RDF and JSON documents.
- Present capabilities of OpenVocabulary Java API, including access to full-text index.
- For registered users – provide access to all-in-one RDF bundles with all currently supported vocabularies and related ready to use Lucene indexes.
Open Vocabulary in action
One of the reasons we have created OpenVocabulary project was to support semantic user tagging, i.e., encourage users of social network systems to use meaningful concepts instead of tags. We have realized that idea in the digi.me service. If you have not used it before – you should give it a try. digi.me brings semantics, better social interactions and smart recommendations to a social bookmarking; think about it as del.icio.us on steroids. There is also a Polish localization of the digi.me service called www.węzełki.pl.
Open Vocabularies is the next generation of previous open source prototype called JOnto (created at DERI, NUI Galway). Unlike JOnto, Open Vocabularies uses SKOS/RDF, provides REST API (built according to Linked Data guidelines), and features more vocabularies, including Open Thesaurus and WordNet OWL/RDF. Similar to JOnto, Open Vocabularies is also delivered as open source software, but it uses BSD license (instead of restrictive Corrib license).
We would like to thank Linked Data community, especially Richard Cyganiak for their help.
Tags: digi.me, knowledgehives, public, repositories, service, tags, taxonomy, thesaurus, vocabulary
Unless you missed our announcement last night: we have just started a new service called Open Vocabulary – see more at http://bit.ly/e516E
RT @knowledgehives: (…) we have just started a new service called Open Vocabulary – see more at http://bit.ly/e516E