Vocabularies #ifacturas

iFacturas is based on a vocabulary. It will be published in our API later this year.

A post from from OData Blog has the whole story. Here is a part:

What are vocabularies?

Vocabularies are made up of a set of related 'terms' which when used can express some idea or concept. They allow producers to teach consumers richer ways to interpret and handle data.
Vocabularies can range in complexity from simple to complex. A simple vocabulary might tell a consumer which property to use as an entity's title when displaying it in a form, whereas a more complex vocabulary might tell someone how to convert an OData person entity into a vCard entry.
Here are some simple examples:
  • This property should be used as the Title of this entity
  • This property has a range of acceptable values (e.g. 1 to 100)
  • This entity can be converted into an vCard
  • This entity is a foaf:Person
  • This navigation property is essentially a 'foaf:Knows [a person]' relationship
  • This property is a georss:Point
  • Etc
Vocabularies are not a new concept unique to OData, vocabularies are used extensively in the linked data and RDF worlds to great effect, in fact we should be able to re-use many of these existing vocabularies in OData.

Why does OData need vocabularies?

OData is being used in many different verticals now. Each vertical brings its own specific set of requirements and challenges. While some problems are general enough that solving them inside OData adds value to the OData eco-system as a whole, most don't meet that bar.
It seems clear then that we need a mechanism that allows Producers to share more information that 'smarter' Consumers MAY understand enough to enable a higher fidelity experience.
In fact some consumers are already trying to provide a higher fidelity experience, for example Sesame can render the results of OData queries on a map. Sesame does this by looking for specifically named properties, which it 'guesses' represent the entity's location. While this is powerful, it would be much better if it wasn't a 'guess', if the Producer used a well-known vocabulary to tell Consumers which property is the entity's location.

Goals

As with any new feature, we need to agree on a set of goals before we can come up with the right design. To get us started I propose this set of goals:
  • Ability to re-use or reference common micro-formats and vocabularies.
  • Ability to annotate OData metadata using the terms from a particular vocabulary.
    • Both internally (inside the CSDL file returned from $metadata)
    • And externally (allowing for third-parties to 'enrich' existing OData services they don't own).
    • No matter how the annotation is made, consumers should be able to consume the annotations in much the same way.
    • Ability to annotate OData data too? Although this one is beyond the scope of this post.
    • Consumers that don't understand a particular vocabulary should still be able to work with services that reference that vocabulary. The goal should be to enrich the eco-system for those who 'optionally' understand the vocabulary.
    • We should be able to reference terms from a vocabulary in CSDL, OData Atom and OData JSON.
It is important to note that our goal stops short of specifying how to define the vocabulary itself, or how to capture the semantics of the vocabulary, or how to enforce the vocabulary. Those concerns lay solely with vocabulary writers, and the producers and consumers that profess to understand the vocabulary. By staying out of this business it allows OData to reference many existing vocabularies and micro-formats, without being unnecessarily restrictive on how those vocabularies are defined or the types of semantics they might imply.

Exploration

Today if you ask for an OData services metadata (~/service/$metadata) you get back an EDMX document that contains a CSDL schema. Here is an example.
CSDL already supports annotations, which we could use to refer to a vocabulary and its terms. For example this EntityType definition includes both a structural annotation (validation:Constraint) and a simple attribute annotation (display:Title):

 
   
 

 
 
 
 
   
      ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$.
      Please enter a valid EmailAddress
   

 

 

For this to be valid XML the display and validation namespaces would have to be introduced somewhere something like this:
<=""   ="" <="" em="" schema="" xmlns:display="http://odata.org/vocabularies/display" xmlns:validation="http://odata.org/vocabularies/validation">
Here the URL of the xsd reference identifies the vocabulary globally.
While this allows for completely arbitrary annotations and is extremely expressive, it has a number of down-sides:

  1. Structural annotations (i.e. XML elements) support the full power of XML. While power is good, it comes at a price, and here the price is figuring out how to represent the same thing in say JSON? We could come up with a proposal to make CSDL and OData Atom/JSON completely isomorphic, but is that worth the effort? Probably not.
  2. There is no way to refer to something, like say a property, so that you can annotate it externally, which is one of our goals.
  3. If we allow for annotations inline in the data (and let's not forget metadata would just be data in an addressable metadata service) it would change the shape of the resulting JSON structure. For example the javascript expression to access the age property of an entity would need to change from something like object.Age to something like object.Age.Value so that object.Age can hold onto all the 'inline annotations'. This is clearly unacceptable if we want existing 'naive' consumers to continue to work.

Populære innlegg fra denne bloggen

Predicates in Core Data

How To Disable Google’s Personalization Of Search Results

InvestorForum by VentureLab in Oslo a success story #startups