Warm tip: This article is reproduced from stackoverflow.com, please click
rdf ontology

How practical are units of measure ontologies in RDF?

发布于 2020-04-20 10:39:43

I am creating a collection of materials in RDF. I have come across two approaches for handling units of measure:

  1. By linking a descriptive name to the RDF property:
     prop:density prop:hasUnits "kg/m3". 

     <x:MyBrick> a x:Material;
     prop:density "1676".`

  1. Using an existing ontology library such as the ontology of units of measure. It is much more complex to assign units as it involves creating multiple objects. See below how I have assigned the same Density to a material:
 <x:MyBrick> a x:Material;
om:hasPhenomenon <x:density_MyBrick>.

 <x:density_MyBrick> a om:Density;
  om:hasValue <x:1676_kilogramspercubicmetre>.

<x:1676_kilogramspercubicmetre> a om:Measure;
  om:hasNumericalValue 1.676E3;
  om:hasUnit om:kilogramPerCubicmetre .

I have reviewed the difference use cases for using UOM ontologies, but mainly these focus on what they, can do rather than how they do it and whether it is practical. For instance I imagine that using the unit ontology makes querying much more complex if you simply want to return all the values of a specified materials attributes. There are also several attributes that are not covered by the ontology, such as surface roughness, and it is not clear how to extend them.

I much prefer taking the 1st approach. It's much cleaner, flexible and provides the user with useful information that wouldn't otherwise be available in a typical database. I would try to handle unit conversions on the app side so my priority would be storing the information in a query friendly way. My worry is that, by adopting this approach, I will be limiting the functionality futher down the line. Are there any major advantages and disadvantages of choosing one approach over the other?

Questioner
alkey
Viewed
31
Noor 2020-02-08 03:56

The problem that is mentioned in the question is a known issue within the RDF community and has been discussed in peer-reviewed papers.

With respect to the second approach mentioned in the question, one may think that is completely unnatural to write using intermediary objects. However, ontologies to describe measurements of things may have been written considering several aspects in mind. For example, in the case when something expands in at a particular instant, intermediary objects may be needed to specify the time of these measurements. Surely, there are situations where these additional descriptions are not required.

The problem with the first approach is that it completely restricts prop:density to only one unit. If you have a density in a different unit, you will have to perform conversions.

I think that an easy solution in your context is to introduce specific datatypes.

@prefix x:  <http://example.com/data> .
@prefix o:   <http://example.com/ontology> .

x:MyBrick a x:Material;
     o:density "1676"^^o:kg-m3.

In your ontology with IRI http://example.com/ontology, you can well further describe the resource o:kg-m3. For example, you can say it is a datatype to type densities measured in kilogram per metric cube as follows:

@prefix o:   <http://example.com/ontology> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

o:kg-m3 a rdfs:Datatype;
        rdfs:label "Kilogram per metric cube datatype (kg/)";
        rdfs:comment "A datatype to type densities measured in kilogram per meter cube" .

o:kg-l a rdfs:Datatype;
        rdfs:label "Kilogram per liter datatype";
        rdfs:comment "A datatype to type densities measured in kilogram per liter cube" .

As you can see above, an additional datatype o:kg-l has been defined. Now, using the same property, you can specify densities measured in different units. For example:

@prefix x:  <http://example.com/data> .
@prefix o:   <http://example.com/ontology> .

    x:MyBrick1 a x:Material;
         o:density "1676"^^o:kg-m3.

    x:MyBrick2 a x:Material;
         o:density "200"^^o:kg-l.

    x:MyBrick3 a x:Material;
         o:density "200a"^^o:kg-m3.

As can you see above, three instances of x:Material and their respective o:density has been defined. Looking at the above triples, you will notice that in the last triple, the value of o:density is 200a. You will agree that value is not a well-formed density value. Also, you may want to know which objects, x:MyBrick1 or x:MyBrick2, have a higher density. A conformant RDF triplestore will not be able to recogize that the value in the last triple is not well-formed. Likewise, a conformant SPARQL engine will not be able to perform algebraic operations on o:density values. Nevertheless, you can customize implementation of an RDF triplestore or SPARQL engine to suit these needs. This paper [1] describes how you can achieve this.

  1. Lefrançois, Maxime, and Antoine Zimmermann. "Supporting arbitrary custom datatypes in RDF and SPARQL." European Semantic Web Conference. Springer, Cham, 2016. (https://www.emse.fr/~zimmermann/Papers/eswc2016.pdf)