Sunday, January 31, 2010

An Approach that Uses ODM

Wow, it's been a long time since I had time to write anything here. I need to get back into the habit.

Today I found a really good approach that I think I will adopt for my own OWL profile for UML. Rather than transforming directly into and out of OWL, this approach is a model-to-model transformation from UML to the OWL metamodel of the ontology definition metamodel (ODM). This is a way that plays nice with ODM rather than ignoring it or sidestepping it. Also, having all the OWL metaclasses in one place makes it easier to determine if I have 100% coverage of the OWL concepts.

The other thing I like is that the author(s) of this approach understand how to use UML association ends as OWL properties. After all, the UML 2.x specification defines association ends as specializations of properties, so why use the entire association as the property? The only complaint I have about their example diagram is that the hasArtist property should have hasSculptor and hasPainter as sub-properties, and the creates property should have sculpts and paints as sub-properties. This is easy to do in UML with the subsets meta-association between the association ends, which appears as a constraint on the diagram. (It is also possible to use association generalization, but that can sometimes have issues I won't cover here.)

Reblog this post [with Zemanta]

Saturday, January 23, 2010

What is an ontology? | Ontogenesis

I just read an excellent blog posting on the definition of an ontology: What is an ontology? | Ontogenesis. The only complaint I have is that the authors don't say anything about the difference between the real world and information about the real world or the difference between open world and closed world. This allows some database or OO designers to think they are building ontologies. For example, a job applicant might be the role a person plays with respect to a particular job. In the HR department, they treat the database record about the applicant as the applicant himself. In addition, every person has exactly one biological mother, but a database about people may not record that relation, so its multiplicity might be optional or even disallowed. Thus, databases and OO systems should subset and possibly augment an ontology to lend semantics to their elements, but they are rarely ontologies.

Thursday, December 24, 2009

UML Associations in ODM

One of the major differences between the Ontology Definition Metamodel (ODM) and my UML profile for OWL is how associations are modeled. ODM treats the name of an association and its reading-order arrow as a property rather than the association ends. I find this rather odd. Not only are such models unusual, the UML specification itself uses no association names in its diagrams, except for the examples of that notation.

The way I have been modeling things at the business level for many years does not use the association name at all; it uses the association end names. These association ends are properties in the UML metamodel, so why shouldn't they represent properties in an ontology? One might argue that in UML, classes own their properties and in an ontology, properties are global, not owned by anything. I've made one simple change in interpretation of the UML specification to make this work: ownership does not convey into OWL. A property's domain corresponds to its owning class, and its range corresponds to its its type. Because the unspecified domain and range of a property is interpreted as having Thing as the domain and range, and because, practically speaking, every class in an ontology is a specialization of Thing, this interpretation works for all properties. When the association is bidirectional, the other end is an inverse in OWL, which is very convenient. Therefore, I can model in my usual way, sprinkle on a few stereotypes, and generate OWL.

There are three ways to name association ends. Many metamodelers from the OMG prefer noun phrases. Ontologists prefer verb phrases. Data modelers prefer prepositional phrases. Transforming these styles into an ontology is a matter of prepending the word "has" for noun phrases, leaving verb phrases alone, or prepending the word "is" to prepositional phrases.

Here is an example of an association with its ends named as verb phrases:

Here is an example of an association with its ends named as noun phrases:


Here is an example of an association with its ends named as prepositional phrases:



Reblog this post [with Zemanta]

Wednesday, December 23, 2009

UML Profile for OWL, Part 3: Ontology

An OWL or a WOL?Image by dullhunk via Flickr
This is Part 3 of a multi-part series describing my profile for OWL. In this part I start to descend the stereotype hierarchy. This part of the descent is about the «Ontology» stereotype.

Generally, an OWL ontology is a web accessible resource with a URL that is similar to a Web page. Sometimes a Web server builds this resource from statements in a triple store so you can retrieve it. Sometimes the ontology lives on your computer's disk and the URL will start with "file://". In any case, it behaves like a Web resource. So, for the purposes of this posting I will describe it as such.

An OWL ontology generally uses "hash URIs" (e.g., "#Person"). These hash URIs are a shortcut for prepending the URI of the ontology. This causes a problem when an ontology is stored in several different locations and someone wants to refer to something in it. My understanding of the way to resolve this problem is to declare a "default namespace" for the ontology. This causes the hash URIs to be relative to the default namespace rather than the current location of the ontology. (Please correct my understanding in a comment if I'm wrong.)

The «Ontology» stereotype has several tagged values. Because the «Ontology» stereotype is a specialization of the the «RDF Resource» stereotype, it inherits URI and namespace prefix, which were explained in Part 2. In addition to these inherited tagged values, the «Ontology» stereotype adds a default namespace.

Although I haven't constrained what you can apply the «Ontology» stereotype to, it makes sense to apply it to a UML Package. The UML elements in that package will all be in their owning package's default namespace, unless an element has a stereotype that says otherwise.

The namespace prefix is still useful in UML land. Rather than having every ontology responsible for defining a different namespace for every ontology it references, in this profile every ontology has its own namespace prefix, which helps keep things consistent across ontologies. (Although this approach will eventually give me trouble going from OWL to UML, I haven't yet run into a case where the prefix is inconsistent. I think all I would have to do it increase the maximum cardinality of the namespace prefix to deal with this. I'll deal with that later.)

In UML it is convenient to have multiple packages stereotyped as «Ontology» to take advantage of UML namespaces. Most UML tools allow you to show the owner of a class on a diagram, making it unnecessary to clutter the diagram with stereotypes and tagged values when you use a class from another ontology on a diagram. You simply create a package for the borrowed classes and apply the «Ontology» stereotype to those packages.

Reblog this post [with Zemanta]

Saturday, December 19, 2009

Class Name Capitalization?

Francesco Torniello da Novara Letter C2 1517
In another posting I mentioned the convention for capitalizing class names. I've been looking into this more deeply today, and I'm even more puzzled about how we got to where we are than I was when I started!

The convention I've seen the most is UpperCamelCase, which I've seen in object-oriented programming languages, OWL, RDFS, and XML. Where did this convention come from, especially the distinction between UpperCamelCase for class names and lowerCamelCase for properties?

When I model for an audience of business people, I make the names non-technical because they seem to find anything that smells of technology offensive. To make the names non-technical, I use spaces between the words, title case for the words themselves, and I spell out acronyms that would be unfamiliar to a layman. I rely on transformations to convert those names into CamelCase for technical language parsers.

Still, why do people expect to see UpperCamelCase in computer languages or Title Case in normal English for concepts? One answer I can conjure is that it is useful in English sentences. It signals to the reader that I'm being specific when I capitalize Association End, for example, because it makes it clear that I'm talking about the UML meta-class called AssociationEnd rather than the more general idea of the end of an association, which might include a name, multiplicity, stereotype, and a constraint.

Why are the capitalization rules for conceptual models different from an encyclopedia such as Wikipedia or Encyclopedia Britannica? Wikipedia naming conventions are more or less sentence case, although Wikipedia acknowledges that "Outside of Wikipedia, and within certain specific fields (such as medicine), the usage of all-capital terms may be a proper way to feature new or important items." The Encyclopedia Britannica convention seems to only capitalize proper nouns, as in the example computer programming language. If we were to follow these encyclopedia capitalization rules, proper nouns would only apply to instances (except for cases where a class name contains a proper noun, as in "Epstein-Barr virus"). Do we have the capitalization conventions backwards for classes and instances in an ontology / conceptual model?

One reason may be what the Wikipedia article on capitalization says: "Common nouns may be capitalized when used as names for the entire class of such things, e.g. what a piece of work is Man." This makes me think maybe we're on the right track. However, The Open Biomedical Ontologies Foundry naming wiki page disagrees! It recommends: "Don’t enforce dogmatically, but prefer lower case beginnings for class and property names. Capture names just as they would appear in normal English written text, i.e. where acronyms and proper nouns cannot be avoided in names they should be capitalized." For example, they say to "Use ‘microarray’, ‘DNA microarray’, ‘pH value’, ‘Golgi apparatus’."

So, what is the right way to capitalize concept names in an ontology / conceptual model? In the world of RDFS and OWL, the convention seems to be that the first letter is always capitalized. I think what is important is consistency, but I do question the convention.

Can anyone shed some light on this?

    Reblog this post [with Zemanta]

    Friday, December 18, 2009

    Enterprise Ontology

    I just found a great blog entry about Modeling an Organization Using Enterprise Ontology. It was an enlightening read about something I thought to be true but have never seen expressed. Highly recommended for all enterprise architecture practitioners.
    Reblog this post [with Zemanta]

    Ontology Definition Redux

    I just had a debate with an ontologist friend about whether or not an ontology is a kind of a taxonomy and a taxonomy is a kind of controlled vocabulary.

    I believe what is true in Formal Concept Analysis should be true in every ontology: every generalization must have fewer attributes than its specializations, and every specialization must have more attributes than its generalizations.

    If we were to create an ontology about controlled vocabularies, taxonomies, and ontologies, we would say that a controlled vocabulary is a set of words; a taxonomy is a set of terms with hypernym, hyponym, and synonym relations between the terms; and an ontology is a set of terms with hypernym, hyponym, and synonym relations between the terms, with additional relations and perhaps axioms. Therefore, I think an ontology is a kind of a taxonomy and a taxonomy is a kind of controlled vocabulary.

    What do you think?

    Here is more supporting evidence I've dug up:
    • CLARITY IN THE USAGE OF THE TERMS ONTOLOGY, TAXONOMY AND CLASSIFICATION: "A very simple ontology could perhaps better be named 'taxonomy'"
    • What are the differences between a vocabulary, a taxonomy, a thesaurus, an ontology, and a meta-model?:
      • "A controlled vocabulary is a list of terms that have been enumerated explicitly. This list is controlled by and is available from a controlled vocabulary registration authority. All terms in a controlled vocabulary should have an unambiguous, non-redundant definition."
      • "A taxonomy is a collection of controlled vocabulary terms organized into a hierarchical structure."
      • "A formal ontology is a controlled vocabulary expressed in an ontology representation language."
      • "The word 'ontology' has been used to refer to all of the above things."
    • Ontology Development 101: A Guide to Creating Your First Ontology:
      • "The ontologies on the Web range from large taxonomies categorizing Web sites (such as on Yahoo!) to categorizations of products for sale and their features (such as on Amazon.com)."
    • Organizing Knowledge with Ontologies and Taxonomies:
      • "The concepts are defined in an ontology that maps the main ideas and their relationships. It also include the creation of a set of terms that defines how to label items according to the concepts described in the conceptual map. This structured set of terms is a taxonomy."
      • "Taxonomies are the classification scheme used to categorize a set of information items. They represent an agreed vocabulary of topics arranged around a particular theme."
    • Building and Using Ontologies
      • "An ontology may take a variety of forms, but necessarily it will include a vocabulary of terms, and some specification of their meaning."

    Reblog this post [with Zemanta]