Having a basic familiarity with logic and logical languages is a prerequisite for working with SUMO. Having past experience with object modeling in UML or schema creation in XML will be helpful, but not sufficient. One of the better books I've seen from the standpoint of getting up to speed on the practical issues of writing logic expressions is Schaum's Outline of Logic. On line, I suggest Waner and Costanoble's introduction or chapters 9 and 10 of an MIT open course. You might also look at the following: [1, 2]
The place to start is the introductory tutorial with audio, available on the ontology portal home page. If you are going to be creating your own ontology content, or even using some of the domain ontologies, you'll want to run the Sigma browser locally, so you should download and install a copy.
It probably does have it, at least at some level of generality. Try searching the WordNet mappings (enter a word in the English Word box). It is likely you're just searching SUMO and the name you expect for a concept doesn't match the name that was used. A related issue is that the name for a term is just a comment. A term means what its axioms say it means; no more, no less. It may well be that a particular term doesn't accord with the meaning you might intend for its given name. That doesn't mean that it is wrong, just that you might have named it differently. Someone else might have named it differently still. There is no objective basis for deciding on a name. Better to treat each name like an arbitrary symbol, such as GENSYM345432, if the term name doesn't seem evocative for you. If you find a term in SUMO or MILO that is what you are looking for, but too general, also try looking in one of the domain ontologies.
The development of SUMO has involved several approaches, in the rough sequence given below. In philosophical terms, it is closest to "Naturalism". It is both top-down and bottom-up (as well as middle-out). We took a pragmatic and empirical approach by virtue of the domain ontology construction and WordNet mapping, but also a theoretical and philosophically informed approach by working top-down from theories and principles from philosophy. Any comprehensive ontology needs work from top-down and bottom up. It must be cognizant of theory and yet focused on pragmatic practice and utility. To take one viewpoint over the other is to ignore an influence that can help create a better product. Some of the steps taken were to:
Isn't first order logic too complicated to expect people to use it?
The simple answer is that logic solves an essential problem that other approaches do not. For one explanation of this issue, look at my article "Why Use OWL?". One version of the contrary argument is "looking under the lamppost". Taxonomies, object models, database schemata, controlled vocabularies and the like simply don't capture the meaning of concepts in an unambiguous way. Those approaches don't capture meaning in a way that computers can understand. Current approaches don't address the need to capture meaning. It's true that FOL is unfamilliar to many people. So was Java when it was first introduced. For that matter, so was assembly language. If a technology has value and can't reasonably be handled by a simpler approach, people will learn it. An end user however should no more expect to see SUMO or KIF expressions than he should see the Java code underlying an application.
Why do we need semantics? Isn't XML good enough?
Here's a look at the issues that somewhat parallels my explanation from the article "Why Use OWL?". In logic we might state
Isa(Y,X) ^ Isa(Z,Y)
"Y is a kind of X, and Z is a kind of Y"
The mathematics of logic allows us to conclude
Isa(Z,X)"Z is a kind of X"
This works in just the same way that 2 + 2 = 4. "2 + 2" means "4". It's not just a procedure for stating a problem solving process in arithmetic. It's not just a syntax, because we could change the symbols as long as we defined them with a formal theory (like Russell & Whitehead's Principia Mathematica). "2 + 2" necessarily entails "4" whether I have some system to state it or prove it or calculate it or not.
XML syntax doesn't have that inherent property. The following expression
<and> <isa "Y" "X"> <isa "Z" "Y"> </and>doesn't mean anything, unless one defines the formal semantics behind those statements. In RDFS, there is some semantics, because there is a formal theory of what the symbols mean. So at least
<rdfs:Class rdf:about="http://a.b.c/my-schema#Y"> <rdfs:subClassOf rdf:resource="http://a.b.c/my-schema#X"/> </rdfs:Class> <rdfs:Class rdf:about="http://a.b.c/my-schema#Z"> <rdfs:subClassOf rdf:resource="http://a.b.c/my-schema#Y"/> </rdfs:Class>logically entails
<rdfs:Class rdf:about="http://a.b.c/my-schema#Z"> <rdfs:subClassOf rdf:resource="http://a.b.c/my-schema#X"/> </rdfs:Class>Note that this has nothing to do with human interpretation or any computational process, but is part of the mathematics of logical entailment defined (or not defined) in these various languages. It's always instructive to have someone take a theory that uses familiar symbols and replace the basic vocabulary with random strings. If there is a proper semantics for the language of the theory, it will still be possible to give a definite meaning to the theory.
Isn't first order logic too slow to use?
It is important not to confuse representation with implementation. Performing representation in the same language as the implementation risks using a language that makes it impossible (or at least very difficult or awkward) to capture certain kinds of information. For example if your implementation language doesn't allow for stating if..then rules, then you won't be able to capture that kind of information. But such rules are almost certainly needed to define each term precisely. A better approach is to capture the information and then decide how, and how much, of that knowledge can be expressed and used efficiently in your application. At least you'll have documented carefully what your concepts mean. Just because implementations can't directly reason with English, doesn't mean we shouldn't have English definitions in our data dictionaries.
X is/isn't an upper level concept. Why is/isn't it in SUMO?
This is the sort of question that lacks any objective basis for fruitful discussion. We set an arbitrary limit of 1000 terms in SUMO because much more is likely to be too hard to learn in a reasonable amount of time, and much less is likely not to cover a broad enough space of concepts to be useful by itself. As to whether something belongs in SUMO or MILO, it's a judgement call on which reasonable people can disagree.
How do you know SUMO is right?
How do you know your operating system is "right"? It's a combination of a priori formal methods to ensure internal consistency, empirical tests of utility and coverage, and a lot of human testing and inspection. There are no shortcuts here. The specific tasks have been
How do you know SUMO is suitable for all domains or tasks?
We don't, exactly, but we don't have any concrete examples to the contrary either. It's a reasonable conjecture that SUMO might not be optimal for some new task or domain, but that's only a conjecture until a pragmatic and specific example is found and formalized. The range of domains to which SUMO has been applied gives us some empirical evidence to the contrary. There are many ongoing debates in metaphysics about different ways of carving up the world at a high level, but the very existence of such debates should show that there are no critical flaws with the major positions. A typical debate is on models for action and change.
Is SUMO biased toward English, or western culture in general?
SUMO is language independent. The original SUMO term names are in English, but they are only often and coincidentally equivalent to English words. SUMO terms have been translated into a variety of different languages, which are primary in some very different and non-western cultures. These languages include Hindi, Chinese and Czech. The ease with which these translations have been performed, and the extent to which SUMO is in regular use by non-English users, gives us considerable confidence that there is no deep-seated linguistic or cultural bias in SUMO, any more than there is linguistic or cultural bias in areas of mathematics discovered by the ancient Greeks or Chinese.
Is SUMO done?
It would be better to say that SUMO is stable. The structure of SUMO has not changed appreciably in several years. However, there are still many things which could be improved and elaborated. From time to time we get reports of typos in rules, or other problems which although isolated, do need fixing. SUMO is likely to continue to evolve, especially as it gets wider usage in reasoning applications. It's also likely that SUMO will not change much compared to the level of change and evolution of the domain ontologies.
How are sorts defined in SUO-KIF and SUMO?
In SUO-KIF variables are not typed. In SUMO, all relations have defined
argument types. SUMO uses
(as well as
for this. Some logical languages use an explicit syntax such as
(forall (?X:Object, ?Y:Process) ...)One achieves the same effect in SUMO with
(forall (?X ?Y) (and (instance ?X Object) (instance ?Y Process) ...))or by just using the restrictions inherent in given relations. For example
(forall (?X ?Y) (and (instrument ?Y ?X) ...))since "instrument" constrains its arguments to Process and Object, respectively.
How does SUMO employ higher order logic?
This is a difficult issue, since higher order logic is very difficult to reason with efficiently, but very hard to do without from a representational standpoint. Higher order expressions are used when necessary in SUMO. A practical reasoning system may also wish to define concepts which effectively incorporate modal or temporal parameters into domain specific predicates. The tradeoff though is that such predicates will be less reusable. It is a delicate balance that must be maintained. Specifically SUMO does include a number of predicates that take formulae as arguments, such as holdsDuring, believes, and KappaFn. In Sigma, we also perform a number of "tricks" which allow the user to state things which appears to be higher order, but which are in fact first order and have a simple syntactic transformation to standard first order form. We also integrate with the THF language to do real HOL with LEO-II and other HOL provers.