I have been reading a lot about Ontologies lately. Looking at the job offers of companies involved in the creation of Intelligent Assistants, such as Amazon Echo or IBM Watson, this seems to be a favoured approach of managing Knowledge .
As a matter of fact, I see two approaches on creating chatbots. The Machine Learning approach is doable only when you have a lot of labelled data - I mean REALLY a lot. However, bots have been around for decades, and what were people doing before that ? Among other things, they were creating Ontologies.
Ontology is a concept that comes from the philosophy and has especially investigated by Wittgenstein, but in Computer Science it is a formal naming and definition of the types, properties, and interrelationships of the entities that really exist in a particular domain of discourse.
Ontologies are one of the many tools for dealing with Semantics - the traditional way of processing language before Machine Learning became cool. Now many people are blinded by successes in Machine Learning in fields such as Automatic Translation, but these are possible only if you have an outrageous amount of data.
If your domain knowledge is your starting point, you should identify the relevant Concepts and possibly the many Labels to identify them. Relations may be important, but more than that it can be useful to establish Rules and Facts.
Now you can derive Questions from Facts - the art of creating questions.
Facts, Concepts and Rules can be established reading literature about the domain.
For chatbots, you will probably are not going to need Taxonomies (as you will most likely be limited to one domain, unless you are Amazon). Semantic relations, along of course with Facts, will be more valuable if you need to recognize and act on entities.
However, you can still use Machine Learning for ontologies. Assuming you managed to gather a lot of documents related to the domain, you can use techniques ranging from Word2Vec up to recognizing relations to noun chunks. The generation of a full-fledged ontology from a collection of documents, however, is hindered by the same problem described before - data would need to be labelled.
Last, but not the least, you can look at many of the existing ontologies, or possibly even at Linked Data.
For instance, the DBPedia is a trove of structured information. The IBM computer who won Jeopardy used Linked Data - but for a limited-knowledge chatbot it might be an over kill. Anyway, this deserves a post in itself.
As a matter of fact, I see two approaches on creating chatbots. The Machine Learning approach is doable only when you have a lot of labelled data - I mean REALLY a lot. However, bots have been around for decades, and what were people doing before that ? Among other things, they were creating Ontologies.
Ontology is a concept that comes from the philosophy and has especially investigated by Wittgenstein, but in Computer Science it is a formal naming and definition of the types, properties, and interrelationships of the entities that really exist in a particular domain of discourse.
Ontologies are one of the many tools for dealing with Semantics - the traditional way of processing language before Machine Learning became cool. Now many people are blinded by successes in Machine Learning in fields such as Automatic Translation, but these are possible only if you have an outrageous amount of data.
If your domain knowledge is your starting point, you should identify the relevant Concepts and possibly the many Labels to identify them. Relations may be important, but more than that it can be useful to establish Rules and Facts.
Now you can derive Questions from Facts - the art of creating questions.
Facts, Concepts and Rules can be established reading literature about the domain.
For chatbots, you will probably are not going to need Taxonomies (as you will most likely be limited to one domain, unless you are Amazon). Semantic relations, along of course with Facts, will be more valuable if you need to recognize and act on entities.
However, you can still use Machine Learning for ontologies. Assuming you managed to gather a lot of documents related to the domain, you can use techniques ranging from Word2Vec up to recognizing relations to noun chunks. The generation of a full-fledged ontology from a collection of documents, however, is hindered by the same problem described before - data would need to be labelled.
Last, but not the least, you can look at many of the existing ontologies, or possibly even at Linked Data.
For instance, the DBPedia is a trove of structured information. The IBM computer who won Jeopardy used Linked Data - but for a limited-knowledge chatbot it might be an over kill. Anyway, this deserves a post in itself.