The Learning Page
This is the
learning page of the Operation Manual for the Corby application. The other
pages of the manual are:
Main page
– This is the main page of the Operation Manual.
Using Corby
– Describes the usual operation procedures.
Options –
Describes the options that control application behaviour.
Resources
– Explains how to best use your computing resources.
Knowledge
base – Describes back-up, restore and recovery procedures.
This
page includes the following sections:
… and finally some fun and
games
Perhaps you came to this page as the last
resort, after trying to get Corby to say something and receiving nothing except
“…” in reply. Then, in the grand old tradition of “When everything else fails,
read the flying manual”, you decided to come here and give it the last chance
before declaring the thing as broken and give up.
If this is
the case, let me assure you of two things: The first is that you came to the
right place and here you will learn how to teach Corby the appropriate response
to your questions and then perhaps even become one of its regular users. The
second is that the above behaviour displayed by Corby is perfectly normal:
Everything that Corby knows must be learned. As a newborn baby, the first time
that you use Corby, its knowledge base, where it keeps all that it knows, is empty
and therefore it will not know how to answer any of your questions.
It is easy
enough to make Corby say something: When it replies with “…” to a question of yours, open the
Feedback dialog box with the Action/Feedback command or hit F9. Then write the correct
response in the sub-window with the title “The system response should have
been:” and hit OK. From that point on, Corby will remember the correct response
to your question.
This
however is not very exciting. Providing canned responses to user questions is
what a tape recorder is capable of. An intelligent device should be able to
provide responses to questions it has never seen before. But that requires
inference and this in turn demands the use of concepts.
Everybody
knows what a table is, yet there are tables of many shapes and sizes; if I show
you a table of a shape that you never saw, you will be able to identify it as a
table. That is because you have the “table” concept. We acquire concepts by
being exposed to many exemplars of one object; then we are able to identify the
object’s components and distinguish the ones that are essential to the object
from the ones that are accessory.
In the case
of a table, the essential elements seem to be an horizontal surface of any shape
or size, sustained above the ground by one or more legs. Other table
characteristics like for instance colour are not essential to the table
concept.
The concept
then enables us to refer to the object as if it were a collection of its
essential components. These, in turn, can be other lower level concepts. In a
more general way, we can say that a concept is a compact representation of a
family of similar ideas
Concepts
bring many useful characteristics to an intelligent entity. For starters, they
provide a very powerful data compression mechanism. Instead of having to
remember each possible combination of the concept’s components, we just have to
remember the components themselves. It is true that by doing that we loose some
information because not all combinations are valid but one can’t have
everything.
The second
advantage of concepts is that they provide a very good inductive inference
mechanism. This type of inference uses the level of similarity between two
things. If these two things are two instances of the same concept, they can be
treated in the same way. Even if they belong to different concepts we can
establish their similarity by the number of elements that the two concepts have
in common. This inference mechanism is what enables us to identify as a table
an object that we have never seen before.
Finally,
concepts are important because they make possible what we call creativity: the
ability to come up with something that we have never encountered before.
Considering the table example, it is easy for you to come up with a table
design that nobody has seen before just by trying a new combination of a
tabletop and a set of legs that you have actually seen. Better yet, considering
that a table leg is itself a concept, as is the case for the tabletop, you can
come up with an original set of legs, an original tabletop, put the two
together and there you have a truly original table design. Of course, following
that recipe, you may come up with a table that nobody likes, but that is always
the challenge of creativity.
Of all the
things that concepts make possible, the inference mechanism is perhaps the most
important one. Consider for instance the many ways in which a question can be
asked. If all reduce to the same concept, once you know the answer to one
instance, you immediately acquire the correct response to all the other
instances. This knowledge amplification power is one of the most important
characteristics of intelligent beings. This is also what enables them to thrive
in a complex, ever changing environment.
Given the
above, it is no surprise that the main objective of the learning process in
Corby is the discovery of concepts. Concepts, however, are hard to come by.
Corby distils them from a huge quantity of language samples like the gold prospector
isolates little gold nuggets from tons of sand.
If you want
to gain some insight into what is going on under the hood in Corby, you should
try the old “say <something>” routine, using an empty knowledge base.
This is the routine that we use on toddlers to improve their pronunciation
abilities. It could go like this (user input in bold):
say cat
… [corrected
to cat]
say dog
… [corrected to dog]
say bird
… [corrected to bird]
say cat
cat
say dog
dog
say bird
bird
say aardvark
aardvark
From the
seventh iteration on, Corby is able to answer correctly to any instance of the
question. If you look in the Statistics window under the "Concepts created
this session" title, you will see that Corby created two concepts: One of
them corresponds to the “say <something>” pattern and the other is a
temporary version of the same that will later be discarded.
This
experiment, simple as it is, is able to illustrate several fundamental aspects
of the Corby architecture. The first one is a practical demonstration of the
power of inference, afforded by concepts. With a modicum of effort (just three
instances of the concept in the example above), the system is able to answer
literally thousands of questions.
The second
aspect that the experiment can highlight is Corby’s language independence. You
can try it with any language you like or, better yet, you can invent a new one.
You can, for instance, replace each word in the above example by a random
number of your choice. The number of instances needed for Corby to pick up the
concept may vary slightly but the end result will be just about the same.
The third
aspect worth noticing is that Corby doesn’t care in the least what the symbols
mean to you. What it cares about are relationships between symbols. This
becomes immediately apparent if you try the experiment with numbers. When you
substitute one set of symbols by another in a consistent manner, the symbol
relationships are preserved and that is all that Corby cares about.
Finally and
this is perhaps the most important insight of the whole exercise, after the
seventh iteration, Corby behaves, for all practical purposes, as if it captured
the meaning of the word “say”. It is true that this is just one of the possible
meanings of the word and perhaps the simplest one but nevertheless the fact
remains that Corby captured the meaning of a word using only symbol
relationships.
This brings
us directly into the hot spot of one of the fundamental and most difficult
problems in Artificial Intelligence: The problem of semantics. It is
understood, with good reason, that until a machine can really understand the
meaning of what is said, it will not be able to even come close to emulate a
human being.
However, if
the above example is anything to go by, what we call meaning is no more than a
set of rules underlying the relationships between words, or, as many people are
fond of saying, meaning is an emergent property of certain symbol
relationships. Some people have suspected as much and have been saying for
quite some time that “words do not have meanings, they have uses”. Given the
above, perhaps it should be more correct to say: Words have meanings, which can
be derived from their uses.
In relation
to the way that Artificial Intelligence programs approach the semantic problem
we can consider two extremes. At one end of the spectrum sit most of the
programs existing today. They rely, almost exclusively on the knowledge that
the programmer has of the language. These programs have, in a more or less well
disguised way, a set of instructions in the form: “If the user says A then
reply B”. The archetype of these programs is the old Joseph Weizenbaum’s
program ELIZA.
This kind
of program is nowadays much discredited. Few people call them intelligent
anymore and with good reason, because the intelligence in those cases is in the
programmer’s head, not in the program itself. The fundamental problem with this
approach is that there are only so many question-answer pairs that can be made
available to a program: Firstly because of the storage requirements; secondly
because of the huge amount of work needed to fill that storage. Just imagine
how badly a program like this would cope with the “say <something>”
example given above. Yet any toddler in the early stages of learning the
language can deal with it. The problem manifests itself when the program is
confronted with a question it has never seen before. It usually fails because
the information it needs to sort itself out of the hot spot is precisely inside
the programmer’s head.
At the
other end of the spectrum we could envision programs that would try to mimic
the human brain. They would use a great number of simple processing elements,
massively interconnected. They would explore all the possible relationships
between all the elements of the language and derive meaning from that. There
are two problems with this approach: The first one is the enormous amount of
computing power needed. The second problem is related to the data structures
needed to support the relationships discovered. One possible solution to this
problem would be to use some kind of genetic algorithm and try to evolve the
needed structures. This however would be equivalent to duplicate the work done
by evolution during millions of years.
Corby is a
good compromise between the above two extremes. It is set up to explore certain
types of relationships and has the necessary structures to support them. It is
able to learn and derive meaning from symbol relationships. It is language
independent and most importantly, it is compatible with the computers that we
have available today.
Everything that Corby knows about the world is
stored in the knowledge base. When you say something to Corby, the first thing
it does is to parse your input into the appropriate knowledge structures and
then store the result in the knowledge base in the form of links between
knowledge elements.
Therefore, the knowledge base contains, among
other things, Corby’s world model, which it uses to build responses to your
questions. But retrieving information from the world model is by no means an
easy task. To appreciate what is involved in this task try the following
example with an empty knowledge base. Again, you can use any language of your
choice; user input is in bold.
today is saturday
…
what day is today?
{saturday | sunday} [correct as appropriate]
… [force autolearn]
[clear
the context with Ctl+N]
today is sunday
…
what day is today?
{saturday | sunday} [correct as appropriate]
… [force autolearn]
[clear
the context with Ctl+N]
today is saturday
…
and bob's your uncle
…
what day is today?
{saturday | sunday} [correct as appropriate]
… [force autolearn]
[clear
the context with Ctl+N]
today is sunday
…
and bob's your uncle
…
what day is today?
{saturday | sunday} [correct as appropriate]
… [force autolearn]
[clear
the context with Ctl+N]
Repeat the above sequence until Corby answers
correctly to all the questions. As the process converges, the information given
by the sentence “today is…” doesn’t need to be in the context anymore. At that
point try the sequence:
today is saturday
…
[clear the context with Ctl+N]
what day is today?
saturday
[clear the context with Ctl+N]
today is sunday
[clear
the context with Ctl+N]
what day is today?
sunday
At this point, Corby is able to pick up the
information conveyed by sentence “today is…” from its world model. You can even
introduce the sentence, exit the program, restart it, and then ask the
question. If you are lucky, Corby may have been able to pick up the concept
associated with this example. In that case it will be able to answer correctly
for all the other days of the week. It will behave as if it understands the
meaning of both the sentence “today is…” and the question “what day is today?”;
in fact, what it is doing is exploring some relationships between symbols.
This example also shows the importance of the
context during learning. Corby is able to pick up the correct relationships
because the relevant elements are all in the context. Once those are
established, not all the elements need to be in the context anymore.
This example also highlights the importance of
interactive learning, due to the context control that it affords. An unusual
aspect of context control is exemplified by the sentence “and bob's your uncle”
in the above example. What this sentence does is to introduce noise in the
context, so that Corby can decouple the irrelevant relationships from the
useful ones.
Finally, a word on knowledge representation, a
hot issue among Artificial Intelligence practitioners. As said above, the
information is stored in the knowledge base in the form of links between
knowledge elements. The information in stored in a diffuse form, so if, for
instance, you were to binary-edit the knowledge base data file, you wouldn’t
find any of the sentences in the example above. But they are there alright, as
the system behaviour demonstrates.
This mimics, as far as we can tell, how the
human brain stores information. If this is true, you can appreciate how
difficult is the work of those who are trying to gain some insight into the
intelligence algorithms by analysing the brain.
Some people
have this idyllic notion that once we get a program that learns, we can set it
to read books 24 hours a day, seven days a week or let it loose on the Internet
and in no time flat we would have a PhD level intelligent device. If things
were so simple, we would have no use for schools, teachers and textbooks: We
would set up libraries, define reading lists and people would learn that way.
Or they would just surf the Net and find there what they need to learn.
Learning is
a very long and complex process. Just consider that it takes about twenty years
of continuous learning for a human to reach intellectual maturity. It looks
like nobody seems to know exactly how we humans learn. A testimony to that are
the education reforms that occur periodically in our school systems. We can
however define some general principles that we can use to improve the learning
process in Corby.
Repetition
Repetition
is perhaps the most distinctive aspect of the learning process in Corby. The
knowledge structures it uses are not created in the final form, they rather
evolve through a series of intermediate structures until they reach the final
form. Even then new information may cause the whole process to be started all
over again.
What drives
this evolutive process is the repetitive occurrence of certain
stimulus-response pairs in the learning process. We can consider several types
of repetition:
·
Basic
repetition – The same stimulus-response pair is submitted repeatedly for
learning.
·
Group
repetition – A group of stimulus-response pairs is submitted repeatedly for
learning.
·
Context
repetition - The same or a group of stimulus-response pairs is submitted
repeatedly while the context is changed.
Structure
Knowledge
structures are composed of other, simpler ones. Complex concepts are made of
simpler ones. It is important that the learning process evolves through a
series of steps of increasing complexity where the simpler elements are learned
first. This is just the case with humans: People do not start to learn rocket
science before they can do basic arithmetic.
Context
Most of the
time, the answer to a question depends not only on the question itself but also
on what has been said before. Corby is able to establish the appropriate
relationships between the response and the context but this process is very
demanding in terms of computing resources.
It is
therefore important that during learning the user limits the context to the
relevant elements so as to minimize the processing requirements. It is also
essential that the response be submitted in several different contexts so that
the system can isolate the elements that condition the response.
Conceptualisation
The main
objective of the learning process is the formation of concepts. Corby picks up
a concept by being exposed to several instances of that concept. It is
therefore important that the learning environment provides these groups of
instances of a concept.
This
incidentally is the most distinctive feature of school books and indeed of the
whole academic environment: When they set out to teach a concept, they provide
the best environment possible by providing several examples of the thing, in
all the appropriate contexts, then providing evaluation mechanisms so that the
student can demonstrate that he acquired the concept.
This is the
main reason why learning from general-purpose books is a very inefficient
process. The probability of finding together several instances of the same
concept is very low.
Specialization
As is the
case with humans, learning in Corby is more efficient if it specializes in some
specific are of knowledge. The fact that the probability of finding together several
instances of the same concept increases with specialization may be the correct
explanation for the efficiency increase.
Interactivity
Corby
learns from pairs of stimulus-response paragraphs provided by the user. There
are two ways in which you provide learning material to Corby: One of them is by
submitting text or HTML files. The other occurs during the normal interaction
with the system through the auto-learn feature, with eventual recourse to the
feedback mechanism.
In both
cases the system receives a stream of paragraphs where each paragraph is the
response to the one that immediately precedes it. The learning process is then
fed with a series of stimulus-response pairs that constitute the learning
material.
The most
efficient way of teaching Corby is through the interactive process because you
can control with great precision what is being submitted and can react promptly
to the way Corby changes behaviour as the result of learning. This, however, is
very time consuming and part of the learning process must be done by submitting
files that the system can process unattended.
Source
material
Corby is a
GIGO (Garbage in garbage out) system: The quality of its productions is
proportional to the quality of the learning material used for training. The learning
speed also depends on the type of source material you submit. For some reason
we humans make books destined specifically for learning, we don’t just use
general-purpose books for that.
The
ultimate objective of the learning process is the formation of concepts. That,
in turn, results from the discovery of certain types of relationships between
language elements. So, the text files submitted to Corby for learning should
provide an environment where those relationships are easier to find. They should
also follow as much as possible the principles enunciated above in this
section. For instance, they should provide enough instances of a concept for
the system to pick it up, they should build complex concepts from simpler ones
and they should provide the appropriate context for each response.
Good source
material in the target language is hard to find in electronic form. A good
source of learning material is the several language corpora available on the
Internet. These are usually adapted to language research but can be easily
modified for Corby training. A reasonably good source of teaching material is
provided by the several FAQs available throughout the Net.
If the
target language is English you can also use corpora built with the specific aim
of teaching concepts to artificial intelligence devices, like for instance
ConceptNet.
As the last
resort, you can always resort to books in electronic form. The best source for
that is undoubtedly the Project Gutenberg. Although most of the books are in
English, it includes books in other languages as well.
HTML files
are usually the poorest source of learning material. The reason for that
resides in the fact that most Web pages are designed for visual impact and
visual information is lost on Corby. For instance a block of text may be
describing a picture next to it; it will make no sense to anyone that is not
seeing the picture.
When files
do not provide the learning level required you have to resort to manually
supply the learning material using the interactive process. This, as you
imagine, is a very time consuming process but in many instances it cannot be
avoided.
Comments and suggestions about this page are welcome and should be sent
to fadevelop@clix.pt
Rev 1.0 - This page was last modified
2005-07-05 - Copyright © 2004-2005 A.C.Esteves