The Corby Engine
The Corby Engine is a static library that contains all the Artificial Intelligence related software that is used in the Corby chatbot. You can use this library to build your own Artificial Intelligence applications. This is the main page of the Corby Engine documentation.
To fully understand all the features included in the Corby Engine you need to read the Corby manual, available at:
If you want to know more about the Corby chatbot, which is a practical implementation of an application using the Corby Engine go to:
A detailed description of the Corby Engine API is available here.
The Corby Engine library was built with Microsoft Visual C++ version 5.0. It includes all that you need to build Artificial Intelligence applications, namely the AI core, the knowledge base and auxiliary file management.
The Corby Engine is not re-entrant and therefore cannot be used by more than one thread at the same time. The interface with the user software is not protected against violations of this rule and consequently it will malfunction in that situation.
All the definitions needed by the user software to access the library are contained in the file INTFACE.H, available for download. See the Application Example for details.
The basic model of the Corby Engine consists of a stimulus-response mechanism. Moreover, the Corby Engine is an abstract symbol manipulation device: It does not care in the least what the symbols represent to you; it just manipulates them according to some rules and is able to derive the meaning of a symbol from the relationships with other symbols appearing in the same context. This means that the Corby Engine is able to manipulate symbols representing very different things. For instance in a chatbot, as is the case of the Corby application, the symbols represent letters of some alphabet. But suppose that we feed the Corby Engine with symbols representing speech data. The Corby Engine would then become a speech processor.
The Corby Engine is able to merge the information coming from several sources so that the response to a stimulus from a given source can take into account the information received from other sources. In order to do that, the Corby Engine uses the concept of channel. This also allows that the stimulus from a given channel can originate a response on some other channel.
Suppose that the Corby Engine was able to control a robot with audio and vision processing. We would then assign a channel to process audio and another to video. In a response to a verbal question, the system would take into account the visual information carried by the video channel. Moreover, the system could be set up so that when some visual stimulus was present, it would warn the user through the audio channel. Extending this a bit further, a stimulus detected in one channel could originate simultaneous responses on several channels.
Although the engine does not care what the symbols represent to you, it must maintain symbol coherence, that is, a given symbol used in a channel may have a very different meaning when used in another channel. Therefore the engine will separate internally the symbols received through each channel.
The Corby Engine is able to process up to 16 bi-directional channels. Channels 0..14 are available to the user software; channel 15 is reserved for internal use. In the great Corby tradition, the engine does not care what you use each channel for. It is up to you to assign them to specific sources and is your responsibility to use them consistently during learning.
As we have seen above, the Corby Engine is an abstract symbol manipulation device. This means that it manipulates symbols according to some rules but without relying on the knowledge of what the symbols represent to you.
On the other hand, Corby is, first and foremost, a language processor. The most obvious application for the Corby Engine is the Corby chatbot as described at The Corby Home page. However, given its ability to manipulate abstract symbols, many of the algorithms that it uses are also applicable to other intelligent functions. The rationale for this is based on the fact that, as far as we know, evolution did not need to create new intelligence mechanisms to deal specifically with language – It used what it already had for other intelligent functions.
The challenge here is to come up with the structures that are the equivalents of language elements in non-language related applications. Of course, by doing that, we are doing some sort of reverse engineering on what nature did. For nature, language is the latest step, for the Corby Engine it is the first. Nevertheless, this approach seems appropriate to the task at hand.
One interesting feature of the Corby Engine, made possible by its basic stimulus-response model, is the ability to output a sequence of arbitrary length. This is achieved by feeding the output of one channel (or a derivative of it) back to its input and establishing the proper response for each response-derived stimulus. Just one original stimulus is enough to produce the sequence. Moreover, the response-derived stimuli can be freely intermixed with original stimuli so as to produce variations in the output sequence or simply to stop it if necessary. Two or more channels can also be used for this, in a configuration where the output of one channel is fed back to the input of other channels. As any given response is a function of both the stimulus and the context and as context length and contents is controlled by the user software, this feature is able to generate sequences corresponding to very complex functions.
This section describes some of the possible applications for the Corby Engine. However, at this time, none of these applications have been tried and there is no guarantee whatsoever that the engine will be successful at any of the applications. Therefore you should treat this information just as an indication of the possible areas of research that could eventually lead to real products.
The main advantage of using the Corby Engine in a language translator is the ability to switch contexts on the fly, without user intervention. Such system would also benefit from the engine’s learning abilities: In case of an error, the user would simply correct the system by providing the appropriate translation and that would be automatically included in the knowledge base.
This application could use several knowledge bases, one for each pair of languages to translate. Or it could use several output channels and allow several languages simultaneously.
Learning in such application consists simply in supplying the system with pairs of paragraphs, one in the source language and the other in the translated one. An entire database can be automatically built by the system from a corpus containing a significant amount of samples.
Very often, an organization, commercial or otherwise, has a main e-mail address where most of its electronic messages are addressed. These messages must subsequently be sorted according to the subject and routed to the specific departments that deal with them, eventually making carbon copies if one e-mail interests more than one recipient.
The Corby Engine could be used to analyse both the subject line and the message text and output a code identifying the subject and the confidence level of the classification; this could then be used by the interface software to route the message. As the Corby Engine deals with entire paragraphs, the interface software would submit each paragraph to Corby Engine and receive the information indicating the probability of the paragraph belonging to one of a set of possible subjects. By combining the output for all the paragraphs in the message, the interface software would know how to route it.
During training, the Corby Engine would be fed with examples of real messages, paragraph by paragraph, together with the intended output for each paragraph. In this case, the same code, corresponding to the manual classification of the message, would be assigned to all the paragraphs in the message. The Artificial Intelligence algorithms contained in the Corby Engine would then be able to abstract the information provided and combine it with previous knowledge so that it would respond correctly to situations not contemplated explicitly in the learning phase.
If we would so desire, instead of a single output channel, we could use several channels one for each subject, up to 15. The output of the Corby Engine would then be a probability value relative to each of the channels.
This is just an extension of the mail sorting application. Each paragraph in the message could be classified as spam in the same manner as it would be classified as a useful subject.
Just like it is able to engage in a conversation with the user, as demonstrated by the Corby chatbot, the Corby Engine can answer an e-mail message. The answer can be done paragraph by paragraph or be lumped together in a single paragraph.
As the output of the Corby Engine has always associated with it a confidence level, the interface software can decide to accept the answer given by the engine or route the message to a human operator.
This can be a stand-alone application or be part of a mail processing system that also does mail sorting and spam filtering.
The main advantage of using the Corby Engine in a help-desk application is its ability to process questions from the user in a free form: The user is able to pose the question in the manner that is most natural to him. Moreover, if the answer doesn’t satisfy him, the user can follow-up with further questions that use the context provided by previous exchanges.
If you are like me, you will often feel frustrated when you receive from the current internet search engines an enormous amount of references to web pages that you must visit to get the answer that you are looking for. Sometimes you have to go through dozens of pages before you find what interests you and sometimes you just give up and use the old method: Go ask someone. I’m sure that you would like to have someone that could read all those web pages and that could afterwards answer a couple of questions about the subject.
This is where an application using the Corby Engine could help. In this application, the interface software would get pages from the net and feed them to the engine. You could then enter in a dialog with the system to get the information you are looking for. This would not be quick but it has the advantage that it can be done without human intervention. Besides, once the system learns about one subject it will not forget and therefore, as you use the system, you will get more and more parts of the net into your knowledge base.
Traditional speech applications rely on text-to-speech and speech-to-text algorithms. This is, of course, less than optimum because speech media is much more rich than text. To make full use of the richness of the media we should use native speech information and that is where the Corby Engine can excel. Of course, it can also implement text-to-speech and speech-to-text functions, using for instance auxiliary channels in addition to the main speech channel.
We could implement a speech processor by feeding the Corby Engine with PCM audio data and collecting PCM data from its output. But that would be a very inefficient system: Audio data is usually plagued with noise and redundant data; for some reason the human hear is a very sophisticated audio processor. In the same manner, we should provide our system with an audio pre-processor.
Take for instance loudness. It is important that the system will know that information: If someone is shouting, the Corby Engine should definitely know about it. But three bits of loudness information every millisecond should be more than enough. Using raw PCM data, that information is supplied with every sample; besides it is mixed with other types of information.
There is also a very thorny issue regarding audio processing: Source separation and identification. Humans can easily entertain a conversation in a noisy environment, even with other people close by talking. An artificial system must also deal with that.
In some applications, source location is also important. It can, among other things, provide some help in the source separation process.
So the question now is: What kind of data should our pre-processor feed the Corby Engine with? Should that be time-domain or frequency-domain data? We want that data to be as abstract as possible in order to accommodate as many users as possible but on the other hand it must be detailed enough so as to make possible, for instance, user identification.
In the basic application for the Corby Engine we have letters. Groups of letters form words, separated from other words by one or more spaces. Several words form a paragraph, which is the basic input/output unit. In a speech processing application we must come up with the equivalents to these elements, otherwise, the assumption that the algorithms applicable to one kind of data are also valid for the other may not stand. Words and paragraphs are relatively easy: they are determined by the silence periods. This however gets complicated in a noisy environment and brings back the source separation issue again.
The main issue is then letters: What are the audio equivalent of the letters in a written text? Another issue is the number of channels needed for audio processing: It looks reasonable to have to channels: One for control information like loudness, source location and things like that, another for the basic audio information.
On the output we have the opposite problem: We want to receive from the Corby Engine abstract data, so that we can change how the system sounds (male or female, child or adult). We would also want the system to have a characteristic sound, independent from the audio data used for training. It would be unthinkable to have a system that changed its sound characteristic for each interaction.
Visual data is a very rich source of information: For some reason we use to say that a picture is worth a thousand words. For the reasons outlined in the speech processor application, for machine vision we would also need a pre-processor. As to what type of processing it should do, I’m at a loss. Suffice to say that neurologists discovered that the human brain possesses more than thirty areas, each one dedicated to some specific type of video processing. For instance one area is specialized in detecting horizontal lines, another in vertical lines, yet another for slanted ones and so on.
What I said about letters, words and paragraphs in a speech application also applies here. We must find the visual equivalent to those textual elements for a machine vision application to have a good chance of succeeding.
Machine vision is also special because a Corby Engine channel assigned to vision doesn’t have an obvious output. Therefore, in a multi-channel environment, visual data would only be used to affect the output of other channels.
In many instances, financial applications must deal with the problem of fusing information coming from very different sources. The Corby Engine’s ability to discover relationships between a response and the context could be used for instance in a stock portfolio management program.
The program would be able to give buy/sell orders for a particular stock, based on a variety of information sources, supplied to the engine as elements of the context. The program would then be able to extract the rules that govern the rise and fall of the stock and give orders accordingly.
The challenge here is to collect information coming from diverse sources such as news headlines, statements coming from market analysts and debt rating firms, earning forecasts, and so on, and train the program to understand the relevancy of those statements.
This application involves the processing of sensors and the control of mechanical devices, the kind of set up used, for example, in a mobile robot.
Of all the applications described so far, this is the one that involves the higher degree of speculation. There is no guarantee that the Corby Engine could be used successfully in this application, nor do I know what kind of pre-processing we could use for the sensory information.
This application is mentioned for the single reason that human beings seem to use for motor control the same kind of modules in the brain that are used for other functions that the Corby Engine can simulate successfully.
Applications in this area, for instance the control of a robot arm, can make use of the Corby Engine’s ability to generate sequences of commands of arbitrary length and complexity. The robot arm could be trained by inverse kinematics or other suitable method and would be able to integrate several sensors. This application can use several channels, for instance one for each motor, or use a single channel with appropriate encoding.
Comments and suggestions about this page are welcome and should be sent to firstname.lastname@example.org
Rev 1.1 - This page was last modified 2005-08-26 - Copyright © 2004-2005 A.C.Esteves