Application Program Interface
This page
describes the Corby Engine API, including all the data structures and the
procedures available to the user software. In order to understand most of the
information contained in this page you should have already read the
documentation associated with the Corby chatbot, available at The Corby Home page.
The
procedures available to the user software are divided into two groups: One
includes the procedures that are available at all times; the other includes
restricted access procedures, whose availability depends on the current state
of the engine as explained below for each individual procedure.
The Corby
Engine is not re-entrant and therefore cannot be used by more than one thread
at the same time. The interface with the user software is not protected against
violations of this rule and consequently it will malfunction in that situation.
struct CORBY_SubmitData{
int sChannelID;
int undLevel;
int slength;
unsigned short *sdata;
};
This
structure carries data to be submitted to the Corby Engine and the feedback
from the engine relative to the parsing of the submitted data. sChannelID identifies the channel to which the
data is destined. undLevel is a value in the range 0…100, filled by the Corby Engine, that
indicates how well it understood the submitted data, that is, how few of the
elements in the submitted data are new to the engine; the higher the number,
the better the understanding. slength is the number of items in the sdata array. sdata is an array of 16-bit values that
correspond to the data being submitted.
The engine
fills the undLevel member after it parsed the submitted data; the user software fills in
all the other members of the structure. The sChannelID member must be in the range 0…14.
struct CORBY_Response{
CORBY_Response *next;
int rChannelID;
int confLevel;
int rlength;
unsigned short *rdata;
};
This
structure carries a single response from the Corby Engine to the user software.
When the system uses more than one channel, the responses corresponding to the
several channels are carried in a single linked list of CORBY_Response structures; hence the next member, which points to the next
element in the list. The last element in the list has its next variable set to null.
rChannelID is the channel associated with the response. confLevel is a value in the range 0…100,
filled by the Corby Engine, that indicates the degree of certainty it assigns
to the response; the higher the number, the greater the certainty. A low value
doesn’t mean that the response is incorrect, just that the engine is not sure
about it. rlength is the number of items in the rdata array; rdata is an array of 16-bit values that
correspond to the response. The Corby Engine fills in all the members of this
structure.
This
structure can be used to carry an empty response; in this case the rlength member is 0 and the rdata member is null. An empty response
is a perfectly valid response to a given stimulus: For instance, suppose you
are engaged in a conversation with someone; if you make a statement and your
interlocutor says “OK”, you are usually left with no response and you must find
a new topic to keep the conversation going. In the Corby Engine environment, we
would say that the “OK” statement has an empty response.
struct CORBY_ResponseData{
unsigned short rChannelMap;
int timeOut;
int nitems;
CORBY_Response *respList;
};
This is the
structure that carries one or more responses from the Corby Engine to the user
software. The response is relative to the last stimulus submitted to the
engine.
rChannelMap is a bit-map that indicates in which channels
the user software wants a response; the least significant bit corresponds to
channel 0. For instance, a value of 0xFFFF requests a response in all the
channels; a value of 2 requests a response in channel 1 only. timeOut is the time, expressed in seconds,
that the engine can take while looking for a response. nitems is the number of individual
responses, one for each channel, contained in the response list. respList is a single linked list of CORBY_Response structures containing the responses
themselves. The last element in the list has its next variable set to null.
The user
software fills in the rChannelMap and timeOut members; the engine fills the other
two.
If the
engine cannot get a response, the nitems member will be set to 0 and the respList member set to null. When this
happens, the engine will return an error. Do not confuse this situation with
the one where the response is empty. In the latter case, the respList member will have a CORBY_Response structure, whose rdata member is set to null, as explained
earlier.
struct CORBY_LearningData{
int sChannelID;
int slength;
unsigned short *sdata;
CORBY_Response *respList;
};
This
structure carries a stimulus and its set of responses, in the learning
submission process. The slength and sdata members constitute the stimulus; respList carries the responses to the
stimulus in each of the appropriate channels.
The user
software fills in all the fields in the structure. sChannelID is the channel associated with the
stimulus. All the members in this structure are optional; these must be
consistent, that is, if the sdata member is null, the slength member should be set to 0. Also in a situation where a response is not
required, the respList member should be set to null.
The confLevel member of the CORBY_Response structures used to carry the
responses for each individual channel is not used in this situation.
Of course,
the learning process can have an empty response as a perfectly valid response
for a given stimulus. In this case, the respList member will have a CORBY_Response structure, whose rdata member is set to null.
struct CORBY_FeedbackData{
int rChannelID;
int grade;
int xlength;
unsigned short *xdata;
int rlength;
unsigned short *rdata;
};
This
structure is part of the mechanism that the user has to suppress undesirable
behaviours and to encourage desirable ones. Undesirable behaviour arises mainly
from the engine’s inference mechanism and from user mistakes. This structure
also serves the purpose of training the auto grade mechanism, whereby the
engine can grade its own productions based on the user response to them.
rChannelID is the channel associated with the response
about which the user wants to give feedback; it must have a value in the 0…14
range.
grade is a value in the -5…+5 range, that indicates how the user grades the
engine’s production; a negative value decreases the probability of the
offending response being issued again; a positive value, on the contrary,
enhances that probability. When this variable is 0, no grade will take place.
The user
can optionally supply an explanation for the grade given, by using the xlength and xdata members. This will be used to train
the auto grade mechanism.
The user
software can also use this procedure to supply a response by using the rlength and rdata members. This will replace the
response given by the system if any, or simply add a new response, otherwise.
Most of the
procedures described below return an error code. Sometimes the error relates to
the parameters used in the procedure call or to the operations performed during
the call. However there are situations where the error returned has nothing to
do with the procedure itself, but with some anomalous condition detected
previously by the engine.
One of the
most important functions of Corby Engine is the updating of the knowledge base.
Great care was taken during the design of the software to avoid at all costs
the possibility of corruption of this database. This includes things like CRC
control and the use of transaction control with rollback capability, so as to
make possible to restart an interrupted operation and therefore avoid creating
an inconsistent record.
For this
reason, in all situations where the low-level file I/O procedures of the
Operating System report an error, the engine will immediately take steps to
avoid further access to the knowledge base until the engine is shutdown and
restarted again. In that situation, the engine will also disable most of the
procedures that provide interface with the user software.
This is
done through the control of the variable associated with the engine state. If a
situation occurs that can put in danger the integrity of the knowledge base,
the system will change its state to Corby_Critical and refuse further
access to all the procedures than can access the knowledge base.
Some of the
procedures described below are restricted to specific states of the engine;
others can be called in any state. When a critical error occurs, these
procedures will return Corby_CriticalError. In this case, the user
software can get information about the specific error that occurred through the
Corby_LastError variable.
The engine
follows the general principle that the procedure that allocates dynamic memory
remains responsible for its release. However, in the case of the interface
between the engine and the user software, many procedures allocate memory to
carry information to its counterpart in the interface. In these cases, the
receiving procedure becomes responsible for the releasing of those blocks of
memory.
In the
description of each procedure below, whenever a situation like this occurs, it
will be stated clearly who is responsible for memory release. It is fundamental
that the user software follow these rules in order to avoid memory leaks,
which, given the dynamic nature of this software, could quickly bring the whole
system to its knees.
int CORBY_SetGlobalParameter(int ParameterID,__int64
value);
This
procedure sets the value of a global parameter that controls the functioning of
the engine. A global parameter is one that affects all the channels. ParameterID identifies the parameter to change.
The other input parameter carries the new value for the parameter. See below Option
parameters for more details.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
For some
parameters, the change in value affects the engine immediately, for others the
effect takes place only at engine initialisation. See below Option
parameters for more details.
This
procedure can return the following error codes: Corby_ParameterIDError, Corby_OutOfRangeError.
int CORBY_GetGlobalParameter(int ParameterID,__int64
*value);
This
procedure gets the current value of a global parameter. ParameterID identifies the parameter. The other
input parameter is a pointer to a 64-bit variable that will receive the
information requested.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can return the following error codes: Corby_ParameterIDError.
int CORBY_SetChannelParameter (int ParameterID,int
channelID,__int64 value);
This
procedure sets the value of a channel parameter that controls the functioning
of the engine. A channel parameter affects a single channel, that is, each
channel can have a different value for a given parameter. ParameterID identifies the parameter to change.
channelID
identifies the channel to which the parameter corresponds; it must be in the
0…14 range. The other input parameter carries the new value for the parameter.
See below Option
parameters for more details.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
For some
parameters, the change in value affects the engine immediately, for others the
effect takes place only at engine initialisation. See below Option
parameters for more details.
This
procedure can return the following error codes: Corby_ParameterIDError, Corby_ChannelIDError,
Corby_OutOfRangeError.
int CORBY_GetChannelParameter(int ParameterID,int
channelID,__int64 *value);
This
procedure gets the current value of a channel parameter. ParameterID identifies the parameter. channelID identifies the channel to which the
parameter corresponds; it must be in the 0…14 range. The other input parameter
is a pointer to a 64-bit variable that will receive the information requested.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can return the following error codes: Corby_ParameterIDError, Corby_ChannelIDError.
int CORBY_GetVariable(int variableID,__int64 *value);
This
procedure gets the value of one of several internal variables that reflect the
current state of the engine. variableID identifies the variable. The other input
parameter is a pointer to a 64-bit variable that will receive the information
requested.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can return the following error codes: Corby_VariableIDError.
int CORBY_CheckKB(char *path);
This
procedure checks if all the required files that make up the knowledge base are
present and are accessible. The input parameter corresponds to the absolute
path in the file system where the files must be found. This is the path that
will be used subsequently during normal engine operation.
If the
input parameter is carried in a dynamic memory block, its release is the
responsibility of the caller.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
When the
engine is started for the first time, none of the knowledge base files will
exist. This is reported as an error that must be handled properly by the user
software, eventually by asking the user if it is OK to create an empty
knowledge base.
This
procedure can only be called when the engine is in the state Corby_InitialState.
If this procedure is successful, it will change the engine to the state Corby_Kbchecked.
See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_NullPathError, Corby_InvalidPathError, Corby_NoKBFilesError,
Corby_MissingKBFilesError, Corby_KbIncompatVersion, Corby_CriticalError.
int CORBY_CreateKB();
This
procedure creates a new, empty set of files that together make up the knowledge
base.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_InitialState.
This procedure doesn’t change the state; therefore, before proceeding to open
the knowledge base, the user software must call again CORBY_CheckKB() so that the engine checks again the
knowledge base and changes state. See below Engine information
for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError.
int CORBY_RecoverKB(void *upar,void Cbprog(void
*upar,int prog));
When the
Corby Engine reports Corby_MissingKBFilesError in a CORBY_CheckKB() call, it is able to recover the
missing files from the main knowledge base data file. The user software can
then call this procedure to recover the missing files. Depending on the size of
the Knowledge base, this process can take a long time; because of that, the
user should be informed of the fact. Moreover, this procedure can inform the
user software of the progress of the operation through a callback procedure.
upar is a general-purpose parameter that will be passed to the callback
procedure. Cbprog is the address of a callback procedure that receives the progress
information. It has the form: void Cbprog(void *upar,int prog) where upar is the general-purpose parameter
described above and prog is a value in the range 0…100 indicating recovery progress; this will
typically be used in a progress-bar control. A negative value indicates that an
error occurred; a value greater that 100 indicates that the recovery procedure
has ended.
The
recovery procedure runs on its own thread; therefore this procedure returns
immediately. The user software must check for completion by the progress value
passed to the callback procedure. The recovery procedure can be aborted by the CORBY_AbortRecover() procedure, described below.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_InitialState.
If this procedure is successful, the state is changed to Corby_Kbrecover.
To use the engine after this procedure terminates, the user software must call
the CORBY_Reset()
procedure, which takes the engine back to Corby_InitialState. See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_NoCallBackError, Corby_CriticalError.
int CORBY_AbortRecover();
This
procedure aborts the knowledge base recovery process.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Kbrecover.
If this procedure is successful, to use the engine again, the user software
must call the CORBY_Reset() procedure, which takes the engine back to Corby_InitialState.
See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError.
int CORBY_Open();
This is the
procedure that initialises the Corby Engine. It only returns when the
initialisation is complete. As this can take some time, the user should be
informed about what is going on.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Kbchecked.
If the initialisation is successful (a return value of 0), the state of the
engine changes to Corby_Running, the normal operating state. See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError.
int CORBY_Close(void *upar,void Cbprog(void *upar,int
prog));
This
procedure initiates the closing down process. It returns immediately without
waiting for the process to finish. The user software can monitor the shutdown
progress through a call-back procedure that indicates how the process is going.
upar is a general-purpose parameter that will be passed to the call-back
procedure. Cbprog is the address of a call-back procedure that receives the progress information.
It has the form: void Cbprog(void *upar,int prog) where upar is the general-purpose parameter
described above and prog is a value in the range 0…100 indicating progress; this will typically
be used in a progress-bar control. A value greater that 100 indicates that the
recovery procedure has ended.
When the
engine is in the Corby_Critical state this procedure cannot be used to
shut the engine down; in this situation, the user software should use the CORBY_Reset() procedure described below.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running.
As soon as this procedure is called, it changes the state of the engine to Corby_Closing,
the closing down state. To use the engine again, the user software must call
the CORBY_Reset()
procedure, which takes the engine back to Corby_InitialState. See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError.
int CORBY_Reset();
This
procedure takes the engine back to its initial state - Corby_InitialState
- from a Corby_Kbrecover, Corby_Closing or Corby_Critical
state.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the states Corby_Kbrecover,
Corby_Closing or Corby_Critical, with no threads running. If the
procedure is successful, it changes the state of the engine to Corby_InitialState.
See below Engine information for more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_ThreadsRunningError.
int CORBY_Submit(CORBY_SubmitData *subdata);
This
procedure submits to the Corby Engine a block of data occurring on a given
channel. In text-based applications this block of data corresponds to one paragraph.
A response
to the submitted data can be retrieved using the CORBY_GetResponse() described below. In most cases
where a response is required following a submission, the user software will use
CORBY_Dialog();
this procedure does both operations simultaneously.
This
procedure implements automatic learning, whereby the submission data carried by
the subdata
structure constitutes a valid response to the last response given by the engine
on the same channel. This procedure also performs auto grade, whereby the last
response given by the engine is graded according to the data carried by the subdata structure.
If the user
software submits an entry with the slength member set to 0 and the sdata member will be set to null, the
engine will assume that the previous entry has a null response and act
accordingly.
If the subdata structure resides in dynamic memory
the user software is responsible for releasing it. However, the sdata member of that structure, if used,
must correspond to dynamic memory allocated with the new operator and it
will be released by the engine with the delete operator. When this
happens, the sdata member will be set to null and the slength member set to 0.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_ChannelIDError, Corby_InconsistentData.
int CORBY_GetResponse(CORBY_ResponseData
*respdata,bool forceFlag);
This
procedure gets from the Corby Engine the set of responses corresponding to the
latest submitted data. The individual responses are carried in a linked list
corresponding to the respList member of the respdata structure.
When forceFlag is set to true, the engine will try
at all costs to get a response, even if only loosely related to the submitted
data. This is useful in situations where creative output is acceptable, for
instance when the engine is used to write an essay about a subject. This
feature is only available when the user requests a response for a single channel;
therefore, when forceFlag is set to true, the rChannelMap member of the respdata structure must have only one bit
set.
If the respdata structure resides in dynamic memory
the user software is responsible for releasing it. Both the CORBY_Response structures corresponding to each
response and the rdata member of each structure, correspond to dynamic memory allocated by the
engine with the new operator and must be released by the user software
when they are no longer needed.
When a
timeout occurs, the engine returns immediately with whatever responses it got
so far. In this case it will return an error indication even if it got one or
more valid responses. In a timeout situation, as the engine processes responses
for each channel in turn, the lower order channels have a greater probability
of getting a response. For this reason, the user software should assign the
lower channels to the more important data.
Usually,
the responses obtained by this procedure are relative to the latest submitted
data. However, the user software can call this procedure twice without
submitting any data between calls. This is only valid if the second call
requests a response for a single channel. In this case the engine will use as
stimulus the last response it got for the same channel. If no stimulus can be
obtained, the engine will report an error.
If the
engine cannot get a response for one of the channels requested, it will return
an error. If the engine cannot get a response in any of the channels, the nitems member of the respdata structure will be set to 0 and the respList member set to null. This should not
be confused with the situation where there is an empty response. In the latter
case, the respList member will have a CORBY_Response structure, whose rdata member is set to null. The user
software must distinguish both situations and act accordingly. An empty
response is a perfectly normal response and happens many times, for instance,
in a conversation between humans, when a subject is drained out and a new topic
must be found.
If the last
entry of the submission queue is empty, the engine cannot build a response and
therefore it will return an error.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_NoStimulusError, Corby_ChannelMapError,
Corby_TimeOutError, Corby_NoResponseError.
int CORBY_Dialog(CORBY_SubmitData
*subdata,CORBY_ResponseData *respdata);
This
procedure implements the functionality of the CORBY_Submit() and CORBY_GetResponse() put together. This procedure
implements both automatic learning and auto grade The subdata structure cannot be empty; in the
cases where the user software wishes to insert a break in the conversation, it
should use the CORBY_Submit() procedure, described above.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_ChannelIDError, Corby_InconsistentData,
Corby_NoSubmitDataError, Corby_NoStimulusError, Corby_ChannelMapError,
Corby_TimeOutError, Corby_NoResponseError.
int CORBY_Bootstrap(CORBY_ResponseData *respdata,int
type,bool autosub);
The engine
keeps internally several sets of statements that the user software can get when
it needs a new topic to restart a conversation that reached a dead end. The
user software notices this situation when it receives an empty response from
the engine. The new conversation topic is carried by the respdata structure. autosub is a flag that, when set to true,
causes the new topic to be inserted automatically in the conversation stream.
If this variable is set to false, the user software is responsible for dealing
with the response.
type identifies the type of topic requested; it can have the values:
1.
Initial
topics – This is a
set of topics appropriate for starting an interactive session. These topics
correspond to the first sentence introduced by the user at the start of a
session when the engine is used in a passive mode; they can be used at the
start of a session when using the engine in an active mode.
2.
User
topics – These
correspond to the first sentence entered by the user after a conversation
break, that is, after the user submits an entry with empty data.
3.
User
Preferences – This
set of topics corresponds to the concepts most frequently entered by the user.
4.
Engine
preferences - This
set of topics corresponds to the concepts for which the engine is not sure what
the answer is; this is due to conflicting information that the engine has about
some issue. This is the only way that the engine has to force the user to
supply more information, which may help to resolve the issue.
5.
Recent
concepts - This set
of topics corresponds to the concepts most recently created by the engine.
6.
Frequent
concepts - This set
of topics corresponds to the most frequently used concepts encountered during
learning.
In some
instances the engine returns the topic exactly as it was received from the
input. In other cases, for instance in type 4 topics, the engine will walk backwards
in the chain of stimulus-responses until it finds the head of the chain.
Whenever the topic corresponds to a concept, the engine must get a concrete
instance of it. This process will be affected by the current context;
therefore, the engine can supply a great variety of topics, even when the
internal list of entries is relatively small.
Type 5 and
type 6 topics will provide some sort of summary of the things recently learned
by the engine. When used, for instance, immediately after the engine finished
processing a file, they will provide a summary of the topics used in that file.
All the
topics belonging to types 1 to 4 are kept in a file and loaded into memory at
start time. They will be saved to the file at engine shutdown time.
Topic types
5 and 6 are of a temporary nature: They are not saved to a file and are lost at
engine shutdown time. Moreover they are cleared by the CORBY_Restart() procedure described below.
The engine
collects these topics during its interactions with the user or during learning.
For each invocation of this procedure, the engine will retrieve the most
probable topic of the requested type. The probability of a topic depends on the
number of occurrences, its grade value and the time elapsed since the topic was
last returned by this procedure. The confLevel member of the respdata structure will reflect this
probability as a value in the range 0…100.
This
procedure can only request a topic for a specific channel; therefore, the rChannelMap member of the respdata structure must have only one bit
set. Otherwise, this procedure will report an error.
When the autosub flag is set, the topic generated
will be automatically inserted in the conversation stream as if it were a
response to the last entry. In this situation, the engine will also insert a
break in the learning queue through the auto-learning feature; therefore, the
user software doesn’t need to explicitly separate the two conversation blocks.
If the respdata structure resides in dynamic memory
the user software is responsible for releasing it. Both the CORBY_Response structure corresponding to the
topic and its rdata member, correspond to dynamic memory allocated by the engine with the new
operator and must be released by the user software when they are no longer
needed.
If the
engine cannot get a new topic, the nitems member of the respdata structure will be set to 0 and the respList member set to null. It will also
return an error code.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_ChannelMapError, Corby_NoResponseError.
int CORBY_Learn(CORBY_LearningData *learndata);
This
procedure inserts an item in the learning queue. In the stimulus-response
paradigm, subjacent to the whole Corby architecture, the slength and sdata members of the learndata structure carry the stimulus and
the respList
member carries the response to that stimulus in the appropriate channels. sChannelID identifies the channel associated
with the stimulus. All the members of the learndata structure except sChannelID are optional.
The confLevel member of the CORBY_Response structures used to carry the
responses for each individual channel is not used by this procedure.
In the
Corby Engine, the learning processes run on the background; therefore this
procedure just puts the learning request in the learning queue. There is no
advantage whatsoever in having many items pending in the queue; on the
contrary, it makes the whole operation more slow. Therefore, the user software
must avoid filling the queue in situations where that can be avoided. For
instance, while submitting a group of files for learning, the software must not
start submitting a new file while there are items pending in the learning queue.
The number of items pending for learning is given by the Corby_NumLearn
variable. The actual size of the learning queue in Kbytes is given by the Corby_LearnSize
variable. See below Engine information for more details.
The maximum
size of the learning queue in Kbytes is set by the Corby_MaxLearn
parameter. See Option parameters for more details. When
this value is exceeded, this procedure will return an error and the entry will
be rejected. Also in this situation the auto learn feature will be disabled.
For those reasons the user software should avoid this situation at all costs.
In
interactive learning situations, it sometimes helps if the knowledge base
reflects immediately the result of the learning triggered by the user input. If
this is deemed an important feature of the application, the user software can
provide this kind of synchronized learning, whereby a new stimulus is not
submitted to the engine while there are still items pending in the learning
queue.
This
procedure is intended for batch learning and should not be mixed with the CORBY_Submit() and CORBY_Dialog() procedures as they interfere
adversely with each other, because of the auto-learn feature. In situations
where that is unavoidable, such mixing should be kept to a minimum. This is due
to the fact that learning, as in fact the whole engine operation, is heavily
dependent on context and in this situation two very different contexts are
intermingled. This, after all, is not different from what happens in humans: If
you are studying a book, you avoid as much as possible to be disturbed by other
events and you certainly do not engage at the same time in a conversation about
some other unrelated subject.
In many
situations, the response to a given stimulus depends not only on the stimulus
itself but also on some event that occurred earlier and is now kept in the
knowledge base. So, in a learning situation, the user software must create the
proper context by previously inserting in the learning queue all the
information relevant to the response being learned. This is done by calling
this procedure with a learndata structure where the respList member is set to null. The number of items that can be kept in the
learning context is determined by the Corby_LearningContextDepth
parameter. See Option parameters for more details.
When the learndata structure has a response but no
stimulus, the engine will assume that the stimulus for that response
corresponds to the response in last entry of the learning queue for the channel
identified by the sChannelID member. If there is no stimulus, the engine
still accepts the entry but it will not cause learning; it can be used
eventually as the stimulus for the next response.
When the learndata structure lacks both a stimulus and
a response, it breaks the links between the entry that precedes it and the
following one; in this case, the sChannelID member is irrelevant, but still it should have
a value in the range 0…14. This break doesn’t affect the learning context.
As said
above, learning is heavily dependent on context. Whenever appropriate, the user
software should clear the context using the CORBY_Restart() procedure described below. In a
controlled learning situation, the user software should make liberal use of
this feature, so as to minimize learning time and CPU use. The objective is to
keep in the context just what is needed for a given response. The reason for
this is that the engine will try to establish correlations between the response
and all the elements in the context. Irrelevant relationships will eventually
be weeded out but, as said earlier, it takes time and CPU power. Flushing of
the learning queue is, as all the other learning operations a delayed action:
It will take place at the proper time according to the learning queue.
When
submitting files for learning, it is good practice to flush the learning queue
before each file, so that the learning data from the previous file gets
properly detached from the next. The engine will also flush the learning
context at the start of the application, so that the learning data from the
previous invocation gets properly detached from the current one.
If the learndata structure resides in dynamic memory
the user software is responsible for releasing it. However, the sdata member of that structure, if used,
must correspond to dynamic memory allocated with the new operator and it
will be released by the engine with the delete operator. When this
happens, the sdata member will be set to null and the slength member set to 0. Identically, the respList structures, if used, must
correspond to dynamic memory allocated with the new operator and they
will be released by the engine with the delete operator. When this
happens, the respList member will be set to null.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_ChannelIDError, Corby_MaxLearnExceeded,
Corby_InconsistentData.
int CORBY_Feedback(CORBY_FeedbackData *fbdata,bool
supres);
This
procedure implements the mechanism for suppressing undesirable behaviours and
to encourage desirable ones. The user software can also use this procedure for
the purpose of training the auto grade mechanism, whereby the engine can grade
its productions based on the user response to them.
Feedback
information is carried by the fbdata structure. The user software must fill the rChannelID member. All the other members are
optional. supres is a flag used to suppress a response given by the engine.
The
feedback is relative to the last entry submitted to the engine and its
associated set of responses. If the entry has a response but no stimulus then
that response is the result of calling the CORBY_Bootstrap() procedure and the engine will grade
the corresponding topic. If the entry has no response for the channel indicated
in the rChannelID
variable, no grade will take place. However this is not an error, as the user
software might use this procedure to add a new response for the stimulus (see
below). In this case, if there is no stimulus, the engine will report an error.
The user
can optionally supply an explanation for the grade given, by using the xlength and xdata members of the fbdata structure. This will be used to
train the auto grade mechanism. For instance, in a text channel, the user can
give this explanation for a grade value of -3: “You’re just being stupid”. This
means that later on, when the user says that sentence in a conversation, the
engine will automatically grade its last production with the same grade value
of -3. Given the engine’s abstraction capabilities, even if the user calls the
system stupid using a different expression, the engine will be able to grade
itself in the same way. Positive feedback works in the same manner. When this
feature is used, the grade member must not be 0 and a response in the indicated channel must be
available.
The
training of the auto grade mechanism, provided by this procedure, will go
through the learning queue and will take into account the current context, as
is the normal case. However, in this case, the engine will take special care so
that the auto grade training doesn’t disturb the context for the normal flow of
learning.
The user
software can also use this procedure to supply a response by using the rlength and rdata members of the fbdata structure. This will replace the
response given by the system if any, or simply add a new response, otherwise.
Later on, through the auto-learn feature, this will submitted to the learning
process. This provides a very straightforward mechanism for teaching the engine
interactively, especially useful in cases where the engine gives the wrong
response or no response at all. When this feature is used, the grade member must be 0. Also, when the
user software wants to add a new response, the engine will check if there is a
stimulus available and will report an error if there is none.
This
procedure can be called as many times as necessary in order to grade and
correct all the responses associated with the stimulus, in all the appropriate
channels.
If the user
software doesn’t want to provide an explanation, it must set the xlength member to 0 and the xdata member to null. Identically, if the
user software doesn’t want to provide the correct response, it must set the rlength member to 0 and the rdata member to null.
The user
software can also use this procedure to simply suppress a response given by the
engine, thus preventing it from reaching the learning process through the auto-learn
feature. To do this, the user software must call this procedure with the supres flag set. In this case the rlength member must be set to 0 and the rdata member to null. This results in the
text corresponding to the old response being removed, transforming it into an
empty response.
If the fbdata structure resides in dynamic memory
the user software is responsible for releasing it. However, if the xdata member is used, that structure must
correspond to dynamic memory allocated with the new operator and it will
be released by the engine with the delete operator. When this happens,
the xdata member
will be set to null and the xlength member set to 0. Identically, if the rdata member is used, that structure must
correspond to dynamic memory allocated with the new operator and it will
be released by the engine with the delete operator. When this happens,
the rdata member
will be set to null and the rlength member set to 0.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError, Corby_ChannelIDError, Corby_InconsistentData,
Corby_InvalidGradeError, Corby_SRMissingError.
int CORBY_Restart();
This
procedure clears the submission queue. This is useful, for instance, when the
user software wants to start a new interactive session without shutting down
the application. The learning context and topic types 5 and 6 will also be
cleared by this procedure. However, flushing of the learning queue is, as all
the other learning operations a delayed action: It will take place at the
proper time according to the learning queue.
The return
is an error code; a value of 0 indicates success. See below Error
Messages for more details.
This
procedure can only be called when the engine is in the state Corby_Running,
the normal running state. See below Engine information for
more details.
This
procedure can return the following error codes: Corby_OutOfContextError,
Corby_CriticalError.
This
section describes the parameters that affect the way that the Corby Engine does
some internal processing. These parameters have a default value that can be
changed by the user via the CORBY_SetGlobalParameter and CORBY_SetChannelParameter procedures. The current value of
each parameter can be retrieved by the CORBY_GetGlobalParameter and CORBY_GetChannelParameter procedures.
For some
parameters, the change in value affects the engine immediately, for other the
effect takes place only at engine initialisation.
There are
parameters that affect all the channels; these are called global parameters.
Others, however can be defined on a channel basis, that is, each channel can
have a different value for the parameter; these are called channel parameters.
Here is a
list of the global parameters:
Corby_LearningContextDepth
This
parameter corresponds to the number of previous items that the learning
procedures scan when looking for correlations with the response being currently
learned. Therefore all the dependencies must be contained in that context,
otherwise the system will not be able to learn properly.
The greater
this value is, the greater are the chances that the system will find the
necessary correlations, especially in the cases where multiple channels are
used. However, as the size of the context increases, so does the time it takes
to process one response because the number of items to process increases.
The number
of calls to the CORBY_Learn() procedure, establishing context for a given
response, determines the context size needed.
This
parameter can range from a minimum of 4 to a maximum of 256. The default value
is 8. A change in this parameter is only acknowledged at initialisation time.
Corby_MaxCachePerc
The Corby
Engine uses a memory cache to speed-up the access to the knowledge base. This
parameter defines the maximum amount of memory space that the cache can occupy,
expressed as a percentage of main physical memory.
This
parameter can have values in the range 5…95. The default value is 10.
Corby_MaxKBMbytes
This sets
the maximum size of the knowledge base expressed in mega bytes. This parameter
controls the working of the Garbage Collector, a thread that scans the
knowledge base looking for records that are no longer used. If no such records
are found and the size of the knowledge base is still above the limit, the
Garbage Collector will try to discard the least important information by
deleting the weakest links in the knowledge base; this will ultimately lead to
deleted records.
The actual
size of the knowledge base is given by the Corby_Kbsize variable
described below. However, you should bear in mind that deleting a record
doesn’t change the size of the knowledge base files; it is only marked as
deleted. Deleted records will be used later when the engine needs to create a
record of the same size. However, the action of the Garbage Collector is
governed by the real size of the knowledge base, that is, it will only delete
records when the real size of the knowledge base exceeds the value set by this
parameter.
There is no
upper limit for this parameter; the minimum size is 5 mega bytes; the default
value is 2048 mega bytes (2 giga bytes).
Corby_MaxLearn
This sets
the maximum size of the learning queue in Kbytes. When this value is exceeded,
the CORBY_Learn()
procedure will return an error and the entry will be rejected. Also in this
situation the auto learn feature will be disabled. For those reasons the user
software should avoid this situation at all costs.
This
parameter can have values in the range 50…50000. The default value is 5120 (5
mega bytes).
Corby_AutoLearn
This
parameter enables or disables the auto-learning feature. This consists in
sending to the learning procedures all the entries in the submission queue.
This a binary
parameter and consequently it can have only two values: 0, to disable
auto-learning and 1 to enable it. The default value is 1.
Corby_MemoryDepth
This
parameter controls the number of relationships associated with each knowledge
element. The greater the number, the richer the knowledge that the engine has
about something. The drawback is that as the number of relationships increases
so does the time needed to perform many of the engine’s operations. This
parameter also greatly affects the size of the knowledge base: The more
relationships as element has, the more space is needed to store them.
This
parameter can take values in the range 8…256. The default value is 64.
Corby_LearningDepth
This
parameter controls the depth to which the learning process goes while learning
a stimulus-response pair. The greater the depth the more efficient the process
will be. The drawback is again the time needed to process a learning item. This
parameter should be adjusted to the time available for learning a particular set
of items.
This
parameter can take values in the range 1…100. The default value is 10.
Here is a
list of the channel parameters:
Corby_WordSeparator
This
parameter corresponds to the code used for the word separator. This is user by
the parser during analysis of the user input for the channel.
Any 16-bit
value can be used for this parameter. A negative value indicates that no word
separator should be used for this channel. The default value is 32, the ASCII
code for space.
This
section includes a list of error conditions detected by the engine and reported
to the user trough return error codes or by the CORBY_GetVariable procedure accessing the Last
error variable. The list is organized by error code.
1.
Corby_CriticalError. This corresponds to a situation
where the system must disable all the procedures that access the knowledge
base. This is necessary to prevent the corruption of the knowledge base and
occurs, for instance, in all situations where the low-level file I/O procedures
of the Operating System report an error. The occurrence of a critical error
changes the engine state to Corby_Critical. See below Engine
information. Most of the procedures described above can return this error.
When a procedure returns this error, the user software can check the Corby_LastError
variable; in some cases, it will contain another error code that indicates what
has happened.
2.
Corby_VariableIDError. The VariableID parameter is
out of range in the CORBY_GetVariable Procedure. See below Engine
information.
3.
Corby_ParameterIDError. The ParameterID parameter is out of range. See CORBY_SetGlobalParameter, CORBY_SetChannelParameter, CORBY_GetGlobalParameter and CORBY_GetChannelParameter procedures.
4.
Corby_OutOfRangeError. The parameter value is out of
range. See CORBY_SetGlobalParameter and CORBY_SetChannelParameter procedures.
5.
Corby_ChannelIDError. The channelID parameter is out of range. See CORBY_SetChannelParameter, CORBY_GetChannelParameter, CORBY_Submit(), CORBY_Dialog(), CORBY_Learn() and CORBY_Feedback().
6.
Corby_OutOfContextError. This procedure cannot be called in
the current state. All the restricted access procedures can return this error.
7.
Corby_NullPathError. The path parameter is null. See CORBY_CheckKB().
8.
Corby_InvalidPathError. The path parameter is invalid: It cannot be
accessed. See CORBY_CheckKB().
9.
Corby_NoKBFilesError. One or more essential knowledge
base files do not exist. See CORBY_CheckKB().
10.
Corby_MissingKBFilesError. Some of the knowledge base files
are missing. See CORBY_CheckKB().
11.
Corby_ThreadsRunningError. Some of the threads are still
running. See CORBY_Reset().
12.
Corby_NoSubmitDataError. There is no data in a CORBY_SubmitData structure. See CORBY_Dialog().
13.
Corby_NoResponseError. The engine could not get a
response in the time allowed. See CORBY_GetResponse(), CORBY_Dialog() and CORBY_Bootstrap().
14.
Corby_TimeOutError. The time allowed elapsed while
getting a response. See CORBY_GetResponse() and CORBY_Dialog().
15.
Corby_InvalidGradeError. The grade value is outside the
allowed range. See CORBY_Feedback().
16.
Corby_SRMissingError. The required stimulus or response
is missing. See CORBY_Feedback().
17.
Corby_NoStimulusError. There is no stimulus in the queue
for which to provide a response. See CORBY_GetResponse() and CORBY_Dialog().
18.
Corby_ChannelMapError. There are either too many (more
than one) or too few (none) channels selected in the rChannelMap member of the CORBY_ResponseData structure. See CORBY_GetResponse(), CORBY_Dialog() and CORBY_Bootstrap().
19.
Corby_NoCallBackError. The pointer to the callback
procedure is invalid See CORBY_RecoverKB().
20.
Corby_SymbolUpdateError. An error occurred while updating a
symbol in the knowledge base. This is a critical error.
21.
Corby_SymbolNotFoundError. A referenced symbol could not
found in the knowledge base. This is a critical error.
22.
Corby_SymbolCreationError. An error occurred while creating a
symbol in the knowledge base. This is a critical error.
23.
Corby_SymbolDeletingError. An error occurred while deleting a
symbol from the knowledge base. This is a critical error.
24.
Corby_MaxLearnExceeded. The maximum size of the learning
queue has been exceeded. See CORBY_Learn().
25.
Corby_InconsistentData. A data structure has inconsistent
data, e. g. the sdata member is null and the slength member is not 0 or vice-versa. See CORBY_SubmitData(), CORBY_Dialog(), CORBY_Feedback() and CORBY_Learn().
26.
Corby_LearningQueueError. An error occurred while reading
from or writing to the learning queue. This is a critical error.
27.
Corby_KbIncompatVersion. The version of the knowledge base
is incompatible with the current version of the Corby engine software.
This
section describes the information about the engine that the user software can
get through the procedure CORBY_GetVariable. The list is organized by InfoType.
1.
Corby_EngineState. This is the variable that controls
access for the procedures in the restricted group. It can have the following
values:
0 - Corby_InitialState – Initial state, engine not initialised.
1 – Corby_Kbchecked – Knowledge base checked
2 - Corby_Kbrecover – Recovering knowledge base files
3 – Corby_Running – Normal running state
4 –.Corby_Closing – Closing down
5 - Corby_Critical – Critical error.
2.
Corby_LastError. This variable contains the last
error detected by the engine, which caused it to change its state to Corby_Critical.
By getting this information, the user software can know what was the specific
error that occurred and take appropriate actions.
3.
Corby_NumThreads. This variable indicates the number
of currently active threads in the Corby Engine. The engine can only be shut
down when this variable is 0.
4.
Corby_NumLearn. This variable indicates the number
of items in the learning queue waiting processing.
5.
Corby_LearnSize. This is the actual size of the
learning queue in Kbytes.
6.
Corby_KbSize. This variable indicates the size
of the knowledge base expressed in mega bytes. This variable corresponds to the
real size occupied by the main knowledge base data file. This value is usually
less than the one reported by the file manager, due to deleted records, which
occupy file space but are no longer in use by the engine. This variable doesn’t
take into account the space used by the auxiliary files (usually a fraction of
the space occupied by the main data file).
7.
Corby_KbRecords. This variable indicates the number
of active records in the knowledge base.
8.
Corby_NumItemsCache. This variable indicates the number
of items currently in the cache.
9.
Corby_MemoryUsed. This is the amount of main memory
currently used by the cache.
10.
Corby_SessionConcepts. This variable indicates the number
of concepts created in the current session.
11.
Corby_KBConcepts. This variable indicates the total
number of concepts in the knowledge base. The product of this variable with the
Corby_InstancesPerCcept variable constitutes the Figure of Merit. This
indicates the quality level of the knowledge base.
12.
Corby_InstancesPerCcept. This is the average number of
instances per concept in the knowledge base. The product of this variable with
the Corby_KBConcepts variable constitutes the Figure of Merit. This
indicates the quality level of the knowledge base.
13.
Corby_KbaseVersion. This variable indicates the
knowledge base version times 10. It is used to verify the compatibility between
the knowledge base and the current version of the Corby engine software. When
the knowledge base is created, it inherits the software version of the Corby
engine that created it. This is a number in the form x.y where x is the major
revision number and y is the minor one. For instance, if the knowledge base was
created by the Corby engine software version 1.5, this variable will hold the
value 15. As the software evolves, it may no longer be compatible with earlier
versions of the knowledge base. In those cases, an utility will be provided
that upgrades the old knowledge base to the new version.
Comments and suggestions about this page are welcome and should be sent
to fadevelop@clix.pt
Rev 1.0 - This page was last modified 2005-08-26
- Copyright © 2004-2005 A.C.Esteves