IBM Agent Building Environment Developer's Toolkit


Appendix B. Frameworks in General


Engine Framework

Description

The Engine Framework operates between knowledge and adapters, between an agent's internal representation of the world and its interactions with the external world itself. The word engine is commonly used to describe a broad range of decision making technologies, such as script interpreters, linguistic analyzers, rule interpreters, and many other information technologies. The Engine Framework provides a structure for all these engine types but remains technology-neutral. For instance, this framework does not specify the use of any decision technology as required. However, this release of the toolkit is currently targeted for supporting rule interpreters.

The Engine Framework is based on the event-condition-action paradigm, which operates across this boundary. The parts of the paradigm are as follows:

These interfaces are paralleled by the Adapter Interface, through which an engine controls applications. In fact, this is the point of the engine interfaces; engines can work with each other just as if they were objects under an Adapter Interface. Each engine can specify its semantic interface, which allows any other engine to process its events and interact with it according to the dynamic configuration allowed by run-time binding. In fact, most engines will be under the configuration and policy control of rules within an inference engine.

Types of Engines

The engine framework in the Level 6 of the IBM Agent Building Environment Developer's Toolkit is represented by the most generally required engine type, a rule-based Inferencer, but the framework can be extended to include many other types such as for learning, which is also becoming generally required. This section explains the different engine types and explains how such other engines can be added, so that Engine developers can begin to working with the other IBM Agent Building Environment Developer's Toolkit components.

Chainer

Every agent contains a chainer engine, of one sort or another. A chainer observes events and interacts with the environment through Adapters. A rule-based inference engine is the most common example. Although this type is labeled Chainer, there is no presumption that it uses inferential chaining. This engine type simply chains the event to some network of symbols, nodes, other engines, or whatever, eventually chaining back to sense or effect the world through an adapter.

Executive

Very often an application will also need an Executive engine, which observes events but does not provide the core processing function of a Chainer. Instead, an Executive is responsible to manage specific problems outside the role of the Chainer but manages one or more of them. For example, a UserIterator engine within a multiuser system receives a single event from the environment and steps a rule-based inference engine through the rules of each user.

An Executive engine has two sets of interfaces. As an engine that communicates with adapters, it observes events and requests sensors and effectors. As an event source to another engine, it generates events and acts as if it where an adapter to its down-stream engines. Specifically, an Executive engine mimics an adapter, taking its single eventId and exploding it into a series of events (such as for each user in the UserIterator). The Chainer can directly call any other adapter directly, but it cannot return the eventID back to the original adapter because the Executive has changed it. The Executive must intercept the request to the original adapter, resolve the user-specific eventId back to the original single eventId and pass the request through to the adapter.

An Executive can control one or more other engines; therefore, it is often useful to intercept the decisions of several different engine types in order to resolve conflicts or eliminate redundant actions on the same event context. An Executive can manage other engines in parallel, or may be designed to serialize them such as when following hypothesis generation (by a generalizing learning engine) by safety checking (rule-based filtering). This complexity is not required in all cases; therefore, the interfaces for such Adapter mimicry are optional.

The internal technology of Executives is not specified. They can be hard-coded, script-based, state-controlled, or whatever.

Analyzer

In Adapter terms, Analyzer engines provide sensor functions. Given a set of facts, they translate, filter, map, transform or otherwise generate new (derived) facts. Just as an Adapter can provide several different sensors, each Analyzer registers and provides one or more analytical procedures. For instance, a linguistic analyzer might provide the following derivations, given a long string of text:

With such an Analyzer, a rule's antecedent can be written at a higher level -- "if the article contains keyword 'Lotus'" can be raised to "if the article is about automobiles", whether or not "automobiles" is explicitly contained as a keyword. Analyzers provide a sensing function to other engines, but they are not Adapters because they do not truly interact with the external application world. (There are other complexities which are mentioned later.) Their mapping or evaluative functions tend be entirely stateless, self-contained, and are wholly internal to the agent composition. Also note that linguistic Analyzers are very sensitive to National Language considerations.

Analyzers can be thought of as between Adapters and Engines, and can serve the needs of either. For instance, category-from-keywords might be geared to the application of electronic commerce adapters. On the other hand, some engines might require special transformations such as one-of-N or "thermometer" codings. Analyzers can also be image-based rather than text-based, but in general, their functions cut across the application focus of specific adapters. Their analyses are based on their argument types.

Analyzers have a much simpler interface than Adapters. Like Adapters, they must register their procedures when asked to identify themselves, but otherwise, they only provide sensor-like, synchronous queries. Like Adapters, Boolean sensing is a special case, allowed for efficiency. Note that the reference to IAEventHeader is still required; Analyzers provide stateless, static functions based only on the binding strings they are given, but the event header contains fields that might become critical, such as request priority during real-time inferencing.

The interface is otherwise identical to the Adapter terminology. The use of "sensor" within function names emphasizes the similarities between the two and will ease migration for engine developers from the Adapter Interface to this. Analyzers provide internal "sense".

Monitor

Monitor type engines can range from trivial loggers to the most sophisticated learning algorithms. In general, they attach and watch an event stream. However, the semantics and control of learning (for example) is much more complicated than the generic IAEngine::observe(TriggerEvent) function. While it is architecturally allowed that any engine can implement itself as a full IAEngine, it is preferred that each monitor focus on its core functionality and leave all of the application authoring control to rule-based configuration. Not only will this approach provide more application flexibility, implementing a full IAEngine is a lot of work, which can be saved and better used to develop the Monitor's core functions.

A lot of authoring is required around Monitors. For instance, core learning technologies (such as a statistical analysis or neural network) do not always evaluate the relevance of an event or the contiguity of cause and effect in real-time. This application specific knowledge must be authored. Something other than the associative algorithm must specify when to learn and what to learn. What events are indeed important to the application and would be valuable to predict in the future or merely remember? What are the relevant attributes for attention? In some regards, answering this question is the job of learning technology, but the stripping of attributes can help the Monitor better focus. This "authoring" is also true of real neural systems, which demonstrate instinctive behaviors and -- even in learning -- a required "preparedness" to learn.

As well, a learning engine would be trivial if it only watched; it should also do something to be of value. Learning engines are most valuable when the generalize their knowledge to new situations -- but this can be a very dangerous benefit! Simply because an agent predicts that a user will delete a mail item does not mean that it should in fact delete it. To the agent, it might as well delete it as place it in a "Junk Folder". The decision of what to do must be authored by the application developer (configuration rules) or end-user (personal rules).

This is a general principle of knowledge and the control of engines within the IBM Agent Building Environment Developer's Toolkit. Knowing something does not necessarily imply doing something. Predicting an action certainly does not imply doing the identical action. In fact, prediction of an event is often useful for controlling avoidance of the event. Therefore, the control of Monitors must be provided by explicit authoring/instruction, such as through a rule-based Inferencer.

Monitors designers will often also want to generate events, although this is entirely optional. By notifying the engine composition that some confidence threshold has been reached or that a new link has been formed, for examples, some consequence can be attached to it. The user can be asked a question about whether the agent should automate some step in the future, or the agent may pursue some any other course of action as it might be instructed.

The Monitor interface is more complex than an Analyzer and requires almost the entire Adapter Interface. A Monitor will provide write services, which will be used when the Monitor should watch an event stream and record important information. The performAction function allows a Monitor to express its write services. Different forms of writing, even within the same Monitor, might require different arguments as well and therefore different effectors should be defined. For instance, some forms of learning differentiate between the simple observation of an object (for sensory correlations) and the learning of cause and effect (for predictive contiguities).

A monitor will also provide read services, which will be used when the Monitor uses its stored information to respond to queries. The answerQuery function allow for the Monitor read services to be expressed. For instance, the application might require pattern matching; given a partial set of facts, many learning/memory engines can complete a pattern of facts. On the other hand, predicting an event based on the partial set of facts can be another, different function.

Finally, the notify function is optional, but Monitor engines will tend to implement it. Monitors will often reach internal critical states, which are potentially important to the agent system. For instance, as learning slowly develops case by case, reaching some level of confidence might provide a valuable opportunity, such as asking the user about automating some action which is now confidently predicted. Again, rule authoring of such agent behavior is suggested. The Monitor's responsibility is to make notice of such an event. The consequence of the event is the responsibility of the application.

Compositions

These engine types can be configured into several common design patterns. Only a few of the possible configurations are reasonable, ranging from a single Inferencer to a mixture of hybrid types and controls. The composition of these engines requires that the knowledge of one engine be expressed in terms of knowledge and inferential processes of another.

Simple Composition

The simple composition is one Chainer, which observes events and calls adapters for sensing and effecting. This is the current level of the Level 6 of the IBM Agent Building Environment Developer's Toolkit, where the Chainer is a rule-based inference engine. The strongest requirements for IA systems are for rule-based inferencing; therefore, it has been provided first.

Reflective Composition

The reflective composition is one Chainer, supported by an Analyzer. This composition is called reflective because the agent's decision are based on the state of another internal component.

Hybrid Composition

The hybrid compositions consists of one Chainer, supported by an Analyzer and Monitor. Although hybrid technology systems are still in the minority, this is an emerging trend. A complete agent is constituted from a set of different, complementary technologies. In most applications, both rule-based and learning-based requirements coexist; users need to give explicit instruction in some cases. In other cases, the user cannot or does not want to give instruction. While the Chainer can control the operation of the Monitor, the value of rule-based instruction is available in its own right. Obviously, such complexities should be made seamless at the user interface so that the hybridization is not a hodge-podge of user controls. This is the responsibility of the View Framework, but rule-based composition and control of these various other engine types allows the perceived integrity of the agent to the user.

Complex Composition

Agents will often be built as multi-user services, which adds complexities for managing different knowledge sets for different users. When the agent receives a single event from an adapter representing a common source and needs to service this event for many users, executive control is required, which is not the responsibility of inference or learning engines.

In this composition, an Executive will control a Chainer. The Executive will receive a trigger event from an adapter. The Executive will then explode that single trigger event into multiple inferencing episodes, one inferencing episdoe for each user supported by the Executive. For this and other such problems, the Engine Framework will expand and allow for other basic types and more complex compositions.


Knowledge Framework

The Knowledge Framework is not truly an architectural "layer" with a well-defined API and SPI. Within an engine, knowledge is used by engines to do their processing and this knowledge is stored by the library. Because of that role, to understand the Knowledge Framework, you need to look at the Engine Framework and the Library Framework.


Library Framework

The IBM Agent Building Environment Developer's Toolkit Library is the manager of persistent storage and retrieval of objects that are needed for engine inferencing. It is also used by rule editors to originate and modify (author) the materials that drive inferencing, e.g. rules and long-term facts. In addition, the library provides storage for scoping and organizational grouping information, such as user lists and storage for metadata on such grouping information, e.g. user profiles, organizational profiles, and system profiles. This information is referred to as control information. The inferencing information from the library (built and maintained by rule editors) is retrieved as needed to build knowledge sets that are used to drive inferencing of IBM Agent Building Environment Developer's Toolkit engines. Knowledge sets are memory objects and are part of the knowledge store. The library provides the retrieval of inferencing information, but it is not the component that builds the knowledge set from it.

Conduct Set Objects

The set of inferencing materials (rules, facts, etc.) that are needed for an instance of inferencing by an inferencing engine is called a conduct set and the persistent storage of this information is referred to as inference store. For the particular case of rule inferencing we refer to the persistent storage as inference rule store. In summary, control information is used to select a particular inferencing set and that inferencing set is used to build a knowledge set for an inferencing engine. All of this is kicked off by trigger event arrivals. The library provides a platform-independent and repository-independent system for storing and retrieving the control information and inferencing sets without being involved in the formulation of the data itself, i.e. the library is a passive service.

Access Library Objects by Name

Information storage and retrieval from the IBM Agent Building Environment Developer's Toolkit Library is accomplished through names rather than through platform or repository-specific navigational information. The only platform/repository-specific information required by the library is a map that is provided through the initial agent configuration process. This map allows the library to find a library object in persistent storage from which the remaining library objects can be located. The IBM Agent Building Environment Developer's Toolkit user is given platform/repository-specific instructions for providing this map (string).

Each object in the library store is named and typed for identification. Naming is controlled by the user of the library through the administrative process or rule editor, for example. Persistent library object typing is accomplished automatically and implicitly by the library based on the library object (in memory) and the context that is used to create the persistent storage object. Because the library provides the means for users to scope names, selection of unique names is made simple. Name scopes are based on hierarchical groupings of inferencing objects. These same groupings provide control points for secure access and consistency for concurrent access to persistent objects in the library.

Rules and facts in the library can be selected by name, as described above. Additionally, you can control the state of rules and facts, where such state can affect which rules or facts are loaded for inferencing. Changes in state can also be affected by the consequent of a rule firing; thus one rule could affect the loading of subsequent rules.

Local access to the IBM Agent Building Environment Developer's Toolkit Library is through a simple set of local objects in agent memory. Transparent to the library user, access to local or remote library repositories is accomplished through these local objects. These local library objects are for library access, not directly for inferencing. They are used as a source for building knowledge sets which are used for inferencing.

Knowledge Interchange Format

Inferencing rules and long term facts are stored in the inferencing rule store of the library in the form of linear Knowledge Interchange Format (KIF). This format is intended to permit interchange of inferencing rules between engines of like type and to provide a common target format for multiple rule editors. In addition, the same KIF format is used by IBM Agent Building Environment Developer's Toolkit adapters to encode short term facts to engines through trigger events. This common externalization of rules and facts promotes interchange and architectural consistency. Although KIF syntax is human-readable, it is not intended as a rule editing language. The library's only direct sensitivity to the contents of rules and facts is by way of its service for validating KIF syntax before storing rules in the inference rule store.

Learning Objects

In addition to storing traditional inferencing materials and control information for selection of inferencing sets, the library also expects to be a provider of persistent storage for learning engine objects such as episodic memory, associative objects, and profiles for recording user actions, activities, and preferences (expressed or learned).

Logs

The library also allows you to store and display chronological and persistent logs of agent activities. You can maintain many such activity logs, each with a particular organizational scope. For example you could maintain a log of inferencing activity for each user for which inferencing is done. This could be used by a user to examine actions that results from the rules that were run on his/her behalf. The activities recorded depend on the applications or individual engines use of logs. Other uses for logging include debugging, auditing, and accounting. Here again the library offers logging as a passive service, not defining what is logged or when it is logged.

Object Metadata

The library provides for persistent metadata storage and retrieval. Metadata can be associated by name with most library objects. When you retrieve an object you can optionally retrieve metadata with that object. This metadata can be used to describe the object contents. For example, rule editors can use metadata to maintain with rules such things as end-user terminology, GUI panel associations, and template associations. Engines can use the metadata to record signatures of rule elements. It can also be used to associate registration information with inferencing sets.

To allow for multiple instances of metadata for a single library object, metadata can be named. The scope of the name is limited to the scope of the object to which the metadata is associated. Multiple instances of metadata allow you to describe the same inferencing data in different ways. You can use this to set up conventions whereby the same objects can be shared, interpreted, or used in different contexts by different programs or users.

Note: Although multiple instances of metadata are supported at the Collector, RuleSet, and LTFactSet levels, we are still examining the feasibility of multiple instances of metadata at the lowest level, e.g. the Rule or LTFact level. Although it may be nice to allow for multiple rule editors of the same set of rules, it is not clear how these editors would coordinate changes, especially concerning their own versions of metadata. Therefore, while we allow for multiple editors of the same rules, multiple instances of Rule or LTFact-level metadata are not yet permitted. (If we had two rule editors, each with its own metadata for the same set of rules, it is not clear what happens when one of these editors adds a rule to the set. Even though the editor that adds the rule would be able to add its own metadata to the rule, the second editor might not know how to retrofit its different metadata to the new rules. Metadata would tend to naturally fallout of the authoring of a new rule. However, retrofitting metadata after rule creation would seem to be unnatural.)


View Framework

The View Framework includes all interactions between the user and the agent (and between the user and the Library, not formally contained by the agent). Such views cover a wide scope of issues including:

Moreover, the View Framework will address a number of advanced topics such as

While the full framework covers this scope, the initial focus of this document is on agent instruction, which includes administration and some underlying services.

The word "editors" is used loosely, because rule editors have mostly failed for common end-users. Rather than have users write rules (which they don't do), some researchers claim that agents should just watch and learn. This approach is very true, but the View Framework is more open to the range of knowledge technologies than such a single paradigmatic claim. For instance, many forms of knowledge, such as corporate or departmental policy, are rule-based by nature. Some instructions to an assistant can be explicit and well stated. Sometimes, there is no other method of instruction; an agent cannot learn how to handle office jobs when a user is on vacation; the agent must be told. On the other hand, learning is in fact a primary form of instruction, and so the View Framework's "editor" must be able to provide and to simplify the underlying complexity -- from single inference engines to the hybrid combination of many engine types.

This framework focuses on the notion of an agent dialog or "smart guide". Similar to the Engine Framework's use of rule-based composition -- even around the inclusion of learning -- the View Framework uses instruction as a primary notion for the following tasks:

This philosophy of instruction allows the seamless use of smart guide dialogs from application integrator to administrator to end-user -- based on a standard KIF representation language.

Interaction with other Frameworks

Adapters

Adapters make absolutely no assumption about viewers and editors. They are responsible only for their semantic interface, which is symbolic. Any association between adapter symbols and end-user terminology is provided by the view framework.

Adapters will tend to require some administrative viewing and control. For instance, some adapters such as for e-mail will require the end-user's password ( depending on the system's security model) in order to act as an autonomous agent. The collection and maintenance of such data is the responsibility of the View Framework. Installation and removal of agent components in the operating system's registry is also a primary administrative task, which must be addressed.

In future releases of IBM Agent Building Environment Developer's Toolkit, when learning engines are also provided, adaptive user modeling will require that adapters provide more and more semantic events. For instance, MAILARRIVED event is driven by the mail system, not the end-user. It is the critical event for agent automation, but other events such as OPENED, CLOSED, and PRINTED would need to be delivered by an adapter for agent learning. Special adapters built specifically for end-user interaction are also required. For instance, a rule might require the "agent" to ask the user a question (a sensor) or deliver a message (an effector). Some special events such as COMMAND can be sent directly to the agent. The user interface to these functions and the semantic interface to the agent are provided by specialized adapters.

These last two requirements for the user-agent runtime dialog are not the focus of this document, however. The initial needs from the View Framework are for rule-based instruction.

Library

For any type of knowledge representation, viewing and explicit editing of knowledge are performed through the Library. As much as possible, the Library is based on standard knowledge representations such as KIF, allowing any viewer or editor to plug-and-play in the View Framework, so long as it works against the KIF format. In the same way that Engines are free to convert KIF into a parochial format for their run-time if they choose, rule editors might choose to map KIF into their own format to best manage the presentation.

As a service to Views, the Library can also store any view specific metadata associated with a rule. The KIF representation contains user provided values, but otherwise is entirely symbolic and formal. Unfortunately, some views may lose data when "compiling" to KIF format; therefore, the Library can be used to store such additional data as needed. For instance, natural language mapping from a rule to a more "common language" expression can be stored as metadata.

Other relationships such as to the Engine Framework are also required and will be elaborated in future releases.

Design Criteria

The gamut of design criteria for the View Framework is as large as the framework itself. Of course, user interface design criteria are primary. These include issues of panel design and the special issues involved with user-agent interaction. Also, IA engenders new system design issues: For instance, an agent that polls every five minutes for new mail might be disconcerting to the user who assumes the agent acts immediately when mail arrives; the user might see an example of junk mail that he/she knows should be filtered by the agent -- but the agent has not yet "seen" this item.

Aside from all such other issues, there are the design goals of the View Framework itself:

Examples of these design points are provided in the following components, which are based around the natural metaphor of user-to-agent instruction.

Components

Note: These components are not provided in the Level 6 of the IBM Agent Building Environment Developer's Toolkit, but they give some sense of direction for how the IBM Agent Building Environment Developer's Toolkit will grow.

Rule Authoring

Rule authoring can be done with a composite client program for administrators and end-users. This program will allow creation of rules rules that configure an agent and creation of rules that instruct the agent. While administrators and end-users will author different sorts of rules, there is little difference between construction and instruction of the agent.

Rule authoring will use a combination of these advanced presentation techniques:

Form-based and other graphically-based methods of direct rule authoring by common end-users have generally failed. Many rule editors have been very well done. For instance, some graphical editors do not require the explicit expression of "and" or "or" by making these functions implied in the sequential/parallel arrangement of nodes. Some form-based editors try to simplify the problem by disallowing any term nesting, and even disallowing OR between terms (all terms are ANDed; another rule can be made to OR another case). But end-users simply do not want to write rules. Not only is this a secondary task to doing business, it is the very hardest work in IA.

Common language can be used to suggest rule as templates; users must merely fill in the blanks if they like the overall function of the template. However, even this is a hard cognitive load. The View Framework provides both the common language and smart guide methods in concert with each other.

InstructionDialog

This is the script which runs the dialogs. Dialogs tend to be presentation- and media-independent. The Frame classes mentioned below assume a graphical user interface, but the nature of dialog allows the user-agent interactions to be played over a telephone as well.

Stylistically, each instance of a dialog should be a relatively small, modular set of questions about a specific context, such as deleting junk mail. Each dialog is associated with that context.

InstructionDialogEngine

InstructionDialogs are controlled by an interpretive engine.

ContextListFrame

The rule authoring program must be scalable. It must handle the instruction of a simple agent with only one or two adapters as well as a monolithic personal secretary agent that handles virtually all of a user's office objects. To achieve this, each InstructionDialog is modular and associated to a particular object and event. The event type will tend to define the context for the dialog such as "Let's talk about what you want me to do when mail arrives." "Do you get a lot of junk mail from your manger?" "Who is your manager?" What is the typical subject of this junk mail?" This list of context can be managed in terms of adapters and adapter events, or otherwise include an item for other tasks such as administration.

The context list frame is not only scalable; it will also be context sensitivity. The starting point can be specified so that any particular dialog can become the initial focus. This allows two flavors of sensitivity:

The modularity of InstructionDialogs allows users to incrementally build the agent's repertoire. It allows the user to incrementally build up trust in the agent. For instance, the user might use the components initially through only one of several dialogs. This is similar to getting a secretary. As this seems to go well, the user can return to give more instructions.

RuleTemplate

While KIF is used as the standard knowledge format for rules, KIF syntax is very far from natural language of common end-users. The mapping from a KIF rule to a common language representation is defined by a RuleTemplate. The major part of this mapping is for the location of user-specified values between the two forms.

InstructionListFrame

As the user works with the dialogs to build rules, the common language representations of those rules can be displayed in an "Agent Instructions" list. In other words, the dialogs can display a fair representation of what it "thinks" it needs to do as a result of the dialog. This is one form of confirmation, an important feedback mechanism in any dialog. The confirmation re-expresses the dialog in another format. The dialog itself and the instructions it generates are different but related. The user can edit an instruction by selecting it, which will place the user back in the dialog. Direct viewing of the KIF structure will be allowable through context menu.

EndUserLabelDictionary

Given the absolute separation of models from views, Adapters and Engines need only specify their symbolic interface. They do not maintain terminology resources. This is the responsibility of the EndUserLabel Dictionary as one of the underlying presentation services of the View Framework. Its use is not required by any view but is helpful to any views that present events, conditions, and actions in lists or other highly structured forms. It manages associations between the adapter/engine symbols and end-user strings (defined by National Language Support or other customization to particular end-user needs).

CommonTermsDictionary

Whenever an end-user specifies a particular value such as a boss' name, an important customer project name, even numeric data such as rate limits, these values are assumed to be important and potentially reusable. For instance, a phone number specified in one rule might also be used in another rule. Any editor can use this dictionary for providing these already-used terms as suggestions.

This service leads to CommonTermManagement. For instance, heavily reused literals should be made into variables -- through conversation with the user. Once a variable is defined, such as MYPHONENUMBER is XXX-XXXX, the variable can be edited rather than changing the literal values through all the dialogs. This service will be especially important to learning engines, which can automatically track the relevancy of terms as users go about their business.


[ Top of Page | Previous Page | Next Page | Table of Contents ]