Current agent-enabled applications are rigid systems, with regard to messaging and information access. The IBM Agent Building Environment Developer's Toolkit allows for flexibility in that it draws on the appropriate skills (intelligence) and knowledge (logic, rules), to interface with the user or other applications.
Skill describes what the agent can do or what the user can expect from the agent. An agent could have rigid or flexible skills. An example of rigid and flexible is similar to processing a request: get me first class tickets on the 1700 Air France to Paris for 15 June. In a rigid skills environment an agent will perform the task of reserving the tickets. However, with more proficient skills or intelligence, the agent would need only the request get reservations for Thompson Meeting. The flexible agent, with access to the user's calendar and personal preferences could arrange travel, lodging, and dinner accommodations.
The user could also select the best KNOWLEDGE representation, which is often described as 'rules'. (".. if Tom sends a note, page me"). A rule set could be more complex, and could have inferred conditions ("... if Tom sends a note, first check into my calendar, determine where I am -- or was last -- and call me"). In general, knowledge should be thought of as information about the user's preferences or needs: This includes information about the groups to which the user belongs.
Interface includes both the user and the application interfaces.
The user interface could be as simple as a form or template of the type used in many e-mail systems today. For example: IF mail is from Joe, THEN put it on the top of the list. These are clear, deterministic commands. Alternatively, the interface could be something which looks over the user's shoulder, observes how the user handles messages, and builds the agent's own rules to emulate what the user has done. This more sophisticated interface has been researched extensively, but is still scarce in commercial applications.
The application interface or adapter is an entirely new concept: instead of closed, self-contained agent-enabled applications (as is now the case), the Agent Design Model calls for agents to adapt to any application. This allows a user to retain a current application which is otherwise satisfactory, but to add agent capability to the extent desired. From early experience, it is clear that some applications are easier to enable with agents than others. However, once an 'adapter' (software which implements the interface between the application and skills utilized) is created for an application, the designer is able to add any amount of skills and knowledge as described in preceding paragraphs.
Design Model: Knowledge
Design Model: Skills
Design Model: Interfaces
The toolkit components provide binary interfaces for dynamic binding. The five frameworks allow each component developer to concentrate a layer of responsibility, thus leveraging the work of other developers in the other frameworks. There are three primary specializations:
New adapters, engines, or knowledge editors can be developed and configured into an agent. The products of these specialists are configured together by an application developer.
These components need to be brought together. For this, intelligent agent technology relies on the paradigm of engines as interpreters. For example, a rule-based system is an interpreter of rules, which dynamically define the interaction of events, conditions, and actions among applications, the end-user, and even other engines and agents. Examples of these interactions are:
APIs between frameworks are dynamically-bound and non-proprietary. Any company or individual can enhance the IBM Agent Building Environment Developer's Toolkit for themselves as desired.
IBM Agent Building Environment Developer's Toolkit's component philosophy helps focus and simplify the responsibilities of all IA developers, but the IBM Agent Building Environment Developer's Toolkit gives special consideration to two of the most critical development perspectives:
Each adapter is responsible for working with a particular application domain, device type, or protocol. For instance, one adapter might be responsible for an e-mail connection, another for mobile paging, and yet another another for a connection to newsgroups. This version of the toolkit focuses on the development of adapters as the most important framework to first establish.
Adapters have three basic responsibilities:
This Adapter Interface is a 'wrapper' to link the application/object world with the knowledge representation and the other essential aspects of the agent. Advanced object-oriented systems often provide event trigger mechanisms, and OO systems are more agent-friendly, nevertheless, few applications are object-based. The Adapter framework wrappers all such differences. Each adapter is responsible for mapping the unique aspects of its application domain to the IBM Agent Building Environment Developer's Toolkit and the standard knowledge representation on which the rest of the frameworks can operate.
The following is a short description of adapters that are currently provided by the IBM Agent Building Environment Developer's Toolkit:
The Time adapter is asked to generate periodic or scheduled time-based events (alarms). It also utilizes the day and date querie sensor functions available to the system. The Time Adapter uses effector calls as a way to set alarms. One event may cause a rule to set an alarm which will eventually trigger the engine, causing other rules to fire.
The File adapter allows an agent to observe and manipulate files to which it has access. While working with the Time adapter, the File adapter provides trigger events that detect file changes such as creation, deletion, size and date...and errors of File adapter effector execution. File adapter effectors provide for checking and manipulating (e.g., copy, move, delete, execute, append to) individual files or specified files in a directory. The checking effectors are typically triggered by time events (for example, a rule that specifies that a file should be checked every 30 minutes).
The HTTP adapter casts the HTTP domain in terms of events, conditions, and actions. For instance, any change on a Uniform Resource Locator (URL) can be a significant event. HTTPAdapter must monitor changes in a URL through its time stamp and poll particular URLs for changes since the last check. This request is defined as effector request. By using the Time adapter's triggers (periodic or scheduled), one rule can determine the polling policy of any particular URL. When it changes, the HTTP adapter provides a new IATriggerEvent. The IATriggerEvent fires any rules that have been written against it.
The HTTP adapter can also simply fetch a URL, whether it has changed or not. These actions are typically thought to be triggered by time events, but any rule triggered by any event can call these actions as a consequence.
The NNTP adapter deals with USENET newsgroups on the Internet. It has a set of effector functions triggered by an agent-startup event for initialization. These functions are used for configuration information, the name of the NNTP news server from which to obtain news, the names of pertinent newsgroups and so forth.
The NNTP adapter also has an effector to fetch all new articles that have arrived at the server since the previous fetch effector. This effector is typically triggered by time events (for example, a rule that specifies that new articles should be fetched every 6 hours), although any rule driven by any event could call it.
The adapter generates a new IATriggerEvent for each article fetched, thus allowing the engine to inference on each article. To support this article-driven inferencing, the NNTP adapter provides a set of article sensors. For example, there are sensors to search article bodies for key words or phrases and to sense whether an article comes from a particular person or email address. So rules driven by the 'new article' IATriggerEvent, and using these sensors, can effectively select an article and pass it for actions for by other adapters. For example, an e-mail adapter might provide effectors that could be used to e-mail the selected article to interested people, or a file adapter might provide an effector to append the article to a given file with other articles on the same topic.
The utility adapter provides basic arithmetic and comparison operations on numbers, conversion operations between integer terms and string terms representing integers. String append and print operations are provided for string terms. For both comparison and arithmetic operations, string operands must represent valid floating point numbers and will be converted to floating point numbers. and the appropriate operation performed.
The primary intent of this version of the IBM Agent Building Environment Developer's Toolkit is to support adapter development. Adapter development is crucial to enable IA technologies in and across applications. You will find design points and helpful tips throughout the documentation, so the following just summarizes up to this point:
Finally, adapter developers automatically benefit from the addition of new engine types. Although the 'when-if-then' representation of rules most closely reflects the event-condition-action pattern of the agent paradigm, it is not limited to rule-based systems. This paradigm reflects how the agent interacts with the world through passive sensing, active sensing, and effective action. The adapter developer is isolated from engine differences. Any application can include any number of adapters with any number of engines. The same adapter built for the rule-based system in the Alpha version of the IBM Agent Building Environment Developer's Toolkit will benefit from the inclusion of additional engines in subsequent versions.
Engines are the 'brain centers' within an agent. An agent can contain more than one engine type, similar to the way the real brain has various centers for sensory processing and long-term memory. The trend in IA is toward hybrid systems, and the Engine Framework defines the various types of engines and how they can work together.
Engine responsibilities depend on type. Some engines are 'master' engines that directly observe events and access adapter's sensors and effectors. Others provide support services such as linguistic analysis or case-based memory. The specific responsibilities of each engine type are best described by the following examples:
The Alpha version of the IBM Agent Building Environment Developer's Toolkit provides a forward-chaining inference engine, the most widely used and required type of IA engine. It observers events and calls adapter sensors and effectors.
Given some data to analyze, this engine type derives new facts from it. For instance, given a long text string such as the body of a note or an append to a news group, an analyzer can deduce the subject of the note, the number of questions asked, or the reading level. The Analyzer Interface is similar to the sensor interface methods for adapters. That is, given an input, analyzers can be sent a message (synchronous) to return the facts of its analysis.
Although rule-based inferencing is the most common requirement, learning engines are being used more and more in IA, especially for adaptive user modeling. Learning is a complicated concept that can range from simple counters to path memories to associative memories to predictive models to time series and beyond. Regardless of the complexity of technology used, such memory and learning functions will often use the monitor interface, which is similar to the adapter interface. For example, Monitors must have effector-like calls in order to load memories, change parameters, etc. They must have sensor-like calls in order for other engines to determine their decisions. These decisions can be predictions or pattern matches that are based on their past experience.
While the primary focus of the Alpha version of the IBM Agent Building Environment Developer's Toolkit is on adapter developers, there is also one design point that allows engine developers to begin adding new engine types, even though the full design is not disclosed:
For the analyzer and monitor type engines just discussed, the interfaces are very similar to that of an adapter. In the same way that applications represent themselves with the event-condition-action paradigm, engines too, can provide internal events, conditions, and actions, which allow the inferencer, for example, to control these other engines according to rule-based instructions. This is crucial for the control of learning technology. Although other methods of integration are allowable under this architecture, engine control through the knowledge representation itself is the preferred design. In other words, engines, like adapters, are plug-and-play components. Engines are scriptable and otherwise dynamically configurable.
Given this similarity between these engine types and adapters, engine developers can begin to implement their own engine design interfaces using the adapter interface in this version.
The Knowledge framework is not truly an architectural layer with well defined APIs and SPIs. No such interface is possible between Knowledge and Engines because of the variety of types of knowledge representation and its control and allowed within the architecture. It is more important that each engine be closely associated with the particular knowledge type (K-type) that is best suited to its function. The most important reason for keeping this layer separate is to encourage the designers on new K-types to follow the design points listed below. Otherwise, this framework is a collection of the knowledge structures used by different engine technologies.
The knowledge framework has no particular bias toward one intelligent agent technology or another, but here are some examples of knowledge structures.
This is the set of long-term and short-term facts held in the agent's memory (working set). Long-term facts such as the user's pager number and manager's email-address require persistent storage in the Library. Short-term facts such as the facts delivered in the IATriggerEvent or through sensor calls are also included.
This is the structure defining antecedents, consequences, and the connectives between them.
There are several techniques for associative memories, including inverted and vector indexing schemes. For instance, a Monitor type engine uses an inverted index to store and recall web pages, based on keywords and category labels. This list of K-types can be a very large, however, including decision trees, neural networks, fuzzy rules, and genetic encodings to name a few of the more prominent techniques being applied within IA.
The Knowledge framework also includes a set of services:
As a general design point, the difference between Library persistence (on disk) and knowledge as working set (in memory) is kept somewhat separate. The Library framework requires Knowledge Interchange Format (KIF) as a standard for representing LongTermFacts and Rules. The Knowledge framework allows any preferred run-time representation to suit an engine. On the other hand, engines that use KIF-based object representations in their working memory can use caching services for switching and restoring several inferencing contexts. This is typical in multi-user agent services, which need to swap different users' rule sets in and out of working memory depending on the event 'owner'. For example, whose mail is this?
While the Knowledge framework allows the flexibility of particular knowledge types, it also encourages the ability to convert from one K-type to another. For instance, many agents will be built as a hybrid of rule-based and learning-based engines so that an agent can be instructed by explicit rule, by explicit example, or by implicit watching. Such hybrids should be able to 'promote' a discovered piece of knowledge in one K-type to an explicit rule K-type. For instance, an agent might ask, I've noticed that you don't look at e-mail from MACHINE@SERVICE that say 'Archive Notice''. Would you like this e-mail automatically deleted? In a hybrid system where the user also gives rule-based instruction, a positive response to this question might be expected to generate an explicit instruction.
Because the Knowledge layer is not an architected framework, it is kept separate in order to emphasize several important design points:
Knowledge structures should not assume any control strategy or decision function, which is the responsibility of the Engine Framework. For example, a rule structure implemented as an object should not 'evaluate' or 'decide' itself. If so, it binds itself to predetermined functions and must be recoded to allow new functions. Inference chaining is the best case in point. A Rule object which encapsulates its structure and implements only the 'decide' method as forward chaining, will preclude the dynamic addition of backward or mixed chaining strategies. By keeping the structural object -- the rule -- separate from any single control strategy, new engines can be developed that utilize the Knowledge Framework's objects already provided. An inverted index is another example. This is a generally useful structure which should not assume only one use by one engine.
All knowledge representations should reference the semantic symbols provide by the adapters. They should not store implementation specific procedures within themselves. This ensures the portability and sharability of knowledge across the vagaries of implementation. For instance, the effector label for FORWARDing an email note will be common across email implementations. The Adapter's responsibility is to bind such a label to a particular implementation.
This is a weak point because object storage technology tends to blur the distinction between working set and persistence, but the point is that KIF is required to allow plug-and-play of different engines and editors through the strong library interface. So long as an engine or editor can read and write KIF to the library, it does NOT need to use KIF in its own working set. Any engine can convert the KIF representation into its preferred representation for whatever reasons. Depending on the engine, the distinction between working set knowledge and the library of ALL persistent knowledge might otherwise be useful or required, in which case the engine has such an option.
Otherwise, the knowledge framework and the library framework are parallel and usually can be considered together.
In contrast to a working set in local memory, a library must make stronger assumptions. Access control not withstanding, a library's contents should in principle be sharable. In terms of IA, several agents might share the same rules, or a copy of a rule might be passed to another user on another system using an agent based on another product. This is why KIF is so critical. The library is the reference point at which any agent (with proper authority) should be able to able to check out and 'read' content in some standard language.
Libraries have a variety of material, but require a common way to catalog and cross reference all its items. For IBM Agent Building Environment Developer's Toolkit, the library organizes all the agent's knowledge in component type called a 'Store'. The library is persistent and ensures that the agent's internal knowledge representations are transcribed into standard formats whenever possible so that engines and editors can plug-and-play using the library Interfaces.
Each store's interface defines the requirements for a particular K-type. For instance, the KIF-based RuleStore defines what it means to be a KIF rule and enforces this both in its interface semantics and its implementations such as with syntax checking.
Not explicitly discussed in the knowledge framework, there are other special pieces of information that an agent needs to do its job on behalf of the user. For an agent to act on behalf of a user, the agent must have the same authority as the user. Depending on the underlying security service, the agent might need to know the user's password, such as for checking the user's private email on his/her behalf. The library is obligated to encrypt this information properly. In fact, SecurityStore is a type of store which can gatekeep any information in the library. As a side issue, it is important that agents begin to rely on authorization certificates rather than passwords and other such methods that require authentication. As security services move to to smart-card technology, the agent cannot be issued such a card, and besides, the authorization by a user to an agent is all that is required of security services to give access to the agent.
Each new K-type and its store interface will define the particular structure and requirements of the K-type, but the Library Framework provides a common schema so that they are all relatively consistent and therefore easy to use (typically by application developers and administrators). As new K-types are included in the IBM Agent Building Environment Developer's Toolkit, this common schema helps to organize and integrate these structures into a consistent whole. These are the common properties of any store:
Every store is associated with one and only one K-type. This modularity allows a particular library to be scaled and customized to only the K-types required by the application.
Every K-type can be organized as sets of sets. The use of a set is arbitrary, however. For example, most rule-based end-user IA systems use the concept of set to allow the end user to group rules (weekend rules, customer care rules, etc.). Sets can also be programatically used such as to organize K-types by application ownership or context (memory of user's browsing of the web behavior versus memory of user's behavior on a pay-for-service).
Many K-types need to store metadata about each item or about each set. For instance, the name of the rule set is kept by the RuleStore. Use of metadata is optional and its content is arbitrary to each K-type and/or K-type editor. For instance, a rule-based editor can use the rule metadata to store view-specific information that would be otherwise lost in the 'compilation' to KIF.
Each K-type store should provide such services, not only for its own integrity, but as a service to any other object that might find them useful.
The library is a true framework, providing an abstract interface which does not show the particulars of implementation or service provider. The Alpha version of the IBM Agent Building Environment Developer's Toolkit provides a file-based implementation, but the same interface can be implemented for data-base persistence. Especially in the common condition where data is distributed and variously modeled (such as personal profile information), a store's implementation might also require communications and substantial integration. But here again, the simple first-order structure of KIF at the interface can help hide some of the data modeling complexities and problems.
Aside from adapters and engines as components, stores are likewise plug-and-play through binary interfaces. An application developer can select the Store component based on its K-type and implementation such as FileRuleStore or DBRuleStore. All the other components (engines, adapters and views) are independent of the implementation choice.
A special object called a collector is also part of the Library framework. It can be used for any arrangement of stores into groups but will most often be used for collecting the stores associated with a particular user. This collection of all the agent's knowledge for and about a user is often called a 'persona'. This is only one possible use of a collector, however.
Finally, the Library is not a general persistence mechanism for any type of data. Each type of store must be explicitly defined and will often provide type and syntax checking. However, the Library's general schema is extendable to any new type of knowledge structure such as a FuzzyRuleStore. As with any such store, this store should be designed similar to the others, so that administrators, application integrators, and engine and editor developers rely on the common schema.
The view framework defines the standards of interaction between the user and the agent (including the agent's Library). It is the critical framework for successful deployment of agents.
What has been learned from rule-editing by end-users can lead the way to new and successful ideas. Even if one argues that adaptive user modeling (learning by watching the user) is the answer to eliminate rule authoring, there are many new issues to be considered, such as the underlying structure to allow a user to be watched, how the agent and user communicate state and intentions, and how the agents can be controlled and trusted.
Even with the emergence of adaptive user modeling, rule editing is still the primary requirement. Rule-based systems are still primary, and so rule authoring is still required, even if it depends on the application developer or administrator for successful deployment. But form-based and even graphical node-and-arc rule editors have failed with end-users.
Getting an agent to behave is only the first step. The agent must also conduct a dialog with the user at run time, such as asking the user a question (a sensor request) before proceeding. This problem is itself complex, involving multiple and occasionally connected client machines. IA promises to actually help this difficult general problem, but this requires much more elaboration.
Full dialog between agent and user must also include user initiation, for instance to give the agent a direct command. Such agents are sometimes called 'conversational', but also tend to be event- and rule-driven, so the IBM Agent Building Environment Developer's Toolkit will also help in these directions. The pinnacle of IA, however, is learning: watching and learning from the user's behavior. Encompassing this will not require another intelligent agent toolkit or architecture; the Engine Framework includes memory and learning technology and the Adapter Framework is universal to any event-condition-action domain.
These points are reviewed more specifically for the development of the IBM Agent Building Environment Developer's Toolkit's rule editor in "View Framework: Editors", but these criteria apply to the View Framework in general:
The other frameworks make no assumptions about the user, the presentation media, or any other technique of agent-user interaction. This is very practical because it makes adapter development easier. There are no resources required of the adapter developer. More strongly, the adapter developer should not assume any resources. These issues are the specialty of user interface experts and new techniques of interaction will obsolete simple devices, such as end-user terminology strings. (User interface adapters are an exception to this rule, but, as a separate adapter module, any particular implementation can be replaced by another component that supports a different technique.)
Presentation techniques should strive to work independently of specific media. For instance, the metaphor of instruction and dialog are common for both face-to-face and telephone-based interactions. Something - but not all - is lost in switching from one medium to another. Most importantly, the underlying style and cognitive model are maintained.
To provide a View Framework, rather than specific application solution, techniques should be applicable to one niche application, as well as to a general personal secretary style agent that is responsible for many applications. Applications should have control of this scope and context-sensitivity. The combination of media-independence and scalability is also critical to many future small devices.
While adapters, engines, and stores provide themselves as model parts, the view framework provides a set of view parts. Given the breadth of applications and types of end user, the application developer, administrator, and end-user should have a choice of view components. Some people like anthropomorphized agents; some don't. The decision rests with the application and end-user and should not be bound with the rest of the view parts and the rest of the frameworks.
Techniques for viewing agents and their knowledge should readily evolve with the user-interface and underlying engine technologies. For instance, techniques for agent instruction must anticipate learning technologies and help to hide rather than expose the complexities of hybrid and distributed agent systems.
IBM Agent Building Environment Developer's Toolkit was born out of the need to provide frameworks for all IBM and partner agent developers to work together. The primary activities and requirements are for rule-based, stand-alone IA systems. However, requirements are quickly emerging to include learning engines and agent-to-agent communication. The IBM Agent Building Environment Developer's Toolkit and its frameworks are designed for such growth.
The inclusion of learning has been discussed, but inter-agent communication has not. Not only is the use of KIF by the IBM Agent Building Environment Developer's Toolkit valuable in its own right, this growing standard for knowledge representation is the basis for an emerging standard for agent interaction, the Knowledge Query and Manipulation Language (KQML). KQML essentially wrappers KIF as its internal "packet" structure, making it relatively easy to migrate the Adapter and Library Frameworks toward larger agent systems being developed for solutions to distributed computing.
In the meantime, the responsibility-based frameworks make it easy for one developer to leverage the expertise of another, especially in this quickly emerging field of IA. No one can be an expert in all the IA technologies:
None of this is of any consequence without the Adapters, which hook all the other frameworks to real applications. They provide the way for agents to do real work in the real world. This is the focus of the IBM Agent Building Environment Developer's Toolkit and the Adapter Framework to easily and quickly enable such applications as new technologies become available.