Note: The objects and methods described below, as well as many others are illustrated in Chapter 3. "Agent Server Sample" and its associated code in Appendix A. "Reminder Agent Sample Formatted Code".
All inference rule persistent storage services of the IBM Agent Building Environment Developer's Toolkit Library are obtained through a small set of Library object classes:
Other persistent storage services of the IBM Agent Building Environment Developer's Toolkit Library are obtained through two additional Library object classes:
The methods for the above classes are also abstracted to base and abstract base classes. To simplify the user-interface, this inheritance is not visible to the library user. This is accomplished by employing wrapper classes for indirection. These wrapper classes also minimize the need to change or recompile using programs should the library implementation change in the future.
To improve readability, the common IALibInference prefix to the class names is omitted in much of the following discussion, leaving the shortened names as follows:
Also, the common IALib prefix to the class names is omitted in much of the following discussion, leaving the shortened names as follows:
Library object are named through their constructors. OS/2 HPFS is required for using the Library because of name lengths and implementation conventions.
In addition your application should not use the following file extensions for files contained hierarchically in directories in and under the directory that is given for the top Collector of the Library (see description of Library map in "Creating Collectors and Storing in Them":
Library services through these object classes are intended to be independent of the execution platform and the repository where the inferencing information is stored. For this reason common object class libraries, such as Open Class are avoided at the library user interface (member function) level, i.e. those member functions that are intended for public use. For the public part of the Library interface for example, char * is used rather than IString. This allows for Library users to convert char * for their convenience to a string class that is available in the current platform's environment. In this way the Library interface remains stable for multiple platforms.
In addition to wrappering the classes, the constructors and destructors for all of the objects are packaged as "factory" methods in the IALibrary object. This makes it possible to maintain implementation independent interfaces, with run-time binding (dll linking) to an implementation (dynamically linked dll). This gives the maximum flexibility as new implementations can be added and old ones changed with no or minimal impact on programs that use the interfaces.
The primary focus of inference rule storage is the rules themselves. Rules are grouped into sets called RuleSets. When Rules are loaded from persistent storage for inferencing, they are presented for loading as a single RuleSet.
Note: In future releases of the IBM Agent Building Environment Developer's Toolkit, Library services may support gathering and merging of multiple RuleSets into a single RuleSet that is to be used for inferencing and/or stored by name for future inferencing. For example, inferencing for a particular event may require merging user, department, and some system RuleSets. Such merge will be possible dynamically or in advance of event arrival. Currently the library includes the naming and methods to operate on named RuleSets, but it does not yet include functions to do such a merge.
The library is not directly involved in deciding which RuleSets are loaded for inferencing and when they are to be loaded. Instead it offers storage and retrieval of RuleSets by name. As discussed later, it also allows for groupings of RuleSets, both as a way of scoping their names for name conflict avoidance and for performance benefits in storage/retrieval.
The Library supports the current (RAISE) engine through a high level interface for loading rule and fact sets via an IALibrary method called getConductSet(). conduct set, This allows the engine to specify a conduct set name and the qualified Collector name. loadConductSet finds the implied elements of the conduct set (a rule set and a long term\ fact set with the same name) and instantiates a RuleSet object and a LTFactSet object (latter is optional) for the engine. This function instantiates Collectors needed for this and caches intermediate level Collector objects when appropriate (terminal node Collector objects are not cached). The engine uses the results to build engine objects that are required for inferencing. The invocation of getConductSet() is implicit through an IAAgent method called loadConductSet().
The Library also supports a very similar function whereby the RuleSet and LTFactSet do not have to be the same. This function is called getInferenceSet. inferencing set,
The library also allows for maintenance of RuleSet state (active and inactive). This is to assist in rule editing and dynamic switching of RuleSets. For example, the logic that loads RuleSets for inferencing could look only at active RuleSets (in the current scope, e.g. user); then merge the active RuleSets into one RuleSet for inferencing. Again, future library services may include such a selection and merging option. As discussed below, a similar state (active/inactive) is maintained for Rules within RuleSets so that selection and merger can be even more granular. In the future, as an optimization, the library may also offer the storage of an "effective RuleSet" in persistent storage, where the effective RuleSet is the same as would be obtained by merging all of the active Rules in all of the active RuleSets of that scope (e.g. user).
A RuleSet must be assigned a name when it is created, generally through a rule editor. If a RuleSet is not given a name when it is created, the name defaults to the name "default". This name must be unique in the scope of library naming (discussed later). The RuleSet name is a permanent assignment because it is used to identify a RuleSet to the library for storage and retrieval. For RuleSet name, the length and special character restrictions are based on the platform's restrictions on file names. In addition to this name, a RuleSet can also optionally be assigned an identifier, a string of originator-specified size. This can be used by rule editors in anyway that makes sense to them, and is stored, but not used, by the library. RuleSets also have a state, active or inactive, expressed as a character 'A' or 'I'. Currently this is not used, but can be used in the future for controlling the selection of RuleSets, keeping inactive rulesets rather than deleting them, so that they can be re-activated later. RuleSet ids are strings of a user-defined size. The RuleSet name, id, and state are considered part of the standard metadata for a RuleSet.
A RuleSet contains an ordered, but unsorted, sequence of Rule objects. It also has member functions for accessing those Rule objects, either through a find member function or an implicit cursor. The find function looks for a match on a Rule name and positions the implicit cursor on that Rule object. You can also position the cursor explicitly to the first, last, next, or previous Rule in the sequence. All of these functions return a reference to the Rule object that can be used to apply Rule object member functions to set or extract the contents of the Rule. The cursor is also used to indicate where Rule objects are to be added to the sequence in a RuleSet, as well as which rule is to be deleted from the RuleSet. The RuleSet includes these add and delete member functions.
Rules contain the same standard metadata as the RuleSets that contain them: name, id, and state. Unlike RuleSets, Rule names are optional and do not have platform restrictions on size. It is strongly recommended that you name your rules. You would probably do so if, for no other reason, only to identify them for rule authoring and editing. The generic rule editor of the IBM Agent Building Environment Developer's Toolkit sets rule names if not provided by the user. The library is not sensitive to Rule name values and the name values do not affect storage or retrieval by the library, except that the RuleSet find function is based on the Rule name attribute. When rules are not named, they cannot be found by the RuleSet find member function.
Unlike RuleSet name, which is assigned permanently and used for library storage and retrieval, the Rule name is completely under user control and can be changed without affecting the Rule object insofar as library management is concerned. Also, there is no system-defined relationship between the standard metadata values in the rules and those corresponding values assigned to the associated RuleSet. Rule names can be used in inferencing, e.g. for conflict resolution prioritization, as well as in rule editing.
A Rule can be assigned a state. When a conduct set is loaded, rules that have an inactive state (I) are ignored and not loaded.
The primary attribute in a Rule object is the rule itself which is stored in a Rule as a single string. The format of this string is as defined by linear KIF (Knowledge Interchange Format), a formal language for the interchange of knowledge among disparate computer programs. The current version of KIF used by the IBM Agent Building Environment Developer's Toolkit is a subset suitable to the current IBM Agent Building Environment Developer's Toolkit engine's inferencing requirements. This subset may grow over time, but the intention is to continue to conform to the linear KIF standard. The current version of the KIF subset is described in Chapter 2. "Writing Rules and Understanding Rule Inferencing". The linear KIF format of rules is human readable, but not intended for that purpose, i.e. it is intended for interchange, but begs for rule editors for usability. This same KIF form is used in the IBM Agent Building Environment Developer's Toolkit for the externalization of long and short term facts (statements of belief in the form of predicates with its list of terms).
The library is sensitive to the linear KIF string format when called upon to validate a Rule through the validate member functions of the Rule object.
Note: Currently the validate function only checks for matching parentheses. This will be expanded later to include full syntactical validation. Later this same validation will also be done automatically when a RuleSet is stored by the library.
Rule editors, or other rule authoring tools, supply the rules for inferencing and therefore create and maintain RuleSets. Although these editors may operate using their own version of objects to represent the rules and the elements of rules, the editors must convert their objects to the linear KIF form required by the library for storage. This requirement allows for a standard interchange of persistently stored rules. This ensures that other editors can edit the same rules, if appropriate. It also provides for any participating engine to use the rules for inferencing.
Long Term Fact Sets (LTFactSets) can also be stored in the library. Because the internal library naming of persistent objects includes a type value, RuleSets and LTFactSets can optionally share the same name in the same name scope without conflict or ambiguity. This means that a single name can be used to identify an conduct set. consisting of a RuleSet and a LTFactSet that have the same name (the LTFactSet is optional). Although the library is not sensitive to the grouping called conduct set, it does nevertheless support this grouping through its allowance for assigning a RuleSet and a LTFactSet the same name. As described above, this allows engines to load rule sets and fact sets with the same name (conduct sets) or with different names (inference sets). For example, one could choose to always load a default LTFactSet, even though different RuleSets were loaded at different times (based on some events, for example).
LTFactSets are currently considered structurally the same as RuleSets, i.e. they have the same basic attributes and naming scheme, as well as the same member functions for operating upon them. Just as RuleSets contain Rule objects, LTFactSets contain LTFacts. LTFacts are used for inferencing in a different way than Rules. LTFacts are simpler than rules. Whereas Rules consist of antecedents and consequences that are inferred from the antecedents, LTFacts are relatively simple expressions of facts or beliefs that can contribute to rule inferencing. These (long term) facts are optional. When a RuleSet is loaded for inferencing, a LTFactSet can also be loaded. LTFacts also use a linear KIF string form, except that they are simpler than Rules and therefore employ less of the KIF syntactical capabilities. Like RuleSets and Rules, LTFactSets and Facts are built and maintained through "rule" editors. Except that they are segregated for inferencing and editing, LTFactSets are treated exactly the same as RuleSets for persistent storage purposes by the library.
So far we have talked about the elements of a conduct set or inferencing set, that is a set of objects used for inferencing on behalf of a particular client. Here client is used in the broad sense, to include anyone, anything, or any group on whose behalf inferencing may take place. Typically, client is assigned a unique name. Generally it is a particular user or group of users. A client name allows for scoping the names of the inferencing objects that are associated with that client. This simplifies the selection of a RuleSet and LTFactSet for loading by reducing the scope to a current client. In order to allow for this grouping/scoping for inferencing clients, the library provides an object class called Collector. The Library provides this as a means for defining inferencing scoping/grouping without exposing the particular implementation (platform or repository). The Library does not define the particular use of the collector objects any more than a file system specifies how its directories are to be used.
This method of subsetting the set of all inferencing rules, etc. permits performance improvements and name scoping. It is very similar to the grouping and scoping of files into directories in many file systems and has the very same benefits. In fact, file system directories are used for the current file system implementation of the IBM Agent Building Environment Developer's Toolkit Library. In other repositories, other means of scoping/grouping may be selected for the implementation, while retaining or extending the basic Collector concept of hierarchical aggregation.
As hierarchical grouping representations, collectors are not only containers for inferencing objects such as rulesets, but also containers of other Collector objects (again similar to hierarchical directory structures, where one directory contains (parents) another). For example, there may be a Collector for each of several users of an IBM Agent Building Environment Developer's Toolkit agent; then a parent Collector for all of those user-associated Collectors. The latter can be considered and used as a list of all users on whose behalf inferencing may take place in the particular agent environment. That is, it can be used by the "driver" of the inferencing to find and retrieve the Collector for a particular user for which inferencing is to be done based on the arrival of events. Then, having established a Collector for that user, it can be used to assist in retrieving inferencing information for loading the inferencing context for that user.
Where a Collector represents a user, for example, you could add a new user by creating a Collector object, giving it a name that associates it with the new user, and specifying that it be contained by the Collector (parent) that represents the set of all the users. Then, of course, you would go on to allow the new user or associated user tool to edit and build RuleSets that are stored in the Collector that represents that user. Similar to RuleSets and LTFactSets, Collectors also allow you to use built-in cursors to look through the contents and allow for searches of contents by name. The result of a cursor method or find method for a Collector returns a ContentsElement object reference. This is a very simple object that contains the name, type (Collector, RuleSet, LTFactSet, etc.), etc. of the object that is contained by the Collector. Then this object's attributes can be used to instantiate the object that it represents (object contained by the Collector).
Collectors are also control points for synchronization of multiple processes/threads, permitting control of inferencing object editing concurrently with rule inferencing - or synchronization of multiple rule editor activities. For this, there is both version management and locking for atomicity of reads.
These Collector control points are also useful for managing the secure access to inferencing objects.
Note: This is not yet supported by the IBM Agent Building Environment Developer's Toolkit Library. In a mobile agent environment, they allow for secure examination of an IA environment by a visiting mobile agent, that is the IA environment can make available its hierarchy of Collectors for examination by a mobile agent - of course limited by Collector level access controls for that mobile agent.
Whereas name, id, and state are standard metadata for RuleSets, Rules, LTFactSets, and LTFacts, (and name is standard metadata for Collectors), user-defined metadata, which is optional, is simply a string that can be associated with each of these base objects. Although its use is not specifically defined, it is expected to be used by rule editors for associating rule graphic presentation information. It can also be used to record and associate user-specific terminology with shorter or symbolized versions of that information as expressed in the inferencing rules.
When an object is stored or retrieved, it can optionally include user-defined metadata. Metadata is associated with its base object by name, i.e. metadata is stored in the library with the same name as the base object (one to which it is associated), but with a distinguishing type name that is paired with the base object name to make it unique in persistent storage. As before, the type distinction between library objects is managed internally by the library.
Generally speaking, you can have than one instance of metadata for a given base object. This is accomplished by another name called a metadata name. Thus each object in persistent storage is distinguishable by a key composed of up to three parts:
Metadata can also be used by converters when loading standard KIF strings (rules or long-term facts) into engine-specific objects for inferencing. It can also be used to retain rule signatures that can enforce rule consistency with editor or adapter specifications. Finally, where a collector represents a particular user, collector metadata could be used to hold profile information for that user. In all of these cases, the originator of the object is also the composer and interpreter of the metadata string that is associated with that object. The library simply allows for the association of the metadata with the base object so that it can be retrieved (optionally) when the object is retrieved and (optionally) modified when the object is updated.
In addition to allowing for Rule and LTFact level metadata, you can also specify metadata for the RuleSet or LTFactSet as a whole. This level of metadata can be named and stored/retrieved independently from the metadata for the Rules/LTFacts that are contained in the Ruleset/LTFactSet. In summary, there are exactly five types or levels of metadata: Collector, RuleSet, Rule, LTFactSet, and LTFact. Each has a distinguishing type code.
Rule and LTFact-level metadata is not selectable on an individual Rule or LTFact basis, i.e. metadata at this level is named and selected through the RuleSet/LTFactSet that contains the base object (Rule or LTFact). This means that you can assign a metadata string to each Rule in a RuleSet, for example, and maintain that association, but all of the controls for those metadata strings are associated with the RuleSet. The reason for this is that the library does not store or retrieve parts of RuleSets, but only whole RuleSets, so we also do not name or store/retrieve the metadata for only some Rules in a RuleSet, but instead do so for all of the metadata for the Rules in a RuleSet as a single unit. Even so, metadata for Rules (and LTFacts) is still associated with the individual rules/facts persistently and consistently.
The Library also supports multiple versions or instances of metadata for the same base object instance. This is done by allowing the originator of the metadata to assign it a unique name (unique within the scope of the base object).
Note: Currently, only one instance of metadata is supported by the library at the element (Rule or LTFact) level. Although you can use method functions to name metadata at this level, inconsistencies would result from adds and deletes of Rules or LTFacts when multiple instances of metadata were actually used. If the need for this capability firms up, this restriction will be lifted later. Furthermore, you are required to get your element level metadata along with your RuleSet or LTFactSet if you are going to do adds or deletes to the Rules or LTFacts therein. The latter restriction will also be lifted if we lift the restriction that limits element level metadata to a single instance.
Currently up to 36 such metadata names are permitted using a single character with values of 0-9 and A-Z. The default metadata name is 0 (zero). This metadata name can be considered an extension of the base object name and the type as previously described. This sounds complex, but is really simple because the association between metadata and its base object is always implied when it is defined and metadata is typed internally. The naming of metadata only comes into play when there is a particular need for more than one instance of metadata; otherwise it defaults to a name of '0'. Normally, all of this naming is not really apparent externally and is only seen when specifically requesting that detail as an option when examining the contents of a Collector.
Because some or all metadata is not necessarily pertinent to rule inferencing, it is only optionally retrieved with the rules that are loaded for inferencing. The metadata is stored by the library separately from the associated base object. When an editor or rule converter wants to retrieve or store metadata with an object, it uses get (retrieval) or put (store) that indicates metadata is requested. When this is done, the Library retrieves/stores not only the base object information, but also metadata. In this way the performance of loading, for example, can be optimized for the non-metadata case when appropriate. Also, only the version (one named) of the metadata specified (or defaulted) is accessed.
Note: There may be a requirement for the concurrent retrieval or storage of more than one instance of metadata for an object. If this requirement materializes, then appropriate functions will be added to the library interface to address this requirement.
Atomicity applies similarly to concurrent write operations. One would like to expect that one write would complete before the other can begin (atomic writes).
The problem is also applies to related information in multiple files. That is, one could have data in two or more files that are dependent in such a way that changing one without changing the other(s) would give an inconsistent "picture" of the overall data. Managing the latter is typically addressed by transaction management systems. In the current IA Library file implementation we are concerned with both single file read atomicity and multiple file atomicity. The latter is a concern because a single "object" can be implemented in and reflect multiple persistent storage files.
The IBM Agent Building Environment Developer's Toolkit Library supports concurrency for Library object functions that affect persistent storage, but it does not directly support concurrency controls for Library functions that affect shared memory objects. That is, the Library addresses the first three problems listed above but not the last problem. The get(), put() and del() kinds of functions on Library objects are implicity locked to insure atomicity of these persistent storage operations. However, object-only functions such as find() and cursor operations are not protected against concurrent adds and deletes (or other cursor operations) that can affect either the results or the cursor positioning, for example. This is discussed further below.
In addition to atomicity of persistent storage support, the Library also allows you to detect persistent storage changes by another thread or process (even at another node inthe network) through version management. For example, a get() or put() may return an error indicating that the version of the object (peristently) is different than you would have expected because of concurrent changes by another process or thread. Then it is your applications choice to repeat your work on the object or to override the concurrent changes. This is addressed further below.
Scope of Locking - Collector The scope of the implicit lock is the Collector that holds the objects that are to be locked. This Collector must be instantiated by each thread/process that uses it. That is, threads can have individual instantiations of the same Collector and each of these instantiations of the same logical Collector object is the control point for managing atomic consistency for concurrent accesses by the threads to the contents of the common logical Collector, as represented by its persistent storage implementation. This works because the locking is not implemented locally in one particular Collector memory object, but through the common persistent storage implementation of a particular Collector.
Atomicity Locking for LogRecordSets - For LogRecordSets the point of control for locking is the LogRecordSet itself rather than the containing Collector. However, when a new LogRecordSet is put(), it is also necessary to lock the containing Collector before locking the LogRecordSet. This is required to insure that another thread, that is doing a get() for the containing Collector's contents, gets a consistent snapshot of the contents. This locking is implicity with the put().
Write operations are serialized by managing object versions. The version is incremented for each put(), del() or delMetadata(), but before incrementing it, the current version, that was the basis for the current operation, is checked against the current version in persistent storage to insure proper sequencing of writes. When a such an operation encounters a version problem (current object version is not consistent with the version of the object in the library persistent storage), the object version is replaced by the persistent storage version and a failure is indicated. As a result of this failure, you can:
The Scope of Version Management In addition to atomicity locking described above, put() methods also manage library object versions. Version management applies to the following objects:
Following are examples of some operations on Library objects that are not guaranteed to be thread safe. That is, when these operations are used in a program by multiple threads concurrently, inconsistent results can be experienced:
When your threads use these operations concurrently, inconsistent cursor positioning can result and cause unpredictable results. THere may be several more operations that are not thread safe, depending on the sensitivity of the using thread to changes. To assess these you need to consider what objects are shared by your threads and what operations ca cause conflicts and inconsistencies. Then where these occur, you should protect the operations or series of operations with a lock or mutex. The Library does not provide this for you because:
LogRecordSets are used for recording information about what inferencing has taken place for a particular inferencing client (Collector, for example). It is useful for monitoring of inferencing activity, debugging, accounting, auditing, etc.
LogRecordSet is a class that accepts the accumulation of a sequence of LogRecords, and, either implicitly or when one of its member functions instructs, puts these accumulated records to persistent storage by adding them to a wraparound persistent storage object (e.g. one or more files). Other functions include delete (starting over), etc.
Note: This Level 6 of the IBM Agent Building Environment Developer's Toolkit includes this general facility for library supported logging, but the engine does not use it in its logging of all inferencing activity.
The logging done by the current IBM Agent Building Environment Developer's Toolkit engine logs a significant amount of information about inferencing for use in debugging. This is a alpha version tool that will be significantly changed in the future. The log file is a file named raise.log in your current directory.
Note: IALibLogRecordSet is referred to below as LRSet for short and IALibLogRecord is referred to as LR for short.
Log Association with Collectors: The library supports LRSets in such a way that you can do logging that is associated with any level of inferencing that you wish. This done by associating LRSets with library Collector objects which can be configured hierarchically in any manner that makes sense to the application. For example, you can go with a two level Collector scheme with users represented by the second level. With this approach you could log system events, etc. and things that are not user-specific at the top Collector level and user-specific activities at the second level. The latter would be accomplished by having a LRSet for each user-associated Collector.
Buffering: A LRSet logs individual log records that are represented as LRs. The LRSet implementation buffers input and output LRs internally. The user can select the number of LRs to be buffered when a LRSet is created. Note that input and output buffer sizes are the same, that is, you cannot specify the buffer size on reads of the LRSet; you just inherit that buffersize that was established when the LRSet was written.
Wrap-Around: LRSets have a maximum size that is established when the LRSet is created. This size defines the wrap-around point for the log. That is, when the maximum log size is exceeded, there is a wrap-around such that the oldest log records are replaced by the newest ones.
Uni-directional: You can only move forward in the log in either input or output mode. You can position the log at the beginning or at some other starting point (log record), relative to the end of the log, but from that starting point, read moves forward chronologically to the current end of the log. Normally, you position at the last (newest) log record before beginning to write to an existing log.
Non-standard Data Content: There are two ways to include non-standard information in a log record:
Levels of Logging and Types of Log Records:
You can specify the type of LogRecords that are acceptable for writing to a LogRecordSet by setting the log level (setLevel()). When you set the level for log writes, addRecord() rejects LogRecords types that are not encompassed by the level that is currently set. The log level can be changed dynamically.
You can also specify a level for reading LogRecords that can limit the types of LogRecords that getRecord() presents to you.
Some LogRecord types are suggested, but the library is not sensitive to these or any other types that can be added. The suggested types are:
For example, if you set level to 7 for writes, you would only be able to write normal activity (1), error (2), and information (4) type log records. Then, if you set level for reading to 3, you would only be given normal activity (1) and error (2) log records.
Log Record Markup (Metadata): When you view an existing log (as a log reader), you can apply markup to the log records. For example, you can annotate the log entries to indicate that they have been viewed, that you don't want to see them again, etc. This markup is referred to as log record metadata. It is not Library defined. You specify the size of it and set its content. You can have multiple, named versions of the metadata for a single LRSet, allowing multiple viewers to each maintain their own markup. The selection of metadata by name and its management is under user control. The size of metadata that is associated with a log record is fixed, but this fixed size can be different for each named instance of log set metadata.
Concurrency: The library assumes that write concurrency for logs is handled by serializing (at the Collector level of logging) through a thread for that level, that is, no two threads should be writing to the same log at the same time. The library does not provide this serialization nor does it enforce it. While the Library maintains versions for each log independently, versions are not used to control consistency for concurrent log writes because of the assumption of external serialization of writes (above assumption).
The library does, however, provide for log read (by a log viewer or formatter) atomic consistency. This is done by serializing persistent storage reads and writes to a particular log on a block (buffer full) basis (because that is when persistent storage read/writes occur). This serialization is implicit on log get() and put() operations. Serialization is also implicit for other operations when they affect or are affected by changes in persistently stored log information: addRecord() (when a buffer is written), clear(), del(), and getRecord() (when a buffer is read). This means that a log viewer/formatter would not see partial changes from log writes. Instead when concurrent writes cause inconsistency for readers (due to overwrites on wraparound logging), they are given a return code and allowed thereby to start over or complete with what has already been read.
Atomicity lock conflicts between log writes and log reads is expected to be minimal for several reasons:
Note: Logically there is not any conflict between concurrent log readers. The implementation does nothing in particular to support this. There is no locking to provide a read snapshot level. The read snapshot is established when the read starts by getting a particular level of the attribute file. This establishes the readers picture of the log for the duration of that particular log read session. This read of the attribute file is under the control of an atomicity lock that provides read atomicity from any log write operations.
When objects are added or deleted from a Collector, the holding Collector is locked to insure atomicity of reading those objects, as well as for atomicity of examining Collector contents. This applies to LRSets as well. When a new LRSet is addded or deleted to a Collector, the Collector is locked.
The types of persistent objects supported by the library are:
Library is a component of the IBM Agent Building Environment Developer's Toolkit. It is configured, like the other frameworks of the Agent Building Environment Developer's Toolkit. By downloading the IBM Agent Building Environment Developer's Toolkit for a particular platform (OS/2, Windows etc.), you get library load modules that supports that platform for the C++ (iaglibr.lib in your installed inc directory), and Java (iaglibj, also in the inc directory). With these DLLs you have static linking capabilities for either a C++ or Java programming interface when using the Library Component.
In addition, at run time you indicate the components with which you want to configure (dynamically link) for the running agent. By selecting the appropriate library DLL, you can establish the particular Library implementation that you want. Currently there is only one such implementation, a file system implementation, and it is selected as the iaglibf DLL in the Configuration File library statement for the sample IAAgent program or through the attachImplementation() method of the IALibrary object when you write your own agent program using IAAgent. Later additional Library implementations may be available. For example, a data base implementation could become available through another dynamically linked DLL.
When you use the library through the sample agent to do normal inferencing, using the sample agent driver program, you need not be concerned with the details of how the library is linked and used as the agent does all of this for you. You only need to specify the Library Configuration File statement. Refer to Chapter 6. "Building and Starting Your Agent" for instructions on specifying the Library statement of the Configuration File.
If you replace the sample agent driver program with one of your own, you will still be hidden from most of the details of linking to the Library, which is done for you through the addLibrary() method of IAAgent.
Likewise, the generic Rule Editor supplied with IBM Agent Building Environment Developer's Toolkit takes care of linking with the Library for you.
If you are writing application programs or rule editors from which you plan to use the Library services, you can statically (at compile time) link with the C++ or Java Library DLLs for your platform. Then, at run time, you can dynamically link with the file system implementation DLL (iaglibf). See Chapter 9. "How Your Application Provides Rules for ABE" for instructions on dynamically linking with the Library in this case.
The Library supports Double Byte Character Sets (DBCS) through standard C++ character strings in the Library member functions. DBCS may be included in KIF rule strings, but not as keywords used to parse the KIF strings.
Note: The methods described below, as well as many others are illustrated in Chapter 3. "Agent Server Sample" and its associated code in Appendix A. "Reminder Agent Sample Formatted Code".
The anchor point for a top Collector is established via the a special parameter for any of the following three functions, depending on the use of the Library that is involved:
Note: In the case where a repository has multiple instances of IBM Agent Building Environment Developer's Toolkit Libraries, the (map or equivalent) parameter used to establish the Library establishes the particular instance of the Library that applies to your application.
When you first start out and have no Library established, that is, you have no rules written, you only need to establish an empty directory (for the current file implementation of the Library) in your file system and specify the path to that directory via the map parameter or through the generic rule editor.
If you are writing an application that works with Collector and ContentsElement objects directly, such as would be the case when your application stores or reads rules directly (refer to Chapter 3. "Agent Server Sample"), you need to understand in more detail the use of the get() and put() methods for the Collector. Also in this case the following steps are allow for setting up the Library:
Note: A very important point is that you are not required to a use the Collector put method to accomplish the recording of the RuleSet in persistent storage because this is handled implicitly by the same RuleSet put method that you use to put the Ruleset into the containing Collector object (again refer to the section on RuleSets that follows). The only time that you must use the Collector put for a top Collector is when you want to add/update Collector-level metadata for that Collector. If you have multiple Collectors, that is, Collectors contained in Collectors (hierarchically), such as for a multi-user server, your contained Collectors must be "put" when they are created to establish them in persistent storage. In the file system implementation, this implicily creates a subdirectory for them under the directory that you established for the top Collector, for example. This put for the sub-Collector must come before any "puts" for objects, such as RuleSets, that are to be contained in that subCollector. Once you have done this one "put" for the new Collector, you need not use the Collector put again. One exception to this is for adding or updating Collector metadata. So once the new Collector is established (by a put), it is maintained in persistent storage through puts to its contained objects, not through puts to the Collector itself.
Now consider an established Library, one that already has objects such as Collectors, sub-Collectors and RuleSets established in it. In this case it is appropriate to initialize, very much as described above, but without using the Collector put method. Rather, after completing the attachImplementation() and getTop(), it is appropriate to "fill" the Collectors (that is, the memory objects) from persistent storage to obtain the latest contents. This is accomplished by using the Collector get method. Note that this is required for any existing Collector, even a top Collector. This allows your application to use the cursor and find methods on the Collectors to determine their contents. Once the get() is done once by your application (this is very much like an "open"), you need not do it again for the current program.
To create and store a Collector, RuleSet, or LTFactSet, you invoke the appropriate constructor, specifying at least the name assigned to the new object and a reference to the Collector that is to be its parent, i.e. the Collector that is to contain it. Then invoke the put member function for the new object to cause it to be stored in persistent storage.
In the case of RuleSets and LTFactSets, you would construct rules/facts and add them to the new RuleSet/LTFactSet after invoking the constructor and before executing the put member function.
Rules and LTFacts are built by invoking their constructors. The attributes of a Rule or LTFact can be set in the constructor or through set method functions.
You can add Rules or LTFacts to RuleSets or LTFactSets through the addElement member function or, after positioning the implicit cursor for a RuleSet or LTFactSet, by using either the addElementAsNext or the addElementAsPrevious function.
When you add, delete, or update objects in an existing Collector (one that has already been put), it is not necessary or proper to put the containing Collector. Changes to an existing Collector are handled implicitly by the put method that is applied to its contained objects. In general, only a single put is ever applied to a particular Collector, that put being the one that originally creates the Collector in persistent storage. This concept insures the "commit" of changes to objects in persistent storage includes the (implicit) updates to parent Collectors. In this way you know that a successful put insures a complete recording of the object in persistent storage (in case of a subsequent failure).
The addElement member function of a Collector is only used internally by the library to keep the contents of a collector consistent with contained objects changes.
To retrieve a RuleSet or LTFactSet and the Rules/LTFacts that they contain, invoke the constructor for the object, specifying at least the name of the object and a reference to the Collector that contains it (parent). Then invoke the get member function for the object to "fill" it by reading its contents from persistent storage. After this, you can use the cursor methods to get at the Rules/LTFacts that it contains. The firstElement, nextElement, previousElement, and lastElement member functions position the implicit cursor and return Rule or LTFact references for the contained objects. When you use one of these cursor method functions and it results positioning to an invalid element, IABool false value is returned. Examples of this include 1) no elements to position on, 2) nextElement used but no more elements found, and 3) previousElement used, but you were already at the first element. These references permit the application of Rule and LTFact methods to operate on the contained Rule or LTFact objects. Additionally, the find member function can be used to find a Rule or LTFact by name. The find member function positions the cursor and returns a reference to the matching Rule or LTFact contained. When there is no element found, an IABool false is returned.
You can retrieve an existing Collector similarly. Invoke the Collector's constructor, providing the Collector name and a reference to its parent Collector object. Then invoke the get member function on the new object to "fill" it from persistent storage. The object cannot be used, e.g. for operating on its contained objects, until this get is completed successfully.
To add or replace metadata for a Collector, RuleSet, or LTFactSet, you use the setMetadataName member function for that base object to establish the name for the metadata, then use the setMetadata member function to set the metadata itself. Then when you put the base object, the metadata is written to persistent storage with it. To read existing metadata from persistent storage, you do similarly, but use the get method rather than put.
For Rule and LTFact metadata, the metadata name is set by a setElementsMetadataName member function of the containing RuleSet or LTFactSet. To set the actual rule or fact level metadata, you use the setMetadata member function of the Rule or LTFact object. Just as was the case for Ruleset or LTFactSet metadata, Rule or LTFact metadata is stored (put) or retrieved (get) along with the containing base object (RuleSet or LTFactSet), according to the setting of the metadata name(s) attribute(s) in the base object.
You can avoid putting metadata or getting metadata, even when the metadata attributes are set in the base object by using the putWithoutMetadata or getWithoutMetadata methods.
Generally speaking, you can only retrieve or store a single instance of metadata with a get or put method, even though multiple instances of metadata are associated with the base object. The instance of metadata retrieved is the one named in the base object. However, you can retrieve or store both an instance of RuleSet/LTFactSet-level metadata and an instance of Rule/LTFact-level metadata with a single get/put. This is possible because you can set each of these two levels of metadata for the same base object.
Currently you cannot create multiple instances of metadata at the Rule or LTFact level. Also you must get the metadata at the Rule or LTFact level whenever you expect to add or delete Rules or LTFacts that have metadata associations.
Individual Rules or LTFacts are deleted by finding them or positioning the implicit cursor through the containing RuleSet or LTFactSet, then invoking the deleteElement member function for the containing RuleSet or LTFactSet. The deleteElement member function also causes an implicit nextElement to be invoked, which can result in an IABool false for all the cases where this would occur for nextElement. When a Rule or LTFact is deleted, all metadata for that rule is also deleted implicitly. When you delete a Rule or LTFact, the implementation marks the item for deletion. The actual deletion occurs when you put the containing base object. Cursor positioning does not show the deleted elements.
You can delete metadata without deleting the base object by using the delMetadata member function for a Collector, RuleSet, or LTFact Set. You can do similarly for the Rule or LTFact-level metadata by using the delElementsMetadata member function. These metadata delete functions allow you to specify the name of the metadata that you want to delete.
A RuleSet or LTFactSet is deleted by invoking the del member function of the RuleSet or LTFactSet. All RuleSet or LTFactSet metadata is implicitly deleted when the base object is deleted. If the RuleSet or LTFactSet contains Rules or LTFacts, they are also deleted, along with their associated metadata. Similarly, a Collector is deleted by invoking the del member function of the Collector. This is not a cascading delete, i.e. the Collector must be empty.
The put() includes, in addition to writing attributes to persistent storage, a logical "close" in that it also causes the last buffer to be written out to persistent storage. You can use put() to flush the buffer and update the persistent attributes at any time; then continue with addRecord() functions. But before terminating the logging you should do a final put to logically close the log. After put() you can run the destructor and finish or you can use clear(), del(), or start over by using addRecord() once again (continuing the same LRSet at the point where it was left by the put(). In this case, you need not use get() to reestablish the existing attributes or persistent cursor positioning. There is no comparable logical close for log reads.
Positioning for Log Reads: You can position the log starting point for log reads by using get(). There are two methods for positioning. You can start from the beginning (oldest log record), position n log records from the end of the log, or position at some particular log record (specifying a log key (identifier)). The latter can be used, for example, to allow you to use a log key (from a log record that you already read); then begin reading the log again at a later time, using that "high water mark" as a new starting point. When starting at a given "log key" (Key Position), you are positioned at that specified log record, unless that log record has does not exist. You cannot always predict if a log record is going to exist because the log can have wrapped-around past the log record that was previously read. When you request positioning on a log record that has been replaced due to wrap-around condition, you are positioned at the next available existing log record following the one that you requested.
Setting and Extracting User-defined Log Metadata: You can associate metadata with log records and store it in persistent storage with a LRSet. The log metadata is intended for use by readers or viewers/formatters of the log to permit their "mark-up" of log records. Set and extract of metadata can be done in connection with moving through a LRSet during log read operations. Metadata is user-defined and associated with particular log records. You can set/extract metatdata for a particular log record while "positioned" on that record (while viewing the log records sequentially). Alternatively, you can also set/extract metadata independent of this log positioning by providing a "log key" (log record identifier) that you saved from earlier reading of the log record. This may be appropriate for cases where log viewers are remote, allowing markup of log records to be accumulated remotely, then sent to a central location for application to the LRSet without first re-reading the log. Log keys are assigned (internally) by the library when a log record is added to the log set.
Metadata is recorded as a single character string for each log record. When you establish metadata for a LRSet, you also establish this character string size. Once established, this size is fixed for that metadata instance. If you provide a shorter string than you established for the metadata size, it is padded to the right with blanks. When you extract this padded metadata field, you also get a padded character string. When metadata is not set, it defaults to blank(s).
You can have multiple instances of metadata for a particular log set. Each instance is assigned a name when the metadata is instantiated. In this way different users/viewers can associate their own "mark-up" to the log records. Each named metadata instance in the same LRSet can have a different metadata size.
Metadata can only be associated with log records when the log records are associated with a LRSet. This is consistent with the intention that metadata is not designed for use by log writers, who can instantiate LR objects before adding them to the LRSet. Instead, metadata is associated with log records (by log readers) after the log records have become a part of a LRSet. For this reason set/extract metadata are methods of the LRSet, not the log record. Even though metadata is controlled through the LRSet, it is nevertheless <+">associated with<-"> log records.
No Explicit Open/Close: There is no explicit open or close function for LRSets. The essential aspects of these functions are provided implicitly through the get() and put() member functions respectively. For writing a new log, you specify LRSet attributes through the LRSet constructor and the implicit open occurs on the first addRecord() member function. For writing to an existing log, the implicit open occurs with the get() that uses the existing attributes of the log and positions the cursor for adding to the log. The addRecord() function also establishes write mode without declaring write-intent via an explicit open. Once this write mode is thus established it cannot be changed without an implicit close (via put() as described below), or use of the clear() or del() functions.
When reading a log, you also start with a get(). This is very much like an open, establishing the log attributes, positioning the starting point, and priming the internal buffer. The getRecord() number function reads log records. The first getRecord() establishes intent (read), after which you cannot do writes (addRecord) without restarting.
Clear Before Write: The clear() function resets all persistent storage data, but does not affect log attributes. It essentially allows for keeping the same log, but erases existing log records and allows for starting over again. Because the data is reset, the persistent storage cursor attribute is also reset to the top. After a clear, you can begin writing to the log (addRecord()) afresh from the existing object.
Clean Slate: The del() function resets all persistent storage data etc. like clear(), but also reset all LRSet attributes in persistent storage. LRSet attributes in the object remain, so you can begin again or use sets to alter the object attributes and then begin again with addRecord(). The main difference from clear() is that persistent storage attributes for the LRSet are lost. If you start over, they will be reestablished.
Read/Write Separation: If you both read and write to the same log, you must use separate threads and separate LRSet instantiations for them so that proper serialization of read and write operations can maintain consistency. Following a re the log read operations (relative to persistent storage changes): getRecord() and putMetadata. Following are the log write operations: AddRecord(), put(), clear(), and del(). Get() is used to initialize either writes or reads to an existing log.