Parser Builder

Fort Riley, Kansas. Decoding a message at the message center which was established by the Signal Corps during a field problem. (Parsing is a form of decoding.)

...many systems need to read data from a stream and classify elements on the steam as objects. Many times the knowledge of how to interpret a stream is know by a different group than the knowledge of how to use that stream, making LooseInterfaces advantageous.

Given a data stream, we want to interpret it, classifying the elements into the appropriate class of object. The data stream contains tags that can be used to identify the raw data, and we want to convert the stream into object form, so we can process the data.

We need a way to create arbitrary objects based on tokens in the data stream.

✥ ✥ ✥
For example consider the problem of reading in raw UNIX files and classifying them into types of files based on their "magic number" --as in the tags in the /etc/magic file. You could create the appropriate subclass of File and then invoke its virtual edit() method, bringing up the appropriate editor.

In a telemetry processing system each telemetry packet has identifying information in its header. The telemetry processing system design requires that an object, once created, knows how to process itself (i.e., we will not use a dispatch table, or a switch on type--this is to satisfy the OrganizationFollowsLocation pattern). At the lowest level objects will be created using a FactoryMethod [BibRef-GOF1995]. Each class of packets will be processed differently; some will assemble themselves into larger units, others will issue messages. Consider the following hierarchy, for a spacecraft that there are two subclasses of Packet:APackets and BPackets:

external image BerczukPLoP953.gif
Sample Packet Hierarchy
We want each Packet, once created, to process itself by using a virtual method, process(). If we pass a data stream into a factory, we want to return a pointer to a Packet that has the appropriate type. To summarize the forces:

  • There is a need to interpret a raw data stream.
  • There is a generic way to process the packets once they are returned from the factory.
  • The raw data contain tags which can be used for classification.


Use a ParserBuilder which reads the identifying information from the header of the packet, and creates an object of the appropriate type, removing only one object's worth of data from the stream.

An example of a client interface in C++ is:

     while (!dataStream.empty()) {
         PacketFactory f;
         Packet* p = f.make(dataStream);
         if(p) p->process()
✥ ✥ ✥
This is a variant of AbstractFactory [BibRef-GOF1995] but the object to be created is defined in the data stream, rather than by the client. HierarchyOfFactories and ParserBuilder can be used to implement LooseInterfaces by providing a means of separating clients from producers of data (assuming that data producers also define the factories).

Other uses:
In some object persistence mechanisms, objects are assigned class Id's which are placed in the storage stream. These Ids are restored first to allow the system decide what class object to make from the restored stream.

ParserBuilder is used in in the pattern QueryObjects [BibRef-BrownWhitenack1999] to convert SQL statements to QUERY objects. (Query Objects addresses the problem of handling the generation and execution of SQL statements in an object-oriented way, when you are trying to use a relational database to store objects.) [BibRef-Riehl1999] discusses similar issues, building objects on a desktop using specifications.

The distinction between this pattern, Builder [BibRef-GOF1995] and FactoryMethod [BibRef-GOF1995] is that in this pattern the factory reads from a stream and the client does not know which type of object will be returned. For text interpretation, ParserBuilder can be a front end to the Interpreter [BibRef-GOF1995] pattern.