data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
The Union Operator
In the Blocks Architecture, a space.cgi implementation (technically
known as a builder) uses the Blocks protocol suite to submit a
request to a SpaceServer engine. When a user submits a query in
the form of retrieve.tag syntax, the job of the space.cgi builder
is to construct a proper request using the <union> operator,
which in turn becomes part of the payload of a packet in the
form of a syntax known as the Simple Exchange Profile.
For developers wishing to bypass the basic syntax of calls to
space.cgi and submit a "native" query, the retrieve.union parameter
provides that functionality. The content of the retrieve.union
is a valid union element, as specified in this tutorial.
The "union" element defines the set-union of one or more "intersect"
elements, each of which define the set-intersection of one or more "union" or
"compare" elements, each of latter describing an containment/value assertion.
Think of a union element as a container for an Boolean "or" for a set
of search results. An intersect element is a Boolean "and" for a
set of search results. In combination, these two elements allow
the developer to submit an arbitrarily complex query to the SpaceServer.
The "compare" elements are what specify the query to be done, thus
inside the sets of union and intersects, one must find a compare
element.
Each "compare" element contains three attributes, a "path" element,
and a "value" element:
- The "subtree" attribute identifies the naming scope for the comparison.
At present, doc.edgar and doc.rfc are the two valid values for a
subtree attribute.
- The "operator" attribute, if present, identifies the comparison to be made
between the "path" element and the "value" element, one of: "eq" (equals, the
default), "ne" (not equals), "contains", or "excludes".
- The "caseSensitive" attribute, if present, is either "true" (the default)
or "false" to indicate whether the comparison should consider textual case
significant.
- The "path" element identifies a partial containment hierarchy for the
comparison.
- The "value" element contains the text to compare against.
At the core of the fetch operation is the notion of a containment/value
assertion. Each assertion identifies a comparison between a partial containment
hierarchy and a textual value.
Several examples serve to illustrate these relationships.
First, consider: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc' operator='contains'
caseSensitive='false'>
<path><element property='email' /></path>
<value>mrose@</value>
</compare>
</intersect>
</union>
|
|
which looks for objects satisfying several criteria: first, the object is
named under "doc.rfc"; second, the object has at least one property called
"email"; and, third, any of those properties contains the string "mrose@"
somewhere within it, according to a case-insensitive comparison. Note that in
this example, the "email" property may occur at any level of nesting within an
object.
Second, a similar example: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc'>
<path attribute='surname' />
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which looks for objects in the same subtree, but is concerned only with
attribute, not property, values. That is, if any property with an object has
attribute called "surname" and the value of that attribute precisely matches the
string "Rose", then this assertion succeeds. (Recall that the default value for
"operator" is "eq" and the default value for "caseSensitive" is "true".)
Of course, if we wanted to limit containment of the attribute: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc'>
<path attribute='surname'>
<element property='doc.author' />
</path>
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which looks for objects in the same subtree, but performs comparisons only on
attributes called "surname" within a "doc.author" property.
In addition, if we wanted to further contain the attribute's property: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc'>
<path attribute='surname'>
<element property='rfc' />
<element property='doc.props' />
<element property='doc.front' />
<element property='doc.author' />
</path>
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which looks for "doc.author" properties contained within a "doc.front"
property contained within a "doc.props" property contained within an "rfc"
property before looking at the "surname" attribute. Note however, that there is
no concept of "rooting" in a containment hierarchy, e.g., in this example, the
"rfc" property needn't be the top-level property of the object (i.e., the root
element of the corresponding XML document).
Of course, an empty containment hierarchy is also possible: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc'>
<path />
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which looks for the string "Rose" in any property. Similarly, to look for the
string "Rose" in any attribute value, the containment "<path attribute='*'
/>" is used.
Note that in all the preceding examples, both the "union" and "intersect"
elements had only one immediate subordinate, rendering each as the identity
function. However, the presence of the elements is always required, regardless
of whether their functionality is needed. To set-intersect the results of
multiple assertions, the "intersect" element is given multiple subordinates.
Multiple subordinates for the "intersect" element: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc' operator='contains' caseSensitive='false'>
<path><element property='email' /></path>
<value>mrose@</value>
</compare>
<compare subtree='doc.rfc'>
<path attribute='surname' />
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which evaluates two assertions and returns only those objects satisfying
both.
To set-union the results of multiple intersections, the "union"
element is given multiple subordinates: |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<union>
<intersect>
<compare subtree='doc.rfc' operator='contains'
caseSensitive='false'>
<path><element property='email' /></path>
<value>mrose@</value>
</compare>
</intersect>
<intersect>
<compare subtree='doc.rfc'>
<path attribute='surname' />
<value>Rose</value>
</compare>
</intersect>
</union>
|
|
which evaluates two assertions and returns objects that satisfy either. Of
course, recursion is permissible between "union" and "intersect" elements, owing
to their definitions.
The Union Operator Profile DTD
The Union Operator DTD |
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |
<!--
DTD for Union Operator, Draft of 9/9/99
(c) 1998-99 Invisible Worlds, Inc.
-->
<!ENTITY % TEXT "#PCDATA">
<!ENTITY % NAME "NMTOKEN">
<!ENTITY % TYPE "NMTOKEN">
<!ENTITY % ATEXT "CDATA">
<!ELEMENT union (intersect+)>
<!ELEMENT intersect ((union|compare)+)>
<!ELEMENT compare (path, value)>
<!ATTLIST compare
subtree %NAME; #REQUIRED
operator (eq|ne|contains|excludes)
"eq"
caseSensitive
(true|false) "true">
<!ELEMENT path (element*)>
<!ATTLIST path
attribute %ATEXT; "">
<!ELEMENT element EMPTY>
<!ATTLIST element
property %TYPE; #REQUIRED>
<!ELEMENT value (%TEXT;)>
|
|
|
data:image/s3,"s3://crabby-images/08ca8/08ca805c71c98b7c60d4de8e140a9ad69ff2870f" alt="" |