The problem with the previous example is that you need to know the
structure of the documents in order to find them. For example,
when we wanted to find the record for the taxon
Sauroposeidon,
we had to formulate a complex XPath
/Zthes/termName
which embodies the knowledge that taxon names are specified in a
<termName>
element inside the top-level
<Zthes>
element.
This is bad not just because it requires a lot of typing, but more
significantly because it ties searching semantics to the physical
structure of the searched records. You can't use the same search
specification to search two databases if their internal
representations are different. Consider a different taxonomy
database in which the records have taxon names specified
inside a <name>
element nested within a
<identification>
element
inside a top-level <taxon>
element: then
you'd need to search for them using
1=/taxon/identification/name
How, then, can we build broadcasting Information Retrieval applications that look for records in many different databases? The Z39.50 protocol offers a powerful and general solution to this: abstract ``access points''. In the Z39.50 model, an access point is simply a point at which searches can be directed. Nothing is said about implementation: in a given database, an access point might be implemented as an index, a path into physical records, an algorithm for interrogating relational tables or whatever works. The only important thing is that the semantics of an access point is fixed and well defined.
For convenience, access points are gathered into attribute sets. For example, the BIB-1 attribute set is supposed to contain bibliographic access points such as author, title, subject and ISBN; the GEO attribute set contains access points pertaining to geospatial information (bounding coordinates, stratum, latitude resolution, etc.); the CIMI attribute set contains access points to do with museum collections (provenance, inscriptions, etc.)
In practice, the BIB-1 attribute set has tended to be a dumping ground for all sorts of access points, so that, for example, it includes some geospatial access points as well as strictly bibliographic ones. Nevertheless, this model allows a layer of abstraction over the physical representation of records in databases.
In the BIB-1 attribute set, a taxon name is probably best
interpreted as a title - that is, a phrase that identifies the item
in question. BIB-1 represents title searches by
access point 4. (See
The BIB-1 Attribute
Set Semantics)
So we need to configure our dinosaur database so that searches for
BIB-1 access point 4 look in the
<termName>
element,
inside the top-level
<Zthes>
element.
This is a two-step process. First, we need to tell Zebra that we want to support the BIB-1 attribute set. Then we need to tell it which elements of its record pertain to access point 4.
We need to create an Abstract Syntax
file named after the document element of the records we're
working with, plus a .abs
suffix - in this case,
Zthes.abs
- as follows:
attset zthes.att attset bib1.att xpath enable systag sysno none xelm /Zthes/termId termId:w xelm /Zthes/termName termName:w,title:w xelm /Zthes/termQualifier termQualifier:w xelm /Zthes/termType termType:w xelm /Zthes/termLanguage termLanguage:w xelm /Zthes/termNote termNote:w xelm /Zthes/termCreatedDate termCreatedDate:w xelm /Zthes/termCreatedBy termCreatedBy:w xelm /Zthes/termModifiedDate termModifiedDate:w xelm /Zthes/termModifiedBy termModifiedBy:w
![]() |
Declare Thesausus attribute set. See |
![]() |
Declare Bib-1 attribute set. See |
![]() |
This xelm directive selects contents of nodes by XPath expression
|
![]() |
Make |
After re-indexing, we can search the database using Bib-1 attribute, title, as follows:
Z> form xml Z> f @attr 1=4 Eoraptor Sent searchRequest. Received SearchResponse. Search was a success. Number of hits: 1, setno 1 SearchResult-1: Eoraptor(1) records returned: 0 Elapsed: 0.106896 Z> s Sent presentRequest (1+1). Records: 1 [Default]Record type: XML <Zthes> <termId>2</termId> <termName>Eoraptor</termName> <termType>PT</termType> <termNote>The most basal known dinosaur</termNote> ...