Raiders
of the lost MARC:
Mining the Voyager database for fun and profit
When
CUL migrated to Endeavor’s Voyager LMS in June 2000, the
change affected everyone who handled or accessed bibliographic
data, staff and users alike. Whether staffing the reference
desk, approving invoices in accounting, placing items on reserve,
checking in serials or cataloging, all CUL staff had to adapt,
and sometimes completely revamp, their own jobs to accommodate
the capabilities and the limitations of the system.
Among
the most significant positive changes has been the greatly enhanced
reporting capability. In the NOTIS environment, only library
systems staff could write queries against the database. The
work was time-consuming and complex. With limited staff time
available, requests for reports had to be queued and handled
in priority order system-wide. But with Voyager, the ability
to get at the underlying data was democratized. Anyone with
a PC, the proper software and some training could begin writing
custom reports for themselves or their units.
In
Technical Services, we’ve used a number of programs to
perform data analysis, data mining and reporting. Three of them
--Microsoft Access, VgerSelect, and Harvest—represent
our primary tools for these purposes. Each has particular strengths
and shortcomings, but in combination, they give us a very powerful
array of tools to get at hidden data to aid in decision-making
and planning, reporting and even end-user applications.
Microsoft
Access is the most flexible of these tools. From their
own workstations, staff can create queries that can look at
virtually any record in the Voyager database, whether it be
a MARC bibliographic or holdings record or a purchase order,
invoice, or item record. Linking fields from the various Voyager
tables, TS staff have written sophisticated queries that count
backlogs by category, generate new acquisitions lists, determine
what we are spending on electronic resources, or tell us how
many items we’ve cataloged for a particular location in
a given time span. Often, these queries have been written with
the assistance of our resident Access guru, Lydia Pettis of
Library Systems. Many TS staff have taken Lydia’s excellent
Access class and have become adept at finding creative ways
to use it in their daily work (see sidebar).
As
useful as it is, Access does have a number of limitations when
working with bibliographic data. Getting at specific fields
or subfields in a MARC record is often difficult, and performance
can be very slow. When item records or acquisitions data aren’t
needed, VgerSelect often comes to the rescue.
VgerSelect is a freeware tool for harvesting MARC data from
Voyager bibliographic and holdings records developed by Gary
Strawn of Northwestern University Library. Although VgerSelect
can only extract bib and holdings data, in many ways it is more
useful, and faster, than MS Access. VgerSelect allows a user
to pinpoint exact data in a bibliographic or holdings record,
down to the subfield level. Results can be output in text format,
or the full MARC record can be written to a file. VgerSelect
has been used extensively in database cleanup projects, in tracking
cataloging errors, and in preparing data for our automated e-journal
maintenance routines.
Harvest
is a locally-developed, Web-based tool created by Peter Hoyt
of Library Systems. Harvest can examine bibliographic, holdings
or authorities records. Queries are created using a simple interface,
and users can easily customize or revise the requests. Harvest
queries can also be linked to one another, depending on need.
Harvest has been used extensively in database cleanup projects,
in the e-journal maintenance work, for various statistical reports
in support of LARIS, and for other planning in technical services,
such as the implementation of classification on receipt.
Use
of the three primary tools isn’t mutually exclusive; often
they are used in conjunction with one another to produce highly
customized reports. For example, a set of results retrieved
from Access or Harvest may be re-run through VgerSelect. And
follow-on processing can be multi-faceted as well. A VgerSelect
result may be imported into an Access table, an Excel spreadsheet,
or even converted to an XML or HTML file. Once the data is in
another format, charts, reports or Web pages may be generated
from the data. |
|