Digital to Microfilm Conversion:
A Demonstration Project
1994-1996

Final Report to the
National Endowment for the Humanities
PS-20781-94

Anne R. Kenney
Principal Investigator

  Cornell University Library
Department of Preservation and Conservation
Ithaca, NY 14853

  URL: http://www.library.cornell.edu/preservation/publications.html

 

TABLE OF CONTENTS 

ACKNOWLEDGMENTS

INTRODUCTION
The Hybrid Approach

THE CORNELL AND YALE STUDIES
Yale’s Project Open Book
Cornell's Digital to Microfilm Conversion Project

SUMMARY FINDINGS OF THE CORNELL PROJECT
Quality
Recommendation:
Cost
Recommendation:
Process
Recommendation:

BACKGROUND ON THE CORNELL DIGITAL TO MICROFILM PROJECT

FINDINGS AND RECOMMENDATIONS: QUALITY
Quality Finding No. 1: COM meets preservation standards for quality and permanence
Quality Finding No. 2: No resolution loss in recording onto COM
Quality Finding No. 3: Paper Scanning superior to film scanning
Quality Finding No. 4: Scanning COM
Recommendation:
Density
Resolution
Reduction Ratio
Film Size
Bibliographic Characteristics

FINDINGS AND RECOMMENDATIONS: COST
Cost Finding No. 1: Microfilm and Digital Imaging
Table 1: Cornell Digital to Microfilm Conversion Processes: Time and Costs
Table 2: Cornell Annual Equipment Costs
Table 3: Producing Digital Images from Paper vs. Microfilm
Table 4: Comparing Cornell’s Digital to COM Project to Yale’s Project Open Book
Table 5: Costs Associated with the Hybrid Approach: Scan First vs Film First
Cost Finding No. 2: In-House vs. Outsourcing
Recommendation: COM vs. digital archiving program
Table 6: Comparison of COM costs to digital archiving estimates

FINDINGS AND RECOMMENDATIONS: PROCESS
Process Finding No. 1: Film First vs. Scan First:
Film First Approach
Scan First Approach
Table 7: Conversion Differences in Two Hybrid Approaches
Process Finding No. 2: Need to Investigate Grayscale and Color Scanning
Process Finding No. 3: Reasonableness Standard for Copyright Searching
Recommendation: Summit Conference

Appendices:
COM Preparation and Quality Control
Reel Programming
Image Graphics and the Electron Beam Recorder
Proposed Reasonableness Standard for Determining Copyright Status


ACKNOWLEDGMENTS

There are many individuals who contributed to the success of this project. It was truly a collaborative one, involving three divisions within Cornell, the imaging service bureau, Image Graphics Recording Center, and the Technical Advisory Committee. Cornell is particularly grateful to the National Endowment for the Humanities for its financial support and faith in our ability to conduct this important research and demonstration project.

Five nationally recognized experts in the fields of imaging science, micrographics, and standards development comprised the Technical Advisory Committee to the project. The advisory committee members included: Paul Conway (Head, Preservation Department, Yale University and Principal Investigator of Project Open Book); Nancy Elkington (Assistant Director, Preservation Services, Research Libraries Group and the guru of preservation microfilming standards); Michael Lesk (Division Manager, Computer Sciences Research, Bellcore, who had served as a technical advisor to Mann Library in the CORE digital conversion project); Don Williams (Senior Image Engineer, Eastman Kodak Research Laboratories, who chaired the AIIM Committee that developed the technical report, TR-26-1993, Resolution as it Relates to Photographic and Electronic Imaging); and Don Willis (then Vice President of Electronic Publishing, INET, who had authored the influential publication, A Hybrid Systems Approach to Preservation of Printed Materials). The committee met at the beginning of the project when they traveled to Cornell to review responses to the RFP, to inspect sample COM and affirm its potential viability, and to select a vendor. The committee met for a second time at the end of the project at Yale University where they were able to compare and contrast the findings from the Digital-to-Microfilm Project and Yale’s Project Open Book.

Many individuals at Cornell participated in this project:

The Department of Preservation and Conservation served as the principal host, responsible for book scanning and project coordination. John Dean, Director of the Department, and Barbara Berger, Preservation Reformatting Librarian, provided invaluable administrative guidance. Five scanning technicians participated in this project over the course of two and a half years: Michael Friedman, Tami Williams, Tom Tierney, Mary Moon, and Allen Quirk. Marti Hanson (Syracuse University) and Steve Chapman (Harvard University) both served as preservation interns in the Department during the first year of this project, and were responsible for developing the Request for Proposal for Computer Output Microfilm (COM) recording services. Steve continued in the department beyond his internship, and played a significant role as liaison between the scanning team, Mann Library staff, and our COM provider, Image Graphics. He also coordinated the data gathering and initial analysis for the cost study component.  

The participation from the Albert R. Mann Library, the agriculture and life sciences library at Cornell, was substantial. Jan Olsen (Director), Sam Demas (Head, Collection Development and Preservation), Rich Entlich (Preservation Librarian), Marjorie Proctor (Preservation Manager), Stephanie Lamson (Preservation Assistant), Eniko Farkas (Conservation Technician), and Janet McCue (Head, Technical Services) were responsible for the discipline-based selection of the core agricultural historical literature included in the project; for the preparation of the volumes for scanning and the quality control of the Computer Output Microfilm as well as the paper facsimiles produced directly from the digital files; for providing guidance in the development of procedures and recommended practices; for the investigation into copyright clearance; and for the cataloging of the digital files and the resulting COM. Mann Library has also committed to making the digital files from this project accessible via a Web-based user system, and Ted Wong is to be thanked for his efforts in this area.

Cornell Information Technology provided technical oversight for the project. Steve Worona (Assistant to the Vice President for Information Technology) served as the technical coordinator, and the following programmer/analysts, Sal Gurnani, Dave Fielding, Bill Fenwick, George Kozak, and Pela Varodoglu, contributed systems support and programming applications that enabled us to arrange and package the digital images for COM production and for the storage of the digital files.

The Image Graphics Recording Center (IGRC) of Shelton, Connecticut served as the sole COM producer for this project. IGRC was very helpful in working with Cornell staff to undertake this research and demonstration project. We are particularly grateful for the support of Michael Beno (Customer Service Manager), Jeff Driscoll (Sales Representative), and Putnam Morgan (Marketing Manager).

The staff of Yale University’s Project Open Book, and especially Paul Conway and Bob Halloran, were extremely cooperative in working with Cornell to make quality, cost, and process comparisons between the two projects. Yale staff graciously scanned microfilm and COM for Cornell, hosted the second meeting of the Technical Advisory Committee as well as several visits from Cornell staff, and provided process/data work forms and preliminary cost figures at timely intervals.

 


INTRODUCTION

Digital technology holds great promise for the world’s research libraries, for it could revolutionize how we capture, store, preserve, and access information. From the preservation perspective, digital technology offers important reformatting advantages over photocopy and microfilm, including its capability to create a higher quality reproduction of a deteriorating original, the ability to reproduce digital images over and over again with no loss of image quality, great flexibility in terms of output and distribution, and potential cost savings associated with storage and distribution. Most important, digital technology offers unprecedented opportunities for access and use, since it could facilitate the expansion of scholarship by providing timely, distributed access to a variety of sources from a variety of locations.

Although the advantages to digital technology for preservation reformatting and access enhancement are numerous, there are drawbacks as well. These center on the obsolescence associated with the rapid changes occurring in the development of hardware/software system design, a lack of experience on the part of institutions and service bureaus with digital imaging for preservation, and issues of permanency and standards. Digital technology has the potential to redefine preservation reformatting, but until the concerns associated with maintaining long-term accessibility to material stored in digital image form can be resolved, many libraries and archives are loath to initiate digital projects beyond the pilot phase. (1)

THE HYBRID APPROACH

In 1992, the Commission on Preservation and Access published a highly influential report by Don Willis, entitled A Hybrid Systems Approach to Preservation of Printed Materials. In this report, Willis argued convincingly for the creation of microfilm for preservation and digital images for access. He discussed the various options for creating both film and digital files, noting the advantages and tradeoffs associated with filming first and scanning from the film, or scanning first and creating computer output microfilm (COM) from the digital files. Willis predicted that the costs of producing both microfilm and digital images would be roughly the same in either approach, and that a hybrid system could serve as a viable preservation strategy until research institutions developed and implemented digital preservation programs. In the event that the digital master were to become unreadable, the microfilm (or COM) could be scanned to regenerate the digital copy (presumably at lower costs than the original capture process). The real issue, Willis concluded, would be determining the circumstances under which the "film first" approach or the "scan first" approach should be pursued.

THE CORNELL AND YALE STUDIES 

It may seem ironic that microfilm, which has become the principal means for preserving information endangered by the "slow fires" of acidic paper, could become an important legacy measure for coping with the "fast fires" of digital obsolescence. Nonetheless, in 1994 the National Endowment for the Humanities funded two important and complementary projects, designed to test and evaluate the interrelationship between microfilm and digital imagery.

YALE’S PROJECT OPEN BOOK

NEH supported the production phase of Yale University’s Project Open Book, a comprehensive feasibility study on the digital conversion of microfilmed library materials. In partnership with the Xerox Corporation, Yale built a networked, multi-workstation conversion system to convert 2,000 microfilmed books to digital image files (representing 430,000 images). These books, chosen from the fields of American history, Spanish history, and the history of communism, socialism, and fascism, had been microfilmed in the late 1980s according to standards adopted by the Research Libraries Group, Inc. (2) Project Open Book studied the means, costs, and benefits of such an approach. The results of that project are summarized in Paul Conway’s final project report to NEH. (3)

CORNELL'S DIGITAL TO MICROFILM CONVERSION PROJECT

Cornell conducted a two and a half year demonstration project to test and evaluate the use of high resolution bitonal (1-bit, black and white) imaging to produce computer output microfilm (COM) that could meet national preservation standards for quality and permanence. (4) In the course of the project, 1270 volumes and accompanying targets (representing 450,000 images) were scanned and recorded onto 177 reels of film. The volumes selected for the project represented core holdings in 19th and 20th century agricultural history. All paper scanning was conducted in-house, and Cornell contracted the COM production to Image Graphics, Inc. of Shelton, Connecticut. With the assistance of a Technical Advisory Committee of outside experts (see acknowledgments), the project led to an assessment of quality, process, and costs, and to the development of recommendations for the creation and inspection of preservation quality microfilm produced from digital imagery.

Both Cornell and Yale recognized the significance and complementary nature of each other's projects. The projects had in common:

  • Relying on high quality microfilm as the preservation master
  • Creating approximately the same number of high resolution (600 dpi, 1-bit) digital images from similar collections of 19th and 20th century brittle books.
  • Developing in-house scanning capabilities
  • Using the same basic technology for indexing and file management
  • Collecting and sharing data on costs, production, and quality in a manner that invited comparative analysis

These two projects benefit the larger preservation community as it seeks to understand the circumstances under which scanning first or filming first is most appropriate in achieving the twin goals of preservation and enhanced access through the use of digital technology.

SUMMARY FINDINGS OF THE CORNELL PROJECT

The following findings and recommendations have been reached as a result of the project:

QUALITY

  1. Computer Output Microfilm (COM) created from 600 dpi 1-bit images scanned from brittle books can meet or exceed ANSI/AIIM microfilm standards for image quality and permanence.
  2. No detectable loss of resolution was observed in recording the digital images onto COM.
  3. The quality of digital images created at the same resolution and bit depth will be superior when brittle books are scanned directly from paper rather than from microfilm copies.
  4. COM can be scanned to re-produce high quality digital images in the event that the original digital files become unreadable.

Recommendation:
Standards for COM production and inspection must be developed and adhered to by institutions and service bureaus alike. The Technical Advisory Committee to the project recommends that quality standards for digital imaging of paper source documents be developed and that modifications be made to the standard microfilm quality control practices for evaluating density, resolution, reduction ratios, targets, film size, and bibliographic completeness of COM.

COST

  1. In a hybrid program to create both microfilm and digital images, the costs associated with the scan first approach appear to be less than those incurred in the film first approach. If extant film is scanned, as was the case in Project Open Book, then the costs favor the film first approach. If only digital images are to be produced, the costs of scanning from paper versus film are comparable. However, the cost figures produced by both Cornell and Yale reflect the nature of demonstration projects rather than full production processes.
  2. Creating in-house scanning services may not be as cost-effective as outsourcing the work, provided clear guidelines are developed to ensure compliance with quality (and pricing) requirements suitable to library and archival applications.

Recommendation:
The findings from the Cornell COM project represent a financial benchmark against which to measure costs associated with developing and maintaining a digital archiving program.

PROCESS

  1. The film first and scan first approaches are both viable solutions in a hybrid program. The decision to go with one approach over the other will depend on a range of variables associated with the attributes of the originals, institutional capabilities, and the availability of appropriate imaging products and services.
  2. The Cornell/Yale projects evaluated the use of high resolution bitonal scanning to produce digital versions of brittle books. Further investigation into the quality, processes, and costs associated with grayscale and color scanning should be conducted.
  3. Cornell adopted a "reasonableness" standard for determining the copyright status of twentieth century brittle books to be included in a hybrid approach.

Recommendation:
Cornell and Yale recommend that the National Endowment for the Humanities support a high level conference to assess the findings of their projects; to make recommendations for best practices on the creation and use of conventional microfilm and COM in a hybrid approach; to consult with vendors of imaging services and products in adopting these practices; to identify areas needing additional research and development; and to evaluate the role of the hybrid approach in broader digital preservation efforts.

BACKGROUND ON THE CORNELL DIGITAL TO MICROFILM PROJECT

Since 1990, Cornell University has advocated the use of 600 dpi 1-bit scanning to capture the informational content of 19th and 20th century brittle books. This position is based on the use of a digital Quality Index approach to benchmarking resolution requirements; to an extensive assessment of common printer’s type sizes used by publishers from 1850-1950; and to visual inspection of digital facsimiles produced from over 100 different type fonts (including Roman and non-Roman script) used during this period. Until the mid-twentieth century, commercial books were produced using metal type, which had a tendency to spread with large print runs, so printers were limited to how small or closely spaced letters could be. All common typefaces used during this period were produced at 5 or 6 point type and above. Six hundred dpi 1-bit scanning adequately captures the fine detail, elaborate serifed script, italics, and small body heights that characterize these fonts. Cornell, therefore, concluded that 600 dpi 1-bit scanning was sufficient to capture fully the textual monochrome information contained in virtually all books published during the period of paper’s greatest brittleness. (5) These findings have been confirmed through quality inspection of over a million pages scanned in-house in the Preservation Department.

In 1993, Cornell conducted a preliminary test to record digital files for one brittle book onto computer output microfilm. This test led Cornell to conclude that COM could be produced from these digital files to meet ANSI/AIIM standards for image quality. With funding from NEH the following year, Cornell sought to evaluate the feasibility of producing preservation quality COM for a significant volume of brittle books.

A Request for Proposal (RFP) for the COM production was developed and distributed to 27 service bureaus by June 1994. (6) Of those, 14 expressed an interest in the project. Each was asked to prepare a sample roll of film from Cornell-produced digital image files for 5 books representing the range of material to be converted in this project. A number of the vendors could meet all requirements, excepting the need to produce film on 35 mm format. Most companies produce COM on 16 mm or 105 mm film. Others were able to record onto 35mm film, but could not handle the 600 dpi image files or the small reduction ratios. (7)

In the end, the number of vendors who could actually perform the work as specified was very small, and only one company submitted a response that the Technical Advisory Committee would approve. In August 1994, this committee of nationally recognized experts in the fields of imaging science, micrographics, and standards development met at Cornell. They reviewed the responses to the RFP, inspected sample COM and affirmed its potential viability, and selected a vendor based on the overall quality of the proposal, technical capabilities, quality control measures, price, and consumer/vendor relations. Cornell awarded the COM production contract to Image Graphics Incorporated of Shelton, Connecticut (http://www.igraph.com/) in September 1994.

Mann Library staff members prepared the 1,270 books chosen for this project and also assumed responsibility for local inspection of the COM. (See Appendix I for staff procedures for book preparation and COM inspection.) The actual scanning of the brittle books occurred in-house, in the Department of Preservation and Conservation, using the Xerox Document on Demand R2.x system (XDOD), though prior configurations of the prototype Xerox CLASS system were used in the early stages of the project. The pages were captured as a collection of TIFF files (with accompanying targets), compressed prior to storage using CCITT Group IV compression, then paginated and structured using the XDOD software.

Following scanning, all images were sent to a Xerox Docutech printer to create printouts that supported two functions: quality control of the scanning and creation of facsimile volumes to replace the embrittled, disbound originals. Once the digital files had passed inspection, the XDOD images and accompanying metadata were exported from the proprietary Xerox database structure based on the ODA standard (Open Document Architecture) that is not in common use in the United States, to an open Cornell Digital Library format. Copies of the book images and accompanying targets were reel programmed, using a Cornell-designed "tape generation program," created to accommodate the requirements of the COM recorder used at Image Graphics. The files were then quality checked and read out to Exabyte tapes for shipping to Image Graphics. (See Appendix II for information on the reel programming).

A relatively new division of Image Graphics, the Image Graphics Recording Center, carried out the COM conversion services. Using the Micrographics EBR System 3000 electron beam recorder manufactured by Image Graphics, the Center recorded the digital images directly from 8 mm Exabyte tapes to 35mm Kodak Image Link HQ microfilm. Electron beam recorders tend to offer better resolution, speed, and dynamic range than other COM recorders (utilizing laser and CRT technology). The electron beam produces a smaller spot (4-6 microns) and is capable of supporting higher resolutions. The equipment also has fewer moving parts and greater flexibility (e.g., multiple film formats). The EBR software controls the reduction ratio, density, image placement, and required spacing (between images, frames, and volumes) on each roll of film. Although the EBR is capable of processing digital data up to 1000 dpi at 24x, the specifications for Cornell were set to 600 dpi, with variable reduction ratios ranging from 6x to 10x. With these EBR settings, Image Graphics reported recording speeds of less than 4 seconds per page.  

The Center also assumed responsibility for the initial quality control inspection on a PEPCO MFI-Type R Microfiche Inspector for roll film; conducted density readings using a TD 504, 2mm densitometer; and submitted sample film for third party testing for the presence of residual thiosulfate (methylene blue tests). Cornell conducted a visual inspection of all microfilm on a light box under magnification, took resolution readings and density checks, and returned to IGRC any film that failed to pass inspection. (See Appendix III for information on Image Graphics and Micrographics EBR System 3000.)

The scanning of 450,000 images was completed in June 1996, and the last of the COM produced by August 1996. The members of the Technical Advisory Committee (TAC) met for a second time in September at Yale University where they were able to compare and contrast the findings from the Cornell and Yale projects. The TAC reached a number of findings and recommendations at this meeting.

FINDINGS AND RECOMMENDATIONS: QUALITY 

QUALITY FINDING NO. 1: 35MM COMPUTER OUTPUT MICROFILM CREATED FROM 600 DPI 1-BIT IMAGES SCANNED FROM BRITTLE BOOKS CAN MEET OR EXCEED ANSI/AIIM STANDARDS FOR IMAGE QUALITY AND PERMANENCE.

The ANSI/AIIM standards cover a wide range of issues, including: the preparation of documents; composition of the film stock; quality of image capture as defined by reduction ratio, image placement, resolution, and density; film processing; and storage. Although there were a number of technical and procedural problems encountered in the process, Cornell’s inspection of the COM revealed a body of film that was of high quality, both in its overall consistency and faithful representation of text, line art, and halftones. The COM compared favorably—and in some cases exceeded—the quality of film produced via traditional, high contrast processes, particularly in the rendering of halftones. Throughout the course of this project, 18 of 177 reels (10%) failed to pass inspection. The majority of problems were encountered in the first 50 reels, and once these problems were resolved, the reject rate was remarkably low (4%). All reels eventually passed inspection as the functional equivalents to standard preservation microfilm. 

  • Film stock, processing, and associated packaging: These all met ANSI/AIIM standards for permanence. The 35mm film stock used was Kodak Image Link HQ that is widely used in preservation microfilming; all reels passed third party inspection for residual thiosulfate concentration; and appropriate reels, fasteners, and boxes were used to store the film. The completed COM is stored under controlled environmental conditions in the Research Libraries Group vault at National Underground Storage in Boyers, Pa.
     
  • Film condition: The microfilm was uniformly free of scratches, fingerprints, dust, and processing blemishes. A background "weave pattern" was evident under high magnification, but the Technical Advisory Committee did not judge it to impede the quality of the COM. The weave pattern is not inherent to high resolution COM recording, and new recorders no longer produce this artifact.
  • Density: Readings were highly consistent, with minimal variations within and between each volume. Density is primarily set at the point of scanning, not in film recording, so Image Graphics took only three maximum density readings and one minimum density reading per reel. Cornell measured minimum and maximum density according to specifications described in the RLG Preservation Microfilming Handbook. RLG standards permit a minimum density of no greater than 0.10. The minimum density values for all reels fell well within specifications, ranging from .02 to .04. Background densities ranged from .90 - 1.06, well within the acceptable range of .90 to 1.10 for medium contrast books. Density variation within titles ranged from .00 to.04, and between titles from .01 to .06, far below the maximum acceptable variation of .20. Because the control of lighting is not a factor in the recording of digital images onto COM, the TAC recommended that a uniform illumination target not be required.
     
  • Spacing and Placement: The images were recorded two images per frame in the cine (IIA) position. Spacing between images and between frames was uniform and consistent, although initially the COM recorder’s film transport improperly advanced the film during recording, causing varying frame sizes and spacing (and in one or two cases, image overlaps) in the first fifty reels of film. Once the problem was corrected, the film spacing remained extremely consistent. There was no problem associated with skew that was attributable to the COM recording.
     
  • Reduction Ratio: At the beginning of the project, the Technical Advisory Committee approved the use of variable reduction ratios to "fill the frame" for each book. (8) This enabled Image Graphics to use the smallest reduction ratio possible, thus ensuring the highest recording of resolution on film, and to produce an extremely uniform product which potentially would facilitate the scanning back from COM if the original digital files ever became unreadable.

    There was general agreement among the members of the Committee that while the use of variable reduction ratios did not lower image quality, the dimensions of the original documents should be recorded on a film target in order to reproduce paper facsimiles at the same physical dimensions as the original volume. Given that file size information for each image was recorded in the TIFF header, and that all scanning was at 600 dpi resolution, a target noting the pixel dimensions (e.g., 2,400 x 3,600) and resolution could be generated automatically from the TIFF header by the program for reel composition (see Appendix II). With this information, one could then calculate the original page width by dividing the first pixel dimension by 600, e.g., the original page width for a 2,400 x 3,600 pixel image would be 4" (2,400 divided by 600 equals 4) and the length could be similarly calculated. COM recording at fixed reduction ratios is also possible, and is being used by Image Graphics in a contract with the Virginia State Archives.
     
  • Bibliographic integrity: The bibliographic integrity of the COM produced in this project was extremely high. COM was reviewed on a light box under 10x magnification. Cornell staff inspected for the proper placement of targets relative to their respective volumes, checked at intervals for correct pagination and completeness, and evaluated the quality of all illustrations. The staff found no missing frames, but in one case observed that the prime bibliographic target had been inverted. Only two reels were rerun due to problems with missing portions of text and for the misidentification of two books (for Reel 90, the pages for title 6 were written to film as those for title 5 and vice versa).

    Staff compared the COM to paper facsimiles printed from the same digital images, in an effort to determine whether the information in the digital file had been accurately transferred to microfilm. In the production of COM, Cornell found an array of problems not encountered when inspecting microfilm produced by a light lens camera. Some errors were caused by the COM recorder's failure to transfer varying amounts of data to the microfilm. In the most extreme case, an entire page was omitted in the COM. In less extreme cases, the text at the edge of a page was "clipped" indicating that some portion of the pixels constituting the text was omitted from the COM, but was present on the paper facsimile. Cornell rejected any reel containing text that was not readable. Clipped text occurred frequently at the beginning of the project but was an intermittent problem in less than 10% of all subsequent reels. 
An ancillary problem was "line drop-out" that occurred when the COM recorder failed to transfer individual rows of pixels to the COM. Being extremely thin, line drop-outs were virtually undetectable on pages of text, but were clearly visible in half tones and in some line drawings. Cornell rejected any reel with an illustration containing more than ten line drop-outs. Line dropout was not evident after the first fifty reels.

QUALITY FINDING NO. 2: NO DETECTABLE LOSS OF RESOLUTION WAS OBSERVED IN RECORDING THE DIGITAL IMAGES ONTO COM.

Having determined that 600 dpi bitonal scanning could produce digital files that faithfully rendered all textual information contained in brittle books, Cornell was interested in determining whether there was any loss of detail in recording those files onto COM. Cornell used three resolution test targets during scanning to evaluate scanner performance and recorded those targets onto the COM. The targets included were the RIT Alphanumeric Test Object superimposed on the IEEE Std 167-A-1987, and AIIM Scanner Test Chart #2. The RIT target, which consists of block characters and numbers represented in two directions, was judged to be the most useful target for measuring the effective resolution achieved on the COM. (9) Cornell staff also conducted subjective evaluation of the COM rendering of the smallest lower-case "e" contained in a volume, using the ANSI/AIIM Quality Index rating for microfilm inspection. We visually inspected the COM on a light box under 100x magnification. In all cases, the images met the "high quality" standard for Quality Index (8.0) in the rendering of the smallest "e" and RIT target readings on the COM ranged from line 8 through line 15, which proved identical to those read on-screen during quality control of the digital images.  (10)

Cornell did not discern any drop in resolution from the digital images to the microfilm copy. Given the capabilities of the COM recording device, the Electron Beam Recorder from Image Graphics, to record extremely fine resolution with excellent image acuity, virtually all of the information in the 600 dpi 1-bit images could be represented on the 35mm microfilm at reduction ratios between 5x and 10x. This is in sharp contrast to other forms of copying, where one can expect image degradation of 10% or greater when reformatting from one medium to the next or one generation to the next.  

To measure the COM's ability to duplicate halftones, we inspected the AIIM target with a 10x loupe to determine the number of distinguishable halftone wedges, and recorded the appropriate rating. Staff noted wide variation on the AIIM target readings, from 110 to 100, 85, 65, and even 0. Investigating the cause of these inconsistencies, staff inspected the paper facsimiles corresponding to the titles with low AIIM readings. It was discovered that those titles had consistently low levels of contrast between the text and background, with light text or darkened paper or both. Staff also inspected the quality of the halftones in those facsimiles, and closely examined detail in the same halftones appearing on the COM. It appeared that the illustrations in the facsimiles and the COM were represented as well or better than one would expect from a traditional camera producing high contrast preservation microfilm. Cornell concluded that a consistent AIIM target reading was not an effective measure of the COM's ability to faithfully duplicate halftones when the scanner had been set to optimize capture for volumes with low contrast. Cornell recommends further inquiry into ways to measure the ability of a COM recorder to accurately duplicate halftones. One possibility would be to define a distinct range of acceptable AIIM target readings for high-contrast, medium-contrast, and low-contrast material. 

QUALITY FINDING NO. 3: THE QUALITY OF DIGITAL IMAGES CREATED AT THE SAME RESOLUTION AND BIT DEPTH WILL BE SUPERIOR WHEN BRITTLE BOOKS ARE SCANNED FROM PAPER RATHER THAN FROM MICROFILM COPIES. 

With the assistance of Yale University, Cornell conducted a comparison of the quality of digital files produced from scanning paper versus film in bitonal mode. Challenge Industries, an Ithaca-based firm that produces preservation quality 35mm microfilm, filmed the same five books that had been scanned as the sample test for COM production in the RFP process. This roll of conventional negative film was sent to Yale, which produced a 2N copy and scanned both versions using the Mekel M400 scanner. The TIFF files created in this film-to-digital process were saved to optical disk, and printouts from these digital files, as well as those scanned directly from the paper originals, were produced at Cornell on the Xerox Docutech at 600 dpi resolution.

Staff conducted an on-screen, side-by-side evaluation of the two digital files at full pixel display (100%) and compared the images to the original books. They also compared the printouts. Finally, the two sets of images were processed through the Xerox TextBridge 2.0 Optical Character Recognition (OCR) program. The OCR process resulted in 100% and 99.3% text accuracy for the paper-scanned versions of the two pages. The microfilm-scanned versions resulted in text accuracy rates of 98.7% and 99% respectively for the same two pages. These differences are slight, but if one were interested in creating text files from digital images, they suggest that direct scanning from paper may prove more accurate and less expensive than OCRing digital images scanned from microfilm. 

Figures 1 through 4 reflect the difference in the quality of recording text, fine detail, and halftone information from paper and film. The difference in the presentation of text-based information is most evident in the "thickening" of stroke widths on images scanned from the film. Some smaller hand-produced characters, measuring 0.6mm and 0.4mm, are inadequately rendered in digital images from film, while faithfully represented in the digital version scanned from paper ( Figures 1 and 2). The most obvious difference is seen in the production of the halftones ( Figures 3 and 4). High contrast microfilm, and current bitonal film scanners, cannot do justice to many halftones, as these following illustrations indicate. The version created directly from paper, however, retains much of the detail present in the original book. In this case, the halftone had received special "windowing" and processing on the XDOD scanner utilizing a descreening and rescreening filter to capture the halftone while treating the text portion of the page as text/lineart. (11) No such enhancement capabilities exist for conventional high contrast microfilming or are yet available in film scanners.  

Yale’s study confirmed these findings. "Scanning from the original, if permitted by the condition of the original and its size," Paul Conway noted in the final report, "will almost always produce better quality results than scanning from a microfilm intermediary." (12) Yale’s goal in Project Open Book was to aim for legibility and to determine what quality could be produced in a production environment where costs could be minimized. Cornell’s aim was slightly different: to create digitized images of sufficient quality in order to produce COM that could serve as the functional equivalent of preservation microfilm.

QUALITY FINDING NO. 4: COM CAN BE SCANNED TO REPRODUCE HIGH QUALITY DIGITAL IMAGES IN THE EVENT THAT THE ORIGINAL DIGITAL FILES BECOME UNREADABLE.  

Staff at Yale University graciously scanned samples of the COM using the Mekel M400 film scanner. The quality of the digital images converted from COM was comparable to the quality of digital images produced from conventional microfilm, especially for printed text and line art, although there were aliasing and moire patterns introduced in the reproduction of some halftone information. Conventional microfilm suffers from similar problems when scanned in bitonal mode. (13) As noted above, bitonal film scanners currently do not have the same enhancement capabilities that are available on flatbed scanners, especially the XDOD scanner used in the COM project, although work in this area is underway. (14) An alternative approach may be to use grayscale film scanning for microfilm or COM that contains significant halftone, photographic, or fine lineart information. Preliminary tests utilizing grayscale film scanning offer promising results but the resulting file sizes and costs are likely to be significantly higher in the near term than those incurred with bitonal scanning (see process section). Given its consistent image size, placement, frame spacing, and density, the use of COM can expedite the rescanning process, which suggests that film quality and consistency may have a great impact on the costs of conversion to create suitable digital images from film.  (15)

RECOMMENDATION: STANDARDS FOR COM PRODUCTION AND INSPECTION MUST BE DEVELOPED AND ADHERED TO BY INSTITUTIONS AND SERVICE BUREAUS ALIKE. THE TECHNICAL ADVISORY COMMITTEE TO THE PROJECT RECOMMENDS THAT QUALITY STANDARDS FOR DIGITAL IMAGING OF PAPER SOURCE DOCUMENTS BE DEVELOPED, AND THAT MODIFICATIONS BE MADE TO THE STANDARD MICROFILM QUALITY CONTROL PRACTICES FOR EVALUATING DENSITY, RESOLUTION, REDUCTION RATIOS, TARGETS, FILM SIZE, AND BIBLIOGRAPHIC COMPLETENESS OF COM.

Although COM can meet preservation microfilm standards, procedures for production and inspection of the COM will differ from those appropriate to conventional microfilm. Significant changes in film creation and quality control are introduced in COM recording. Images are generated digitally, not photographically, and factors affecting image quality, such as resolution and density, are made up stream—at the point of scanning—and not at the point of filming. This has significant ramifications for final film inspection.  

The quality of the resulting COM will in large measure be determined by the quality of the initial scanning, not the film recording. It is imperative, therefore, that adequate settings (e.g., resolution and bit depth) be established and used to capture fully the significant information contained in the source documents, and that a rigorous scanning quality control process be instituted, with visual inspection occurring both on-screen and via printouts from the digital images. Although there are currently no formal standards governing quality for digital imaging, work on this front is occurring. The Research Libraries Group, Inc. has established a working group on digital image capture requirements, and a report on their findings is expected by July 1998. A number of institutions and organizations have produced internal guidelines for digital image capture. (16) These efforts should be assessed by the broader preservation community so as to develop quality standards for digital imaging of paper source documents. Rigorous quality control processes should also be established. The Association for Information and Image Management has published guidelines for quality control of image scanners, which include information on the use of targets, and Cornell’s Department of Preservation has published recommendations on verifying image quality.  (17)

In reviewing the findings on image quality and COM inspection from this project, the Technical Advisory Committee recommends that the following modifications be made to the technical and bibliographic inspection procedures for preservation microfilming, as defined in the RLG Preservation Microfilming Manual: (18)

Density:

  • Given that all density readings fell within acceptable range and that there was very little density variance within and between titles, the Technical Advisory Committee recommends that fewer background density readings should be required for COM than conventional microfilm. RLG guidelines specify 3 background density readings per title, or 2 readings with volumes of less than 50 pages, and a minimum of 8 readings per reel. For computer output microfilm such as that produced by IGI for this project, the TAC suggests that the home institution take one reading per title. Over time, this requirement could be even further reduced.
  • COM service providers should take three Dmax readings per reel and one Dmin reading per reel.

Resolution:

  • The same resolution (and bit depth) used to create the digital images should be used to produce the COM. The achieved resolution on film can be evaluated by taking on-screen readings of resolution test targets scanned at the same settings as the source documents and comparing them to readings taken from those same targets recorded on COM. Detail capture should also be evaluated subjectively by examining the smallest significant lower case letter contained in a document recorded on the COM. The appearance of halftones and fine line drawings should also be evaluated for detail capture and the introduction of moire and other evidence of aliasing. The home institution should conduct resolution assessments of COM on a light box under 50-100x magnification.
  • No resolution readings are required by the COM service provider. 

Reduction Ratio:

  • Use of variable (and non-standard) reduction ratios is acceptable, provided that information regarding physical page dimensions, resolution, bit-depth, resulting pixel dimensions, and recording space on film (e.g., 15mm) are included on a technical target. If a standard reduction ratio is used, that ratio must be conveyed on a technical target, according to RLG guidelines.

Film Size:

  • the preservation community should reevaluate the exclusive use of 35mm microfilm for preservation purposes. More commercial options for COM recording (and film scanning) exist with 16mm film and 105mm fiche than 35mm film.  

Bibliographic Characteristics:

  • COM service providers should perform full frame-by-frame inspection to ensure bibliographic completeness.
  • Provided that the scanned files are fully inspected for completeness, ordering, and image quality, and that the COM service provider performs frame-by-frame inspection, the TAC recommends that bibliographic inspection of the COM by the home institution could be done on a random, 10% sampling of each reel.
  • Technical targets should include information on the scanning process used (e.g., resolution, bit depth, use of enhancements, file formats, type and level of compression) as well as information regarding essential document characteristics, such as physical page dimensions of the original (including all variations from that size, including foldouts, reduced photocopy versions of oversize items), level of detail, tonal reproduction, and presence of color.
  • Include as a target the collation form containing the document control structure information to aid in recreating pagination and indexing if the COM needs to be scanned to recreate digital files. (19) See Appendix I for forms and target sequence used in the COM project.

FINDINGS AND RECOMMENDATIONS: COST

COST FINDING NO. 1: IN A HYBRID PROGRAM TO CREATE BOTH MICROFILM AND DIGITAL IMAGES, THE COSTS ASSOCIATED WITH THE SCAN FIRST APPROACH APPEAR TO BE LESS THAN THOSE INCURRED IN THE FILM FIRST APPROACH. IF EXTANT FILM IS SCANNED, AS WAS THE CASE IN PROJECT OPEN BOOK, THEN THE COSTS FAVOR THE FILM FIRST APPROACH. IF ONLY DIGITAL IMAGES ARE TO BE PRODUCED, THE COSTS OF SCANNING FROM PAPER VERSES FILM ARE COMPARABLE. HOWEVER, THE COST FIGURES PRODUCED BY BOTH CORNELL AND YALE REFLECT THE NATURE OF DEMONSTRATION PROJECTS RATHER THAN FULL PRODUCTION PROCESSES.

 Yale conducted a very extensive cost study, the approach and findings of which are presented in its final report to the National Endowment for the Humanities. Early in both projects, Cornell and Yale agreed to collect data on the primary sub-processes of digital conversion. Yale took the lead in establishing a cost study model, and gathered time and cost statistics for the first 600 volumes scanned from microfilm. Yale also calculated the costs of equipment purchase, lease, maintenance, and replacement.

 Cornell undertook a more modest data gathering effort in March and July 1995 during a typical production phase of the project. Data was collected in the following categories: preparation, scanning, file management, tape creation, and COM inspection. These categories roughly correspond to the categories used in the Yale Cost Study. For comparison purposes to Yale’s Project Open Book, Cornell calculated a "Yale adjusted" mean time to reflect the difference in the average size of books scanned at Yale (216 pages) and Cornell (341 pages). Cornell also calculated costs based on Yale’s combined hourly wage/benefits rate of $15.38 ($0.2563/minute).

Equipment and maintenance costs were calculated for the XDOD scanning system only even though the first 600 volumes were scanned using the prototype CLASS system. Cornell also calculated equipment costs for file management but did not calculate equipment costs associated with the production of COM, which are reflected in the per frame charges. Cornell used Yale’s method of equipment calculations for purchase, maintenance, and replacement. The equipment costs reported here and in Project Open Book document costs for a specific project conducted during a specific period and may not be generalizable to other projects or to the current costs of comparable equipment. 

The costs of COM production were based on a fixed project rate negotiated with Image Graphic, plus additional charges incurred for shipping and one-up recording of certain targets and page fold outs. The costs associated with project management, systems programming, facilities, and equipment down time due to the conversion from CLASS to XDOD scanners were not calculated, as these were seen as specific to the early ramp up phase of the project. (Yale did not record this kind of information either.) Finally, Cornell included typical costs associated with the creation of preservation quality microfilm in order to provide comparative data on an end-to-end hybrid process for a scan first versus a film first approach. Five tables are presented here that detail the findings from the Cornell cost study and offer comparisons to Yale’s findings in Project Open Book.


Table 1 presents time and cost figures associated with the labor to collate, scan, index, and prepare digital files, as well as to inspect COM (the charges for COM recording are reflected here as well). The time figures were based on the number of volumes that could be processed at the various stages during a specific period. For instance, preparation figures are based on a sample of 150 volumes; scanning on a sample of 45 volumes; reel programming on 21 reels containing 120 volumes; and COM inspection on 11 reels, containing 4-10 volumes each. (20) Table 1 indicates that the labor cost in an-end-to end process to create microfilm from digital imagery averaged around $.30/image in this project. 

 TABLE 1: CORNELL DIGITAL TO MICROFILM CONVERSION PROCESSES: TIME AND COSTS

Activity

 

Mean Time
(min/vol.)

High/Low

 

Mean Time1
(Yale adjusted)

Process Costs2
Book Image

Preparation3

82.6

82.6

$21.17

$0.098

select/collate

56.0

15/125

56.0

$14.35

$0.066

disbind

11.1

25/2

11.1

$ 2.85

$0.013

trim

8.2

38/3

8.2

$ 2.10

$0.010

structure

3.5

10/2

3.5

$ 0.90

$0.004

reel program

3.8

8/2.5

3.8

$ 0.97

$0.005

Scanning4

86.2 (auto)

56.1

$14.38

$0.067

126.2 (man.)

73.2

$18.76

$0.087

targets

2.8

2.8

$ 0.72

$0.003

setup

9.4

25/4

9.4

$ 2.41

$0.011

scan (auto)

56.0

63/52

30.7

$ 7.87

$0.036

scan (man.)

96.0

194/53

47.8

$12.25

$0.057

quality con

13.0

34/8

8.2

$ 2.10

$0.010

rescans

5.0

20/0

5.0

$ 1.28

$0.006

Indexing5

8.6

30/5

8.6

$ 2.20

$0.010

File Management6

5.2

5.2

$ 1.33

 

$0.006

Tape Creation7

4.6

4.6

$ 1.19

 

$0.006

COM Production8

fixed rate from IGI

 

$20.66

$0.096

COM Inspection9

9.5

7.5

$ 1.92

$0.009

density

1.4

1.4

$ 0.36

$0.002

resolution

2.7

2.7

$ 0.69

$0.003

visual check

5.4

5.4

$ 0.87

$0.004

Totals

auto

196.70

164.60

$62.85

$0.291

manual

236.70

181.70

$67.23

$0.311

Table 1 Notes:

1) Mean time. The mean time for each process has been calculated, although it is worth noting that the difference between the high and low figures in each process can be significant. Mean time (Yale adjusted) was calculated based on the difference in the average number of pages/volume in the Cornell and Yale studies. Cornell volumes averaged 341 pages; Yale volumes averaged 216 pages (63.3% of the size of the Cornell volumes). When volume size was relevant, the Yale adjusted mean times were used. 

2) Process Cost. Again for comparison purposes, the translation of time spent to cost was based on the Yale combined hourly wage/benefits rate of $15.38 per hour or $0.2563 per minute. (See Conway, Appendix 7, "Cost Study Model and Principal Data.") The per image costs are calculated by dividing the per book costs by 216, the average page length of books scanned at Yale.  

3) Preparation. Cornell recorded times for activities distinctive to preparing volumes for scanning (as opposed to microfilming). Each of the volumes had to be disbound and the binder’s margin trimmed for scanning on the XDOD. The structuring/indexing information was put on a work form that accompanied each volume. Reel programming was done at this stage as well, but could have been handled by Image Graphics at the point of COM recording. See Section on Scanning Preparation. The basic preparation costs common to both microfilming and scanning (e.g., selection, retrieval, collation, target preparation, etc.) were derived from times reported by Patti McClung in her landmark study "Costs Associated with Preservation Microfilming." Library Resources & Technical Services (Oct/Dec 1986, p. 363-374).  

4) Scanning. Includes the costs for capturing technical and bibliographic targets, book set up, quality control, and two forms of scanning. The first form of scanning was done in an auto-mode in which standard settings were used to capture all pages of the volume. The second form of scanning, "manual mode," involved windowing halftone information on a page and treating it differently than the surrounding text. Only books containing halftones that were considered significant to the meaning of the text received such treatment. Mann Library staff devised guidelines for determining which books would receive such treatment (See Appendix I on Scanning Preparation).  

5) Indexing. The proprietary Xerox Documents on Demand software was utilized to "structure" and paginate each book. Cornell used two hierarchical levels of tags. The base-level tag was used to paginate a book—matching image file names with the page numbers that appeared in the original book (including Roman numerals, no pagination, duplicate page numbers). The second level tag clusters the pages into groups, such as title page, table of contents, text, index, back matter. Appendix I contains an example of the structuring tags applied to a typical book in this project, and to the list of standard terms used to refer to these units.

6) File Management. Activities associated with this step involve setting up for batch move of image files and moving the relevant targets.  

7) Tape Creation. Includes creating work form, moving images to staging area, preparing and running tape creation script and log file for quality control, and running the tape generation script. See Appendix II on Reel Programming.  

8) COM Production. Cornell held a contract with Image Graphics to record the digital images on to COM at $0.09/image. Additional costs included shipment of the completed COM and costs associated with one-up imaging of certain targets and foldouts. The per image cost therefore totaled $0.96.  

9) COM Inspection. Staff recorded the time spent in performing density and resolution checks as well as the visual inspection over the light box.

Table 2 details the costs of hardware, software, maintenance, network connection, and optical storage media for the paper scanning and reel programming to create COM. Costs associated with equipment replacement are based on the Yale model that assumes an additional 50 percent replacement surcharge for increased functionality. The equipment costs are then calculated on a per book and per image basis, using the adjusted Yale figures for a 216 page book. The costs of COM recording equipment are subsumed under the per frame charge, reported in Table 1. The bottom line is that the average per image cost for equipment ranges from 6.6 cents to 8.1 cents.

TABLE 2: CORNELL ANNUAL EQUIPMENT COSTS

Total

Per Book1

Per Image2

 

 

Auto

Manual

Auto

Manual

XDOD Scanning System3

$12,699

$11.63

$14.70

$0.054

$0.068

Purchase (20%)

$5,386

$4.93

$ 6.23

$0.023

$0.029

Maintenance

$4,620

$4.23

$ 5.35

$0.020

$0.025

Replacement4

$2,693

$2.47

$ 3.12

$0.011

$0.014

File Mgt, Tape Creation5

$5,500

$0.57

$ 0.57

$0.003

$0.003

Purchase (20%)

$3,000

$0.31

$ 0.31

$——

$———

Maintenance

$1,000

$0.10

$0.10

$——

$———

Replacement

$1,500

$0.15

$0.15

$——

$——

Optical Media6

 

$2.00

$2.00

$0.009

$0.009

Ethernet7

$108

$0.10

$0.13

$——

$———

Totals

$18,307

$14.30

$17.40

$0.066

$0.081

Table 2 Notes:

1. Per book costs are based upon estimates of scanning production times for a year, for a single daily shift. A scanning technician works 215 days/year x 7.3 hours/day = 1,570 hours/year. The actual scanning production time is estimated at 75% of capacity, allowing for some down time, meetings, phone calls, etc. We assumed, therefore, full production capacity to be 1177.5 hours/year. A book scanned and indexed in auto mode takes 1.078 hours (64.7 minutes), so 1,092 books/year can be scanned in auto mode. A book scanned and indexed in manual mode takes 1.36 hours (81.8 minutes), so 864 books/year can be scanned in manual mode.

 2. Per image costs are calculated by dividing the per book costs by 216, the average number of pages in books scanned at Yale.

 3. Equipment. Cornell used Yale’s methods for calculating equipment costs. A system is assumed to have a five-year life, so the annual equipment cost is one-fifth the purchase price. The XDOD system, including scanner, optical storage subsystem, computer, monitor, and software came to $26,930, with a partnership discount of $10,000. The scanning equipment costs would be 37% higher without this discount. Annual charges for maintenance agreements reflect actual costs of the Xerox contract covering the XDOD scanning system.

 4. Hardware replacement calculated at fifty percent of annual equipment cost. Assumes decline in costs for similar functionality upon replacement. 

5. File management and tape creation was handled on a Sun Sparc workstation, which cost $15,000 with peripherals. If the system operates 1,570 hours/year, and each volume takes 9.8 minutes (or .163 hours) for file management and tape creation, then the equipment cost is the annual equipment cost ($3,000) divided by number of hours per year (1,570) times the amount of time per volume for file management and tape creation (.163), which equals $0.31 per book. Maintenance and replacement costs are similarly calculated.

 6. Optical media. Cornell uses 600 MB Magneto Optical Disks, which cost $80 each. Each volume requires around 15 MB, for a total of $2/book.

 7. Ethernet connection. The monthly charge for a line is $9, for a total of $108/year.

 

Table 3 compares the labor and equipment costs reported in the Cornell and Yale projects to produce digital images. The costs to create either the microfilm or the COM are not included here. The Cornell/Yale findings indicate that the costs to produce digital images from either microfilm or paper are comparable. This table also indicates that there are great disparages in time and costs for the various steps involved in each approach. For instance, preparation time at Cornell was 78.8 minutes per volume, nearly 15 times the amount spent in preparation at Yale. This is because the Cornell project began with the original books and a good deal of the time was spent in identification, assemblage, collation, and physical preparation. With extant microfilm, many of those costs were incurred at the point of microfilming, not at the point of scanning. Second, the time and costs associated with the actual scanning indicate that it takes somewhere between 50% and 90% longer to scan from paper than it does from film. On the other hand, the time and costs associated with indexing, file management, and equipment were significantly greater when scanning film than scanning paper. In the end, the time and costs associated with the two approaches were remarkably similar.

TABLE 3. PRODUCING DIGITAL IMAGES FROM PAPER VS. MICROFILM

 

Cornell: Time & Costs1

Yale: Time & Costs2

Process

Mean Time

$/Bk

$/Image

Mean Time

$/Bk

$/Image

Preparation3

78.8 min

$20.20

$0.094

5.3 min

$ 1.36

$0.006

Scanning

 

 

 

 

 

 

Auto

56.1 min

$14.38

$0.067

38.1 min

$ 9.77

$0.045

Manual

73.2 min

$18.76

$0.087

 

 

 

Indexing4

8.6 min

$ 2.20

$0.010

29.9 min

$ 7.66

$0.035

Other5

5.2 min

$ 1.33

$0.006

19.2 min

$ 4.92

$0.023

Sub Total: Process

 

 

 

 

 

 

Auto

148.7min

$38.11

$0.18

92.5 min

$23.71

$0.110

Manual

165.8min

$42.49

$0.20

 

 

 

Equipment

Mode

 

 

Capacity

 

 

 

Auto

$14.30

$0.066

High6

$24.51

$0.113

 

Manual

$17.40

$0.080

Low6

$31.32

$0.145

Total: Process/Equip.

Mode

Capacity

Auto

$52.41

$0.24

High

$48.22

$0.22

Manual

$59.89

$0.28

Low

$55.03

$0.26

Table 3 Notes:

1. The time and costs are adjusted for comparison purposes to represent a 216 page book and an average salary/benefits rate of $.2563/minute, used at Yale.

2. Figures for producing digital images from microfilm are taken from Conversion from Microfilm to Digital Imagery, Performance Report, Appendix 7, Digital Image Conversion Processes: Time and Costs.

3. Preparation for paper scanning from Table 1, minus the time to reel program the pages.

4. Indexing. Cornell structured the self-referencing portions of a book (e.g., title page, table of contents, list of illustrations, index, bibliography); Yale structured to the chapter level, one level deeper.

5. Other. For Cornell includes File Management (does not include tape creation or COM inspection).

6. Equipment costs. Yale provided two equipment rates, based on high and low capacity production.

 


 

Table 4 compares the labor and equipment costs reported in the Cornell and Yale projects. Note that the costs in the Cornell project ran 50-60% more than the costs of the Yale project, although the costs differed greatly at various stages of production. The Yale Project began with extant microfilm and resulted in the production of digital images. The Cornell project began with the original paper documents and resulted in the production of both COM and digital images. This table does not include the costs of creating the microfilm in the first place (Table 5 compares costs of full, end-to-end projects).

TABLE 4. COMPARING CORNELL’S COM PROJECT TO YALE’S PROJECT OPEN BOOK

 

Cornell: Time &Costs1

Yale: Time & Costs2

Process

MeanTime

$/Bk

$/Image

MeanTime

$/Bk

$/Image

Preparation

82.6 min

$21.17

$0.098

5.3 min

$1.36

$0.006

Scanning

 

Auto

56.1 min

$14.38

$0.067

38.1 min

$9.77

$0.045

Manual

73.2 min

$18.76

$0.087

 

 

 

Indexing3

8.6 min

$ 2.20

$0.010

29.9 min

$7.66

$0.035

COM

fixed rate

$20.66

$0.096

 

 

 

Other4

17.3 min

$4.44

$0.021

19.2 min

$4.92

$0.023

SubTotal:

 

Auto

164.6 min

$62.92

$0.29

92.5 min

$23.71

$0.110

Manual

181.7 min

$67.23

$0.31

 

 

 

Equipment

Mode

 

 

Capacity

 

 

 

Auto

$14.30

$0.066

High5

$24.51

$0.113

 

Manual

$17.40

$0.080

Low5

$31.32

$0.145

Total

Mode

 

 

Capacity

 

 

 

Auto

$77.22

$0.36

High

$48.22

$0.22

 

Manual

$84.63

$0.39

Low

$55.03

$0.26

Table 4 Notes:

1. The time and costs are adjusted for comparison purposes to represent a 216 page book and an average salary/benefits rate of $.2563/minute, used at Yale.  

2. Figures for producing digital images from microfilm are taken from Yale’s final report.  

3. Indexing. Cornell structured the self-referencing portions of a book (e.g., title page, table of contents, list of illustrations, index, bibliography); Yale structure