Technical Information

The project team secured permission from ProQuest to digitize their microfilm edition of the Richmond Daily Dispatch from November 1860 through December 1865. After their purchase the microfilm reels were sent to the imaging and rekeying vendor, Digital Divide Data, to be scanned, hand-keyed, and encoded as XML documents to the TEI P4 specification. Page images were scanned at a resolution of 600 dpi and saved as 8-bit grayscale TIFF images.

The XML files for all of the newspapers were individually checked upon receipt from the vendor, and were subsequently transferred to the Perseus Project for named entity identification and encoding. The resulting files had undergone major changes which necessitated the creation of a modified Document Type Definition (DTD) based on the TEI P4 specification, in order for them to be successfully ingested and indexed by XPAT, the indexing engine of the University of Michigan's Digital Library Production Service's DLXS software suite. DLXS is a suite of tools which supports XML and Unicode, and allows for the processing of large and highly-structured texts. The system requires that all documents conform to a certain DTD style in order to be processed by the indexing engine. DLXS requires a UNIX environment, and the University of Richmond installed Red Hat Enterprise Linux on its dedicated server, as well as Apache web server, a MySQL database and Perl 5.8.0.

JPEG 2000 images were created from the TIFF format images received from the vendor. These images reside on the project server, and are converted on the fly into the JPEG format when viewed by users.

In all, the repository contains 1,384 indexed and searchable issues of the Richmond Daily Dispatch and 4,051 JPEG 2000 image files. The original page images in the TIFF format are archived on a separate server.

In 2015 a new version of the site was released. Built on open source tools (the eXist native-XML database and the Djatoka JPEG-2000 image viewer) and a customized web interface designed by University Communications, the new site enables direct searching of TEI P5 XML data. The site is not intended to be static, and further content additions and feature developments are anticipated for the future.

Project Funding and Partners
Word
Phrase
Boolean
Collection:
Daily Dispatch
Civil War Richmond Books
SEARCH IN:
Collection:
Daily Dispatch
Civil War Richmond Books
SEARCH IN:
Collection:
Daily Dispatch
Civil War Richmond Books
SEARCH IN:
Search Tips
Browse the Daily Dispatch
Browse Books on Civil War Richmond
Confederate Richmond
Project Information
Virginia Secession Convention
Mining the Dispatch