Cover Page

Semi-Annual Interim Performance Report

Grant Number HK-250616-16

The Jubilees Palimpsest Project:
Spectral RTI Technology for the Recovery of Erased Manuscripts from Antiquity

Todd R. Hanneken, Ph.D.

St. Mary’s University

March 31, 2018

Narrative Description for Project Period September 1, 2017 - February 28, 2018

The third of six semi-annual project periods successfully surpassed the proposed goals and brought the project halfway to completion. The overall theme of the second year of the project is to connect scholars (researchers, teachers, students) to the images in an easy web viewer with annotation and collaboration features (in contrast to the emphasis on developing and proving the capture and processing technology in the first project year, and the emphasis on sharing the tools for other imaging projects in the third year). The first broad category of making the images useful to scholars is the implementation of the Mirador viewer with features for annotation and collaboration. The second broad category of connecting to scholars consists of personal interaction at meetings and conferences. The third broad category related to the theme of the project year centers around mentoring student researchers. Additional activities anticipate the third project year or continue the theme of the first project year. The major categories specific to the third semi-annual period are narrated in this section, and the cummulative report below (anticipating the final report) has been updated.

This grant period saw the release of a version of the Mirador viewer that supports the International Image Interoperability (IIIF) Presentation API standard for Image Choice (layers). This feature is essential for spectral imaging, which utilizes a number of processed color enhancements for any given page. Previously, the full data was only visible in the Leaflet-based IIIF Navigator, which did not support Open Annotation. Mirador Snapshot With the implementation of Mirador 2.6.1 the color enhancements and raking light options are all selectable in a "Layers" pane. The addition of an Open Annotation server makes it possible for scholars using Mirador to view annotations made by others and add their own. Compliance with the Open Annotation standard also facilitates use of other tools for editing and managing the annotations, which can be proposed by anyone. Another essential feature, not standard with Mirador, has been added to facilitate scholarly collaboration. Now when scholars navigate through manuscripts and canvases (pages), the address bar of their browser will show not only the web address of the viewer but the "coordinates" of what manuscript and page they are on. When authors copy and paste that address into scholarly collaboration media (annotations, scholarly works, email, etc.) a reader will be able to click the link and go directly to the page intended by the author. Another feature implemented in Mirador is an index of chapter and verse ranges on each page of the Book of Jubilees. The project director also participated in the monthly conference calls of the IIIF Manuscripts group and communicated a list of desiderata to the development team planning the next major version of Mirador. Additional scholarly tools were also prepared for integration with the online environment, including a paleography chart, a database of other versions of Jubilees, and an EpiDoc TEI XML version of the 1861 critical edition of Latin Moses (Jubilees and the Testament of Moses), which will serve as the basis for new readings and eventually a new critical edition.

The second category of connecting scholars to the technology is through human interaction. The main proposed activity is the meeting of scholars. The proposal imagined that this might take place in March at the University of Pennsylvania, but Annette Reed left her position there to start a new position at New York University. We have made the adjustment to have the meeting at the University of Notre Dame in May (15-17). We do not imagine this adjustment as having a delterious effect on the project goals; if anything it will increase the number of “volunteer” participants. Other presentations and workshops already took place. In each case the slides were published on the project website and the presentation promoted through Twitter. A joint session of the Digital Humanities and Pseudepigrapha groups at the Society of Biblical Literature Annual Meeting in Boston (November 20, 2017) was dedicated to advanced imaging of manuscripts. This session included four members of the team: Todd Hanneken (specifically about the Jubilees Palimpsest Project slides, transcript, voice over slides), Mike Phelps (with more emphasis on the Sinai Palimpsests Project), and team scientists Roger Easton and Keith Knox (on the science of textual discovery). About 50-60 scholars were in attendance, and all the presentations were well received. Another successful effort was a workshop by the project director (Hanneken) at the Antiquity Workshop at the University of Texas Austin November 9, 2017 (slides). This workshop was exclusively devoted to the Jubilees Palimpsest Project, and was attended by about 50 faculty, students, and librarians. Another workshop took place on campus at St. Mary’s University (the project director’s home campus) on September 8, 2017, which was attended by about 40 students and faculty (slides). Planning efforts were made toward students presenting at three conferences, all of which will take place in the next semi-annual reporting period.

A third category of connecting the technology to audiences consists of the activities of student researchers. This work builds resources for other scholars to use, and serves as a supervised test-bed of user experience. In the second project year the hours propsosed for two student researchers to work intensely were distributed to twelve student researchers to work in scheduled workshops and as their schedules permit. As of February 28, 422.5 of the 468 hours of student research budgeted for the second project year were completed. The remaining 45.5 will be spent in March. The main student projects have been to encode Latin Moses in EpiDoc TEI XML, create paleography charts, and to provide line-by-line transcriptions in Mirador. The transcriptions compare what the 1861 editor was able to read with what they are able to read in the spectral enhancements. So far the students have discovered many cases in which the editor expanded an abbreviation or corrected a misspelling in the manuscript. Though none of this deeply changes scholarly understanding of the text, it does reveal ways in which print editions mediate ancient literature imperfectly. Along the way, students learned in semi-weekly workshops the principles of digital humanities and manuscript studies, including TEI XML, GitHub, IIIF Image API, and Mirador. In the spirit of openness, the Student Researcher Guides have been posted on the project archive for 2017 and 2018. Though not particularly intended or promoted for outside use, they are indexed by Google and may be useful to others working on similar digital humanities projects. As mentioned above, the research experience will lead to presentations at student and other conferences in the following reporting period. For a fuller sense of the student researcher experience, see the flyer and slides for the student researcher information session.

Additional activities in the reporting period can be grouped into those that continue the focus of the first project year and those that anticipate the focus of the third project year. Continuing from the first project year, weekly conference calls with the scientists continue to address strategies for improving the legibility of the images through even more advanced processing techniques. One approach being pursued is application of non-linear transformations, such as Laplacian Eigenmaps, to the data. This approach may continue to supervised machine learning to identify characteristics in material properties across larger amounts of data than the human mind could grasp. Another approach focuses on data misalignment resulting from some filters used in the capture. Another approach aligns captures from both sides of a folio to better process "show through" when text is so thoroughly erased that it is more legible from the other side or from light passing through the parchment. Other activities anticipate the third project year. Most significant of these is the rewriting of the SpectralRTI_Toolkit as a Java plugin for ImageJ. This work has been contracted with the Center for Digital Humanities at Saint Louis University, with regular guidance from the project director. This work is on schedule and the plugin will be ready for the third project year. Efforts to extend the use of SpectralRTI among other imaging teams has already begun through collaboration with the Livingstone Online project and the Harry Ransom Center at the University of Texas, Austin.

Narrative Description for Project Period March 1, 2017 - August 31, 2017

The second semi-annual project period saw the completion of the work proposed for the first project year, and more. The following three paragraphs describe progress within the semi-annual project period in the areas of data processing, SpectralRTI technology, and dissemination, respectively. The following sections add accomplishments from the second to the first semi-annual period into a cumulative report that will eventually become the Final Performance Report.

Data processing progressed steadily, centered around weekly conference calls dedicated to data processing. Todd Hanneken (St. Mary's University) facilitated the discussion, summarized progress in updates to the entire project team, and provided feedback from the perspective of philology and humanities scholarship. Keith Knox (Early Manuscripts Electronic Library, EMEL) first completed batch processing of all pages using "recipes" proven on other palimpsests, namely Pseudocolor and Sharpie, except for C73inf for which Ruby was more effective. Subsequently, Knox focused efforts on developing software to correct for registration errors introduced by some of the filters used in capture. Roger Easton (Rochester Institute of Technology, RIT) and a student summer intern, Nicole Polglaze, focused on advanced supervised processing using ENVI software. This advanced processing was completed for some pages of C73inf, four of the five pages of F130sup, all seven pages of H190inf, and all ten pages of O39sup. Twenty pages of S36sup and remaining pages of C73inf are underway in the third semi-annual project period. The calls also included Ken Boydston (MegaVision), Michael Phelps (EMEL), and Josephine Dru (EMEL), all of whom contributed insight from their areas of expertise in capture hardware, object handling, and philology. In addition to the notes from the calls, Hanneken kept archival "Processing Guides" which consolidated notes from the image scientists and scholars (http://palimpsest.stmarytx.edu/AmbrosianaArchive/Guides/). Since the data is freely accessible with a Creative Commons license, it may be hoped that these guides will be of use to independent imaging specialists who may wish to apply their own processing techniques (see dissemination, below). Hanneken also performed processing using the SpectralRTI_Toolkit, which offered some alternative color enhancements (Extended Spectrum and PCA Pseudocolor), and added RTI interactivity to the images created by the scientists. All images were published with the International Image Interoperability Framework (IIIF) Image and Presentation APIs on the image repository (http://jubilees.stmarytx.edu). The number of IIIF Image API images (including color enhancements and raking light) is now 4363. The number of WebRTI images is now 1467. The number of IIIF Presentation manifests is eight (C73inf Latin Moses, C73inf Latin Commentary on Luke, A79inf Petrarch, F130sup Greek Commentary on Luke, H190inf unidentified undertext, O39sup Origen's Hexapla, S36sup Gothic Bible, Tests from Startup Phase).

Development of SpectralRTI technology and the SpectralRTI_Toolkit for ImageJ progressed in the three areas of documentation, training, and software development. Hanneken wrote and published an initial release of documentation for SpectralRTI and the SpectralRTI_Toolkit (http://jubilees.stmarytx.edu/spectralrtiguide/). The documentation is also published on GitHub should anyone wish to derive their own versions or contribute improvements to the documentation. As of the first release (July 2017), the documentation is quite complete in content but in need of images and other polish to be more user friendly. In addition to the published documentation, Hanneken successfully trained early adopters from two independent teams on the use of the Toolkit, Kathryn Piquette (University College London) and Sarah Baribeau (Lazarus Project and EMEL). Meanwhile, significant progress has been made on the rewrite of the Toolkit from an ImageJ macro to an ImageJ2 Java plugin. Although the plugin is not yet ready for release due to relatively peripheral bugs, the basic functionality and expected performance improvements (especially memory management) are on track. The coding work has been contracted to the Center for Digital Humanities at Saint Louis University, where Bryan Haberberger is the lead Java developer. Their development fork is available on GitHub. Saint Louis University reports 350 hours of work on software development, and that they are not concerned about doing more than the budgeted work because of their shared interest in the success of the product.

Dissemination of project activities progressed along several avenues. Hanneken has been posting project milestones, particularly publication of images, on Twitter (twitter.com/thanneken). Google seems to reach the widest audience so far, and several scholars have made contact with the project director with various inquiries. Hanneken participates in the monthly IIIF Manuscript Group conference calls, on which a wide range of innovations relevant to the project are discussed. In June, Hanneken presented at the first annual conference of Rochester Cultural Heritage Imaging, Visualization, and Education (R-CHIVE.com). Surrounding the two-day public conference were five additional days of meetings. One significant contribution of the project that became apparent at the meeting is the openness of the data. Rochester's emphasis on imaging science goes back to the glory days of Kodak and Xerox. Today, many students are working on digital image processing of cultural heritage, but lack complete data sets with which to work. Because the data is made available online without any impediment (such as registration or email request) and generously licensed, many imaging specialists will be served by and hopefully serve in turn the humanities interest of the project. This and other connections made at the conference and workshops can be expected to have long-term direct and indirect benefits.

One practical "lesson learned" worth reporting is that Amazon Simple Storage Service (S3) proved not to be effective for storing data for the IIIF image repository. This low-cost service requires transfer of a complete file to the main processing instance before any of it can be read. The key benefit of the Jpeg2000 backend of the IIIF Image server is that only the tiles with the region and detail required need be read and processed. The solution was to move the data from Amazon S3 to Amazon Elastic File System (EFS). This service is more expensive, but savings in other project areas should suffice to cover the additional cost.

Narrative Description for Project Period September 1, 2016 - February 28, 2017

Accomplishments for the first period surpassed goals, partly because the project director had a full-year sabbatical that gave extra flexibility in time. The proposal outlined goals by years, rather than half-years. The first-year goals remaining for the second semi-annual period lie mostly in publishing documentation for general use and rewriting the software from ImageJ Macro Language to Java. The former is on target and the latter is under contract with the Saint Louis University Center for Digital Humanities.

The activities below match those described in the proposal, with the addition of designing and building an arc to automate Spectral RTI captures. The breadth of audiences served surpassed expectations. We believe this is because the news of receiving a grant from the NEH raised awareness of the project even before we had something to show for the work. There were no personnel deletions and some volunteer additions.

The overall budget appears to be on track, with greater expenses in data management equipment (due to anticipated expansion in scope) and lesser expenses for food in Milan (due to favorable currency exchange rates).

The challenges and lessons learned were reasonable and adequately addressed. The biggest single problem was a lost/delayed piece of luggage, which we addressed by assigning part of the team to locate the bag and part to replace the items. Success converged from both directions almost simultaneously after delay of about one day. Milan is a major city but still posed challenges for finding replacement parts. A long-term lesson learned is to rely less on international transport of equipment; we are networking toward building a consortium of imaging projects in northern Italy that can share imaging equipment.

Cumulative Report

Project Activities

Publish open tools for processing Spectral RTI images

The Spectral RTI Toolkit works with ImageJ to process Spectral (narrowband) and RTI (hemisphere) captures into RTI and WebRTI files. Project activities cover five areas. First, the alpha-version of the Toolkit was published on GitHub (https://github.com/thanneken/SpectralRTI_Toolkit). Second, the Saint Louis University Center for Digital Humanities (http://lib.slu.edu/digital-humanities) was contracted to develop the toolkit from its current state as an ImageJ macro into a Java plugin for ImageJ2. The proposal describes this work as “Java Developer,” and Saint Louis University was selected for its past accomplishments in coding digital humanities projects, particularly those related to the shared interest in the International Image Interoperability Framework (IIIF). The contract covers the proposed activities (recode the plugin) within the budget, and assigns any excess effort to improving the WebRTI viewer to operate on a IIIF backend. Third, the Toolkit is being maintained and improved in light of the needs of users within and beyond the project. In particular, the project director supported Kathryn Piquette (University College London, Advanced Imaging Consultants http://www.ucl.ac.uk/dh/consulting/advanced-imaging-consultants) in implementing the Spectral RTI Toolkit on her independent projects using a different imaging system. The PhaseOne imaging system is the major alternative to the MegaVision system used by our team. Success in establishing shared compatibility was not surprising, but a valuable accomplishment nonetheless. Fourth, documentation has been published on the project website http://jubilees.stmarytx.edu/spectralrtiguide/, along with GitHub to facilitate derivatives and contributions from others, (https://github.com/thanneken/SpectralRTI_Toolkit/tree/master/Guide). Fifth, training began with early adopters Kathryn Piquette and Sarah Baribeau.

Design and build an arc to automate Spectral RTI captures

We worked hard and achieved success beyond the proposed activities in designing and building an arc to automate Spectral RTI captures. Capturing data for RTI images requires a real or virtual dome of discrete lights around the object. This can be done with a handheld flash, which requires manual positioning for each of fifty or so captures, and requires the light positions to be calculated from a reflective hemisphere for each page. It can also be done with a dome with lights fixed to known positions, but the diameter of the dome must be several times the diameter of the page, which is about eight feet for manuscripts. Such domes are unwieldy and interfere with safe object handling.

arc
The MegaVision Spectral RTI arc. Click for pan and zoom.

Through a series of conference calls and models (physical and Computer Aided Design, see appendix), our team designed an arc that has the major advantages of a dome but takes less space and can move out of the way. The arc pivots on the light stands already used by spectral imaging. The arc slots into seven positions that do not change from one sequence to the next. The arc holds sixteen lights, of which the odd and even numbered lights fire on alternating arc positions. The result is that fifty-six images capture the reflectance of the object when illuminated by evenly-distributed positions around a virtual hemisphere. The time required for hemisphere captures for RTI decreased from almost twenty minutes in the startup phase to less than five, while increasing the number of captures from thirty-five to fifty-six. The greater number of hemisphere captures increases texture resolution and decreases the impact of shots corrupted by shadows from the camera stand. The current cost to produce the arc is about $5000 and should retail for about $8000. Several buyers are already pursuing orders so we can expect the cost to decrease with production optimizations. Comparative data for the cost of RTI domes is not available. The cost of a handheld flash could easily surpass $1000 for a quality flash with battery pack and radio trigger (an infrared trigger would corrupt spectral imaging captures). Even at $8000 the arc adds tremendous functionality for a fraction of the cost of a spectral imaging system (easily $100,000).

Capture data in Milan

The team traveled to Milan and captured complete Spectral RTI data for all 144 pages of the Jubilees Palimpsest, plus early modern notes archived with the Jubilees Palimpsest and samples from five additional palimpsests in the Ambrosiana. The team consisted of the seven proposed participants and benefitted from additional volunteer effort from team members extending their time commitment and additional partners assisting at their own expense. The proposed team members were Todd Hanneken (project director), Anthony Selvanathan (graduate researcher from St. Mary’s University), Michael Phelps, Damianos Kasotakis, Roger Easton, Keith Knox, and Ken Boydston. Additional volunteers were Dale Stewart and Giulia Rossetto.

The travel to Milan originally scheduled for March 2017 was moved ahead to January 2017. This saved money and increased time availability of team members on site. We were able to rent three apartments in the same building, which worked very well. The favorable exchange rate helped us stay well within budget.

The narrowband spectral captures were increased to fifty-two captures per page in response to particular properties of the chemical reagent that was used early in the nineteenth century. The narrowband captures included fourteen bands of narrowband reflectance from ultraviolet to infrared, four bands of transmissive illumination, and a total of thirty-four fluorescence captures. The fluorescence captures included four different wavelengths of illumination and seven different filters plus additional variants at different exposure settings when the chemical reagent caused regions to differ radically in reflectance. The hemisphere captures for RTI amounted to fifty-six images per page. A total of 108 images were captured for all 144-pages of the Jubilees Palimpsest in just less than three of the four weeks in Milan.

With the remaining time we imaged the forty-six pages of non-palimpsest front matter and early modern notes archived with the Jubilees Palimpsest. Because these pages pose no challenges to legibility we used a reduced thoroughness (but still super archival quality) of sixteen images per page. We also sampled pages from other palimpsests in the Ambrosiana collection to aid demonstration of the utility of Spectral RTI and to probe the potential for future advanced imaging projects at the Ambrosiana. The objects selected were: an illumination from Petrarch’s Vergil that includes a crypto-script signature illegible to the human eye (A79 inf), an unidentified Greek commentary on the Gospel of Luke (F130sup), a palimpsest with several unidentified undertexts (H190inf), Origen of Alexandria’s six-column edition of versions of the book of Psalms (Hexapla, O39sup), and Wulfila’s fourth-century translation of the Epistles of Paul into Gothic, including a liturgical calendar (S36sup). The objects were selected to appeal to a broad range of scholarly, popular, and political constituencies.

In total we captured 239 pages, mostly at a rate of 108 captures per page, 50 megapixels per capture, 16 bits per pixel. Capture and on-site processing generated seven terabytes of data in Milan.

Manage and publish archival data

The data generated was archived for accessibility, functionality, and clarity for the immediate team and for posterity. For each capture, three formats are archived. First, the raw data from the camera in digital negative (dng) format was immediately set to read-only and archived for posterity should any of our subsequent processing decisions be questioned. Second, the data was “flattened” (corrected for aberrations in lighting based on a plain white calibration target). This data is most useful to the scientists for processing. Third, the flattened data was gamma-corrected to match the perception bias of the human eye. These gamma-corrected images are necessary for processing designed for human consumption. This data is somewhat redundant in that the later could be rederived from the former. We are considering ways to reduce this redundancy without sacrificing accessibility. The question is how easily, consistently, and reliably posterity will be able to rederive the derived data. In the meantime, all three are considered archival, along with the calibration captures.

Extensive capture metadata is encoded into the EXIF headers of the captured images. We supplement this metadata with an XML file for each page that includes all the EXIF metadata for each image in the sequence, while grouping together data that is constant for all shots in the session or image sequence. Additionally, illuminator sequence codes meaningful to the team may not be meaningful to posterity so they are elaborated in companion tags using a namespace specific to spectral imaging.

Data preservation and integrity was preserved at various levels. First (and most often overlooked) we countered the threat of “bit rot” by using checksums on the file system level and redundant file system metadata by using the B-Tree File System (BTRFS). Checksums are also used in verifications and duplications using rsync. Second, we countered the threat of drive failure by using RAID 1 or 10 redundancy in the definitive archives and backups. Third, we countered the threat of losing an entire computer or piece of luggage by distributing backups across locations. Several cities could be destroyed in the next world war and our data will survive.

The archival data is publicly available for specialists apart from the IIIF image repository described below, which serves a much wider audience. The data archive is available at http://palimpsest.stmarytx.edu/AmbrosianaArchive. Like all grant products, the data is accessible without any kind of encumbrance (e.g., account creation, cookie stalking) under a Creative Commons license (CC BY-SA for everything created solely by the Jubilees Palimpsest Project and CC BY-NC-SA for objects owned by the Biblioteca Ambrosiana).

Process data to approximate and improve upon first-hand experience

Data processing can be grouped into two end goals. The first is to create a digital facsimile that captures the present state of the artifact as accurately as possible. This kind of accuracy is useful to students and scholars who do not have first-hand access to the artifact, and to future conservators and scholars who will not otherwise have precise information on the state of the artifact in 2017. Accurate digitization of first-hand experience is done with high-resolution color using ten wavelengths within the visible spectrum. From this data accurate color images were created in the LAB (preferable for archival quality) and sRGB (preferable for compatibility and accessibility) color spaces. These derivative files have 24-bit color depth. Accurate spatial resolution is achieved by avoiding Bayer or other filters, and by using an apochromatic lens. Accuracy in texture is achieved by using transmissive light (as if holding the page up to a light) and capturing reflectance of light originating from different angles (raking light images and eventually RTI, as if moving a light around the object).

The second major end goal is to surpass first-hand experience for reading illegible text, marginalia, and other features. Some of these follow standard recipes and some involve case-by-case labor. The standard recipe included with the Spectral RTI Toolkit is Extended Spectrum, which essentially squeezes ultraviolet and infrared into the visible spectrum and optimizes contrast. Another standard recipe was created by imaging scientist Keith Knox to deal with the particular problems of the reagent-saturated palimpsest. This method, called RuBY takes its name from the formula of taking Royal blue fluorescence divided BY transmissive. It has proven effective at reading illegible text in the palimpsest. Two additional recipes developed by Knox, Sharpie and Pseudocolor, were applied to the palimpsest samples other than C73inf. All of the processes described thus far (Accurate Color, Extended Spectrum, Ruby, Sharpie, and Pseudocolor with raking and transmissive light variants and WebRTI) have been completed and published for all pages captured. Additional supervised processing has been completed for the supplemental palimpsests except S36sup. One technique, called PCA Pseudocolor, is built into the Spectral RTI Toolkit. Although this processing can be done by anyone with default settings, we are taking our time to optimize quality. The most advanced technique requires a feedback-loop between scholars and scientists. The chief scholar Todd Hanneken and the scientists Keith Knox and Roger Easton are conducting weekly conference calls to discuss processing recipes and focused efforts. The processing guides created through this collaboration are archived, publicly accessible, and discoverable through search engines: http://palimpsest.stmarytx.edu/AmbrosianaArchive/Guides/. A related document studies the paleography of the Jubilees Palimpsest by grouping together legible examples of each letter: http://palimpsest.stmarytx.edu/AmbrosianaArchive/Guides/LatinMosesPaleography.html.

Create an open image repository utilizing IIIF Image and Presentation standards

Together with the Department of Network Services at St. Mary’s University, the project director created a IIIF image repository on an Amazon Web Services EC2 instance with EFS primary storage, Amazon S3 backup storage, Amazon CloudFront international caching, and Domain Name Service for http://jubilees.stmarytx.edu. As described in the proposal, this arrangement is ideal for the predominantly off-campus traffic of the project and the potential need for elasticity if usage spikes with media coverage. Lessons learned include the inadequacy of Amazon S3 for primary storage of Jpeg2000 files because the need to transfer the whole file before reading any of it defeats the advantage of Jpeg2000 that only the region and resolution required need be read and processed. Because CloudFront caches web pages for twenty-four hours it is essential to double-check all data before uploading it to the Amazon EC2 instance.

The project director tested open source alternatives for the Jpeg 2000 backend of the IIP image server. Unfortunately, quality, performance and reliability were acceptable only with the commercial alternative (Kakadu), which is the one thorn in the side of an otherwise entirely open-source project. Once the IIP image server was compiled with the Kakadu Jpeg 2000 libraries and the Apache configuration adjusted, the IIIF Image API compliance was ready. The IIIF Image API allows project images to be stored once and served in portions at various resolutions. This is essential, for example, for the paleography chart of Latin Moses, which stores the page and region coordinates of each letter exemplar, not cropped duplicate images.

IIIF Presentation API manifests were written with placeholder data in advance of the capture session and filled in as data was created. This allowed many images to go live before the capture session was complete. One challenge with the IIIF Presentation API is that even Mirador, which was specifically designed for IIIF manifests, does not fully support the standard. The short-term solution was to create a new IIIF Navigator using Leaflet and JQuery (http://jubilees.stmarytx.edu/iiifp/). The long-term solution was to wait for Mirador 2.6.1, which supports the Image Choice feature of the IIIF Presentation API. The Image Choice feature is essential because many images describe each page. For the Jubilees Palimspest we have old microfilm, an 1861 edition as a scanned page, and the spectral image cube with various color enhancements, transmissive illumination, and raking light directions. The IIIF Presentation manifest also supports links to other resources and annotations, including WebRTI images, transcriptions, and translations.

Starting January 2018, ranges of chapters and verses witnessed on each page were added to the IIIF Presentation manifest for Jubilees, which facilitates browsing in the Index tab in Mirador.

Also starting January 2018, a public annotation server was connected to Mirador, which allows users to contribute annotations to be seen and reviewed by others. These annotations are most often transcriptions, but can also note areas or points of interest, such as marginalia or other scribal practices.

In February 2018 Mirador was customized to show manifest (manuscript) and canvas (page) coordinates in the address bar. This allows scholars to copy and paste from the address bar in their browsers into any medium (such as an article or annotation) and direct others to the exact page in the viewer.

Create an EpiDoc TEI XML version of the 1861 edition of Latin Moses

A fully-tagged machine and human readable version of the 1861 edition of Latin Moses facilitates study of the manuscript and will serve as the foundation for a new critical edition of the manuscript. Standard tags were used to code unclear characters, fully illegible characters, line and column breaks, chapter numbers, as well as verse numbers and emendations offered by subsequent generations of scholars. The XML edition preserves all available information and can be viewed is customized ways, such as showing the best available scholarly improved text, or the most faithful transcription of the manuscript, or both. Eventually, the past readings and emendations will be combined with new ones to create a complete critical edition.

Media and scholarly relations, conferences, meetings, and presentations

Activities to support media coverage surged after the NEH announcement in August 2016. Coverage is listed below under Accomplishments. This coverage and web searches led to scholars contacting the project director with various requests, all of which were addressed. Conference presentations and public lectures are listed below under Accomplishments. A scholarly meeting dedicated to the project is planned for May 15-17, 2018 at the University of Notre Dame.

Training and mentoring

The two major categories of training and mentoring were the student researchers at St. Mary’s University and other imaging students and professionals. In academic year 2016-2017 Anthony Selvanathan was trained by the project director in various aspects of the project, and joined the team for five weeks in Milan. There he learned and was actively involved in all aspects of the project, especially manuscript handing and mounting for imaging and operating the image capture equipment. In academic year 2017-2018 the student research opportunities were opened up to all students on the campus of St. Mary’s University. Approximately thirty students were exposed to the project at least at the level of an information session, and twelve continued on to paid work. The training portion consisted of semi-weekly workshops of two-hours each with the project director. This training supported independent work for the project on the students’ own schedules. The training included surveys of the general context of the project (the book of Jubilees, manuscript studies, spectral imaging). Specific skills trained and utilized were coding manuscripts in EpiDoc TEI XML, creating paleography charts using the IIIF Image API, and transcribing manuscripts using the annotation features of the IIIF repository and Mirador. The instructions created for the student researchers is included in the public archive of the project (2017 and 2018).

External training on Spectral RTI processing was provided to students and scholars working on other imaging teams, particularly Kathryn Piquette (University College London) and Sarah Baribeau (Lazarus Project and EMEL). This training was accompanied by the development of the Guide to Creating Spectral RTI.

Accomplishments

Spectral RTI Toolkit

Spectral RTI capture equipment and procedures

Data archive for advanced specialists

IIIF image repository and general user interface

Media coverage

Conferences and conference presentations

The Society of Biblical Literature Annual Meeting in Boston, November 2017, included a session titled, “Multi-spectral Imaging and the Recovery of ‘Lost’ Texts from Palimpsests.” It was a joint session of the Pseudpigrapha section and Digital Humanities section. Four team members presented (along with two others):

The Rochester Cultural Heritage Imaging, Visualization, and Education (R-CHIVE) consortium held a conference June 19-20, 2017 (with additional meetings before and after the conference). The following team members presented:

The Project Director (Todd Hanneken) gave public presentations on the project.

A scholar’s workshop is being planned for March 15-17 at the University of Notre Dame.

Audiences

The audiences served can be grouped into three categories: 1) scholars of the ancient literature being recovered; 2) digital humanists interested in the capture and processing technology, or similarly the publication technology; and 3) general interest and popular media.

Scholarly content

Scholarly interest has come through the team members and from presence on the Internet leading individuals to contact the project director. The team members are all well established in their respective specializations, so word has spread quickly. The November 2016 meeting of the Society of Biblical Literature in particular was an excellent opportunity to identify and begin collaboration with scholars specializing in the various texts imaged. Other scholars have come to us based on searching the Internet or reading our publications. Working at the Ambrosiana for a month was a great opportunity to network in the field and demonstrate our capabilities.

The first print publication of an image we captured appeared on the cover of Alexey Eliyahu Yuditsky, A Grammar of the Hebrew of Origen’s Transcriptions. Israel: The Academy of the Hebrew Language (2017).

Digital Humanities communities

Similarly, word about our technologies for capture, processing, and publication spread through professional networks and the Internet. For example, the Gregory Heyworth of the Lazarus Project also does spectral imaging with MegaVision, and was one of the first to learn about the new Spectral RTI arc and software to process the data captured therewith. Giulia Rossetto is an imaging specialist who worked on the Sinai Palimpsests Project and donated some of her time to assist our project in Milan. Kathryn Piquette is an RTI specialist expanding into spectral using the PhaseOne system, and has begun using our software with her own captures. The project director demonstrated the project and the IIIF Navigator interface (designed to accommodate the large number of resources per page) to the monthly community video conference of the Manuscript Group of the IIIF community. He also presented on texture imaging at the first annual conference of Rochester Cultural Heritage Imaging, Visualization, and Education (R-CHIVE).

General interest and popular media

There has been significant popular media interest in the project. See especially the publications noted above.

Evaluation

We measure our success by our ability to answer the following questions.

Does Spectral RTI technology work?

Yes. Beyond the limited tests from the startup phase we demonstrated that the technique is feasible in the capture phase and effective in the processing and publication phase. The equipment problems we did encounter had nothing to do with the addition of RTI. We were able to conduct the capture at a steady pace consistent with other spectral imaging projects. The technology works efficiently and consistently combines the advantages of spectral imaging with the advantages of RTI.

Can we increase the speed and efficiency of Spectral RTI?

Yes. By using the MegaVision RTI arc we can conduct Spectral RTI in less time than it takes to do RTI alone using the hand-held flash method. The RTI sequence of 56 captures takes 4-5 minutes. The sustained rate of capture (including object mounting, 52 spectral captures, 56 RTI captures, breaks, occasional trouble shooting, visitor interruptions) averages 20 minutes per page. The arc also saves about five to ten minutes of processing time compared to determining light positions of a handheld flash from a reflective hemisphere.

Can we make Spectral RTI technology available to other image teams?

Yes. This is not a major goal until the third project year, but we were already able to work with Kathryn Piquette to capture and process images from Spectral RTI. The fact that she is working in the UK on a different system (PhaseOne rather than MegaVision) is a good start to expanding our reach. The technology has also been adopted by the Gregory Heyworth of the Lazarus Project at the University of Rochester and Sarah Baribeau of EMEL. Our goal is the opposite of forming a monopoly on the technology.

Can the images reach a wide audience through open standards and web interfaces?

This is a long-term goal. Right now it works. We really need some online tutorials and some interface polish. We plan to measure how many people we reach through the non-specialist interfaces using software designed to analyze Apache web server logs. The logs are being archived but have not yet been processed. We are not tracking users through cookies. Google Analytics is used only for the main page.

Can we overcome the damage done by chemical reagent in the case of the major showcase manuscript?

This is the most interesting question to the image processing scientists. Progress on this test case would have broad application. So far we have decent progress with a new technique called Ruby (Royal blUe fluorescence divided BY transmissive). We also have an understanding of why it is so difficult (because the undertext, overtext, and reagent are all made of the same iron gall ingredients). We do not yet have a slam-dunk universal solution.

Can we correct or add to the scholarly edition published in 1861?

This is the most interesting question to the scholars of Jubilees and the Testament of Moses. After a very short time, it seems we can add at least a few words to what Ceriani could read in 1861. Whether this will have a major impact on our understanding of the original composition remains to be seen. Perhaps more intriguing is our ability to correct Ceriani’s reading. At that time editors took liberties with reconstruction and claiming to see what they expected to see. As we accumulate evidence that Ceriani was a “loose” editor every reading he proposed comes into question and will be subject to additional scrutiny. Work here has barely begun. We started working on the regions Ceriani did not claim to read and only accidentally noticed that some of the surrounding text does not fit his reading or clearly reads otherwise.

Can we offer a resource for teaching and research on the manuscript, beyond the transcription of the major text?

At this early stage I can assert that I think the repository and interface is a great teaching tool. Our graduate researcher was certainly awestruck when we let him interact with the first enhanced images. Additional evidence will be gathered here in future reports.

Continuation of the Project

We have all the data we need to proceed with the rest of the project periods. We can demonstrate the utility of the technology to scholars and prepare the technology for widespread adoption.

Additionally, we are building a strong case for returning to the Ambrosiana to image more palimpsests. The Ambrosiana seems to be comfortable collaborating with U.S. institutions (Notre Dame and the Jubilees Palimpsest Project) as the digital library side of their mission while they continue their non-digital work. We are also building a network of contacts and a case for establishing a consortium of imaging project in northern Italy. This would save much money on transport and troubleshooting of equipment while allowing individual teams to pursue a diversity of approaches with different collections.

Long Term Impact

We seem poised to make long-term impacts on scholars of ancient literature and on image capture, processing, and publication teams. Evidence will be added here as it becomes available.

Grant Products

Direct

The project website works and includes basic processing of all pages. Advanced supervised processing has a strong start and is ongoing. IIIF Presentation manifests have been created for all the palimpsests imaged.

The ImageJ macro edition of the SpectralRTI_Toolkit and documentation are now available. It will benefit from development in performance and user interface thanks to the contracted work with the Saint Louis University Center for Digital Humanities and future training and feedback sessions.

Media Coverage

See above.

Appendices

Designs for a hemisphere capture device

The design with two low pivots was built:

TwoLowPivots-off

The “carousel” design could also be useful in permanent facilities with adequate space and overhead structure.

Carousel-off2