ICA/SUV

Archivists and Technology

Sonia Yaco
Old Dominion University, Norfolk, Virginia

Paper Presented at the 2007 Annual Conference of the International Council on Archives Section on University and Research Institution Archives, University of Dundee, Scotland, August 15, 2007



Archivists are at a strategic crossroads. As a profession with a long history, it is easy to rely on our fine tuned skills for appraising, describing and providing access to our collections. However, in so doing we are putting ourselves at risk. Archivists cannot afford to let our professional abilities stagnate while the world of technology continues to evolve at an ever-quickening rate. Lack of technical skills among archivists bars us from utilizing a variety of new technologies that can be used as tools for processing and for providing better access to our collections. Technical tasks ranging from Web page design, to EAD encoding, setting up components such as style sheets and SGML/XML/XLS/HTML parsers for publishing EAD on the Web are increasingly needed to insure archives are visible to users. Additionally understanding new technologies like POD casts and voice recognition software will help us to increase access to our collections.

Today I will be discussing two different research projects I've conducted on different aspects of using technology to facilitate discovery of our holdings. First I will discuss a pilot study I conducted evaluating the potential for using voice recognition software in the appraisal and transcription of oral history tapes. 1  Automatic computer transcription of spoken word material could revolutionize archivists and researchers intellectual access to this material. Audio content, now hidden in sparsely described catalog records, could have key word indexes automatically created with voice recognition software. Low cost transcripts could be used by archivists evaluating and describing spoken word material as well as by researchers seeking to understand the content of tapes before visiting a remote archives.

Secondly I will look at the results of a survey I conducted with sixteen archives in the United States examining barriers to the implementation of EAD. One of the major findings from that survey is that there is a gap between the technology needed to encode and publish EAD and the skills of many archivists.

Voice recognition software, simply put, is software that transcribes spoken words into written text. Use of this software has the potential of increasing intellectual access to the riches of oral histories and other audio tapes. Having computers automatically create transcripts could greatly increase the speed at which archivists could process tapes, increase the richness of their description and make inexpensive transcripts available to researchers. Sound tapes, which typically come into Archives on analog cassette or reel-to-reel tape, could be converted to digital format, and then automatically transcribed with this software.

The mechanism for using voice recognition software begins with a user speaking into the microphone attached to a computer. There are two aspects to voice recognition technology: acoustic phoneme recognition - which is the software mapping sounds to words and statistical language modeling -- weighing the probability that any given word will appear in a sentence. These two components together make it possible for the software to choose between similar sounding words for the most statistically likely one.
In my pilot study I evaluated two types of voice recognition software. Dragon NaturallySpeaking™ is speaker dependent, designed to recognize the voice of a specific speaker. AudioMining is a type known as speaker independent, intended to recognize multiple speakers' voices.

The spoken material I used as test data in this study was a series of cassette tapes containing interviews conducted by University of Wisconsin-Madison emerita Law Professor June Weisberger in 2004. Weisberger interviewed eight people about their role in the creation of the 1984 Wisconsin Marital Property Reform Bill, which sought to create equal rights for women to property for during a marriage. The nine speakers on the tapes are college educated Caucasians ranging in age from 60 to 80. Eight of the speakers are women and one is a man. They were raised in the Midwestern or eastern part of the United States. For all of these speakers, English is their first language.

Table 1 shows the transcription of the first sixty seconds of one of the tapes, by three different transcription methods -- manual, by Dragon Naturally Speaking and by AudioMining. There is significant variation between the three methods of transcription. Dragon NaturallySpeaking™ bears almost no resemblance to the manual transcription. Only three phrases from the entire 60 second transcript were accurately transcribed, 'where we were', 'all of these resources' and 'actually became one of our'. None of these phrases would be helpful in determining the subject of the tape. There is much more of a correlation between AudioMining's transcription and the manual transcription. Although the first few phrases are not transcribed accurately, important key words in the rest of the including, 'Divorce Reform Act,' are correctly transcribed.

Table 1 Transcription of Tape One Side B - first 60 seconds
Transcription method
Content
Manual[June Weisberger:] March 3 2004 between June Weisberger and Linda Roberson.

[June Weisberger:] Okay, Lindy.

[Linda Roberson:] Okay. Do you need to check that and make sure...

[June Weisberger:] No, no.

[Linda Roberson:] You can see that it's going. I was talking about what Mary Lou brought to the table. And where we were at that time was that she had just very successfully united bench and bar behind the Divorce Reform Act, which was quite revolutionary in its way and so that was her work there was pretty much done. She had all of its goodwill and all of these resources that she could bring to bear and she'd just finished a tremendous job on a major piece of legislation itself and so could we just slide in another major piece of legislation and move her from protecting women at divorce to protecting people during the ongoing marriage. Which actually became one of our big arguments as I know you recall.

[June Weisberger:] One of the best arguments.

[Linda Roberson:] Yeah.
Dragon NaturallySpeaking™ As or a clean July 3 a in a any as you know you and I on and I am rocking in and where we were a time that he is very fast lane eight at him are behind format with evolutionary flat on said who worked in the last time she had all been with all of these resources the heated rings in their sieges if any action is tiny and flashes of who defy a knee in the last angle or for a minute for everything he owns or any there was actually became one of our in our is unknown after you.
AudioMining To pitch their 2004 between giveaways. And at a list of the need to check many sure and I give you a call, and I may move off the table and where we were at a time with which he just very successfully is not a bench and bar behind the divorce Reform Act, which was quite evolutionary and it's why on and so that was her work there with pretty much done. She had all of its goodwill and all of these resources that she could bring to bear. And she just finished a tremendous job on a major piece of legislation itself can be defined in another age in the Atlanta life said any move for protecting women at work to protect the people during the ongoing marriage, which actually became one of our big arguments as I know you all -- started out because like.

Findings

Dragon NaturallySpeaking™ produced a transcript that has little relation to the actual contents of the spoken interview. Few important keywords were transcribed.

AudioMining was much more accurate in transcribing Professor Weisberger's interviews. It created transcripts that conveyed a good sense of the general topics covered in the interviews. Many important keywords including 'divorce', 'commission on women', 'marital property reform', and even 'Weisberger' were correctly transcribed.

However AudioMining is very expensive, approximately $5,000 2 , and requires a great deal of technical expertise to install and use.

My criteria for success in this study was for the software to accurately create a written transcription of the contents of the tapes that could be used by archivists evaluating or appraising them, and by researchers wanting access to the information contained on the tapes without listening to them. The voice recognition software I evaluated in this pilot study produces results that are too inexact to provide substantial assistance for archivists or researchers in their current versions. However, the key word identification done by AudioMining show great promise. Combined with tools that analyze the frequency of keywords in a text, AudioMining could allow researchers to discover relevant collections of spoken word material. 3 

There are several new voice recognition products that have been released in the last year such as Sonic Foundry's Mediasite4 , which contains a voice recognition component with keyword searching capabilities for videotape recordings; and Blinkx 5 , a web search engine that utilizes automatic speech recognition technology to crawl audio video content on multi-media web sites.

The continued development of video, audio, commercial and personal computer voice recognition software suggests that we are tantalizing close to the development of technology that could provide keyword search capability for oral histories.

Now that we've discussed one technology that could help provide access to some of our collections in the future, I'd like to discuss the problems involved in implementing a current technology tool designed to increase access. That tool is Encoded Archival Description - EAD. This markup language was designed as a way to standardize finding aids and in so doing facilitate discovery of archival collections across multiple institutions in on line environments.

What prevents archives who want to implement EAD from doing so? The answer nine years ago when it was first released, was that the technology simply was not there - there was no software readily available for editing or viewing EAD.

Today the answer to the question appears to be more complex. This spring I surveyed sixteen archival repositories to try to identify what those obstacles are and how they might be removed. Through this survey I was able to identify three main barriers to implementing EAD and have explored some possible solutions to those barriers.

In order to see what issues there might be among current implementers, I looked at the 78 EAD Implementors Listings on SAA's website. I identified several trends from them:

They use a complicated software work stream - some of them use six different packages to markup and publish their finding aids in EAD. Numerous institutions had server and technology problems in the implementation process. Of the solutions found for these problems, the most consistent one mentioned was the use of outside resources for various phases of the process.

I created a survey based on the issues raised in the literature I reviewed and my discussions with archivists and librarians at the University of Wisconsin Madison and the Wisconsin Historical Society. The questions were divided into sections on background, problems, costs, expertise, workflow and solutions. The subjects were perception of EAD complexity and cost, workflow details, degree of institutional support, familiarity with authoring software and availability of technical resources.

There were three main barriers identified by respondents. The first and primary barrier to EAD implementation was lack of staff. EAD is a time-consuming process. The initial planning, designing a workflow, choosing software for encoding, rewriting or updating of finding aids, setting up an EAD server, and then encoding and publishing finding aids all require vast expenditures of staff time. Respondents said that these tasks might be possible if they could reserve a block of time, but their other duties precluded this. As one said, 'It is not just finding the time to do the encoding, it is finding the uninterrupted time to think out how to do the encoding.'

Secondly, the lack of expertise in server technologies creates a middleware gap. Middleware is software that mediates the exchange of information between an application and a network. In other words, archivists appear to know how to mark up finding aids in EAD, but do not know how to deliver that content to a web site. A review of implementation study literature shows that this is a problem that has existed since the beginning of EAD.

The third and final barrier to implementation identified in this survey is the plan of many archives to rewrite their finding aids before implementing EAD. The majority of respondents with existing finding aids planned to augment, update or rewrite existing finding aids before EAD encoding. Several other studies have suggested that the drive to get all finding aids up to current archival standards prevented institutions from encoding any finding aids.

In addition to identifying the problems preventing EAD implementation, my survey also attempted to determine possible solutions to these problems. Given the concern with staffing levels displayed by respondents, it is not surprising that use of outside consultants was cited by respondents as the single most helpful factor in implementing EAD. Their responses and a review of literature suggests that the most productive use of consultants would be for planning and workflow design, and to set up server environments.

There are several possible solutions to closing the middleware gap. One would be to improve staff knowledge of server technologies by expanding standard EAD training to teach the server technology needed to publish EAD encoded finding aids on the Web.

A solution that would require less server knowledge from archivists is the use of software that reduces or eliminates server customization required to host EAD. One such software is a promising new version of Archon, recently released by Christopher Prom and his colleagues. This all-in-one software can be used to encode finding aids; output them in several file formats such as html, XML and php; and provide a server friendly relational database and search engine.

To deal with the desire to only encode finding aids after they have been rewritten, would be to encode and publish finding aids in two rounds. The first round would be to encode finding aids up to the basic EAD record guidelines as defined by the Library of Congress's 'Minimum Recommended Finding Aid Elements.' Only finding aids that do not contain these basic elements would need to be updated. A second round of encoding could add other levels of description to these basic EAD records and update all elements to the repository's current standards, as needed.

As an archival standard, EAD holds the significant promise of improving quality of archival description. If enough archives adopt EAD, finding aids will improve and users' access to archival material will increase. However, a standard that no one is able to adopt is of limited value, so it is vital that the professional community explore and understand the factors that makes implementation of EAD difficult. It is unrealistic to implement EAD with existing staffing levels. With staff workloads already pushed to the limit, it is especially impractical for an institution to expect to implement EAD if they require every encoded finding aid to be fully compliant with current local standards. This is particularly true at smaller archival repositories.

However, increased staffing alone will not insure successful implementation, because EAD is a technology dependent standard. There is still a significant gap between the technological expertise needed to implement EAD, and the computer skills of many archivists. What resources archivists use to bridge these gaps will decide the future of our profession. Either we can do this by expanding our internal resources through increased staffing, increased technical training, and the use of less complicated EAD software or we can solve them by using outside resources such as consultants. While the survey reveals that many archivists feel the best solution is to use consultants for most aspects of EAD implementation, it may be more appropriate for archives to use consultants as part of a proactive plan to incorporate this technology into the core work of archivists.

Today I've discussed two technologies that archivists can use to facilitate discovery of our collections. Voice recognition software may soon be useful for providing keyword access to spoken word material. EAD exposes content to a wide audience - but only if it can be implemented. Lack of staff technical skills prevents many institutions from fully utilizing this tool.

Archivists need to reach out and embrace these and other new technologies. It is the only way to insure that we, our profession and our collections do not become irrelevant.

Return to 2007 ICA/SUV Conference Papers


[1]  This study is described in more detail in Sonia Yaco, 'The Potential for Use of Voice Recognition.' 'Software in Appraisal and Transcription of Oral History Tapes,' Association of Recorded Sound Collections Journal 2007 (Fall), 214:225.

[2]  The full retail purchase price of AudioMining July 2007 was $5,000.

[3]  An example is the search engine Grokker, http://grokker.com.

[4]  'Mediasite.com,' http://www.mediasite.com, (accessed: 13 April 2007).

[5]  'Video Search Engine Blinkx,' http://www.blinkx.com, (accessed: 21 October 2007).