Considering Archives and Users in the Internet AgeMaygene DanielsNational Gallery of Art, Washington, D.C. Paper Presented at the 2007 Annual Conference of the International Council on Archives Section on University and Research Institution Archives, University of Dundee, Scotland, August 14, 2007 My subject this afternoon is identifying and understanding our users and their needs, a subject that I'll explore in the context of the changing world of 2007. First, I'd like to emphasize that, as every archivist knows, whatever technologies we use, our services for researchers always will depend on a foundation of good, sound archival arrangement, description, and physical preservation. Yet, although technology is not everything, the digital revolution of the past decade and the explosive growth of internet have caused fundamental changes in who our users are and how we serve them. Bill Maher recognized this 7 years ago when he spoke to this section's meeting in Cordoba on the future of college and university archives. Even then Bill described a rapid evolution in user expectations, citing new interests in visual materials, greater demands for narrower pieces of information, and growing interest in internet-based digital imaging projects. These are important factors, among a host of other changes. By now, 7 years after the Cordoba meeting, essentially every archives uses and every archives user has access to a computer. We are in a new era, so we must now ask how well are we serving our users now that e-mail and internet are inescapable facts of life? To answer that question, we first need to consider who our users are. Evidently each archival repository defines its own community of users, sometimes giving priority to some users and occasionally excluding others. As long as policies are based on some reasonable grounds and consistently applied this is perfectly acceptable. My purpose here, though, is not to analyze the specific user communities of different repositories but instead to look broadly at all communities of archives users over time. Here a historical perspective may be helpful. Decades ago, internal, administrative use was considered the principle purpose for traditional institutional records. Hilary Jenkinson enshrined this point of view in his 1922 Manual of Archive Administration when he limited his definition of archives to records identified and preserved due to their value for institutional administrators or their successors. In other environments in the last century, academics and scholars became the privileged and preferred users for archives and manuscript collections. American T. R. Schellenberg reflected this point of view in his seminal handbook Modern Archives, published in 1956, when he defined archives as exclusively 'documents preserved because of their value for 'research and scholarly use.' Later, especially during the activist decades of the 1970s and 1980s archival institutions increasingly recognized the interests of users beyond the academy. In this new atmosphere, archives were not content to await serious researchers but instead were encouraged to actively reach out to new communities. One set of guidelines issued in 1984 by the Mid-Atlantic Regional Archives Conference in the United States observed "well-directed publicity will encourage research use," suggesting that even the general public might appropriately be welcomed in archives. In her excellent manual Providing Reference Services for Archives and Manuscripts, published by the Society of American Archivists in 2005, Mary Jo Pugh reflects this newer vision of archives users when she recognizes both professional and avocational users as legitimate archival clients. She defines professional users as those who need archives for any business purpose. These include institutional administrators, academic researchers, students, lawyers, documentary filmmakers, journalists, and others. She defines avocational users as those who benefit from archives for reasons of personal interest. Thus genealogists, local historians, hobbyists, and the public at large all can be considered non-professional, but still appropriate users. Now, in the 21st century, archives may well be serving both specialists and the general public, and they may do this either by providing direct individual research services or through indirect services delivered through secondary or derivative sources. Traditionally secondary sources have been enjoyed most by the broad interested public, especially avocational users. They have seen documents on display in archival exhibitions, or have enjoyed books, articles or documentary films. I see no signs that these rewarding archival products will disappear. At the same time, internet technology has created a supplemental way to make such offerings available without the restraints of geography. Archives on every continent are making creative use of this new media for on-line exhibitions, publications, teaching materials, audio podcasts, and documentary films. Although on-line use by the general public is notoriously difficult to evaluate, anecdotally, rich web sites are now reaching internationally new audiences who would never be able to physically visit an exhibition and might not have access to expensive publications. More people are able to reach these resources via internet, and undoubtedly more are doing so. This is a plus from almost any point of view. But what of those researchers who need direct access to archival documents? They need to know where to find collections, they need access to the documents, and finally they need to have copies and the legal right to publish document images. In 2007 internet is providing unprecedented access to archival information and in the process is expanding our communities of users. At the same time the new applications are not without problems and cautionary tales as I'll explain. The first step in research always must be for a potential user to find the repository that holds needed documents. The repository may be obvious, but often finding the correct repository is not easy. If the institution is not apparent, trained users in the past have used a mixture of knowledge and common sense, clues from footnotes, and library sources. Published collection guides and finding aids and printed multiple-collection catalogs such as the venerable National Union Catalog of Manuscript Collections in the United States (NUCMC ) have been primary and essential tools. Let's look at NUCMC in greater detail. This well-regarded tool was designed to be used in a series of annual printed volumes. The series ended in 1993. Microfilms of the volumes are still available, but in 1993 NUCMC went on-line, and this is how it is used today. This evolution is typical for many bibliographic tools, which started in print form and are now available electronically. Without reservation, this is a positive advance in user service. One only has to think of the problems ordering a hard copy catalogue or microfilm to realize how wonderful internet really is and how broad the reach of our resources has become. Nonetheless, even as guides and finding aids are far more available than in the past, simply moving them from paper to internet does not necessarily create an effective tool - and can lead to oddities in structure and presentation. A search screen on the current version of the NUCMC web site illustrates this. It provides an access point to a large combined source of information- and yet it is a construct of the pre-internet age since the system is entirely closed. Any search is restricted to data specifically placed in this limited, highly controlled system. It's a model that evolved across a century. The question is whether it is still effective. I think not. To illustrate, using the NUCMC/RLG query screen I searched on the name Chester Dale, an important art collector and donor to the National Gallery of Art. The result of the search is not very friendly to users. The hits are uninformative, repetitive, and arranged in a meaningless alphabetical order - although in fairness if you click on an entry you are led to a MARC record - but not to a complete finding aid for the collection. Overall the result is not impressive - and worse, most users would never find it to begin with since NUCMC is buried in the United States Library of Congress website where it is not given particular prominence. Of course there are other searchable catalogues that have made the transition to internet more gracefully, but all are built on restricted sources and the results are equally limited. So how do researchers search for historical materials now if our traditional closed catalogs and guides don't work? Overwhelmingly archives users now turn to on-line search engines such as Google and Yahoo as their preliminary point of access to archives. The same search for Chester Dale in Google results in references to websites arranged in a hierarchy of relevance. The first in order guide us to art museums that own paintings donated by Chester Dale. This may not be what I'm looking for, but it is informative. And at the bottom of the page you'll see a reference to the Chester Dale papers. It's actually a reference to an on-line exhibit - not what I want, but it lets me know the name of the papers and the institution and from there it's an easy step to the finding aids - not just the MARC records - I'm looking for. Checking further on the original Google search, I find surprising and rewarding resources that are completely unexpected - including a link to a press release about Chester Dale in the archives of the Art Institute of Chicago. Why would anyone use the limited NUCMC resource with the universe of Google available? In this international conference, we need to ask whether internet searching is truly open or whether it may limit access in invisible ways, perhaps excluding results outside the immediate geographic or linguistic area? Encouragingly barriers of language and national borders actually seem to disappear on internet. If a search is in French, French results are likely to appear. Google has a language options screen where the list of choices is intriguing, even including Esperanto. It's significant to note nonetheless that the default option is to search sites in all languages. In effect, then, this model for searching is wide open and unrestricted across language and cultural boundaries, and our researchers realize this. The very accessibility and breadth of internet has made archives tools available to anyone who cares to look. As a result, the base of archives users is unquestionable becoming broader than ever before. Again this seems to me to be a very good thing. Nonetheless there are unquestionably dangers in depending exclusively on this new model of internet searching. To start, open-ended internet searching, however sophisticated, always favors narrow questions associated with uncommon words. Broader themes or subjects are more difficult to find, but to a large extent this has always been true for finding aids. Evidently our users also need some level of sophistication in searching with Google or any other internet search engine and distinguishing between hits. Archival organizations also need the best possible strategies for maximizing the likelihood that potential users will find our resources. These are both important agendas for the profession. Most significantly, we have no control whatever on these giant search engines, and, although some aspects of their strategies are well-known, many details are closely-guarded commercial secrets. Although up to the present competition between Google, Yahoo and others has made the tools free, open, and generally unbiased, we have no way of guaranteeing that this will always be true. It's important for us as archivists and citizens to ensure that it does. We also need to ask ourselves whether we as a profession are focusing on the correct questions to build on successes and to make our resources easier to find and use on internet. Here I think that the answer is less encouraging. With our profession's intense attention to standards and bibliographic rules, in my opinion we have been missing key new issues. A historical perspective may be useful. Beginning even in the 1970s, archivists internationally were far-sighted in recognizing that automated systems would substantially alter the information universe. In response, the profession began to look to library precedents to find ways to standardize data to allow for information exchange and more effective access via computer systems. MARC for Archives and Manuscripts - MARC AMC - resulted from this concern and was the first of many descriptive and bibliographic standards in the age of automation. MARC AMC focuses on establishing consistent tags for information items and minutia in how it is presented. Internationally archivists also have been seeking to harness automation through standardization. More than a decade ago in 1994 the International Council on Archives issued its General International Standard for Archival Description, ISAD-G, followed by other descriptive standards. When these rules were first discussed and implemented, they made a great deal of sense. Internet was in its infancy and the closed traditional forms of searching either in library systems or shared guides on-line seemed to be the best option for the future. With the advent of the open model for on-line searching in Google or Yahoo, this expectation proved to be incorrect, yet our professional focus on the standards required for closed searching has continued. The most recent standard in the United States, Describing Archives, A Content Standard, DACS, continues to be the subject of lively interest. And ICA itself recently has formed a committee to establish a new international standard to describe institutions with archival holdings. Some of you may have seen the recent internet discussion of whether this standard is repetitive. I would go one step further and ask whether it is needed at all. This is not to say that all standards are unnecessary. Open, internationally-accepted technical standards in particular are essential for long-term access to digital information. Nor do I think that efforts to standardize data over the past decades have been wasted. To the contrary, they have been extremely helpful in honing our profession's unified sense of best practices. Now, though, the need for universally accepted descriptive standards for modern groups of records has disappeared and continuing to create further specialized rules seems particularly unproductive. Worse still, this focus on cataloguing rules deviates from the requirements of good archival description. MARC and its successors are highly effective for describing books and published materials within closed library cataloguing systems. They never worked particularly well for varied and complex groups of archival materials. Even EAD - encoded archival description - creates rules and rigidity where common-sense would do equally well. EAD provides an elegant framework to present traditional finding aids on-line. It uses a specialized document type definition - or DTD - for web-presentation, but for presentation, plain old off-the-shelf HTML could do equally well. I also wonder whether the standard will keep up with new advances in web presentation? I'm not convinced. It's also worth noting particularly that PDF - portable document format - files can have the same organization and content as an EAD finding aid with enhanced search functions. There are good arguments for using PDF that I won't go into here, except to observe that standards and rules are inflexible, and unless used carefully, can stymie rather than promote progress. Most significantly, I'm not aware of any evidence that any of these costly and difficult rules and standards, whether ISAD-G, DACS, or EAD, improves access to archival resources on-line. Put in another way, description rules are solutions to a problem that has disappeared. With the advent of sophisticated open search engines, our new agenda should be simplified to a single issue: how well do our finding aids perform on internet? Peter Horsman wrote: "The question of whether a particular kind of finding aid is ISAD-compliant is by itself not an interesting one, and even irrelevant." I agree with him completely. Let me turn now to the question of how well we are doing at delivering documents to researchers. To begin, researchers often need to find us on the street, and archival web sites certainly help readers find our facilities, hours, phone numbers and the like from anywhere in the world. Certainly this is a good beginning. Archivists also have long sought ways to deliver surrogates of documents to researchers. There are several reasons for this - to serve users who cannot come to the archives, to save wear-and-tear on fragile originals, and even to improve efficiency for access to heavily-used records or those that might be physically difficult to handle. In the past, documentary publications, first letterpress and later microfilm or microfiche, have been successful tools. Initially individual significant documents were selected for publication, but over time more and more series have been reproduced on microfilm in their entirety. Similarly in the digital age, many institutions are making selected documents and sometimes entire series available via internet. Although web publications fundamentally are the same as their analog predecessors, evidently the presentation medium is substantially different. Notably, on internet, especially with underlying searchable text, individual documents can be found quickly and directly without costly indexes for study in any location. But this flexible access has problems as well. Documents reproduced in print or on microfilm were physically captured in a specific, rational, order. This is not necessarily the case on-line where the medium encourages viewing each document as an item, out of sequence and out of context. Web sites also can change without trace, so that tracking a document can be mystifying. Footnotes are particularly challenging. Again we should ask how well we as archivists are serving users in this new medium. In response, I believe that there are enough superb examples to suggest that our colleagues are doing very, very well, although there are still ways to improve. Using a significant grant, the Archives of American Art has placed their entire collection of artist Joseph Cornell's papers on-line - all 33,000 pages. The result is quite amazing. By using a split screen, the order of the documents is shown in its established sequence on the left, folder by folder, front and back, as if on microfilm. Tabs above show the document folder making citation and identification straightforward. At the same time the full document can be examined in detail in the main frame. It seems to me that this presentation effectively takes advantage of the good things about digital access while diminishing some of the problems. In another example for the On-line Archive of California, certain difficult-to-access documents selected from a larger series are reproduced on-line. Images are matched with the hierarchical finding aid, so that the context is never in doubt. This is another good example. but there are others that are not so successful. In a typical instance of a problematic web presentation, a historical photograph appears on-screen in a vacuum. There's no way to know whether there are related documents or supporting information. My point is not to criticize, but instead to observe that we are still learning about the pitfalls and potential of this new environment. There is an ever-present danger that materials will be seen out-of-context and diminished in meaning. We need to understand the new environment well enough to avoid this pitfall. Large digitization projects also raise other concerns, since they can be costly to create and challenging to maintain. Digital images of documents are as fragile as any other digital object. For this reason, at the same time that continued quality internet publication seems an excellent thing, it is not universally affordable, or even perhaps wise given competing demands for limited resources. Each repository will need to make its own decisions based on individual circumstances. Digital access is likely to increase exponentially enlarging the world of archives users, but it is unlikely to replace the research room - at least in our lifetimes. Archives thus will continue to directly serve users by providing on-demand research copies of documents for reference, for use in classrooms, and for publication. In this area the wide availability of inexpensive table-top scanners and digital cameras has infinitely simplified and improved our services, although it does not necessarily help us reach new users. With this new technology, repositories - or even researchers themselves - can make quick, color reproductions at little cost. Thumbnail images can be distributed via internet, replacing photocopies, and higher-quality scans can make excellent reproductions in print or on internet. The era of the chronically under-funded and overtaxed repository photo lab is over. I think that this is a good thing. On the other hand, digital copies come with their own problems, since the very virtues of digital copies make them difficult to control. Now images can be copied, altered, and recopied until their original source and context are completely lost. Furthermore, in this era of Photoshop, any historical photograph is easy to alter, leading to misunderstandings of all kinds. So while on-demand digital copy services are an impressive advance, the ease with which digital copies can go astray or represent something that they are not is an on-going concern and something we as a profession should address. A corollary is that uncertainties concerning copyright ownership and other rights and permissions in the digital age also complicate efforts to assist users. These issues need our attention, even as the technologies themselves dramatically help us serve users in almost every way. Finally we must realize that evaluating our effectiveness in serving users will be one of the continuing challenges for archivists in this new internet era. Recently there have been some efforts to develop standard metrics for measuring research services. In my view these are doomed to irrelevance from the beginning. Measuring internet use is notoriously difficult, and the variation among users, repositories, and records, and the imponderables introduced by extremely rapid change, makes any numerical evaluation essentially meaningless. On the other hand, although statistics won't work, we need to continually evaluate our assumptions and performance, using common sense approaches and asking the right questions. Users will themselves always be our best source of information, and their comments and observations in reference interviews and debriefings, as well as internet feedback that is solicited and unsolicited will offer important insights. I also believe that archivists should themselves become researchers from time-to-time. It's amazingly informative to be on the other end of a reference exchange, and even a few hours spent using finding aids and documents on-line can be infinitely instructive. We also must look to the future. Predicting is impossible, but perhaps some educated guesses might be of interest. To begin, in my view as search technologies become ever more sophisticated, archives will have strong incentives to reinvigorate traditional context-based description - our old inventories and registers. At the folder level these give users deep, potentially searchable, access to documents efficiently and cheaply, especially on-line. Archival websites also will almost certainly accumulate greater richness with on-line exhibitions and films, and geometrically larger numbers of digitized documents. With this improved on-line access more of our researchers will be in distant places.The character of communications between archivists and users also is likely to change as new web technologies make searching more intelligent, repeatable, and interactive. Whatever the future brings, we need to use technologies in ways that maintain the essential context of documents and support the principles of archival work. Within this dedicated profession, I have no doubt that we will be successful. |