Coming to Terms with the Information Age in Archeology
My thesis in this paper is simple: although I believe we have made real progress since 1994 in our ability to disseminate data effectively, we as a discipline have yet to give serious attention to infrastructural concerns that must be addressed as we seek to integrate more fully electronic means of data distribution into archeological practice. I shall discuss five issues: 1) the reality of funding, 2) the question of audience and ends, 3) the tyranny of standards, 4) the necessity of continuity, and 5) the innovations that will matter most.
As recently as 1994, it was possible to ask the question “Has archaeology remained aloof from the information age?” in all seriousness.1 The answer offered by the author was an affirmative and he argued that although a number of innovative uses of computing had appeared in archeology, most of those responsible for conducting archeological research and ultimately disseminating archeological data preferred to publish in “the conventional manner” — that is, on paper in articles, books and monographs. Perhaps of greater interest was his belief that it was becoming increasingly necessary for those who receive archeological data for storage, such as museums, record centers and libraries, to become innovators and develop new and exciting ways to deliver archeological information to the public. One of his main points was that these institutions, often with little participation by those involved in data creation, had to grapple with the emergence and definitions of standards on how to describe data, the evolution of new and different modes of publication and ways to make information accessible to end users. Although his definition of the “public” is quite limited, his opinions nevertheless are a good departure point for examining just how far the discipline has come in the past five years in delivering archeological information to its various constituencies.
The Reality of Funding
Everyone knows that computing power, software, storage and bandwidth have become much cheaper over the past decade. We still have to buy all of it, of course, and often we find ourselves being pushed to keep up with extremely rapid changes in information technology that threaten to make our existing investment obsolete. An often ignored or overlooked cost of computing, however, is that associated with the deployment of a system of IT support that keeps all of the component parts organized and in good working order. Staff generally translates to “people,” and as numerous commentators have observed, it is increasingly difficult to find and retain IT staff.2 This problem shows no signs of abating. Other costs of IT support include those related to data acquisition and evaluation, entry and metadata construction. 3
These costs of computing create a real dilemma in archeology, because by any measure we are not a well-funded discipline. Our available resources are generally spent on basic research, analysis, curation and heritage management. Despite a growing recognition that possession of an effective IT strategy is required for individuals as well as organizations, finding resources to implement the strategy is very difficult. Childs (this volume), describing Federal archeology on the Internet, makes a telling point: “The bottom line is that most Federal agencies with archeology programs do not have a formal Web design and development infrastructure for cultural resources, let alone archeology…Regular maintenance is difficult and new product development without extra help and money is even more difficult.”
This scenario is replayed in academic contexts. At my university, I have observed that while it is only moderately difficult to get new equipment, it is almost always close to impossible to get new IT staff. This is at least partially reinforced by our funding agencies. In the Social, Behavioral and Economic Research directorate of the National Science Foundation, which hosts archeology, there are a number of competitions that support the purchase of new equipment for both research and educational ends. But there are no competitions that finance the addition of staff aside from those that support laboratories, a topic I will return to below. While it is easy to make the case that NSF is not in the business of supporting infrastructure, this acknowledgment does not make it any easier to achieve our goals.
The reality of funding, then, is this: if digital dissemination of archeological information is to become a reality, the true and full costs of supporting an IT strategy for individuals and institutions must be recognized. At a minimum, such a strategy must involve regular infrastructure — hardware, software, bandwidth, etc. — and upgrading as well as maintaining an adequate level of support staffing.
However, if we are already stretching our limits in terms of current levels of support, how will this be done? We either find new resources or we reallocate existing resources to meet these new priorities. Although we could maintain the status quo — get done what can get done given what we have, rely on the goodwill of volunteers, etc. — so doing will mean that we will fall even further behind what might be useful and desirable to achieve our ends. And while I am always in favor of lobbying for new resources, it may be more realistic and practical to reassess our priorities. To do this, we must then consider our audience and ends.
The Question of Audience and Ends
The question “What are we disseminating and for whom?” should dominate our consideration of priorities. We all know that a data archive is developed for different ends than is a Web site created for a classroom. But one of the seductions of digital publishing is the belief that with just a little more effort, material presented to one audience or forum can be transformed into material for another. While it may be the case that from a purely technical perspective information can be reformatted and made available in a different package, this does not mean that the repackaged information is necessarily useful to that new audience. The analogy to print publication is clear: an author might choose to prepare a scholarly monograph for his colleagues, and it is thus written in a language and presented in a manner that reflects the expectations of that audience. However, it requires a separate intellectual act to write a book suitable for the educated lay public. Why? Because the public’s expectations for content, style and presentation are very different from those of specialists; a mere repackaging of the monograph is unlikely to achieve the desired result. And while I grant that things difficult to do today will become tomorrow’s routine task, the desire to do more will always be a tension since technology also will continue to offer ever greater possibilities.
Another growing conviction is that since technology offers us storage capacity — or virtual reality, complex three-dimensional renderings, high capacity bandwidth or other IT marvels — undreamed of even ten years ago, we should strive to use it in every instance. Recognition of this temptation arose, for instance, at a recent workshop on digital publishing sponsored by the Digital Archaeology Laboratory at the University of California at Los Angeles’ Institute of Archaeology, during a discussion about the ideal content of a digital monograph. Some argued that a digital monograph could contain “all of the data” generated by a project and that it would ultimately represent a kind of archive offering both interpretation and data presentation. Others present were less sanguine, and suggested that such a monograph would serve neither end particularly well. If one wants an archive, there are better models for it than a monograph format. Although all present agreed that a digital monograph could be anything its authors wanted it to be, a general consensus emerged that defined a digital monograph as a traditional, but value-added product, the production of which was contingent upon the demands and expectations of the audience. Here “value- added” refers to the extras, like sound, color image presentation and innovations in the mode of navigation, that technology allows us to add to our books. But the point remains: our products — Web sites, archives or monographs — must be content, not technology, driven.
This brings us back to the reality of funding. Audiences for many types of archeological information are very small, in some cases no more than a handful of specialists worldwide. Complex data management projects in these instances, such as making digital large quantities of field notes and records from older projects, are very expensive and can thus be viewed as terribly wasteful of resources since so few benefit. Does this mean, then, that only those projects likely to reach large audiences, such the Ancient Architects of the Mississippi Web site,4 should be digital? Surely, the answer must be “no” since we have an ethical obligation as archeologists to publish and preserve our data. But must we preserve it digitally? Strong arguments can be mustered for both pro and con. However, given current priorities that emphasize basic field research as opposed to collections-based projects, digitizing very large collections of original notes, records and other data that only appeal to small audiences will be difficult to justify.
The Tyranny of Standards
The computing world is a world of standards. Although I imagine that it would not be impossible to chart all of the standards that govern IT, it would be a Herculean task. Standards make computing possible because they facilitate communication — without standards, there would be no communication. Standards govern everything — from how words, sounds and images are translated into electrons, how those electrons are routed across networks, how the electrons are stored by media as disparate as floppy disks through flashcards through bubble memory and how the electrons are printed onto paper or other media. Standards govern how to describe data, and how to link data together so that other sets of standards recognize them. Standards, then, govern every aspect of IT from the most technical domain of hardware design and implementation to the broadest level of content definition and identification.
National and international bodies of all kinds — not to mention trade associations; business, academic, and interest-group consortia; and even individual companies — all define and then attempt to codify standards. Some standards have the imprimatur of government and are widely followed. Other standards are created by treaty agreements between nation-states as a part of their trade negotiations. Still other standards are less formal, but no less powerful. Consider the near-monopoly of the Windows operating system for desktop computers. No organization has forced the user to buy Microsoft, but instead, computing in North America has evolved to fix Windows as a de facto standard. Similar de facto standards of the recent past have been the 3.5-inch floppy disk (now being replaced by a number of rival candidates), CD (soon to be displaced by DVD) and the QWERTY keyboard (still going strong).
The tyranny of standards has two facets: 1) you do not control them, yet you must work with them as they dictate, and 2) they will change, and you will have to change with them. Both of these conditions lead back to the inescapable reality of funding: it is expensive to keep up with standards, but you have no choice if you expect to communicate. This is not a question of buying the “latest, fastest, greatest CPU” or keeping up with the Joneses, but a simple matter of keeping abreast of change and coping with it. Consider this example from the world of metadata standards:
“After conducting an analysis of the scope of the project, it was determined that database modeling should focus on the specific metadata requirements necessary to support the Federal Geographic Data Committee’s (FGDC) Content Standards for Digital Geospatial Metadata (CSDGM), the Directory Interchange Format (DIF), the Government Information Locator Service (GILS), and the Earth Observing System Data and Information System (EOSDIS) Information Management System (IMS) data order requirements. As a secondary exercise, the data model will be extended with some additional content elements from the Anglo-American Cataloging Rules (AACR)/Machine-Readable Cataloging (MARC), and both MARC and Standard Generalized Markup Language (SGML) will be incorporated as data element tags into the data model.”5
This statement was issued by CIESIN, the Consortium of International Earth Sciences Information Network, a not-for-profit, non-governmental organization established in 1989 at Columbia University to “help scientists, decision-makers, and the public better understand their changing world. CIESIN specializes in global and regional network development, science data management, decision support, and training.”6 Note that CIESIN itself does not create the standards for metadata per se. Instead, they coordinate projects that are required to use the standards set by others in order to comply with the terms of contracts. In the quoted example, CIESIN is describing the construction of something called the “Unified Metadatabase,” used to describe the spatial features of demographic, health status and other data, and which in this case uses standards set by three US Federal government programs (FGDC, GILS, and the EOSDIS IMS), a de facto data exchange format standard that was created by NASA (DIF), a Library of Congress-supported library coding system that has broad-based English language and academic support (MARC) and a text markup language accepted by the International Standards Organization in 1986 (SGML).
I hope the point of this example has not been lost in the maze of acronyms. It takes time, resources and staff to keep up with ever-changing standards and investing resources to stay current with standards may well diminish resources destined for basic research. Changes in hardware, software and communications standards will require investment in infrastructure and the development of a planning process. We should not, however, ignore the emergence of metadata standards because, in the long run, knowledge of and compliance with these standards will structure every digital means of information distribution likely to appear in archeological practice. The question of how shall we seek guidance leads to the next issue, that of continuity.
The Necessity of Continuity
Until recently, the question of continuity has been relatively straightforward for archeologists concerned with their ethical obligations for data dissemination. The relevant repositories have been the library (which houses the monograph), the archive (which houses the unpublished records) and the museum (which houses the objects of study). Digital publishing, however, changes all of this. Where does your Web site go when it’s time for it to retire? Do you keep a copy as a record? Should you? Who keeps your database online? For how long? And who will pay for the costs of keeping up with changes in standards? You? The organization? The government? No one?
Answers to these questions are emerging very slowly in archeology, but we can outline some of the directions our field is likely to take. Although costs will always constrain what we do, continuity in digital publishing depends more on what it is we are publishing and what kinds of obligations we have to preserve these products. Some things probably don’t deserve preservation in perpetuity — classroom Web pages are a good example of ephemera. But we’d like to preserve most other digital products, like primary data records, monographs or other published works.
The preservation of a digital monograph is a good place to begin our discussion. As I noted above, such a monograph can be anything the author wishes and given existing technology, such publications can contain interpretations and all of the primary data the author sees fit to include. At present, the only two digital formats that could accommodate this enhanced vision of a monograph are either CD or DVD “books” and Web sites. I know that a book, if published with acid-free paper and kept under reasonable conditions, will last many decades. A CD, however, has a much shorter shelf life, perhaps only two decades at most. Even if CD readers are still available in three decades, will someone be able to “read” my CD book?
Changes in hardware and software standards may even affect content. If I have chosen to publish my data records in table format, it is likely that some program will exist to translate my data into a new format, although it is likely that the original formatting will be lost. But if I represent my data by means of some complex construction, such as a three- dimensional rendering of a site or building, there exists the real possibility that changing standards may well make it impossible to capture that rendering accurately. That part of my book, then, will be lost even if some entity exists to migrate my original work into a new form.
Web sites offer similar problems. We really have no idea what ‘in perpetuity” means with regard to a Web site, but if modern Web practice is any guide, the frequency with which links to pages are returned as non-existent does not bode well for long-term maintenance of complex sites. Having greater storage capacity at the site and more bandwidth available to display more complex images or information more rapidly does not guarantee longevity.
How continuity is achieved is scale-dependent and closely related to the type of digital document produced. Individuals must consider their own IT strategies. You are ultimately responsible for migrating data to new formats and platforms if you choose to store them digitally. Likewise, it is your responsibility to keep your digital archives of papers online should you wish to do so. However, as we move away from the desktop, individuals begin to lose control of their products and it is here that existing institutions must be adapted to new ends or wholly new organizations developed to ensure continuity.
An existing institution undergoing significant reorganization to meet the digital challenge is journal publication. Publishers are experimenting with a plethora of organizational and business models to implement a digital dissemination scheme. The Institute of Electrical and Electronics Engineers, for example, has eschewed print publications altogether and has an ambitious timetable for migrating all of its journals to an online format by the start of the next decade. Traditional publishing houses have created hybrid models, where both print and digital copies of journals or in some cases individual articles are made available through different subscription formats. Not surprisingly, libraries have become concerned with how they maintain their place as repositories of journals as these changes take place. Internet Archaeology is the first major archeological journal in a wholly online format.7 Sponsored by a consortium of British universities, the journal is presently free of charge and is distributed through servers maintained by the consortium. Subscriptions are planned, which may mean a change in the form of distribution as well. Regardless of what we publish, however, continuity requires that some organization will assume the costs associated with access to a product and will provide the resources for continuous migration of the products as standards change.
The maintenance of digital data archives merits special consideration. In this instance, I mean the term “data” to refer to primary records that describe field contexts, objects, images, and other products of field and laboratory studies. As we all know, archeology is a destructive process, and we rely almost wholly upon the paper records, photographs and drawings of the specific field contexts from which the objects of our study have been taken. Without context, we have in a real sense no data. If we embark on a program of digital capture and permanent archiving of these data, we have to guarantee with whatever certainty we can muster that we will in fact maintain these records in perpetuity. This is certainly the implicit promise in a paper-based archive, although it is clear that because paper has longevity, it has been possible in great part to defer the upgrade of the paper products to new formats. But we know all too well that our digital products have a much shorter shelf life and the making of the promise has very tangible and near immediate impacts.
Digital archives, then, must have a very real organizational imprimatur that guarantees continuity. Otherwise, if we lose these data, they are lost forever. Many groups have been concerned with these problems and a few sources of useful information include Beagrie and Greenstein,8 which contains a number of excellent case studies on how archiving institutions must develop effective IT strategies, Day,9 which includes a comprehensive bibliography on digital archiving and Conservation OnLine,10 an online catalogue of information for professionals in the digital archiving field that has much to offer anyone interested in the topic.
What kinds of institutions currently support digital archives? These range from international scientific organizations such as the International Council of Scientific Unions, which supports the World Data Centers — which is itself composed of national and academic bodies that now warehouse primarily geological, geophysical and similar data — governmental agencies of all kinds at all levels — from the local to the global — professional societies and avocational groups. Among the organizationally supported data archiving projects in archeology are the National Archaeological Data Base11 — supported by the National Park Service — the Archaeological Data Archive Project — supported by the Center for the Study of Architecture at Bryn Mawr University12 — and the Archaeology Data Service13 — sponsored by a consortium of universities, museums and government programs in the United Kingdom.14 Although these organizations are a good start, it is clear that archeology is far behind most disciplines in the way in which it has approached digital data archiving and dissemination, and this in turn leads to my final point, which is concerned with the significant innovations.
The Innovations That Will Matter Most
If archeology is to join the information age in a serious way, we must see innovation in three domains: data acquisition, education and perception.
Basic field recording techniques in archeology have changed very little over the past 100 years and involve the use of a combination of paper forms, notebooks, graph-paper drawings and standard 35mm and large format photography. While these techniques are reliable, they are very limiting, especially as one moves from the field to analysis into data publication, presentation and archiving. Field drawings must often be redrawn and digitized by hand for integration into advanced geographic information systems. These same field drawings must also be linked by hand to computerized databases that describe their contents. Handwritten field notes rarely are transcribed and searched electronically for information and forms, while they always contain important information, have to be summarized and described and their content re- transcribed into other paper or possibly digital records. Field notes and forms are searched visually by flipping through ring binders or file folders. Slides, prints and negatives can be integrated into databases, but it is difficult to integrate them easily into sets of field drawings and maps in a consistent manner. And while many archeologists have begun to digitize these data so that modern IT tools can be used to examine them in a more rapid manner, the costs of this post-hoc approach are very substantial and further, they tend to introduce new sources of error into these primary data. Indeed, many archeologists have come to believe that traditional field recording methods substantially slow the pace of analysis and certainly the publication and archiving of the results of field research.
With the advent of pen computers, digital theodolites, and other in-field data recording devices, many of the technical challenges of digital primary data recording are now at least surmountable and as these devices improve — as they inevitably do — the costs of so doing will diminish as well. This implies that in-field digital recording of primary archeological data can become standard archeological practice and while it may not replace paper in all circumstances, it will certainly become more widespread. However, this should only occur if students are trained effectively in these technologies and if the use of these technologies is seen as only part of a comprehensive IT strategy and not simply an end in itself.
As I pointed out in a recent review paper on quantitative methods in archeology,15 we as a discipline need to focus more of our attention on how we train our students to use these methods. This argument applies with equal force and relevance to IT. Learning how to use a computer or specialized software is not the same as understanding how to integrate IT effectively into the research process. Although there are many students who learn how to use various types of IT on their own, I have argued that if we want to train true innovators, we will have to provide the appropriate educational context to do so. One such program is the M.Sc. in Archaeological Computing at the University of Southampton.16 Topics covered include digital drawing and imaging, database systems and geographic information systems. Programs like this will have to become more numerous if we expect to have our “own” experts adapt IT to archeological ends. All of this, unfortunately, costs money, and from where will it come?
This, then, brings us to the final innovation that will matter most — a modification of our funding policies. Frankly, I don’t expect that this will happen, but I think it is a message that needs to be delivered regardless of the reception. If digital dissemination of information is to become commonplace in archeology, we must build at every level — from the individual to the global — an effective IT strategy. This will cost money and unless we get new resources, we must create the political will to re-allocate existing resources. Individuals have their part to play in this, but as I have argued, if we are to have real continuity in data dissemination, we will have to look to the creation of organizations that will be charged with these responsibilities. Leadership is required at all levels and we can attempt to enlist the services of the Society for American Archaeology, the Archaeological Institute of America, the National Park Service and the National Science Foundation. The latter two may be the source of funds, but we will need the backing of our professional and academic societies to provide part of the push. NSF has already begun to fund digital data dissemination projects that target archeology, but we need to see more of these.
I also believe that the development of a new conservation ethic for existing data much as has developed in our field for the preservation of archeological sites themselves must be a necessary part of this strategy. Indeed, the archiving of all data records — whether digital or not — is taking on new importance as many professional and academic societies of archeologists, such as SAA, AIA and the Register of Professional Archeologists, among others, promulgate strict archiving standards from an ethical perspective.17 The demands of digital preservation, however, create even more urgency for such standards to be created. Once again, we can begin with individuals who take their ethical responsibilities seriously and will either develop their own IT strategies or will deposit their data with existing organizations. This will require a change in mindset, obviously, as well as the reward structure in both academic and professional arenas and we should begin to get equal credit for preserving, as well as publishing, the results of our labors.
Back to Booth. We’ve not remained aloof from the information age, but we still have far to go to recognize what the full implications of going digital will be to our field. I am certain that we have a digital future, but what form it will take is still not clear to me. The optimist in me sees NSF-funded national centers for digital data preservation; the pessimist sees the Dead Media Project page on the Web.18 You choose.
1. B. Booth, “Has archaeology remained aloof from the Information Age?,” In Computer Applications and Quantitative Methods in Archaeology 1994, eds. J. Hugget and N. Ryan (Oxford: British Archaeological Reports, International Series 600, 1995), 1-12.
2. “Current Issues for Higher Education Information Resources Management.” CAUSE/EFFECT 20.4 (1998): 4-7, 62-63. www.educause.edu/ir/library/html/cem9742.html.
3. V. Canouts and S. T. Childs, “National Databases: Promise and Postscript” (Paper presented at the Society for American Archaeology Annual Meeting, Seattle, 1998).
A. Wise and P. Miller, “Why Metadata Matters in Archaeology,” Internet Archaeology 2 (1997). intarch.ac.uk/journal/issue2/wise_index.html.
5. CIESIN Metadata Guidelines www.ciesin.org/metadata/TOC/init.html. 1998.
7. M., J. Heyworth, A. Vince Richards and S. Garside-Neville, “Internet Archaeology: A quality electronic journal,” Antiquity 71 (1997): 1039-42.
8. N. Beagrie and D. Greenstein. A Strategic Policy Framework for Creating and Preserving Digital Collections. Arts and Humanities Data Service. 1998. ahds.ac.uk/manage/framework.htm.
9. M. Day. Preservation of electronic information: a bibliography. 1997. www.ukoln.ac.uk/~lismd/preservation.html.
12. H. Eiteljorg, “Electronic Archives,” Antiquity 71 (1997): 1045-57. csaws.brynmawr.edu:443/web1/adaparchive.html.
14. J. Richards, “Preservation and Re-use of Digital Data: The role of the Archaeology Data Service,” Antiquity 71 (1997): 1057-59.
15. M. Aldenderfer, “Quantitative Methods in Archaeology: A Review of Recent Trends and Developments,” Journal of Archaeological Research. 6.2 (1998): 91-120.
16. www.arch.soton.ac.uk/Prospectus/ Computing/
17. N. Parezo, and D. Fowler, “Archaeological Records Preservation: An Ethical Obligation,” in Ethics in American Archaeology, eds. M. Lynott and A. Wylie (Washington, D.C: .Society for American Archaeology, 1995): 50-55.