Nereida Cross
Helen Jarvis

School of Information, Library and Archive Studies
University of New South Wales, Australia
Sydney, NSW 2052, Australia

This paper will be looking at applying new technology to preserving, recording, accessing and displaying documents, photographs, archives and geographic information. This was carried out as part of Yale University's Cambodian Genocide Program (CGP), an international research project aiming to locate, document and preserve all existing evidence relating to the possible commission of war crimes, genocide and other crimes against humanity during the Khmer Rouge regime of Democratic Kampuchea between April 17, 1975 and January 7, 1979. Funding and support has been provided by the United States government, the Royal Government of Cambodia, the Royal Government of the Netherlands and the Australian government.

CGP has created four databases for bibliographic, biographic, image and geographic data. These databases are held in the United States, Cambodia and Australia. Detailes of these databases as well as access to these databases via the Internet are discussed.


In this paper I will look at applying new technology to preserving, recording, accessing and displaying documents, photographs, archives and geographic information. This has been carried out as part of Yale University's Cambodian Genocide Program (CGP), a project I have been working with for the last three years.

The CGP is an international research project aiming to locate, document and preserve all existing evidence relating to the possible commission of war crimes, genocide and other crimes against humanity during the Khmer Rouge regime of Democratic Kampuchea between April 17, 1975 and January 7, 1979.


The CGP program began with legislation, the Cambodian Genocide Justice Act, being passed in the United States Congress and signed into law by President Clinton in April 1994. It consists of two parts:

1). Documentation survey and index, and

2). Legal/historiographical research and training.

The contract was awarded to Yale University with Professor Ben Kiernan of the School of History, Yale University, as the CGP Director. Initially the project was funded for two years by the US Department of State, but this has now been extended to the year 2001. Additional funding and support has been provided by the Royal Government of Cambodia, the Royal Government of the Netherlands, the Australian government, the Henry Luce Foundation. Dr Helen Jarvis, from the School of Information, Library and Archive Studies, University of New South Wales was invited to join the Yale bid as the Program Consultant for Documentation, and I have been working with her on the documentation aspects of the project. There are three nodes for this work - Yale University (History & Law), the Cambodian Documentation Center (DC-CAM) in Phnom Penh, and our School in Australia.


By the very nature of this project we are dealing with evidence in many formats -- documents, photographs, archives and geographic information. The Cambodian Genocide Data Bases (CGDB) consist of four databases for bibliographic, biographic, image and geographic data.


A biographic database was compiled using CDS/ISIS and CDS/ISIS for Windows(WinISIS) software, and links are provided to text documents and images bitmap scanned in GIF format. We have used international cataloguing tools and the UNIMARC format for the bibliographic database, which includes archives, prison records, forced confessions data. We have also included additional fields relevant for our data, such as specific fields for codes developed by the Human Rights Information and Documentation Systems (HURIDOCS) and for locations developed by the Cambodian Department of Geography. The CGP Tuol Sleng Image Database (CTS) is an index to over 5000 scanned photographs from the Tuol Sleng prison. The databases are held in the United States, Cambodia and Australia.

Some of the issues involved are:

Applying international standards to describe items in different formats;

integrating bibliographic records for archival material with library/published items (many of the documents are unpublished - letters, reports, confessions and "biographies");

providing the necessary fields, techniques and tools to allow a thorough analysis and coding of the items to ensure maximum retrieval with regard to genocide and crimes against humanity;

incorporating biographical, photographic and geographic material;

multiple languages and scripts.

As the database was to be used internationally and the resulting information to be publicly available, we wanted to use internationally accepted formats. The structure of the bibliographic database follows UNIMARC, Universal MAchine-Readable Cataloguing . UNIMARC is used in the National Libraries of Cambodia and Vietnam as well as in many other Asian and European libraries and documentation centres. In 1995 when we started the project, UNIMARC had no formal supplement to deal with archival materials, unlike USMARC which incorporated the fields from the USMARC Format for Archival and Manuscripts Control. However, just as the project was starting, the additional fields required for archival description, such as provenance, were added to the UNIMARC format in draft form. As the items we are dealing with are in several languages, mainly Khmer, English and French, and in Roman and Khmer scripts the issues of multilanguage and multiscript content had to be addressed. UNIMARC provides for multiple scriptss allowing any field to be mirrored in a different script.

We decided to use the database software CDS/ISIS, initially mini-micro CDS/ISIS version 3.07 and later WinISIS. CDS/ISIS is an information storage and retrieval software package developed by UNESCO. It is already used in the National Library of Cambodia and in some libraries in Cambodia. We also conducted a course including CDS/ISIS training for library personnel in Cambodia Dec 1994/Jan 1995. CDS/ISIS is widespread in Asia and the Pacific as it is powerful, easily obtained through national and regional distributors designated by UNESCO, and is free for nonprofit use.

One of the advantages of CDS/ISIS is its capacity to display and work in different scripts. There are Arabic, Vietnamese, Chinese and Thai versions of the software. However, when we were starting the project and inquired on the development of a Khmer version, we were advised to wait as the Windows version was imminent and the Khmer fonts running under Windows, already being used in word processing, would be available. However, that release date still has not been reached 3 years later. The program is still in beta testing and the data entry module was not available for data entry until mid 1996 in a limited way, and not fully incorporated within WinISIS until the end of that year. Unfortunately, the use of the Khmer font was not even then immediate. Two conversion tables mapping the character positions had to be written so that Khmer letters mismatched by the WinISIS Ansi-to-Oem conversion could be displayed correctly and extra display formats written to call up the different fonts. Now that we can input Khmer, we have had to go back and edit records already entered. If the document was originally in Khmer, the title, the names, a summary and the place names are all provided in Khmer, with translations in English (summary & title) and transliterations (names & place names) also provided. This is in addition to providing a link to the image of the Khmer document itself. All Khmer documents considered significant are being scanned.

We also wanted the ability in CDS/ISIS to include or call-up from the bibliographic references, full text documents, pictures and other files from outside the program, as would be desirable in a database for genocide research. Some of the bibliographic records in the database are linked to the full scanned text or image of the original documents. A Pascal program, SHOWBI was specially written based on UNESCO's SHOWP, which we already knew was available, to accomplish this in our databases. SHOWBI enables a list of the scanned pages for the document in each language, or a non-text document (eg photograph) to be displayed for selection from each record. SHOWBI is used in the DOS version, while in the Windows version this is achieved by a different approach.

Apart from bibliographic description where international standards, such as the Anglo-American Cataloguing Rules (AACR2) and archival description rules, are strictly adhered to there was a need for additional information and codes unique either to Cambodian research or to genocide research. Those developed by HURIDOCS were very useful, such as thesaurus terms for the types of crimes committed in Type of Event and Appendix B: List of Index Terms, physical descriptions and occupations of people in List of Physical Characteristics and Appendix H: List of Occupations, and codes for international conventions in Appendix L: List of Instruments . A field was added for the geographic classifications down to village level for the Khmer Rouge zones, regions and districts, as the boundaries and names are different from the current provinces and districts. For the current period we use the Gazeteer Codes developed by the Cambodian Department of Geography and the Geographic Area Codes to province level developed in Cambodia for use in Cambodian libraries. Liason continues with the National Library of Cambodia and the emerging Cambodian Library Association for the latest developments in national standards.

Also the Gazateer codes have been incorporated into our naming conventions for scanned and text files which refer to specific places. As can be appreciated with so many image files our file naming conventions are very important in organising and control.


For the image database we started with over 5,000 photographs from the Tuol Sleng prison and interrogation centre. We scanned the photographs resulting from a recent project by the Tuol Sleng Genocide Museum staff and the Cambodian Photo Archive Group, led by Chris Riley and Doug Niven, to restore and print the negative film. An index database for the photographs has been developed - the CIMG - Tuol Sleng Photographic Database (CTS). Each photograph has a record giving details such as gender, age, clothing and whether a name or number are present. Sometimes a name is written on the photograph itself, on a placard around the prisoner's neck, or occasionnally chalked directly on them. Again, there is a link from this record to the photograph. As so few photographs have names, we decided when we made them available over the Internet to provide a form visitors to the site can fill in supplying details, if they recognise someone. We plan to link those photographs to the Tuol Sleng confessions already catalogued.

The databases and research results are accessed and disseminated via the Internet, which was officially launced in January 1997. To do this we used freeWAIS-sf, SFgate, Perl and Lynux. The URLs are:

CGP at Yale


Records are displayed over the World Wide Web with Khmer, if the Khmer font is loaded and that option selected. Many records have links to the scanned original Khmer document. As well as providing search capacity for the CGP databases, we are also using the Web for collecting additional information. As mentioned, if users recognise someone they can fill out a form, providing extra information to supplement the often brief CTS record . A CD-ROM version has also been produced. For this we mounted a search only version of DOS CDS/ISIS.


An extra dimension to the project was added in 1995 when Helen Jarvis was awarded a grant from the Australian Government (through the Department of Foreign Affairs and Trade) to undertake a pilot Mapping Component. Visits were made to 100 genocide sites in eleven of Cambodia's 22 provinces. Since then visits have been made to over 200 sites covering 17 provinces . On these field trips satellite coordinates from a hand-held GPS (Global Positioning System) unit are recorded to give the location of the prisons, memorials or graves. With the UNSW School of Geomatic Engineering, the GPS data which includes a brief outline entered as "attributes" on site has been used to build an ArcInfo GIS (Geographic Information System) database, which is queried, displayed, and printed using the map viewer ArcView2. CGP is now able to produce computer-generated maps of the burial sites, prisons and memorials recorded. We are currently adding as "hotlinks" the text reports which describe each site. These are activated by clicking on the relevant point on the map for each site. The ArcView database is held at DC-Cam, at Yale, and in the School. Maps generated from the database using ArcView are also available over the Internet. We hope in the near future, to make the geographic data available direct from our web site through the ArcView Internet Map Server. This would allow searching of the sites online in the maps and the associated attribute tables.

Copies of the databases are held in each of our offices in Australia, USA, and Cambodia. They are kept in sync by forwarding data using ftp, recently now available in Cambodia, and on CD-ROMs and MOD (magneto optical disks).


American Library Association. (1988). Anglo-American Cataloging Rules. 2nd ed. Chicago, IL: American Library Association.

Cashman, Timothy J. (1997). Data Integration for the Cambodian Geno-cide Program.

Sydney, Australia: School of Geomatic Engineering, University of New South Wales.

Documentation Center of Cambodia. (1997). Mapping the Killing Fields of Cambodia, 1997: A Report. Phnom Penh, Cambodia: Documentation Center of Cambodia [DC-Cam].

Dueck, Judith and Aida Maria Noval, comp. (1993). HURIDOCS Standard Formats: Supporting Documents. compiled with HURIDOCS Task Force members. Oslo, Norway: HURIDOCS.

Jarvis, Helen and Loomans, Robert. (1997). "The Cambodian Genocide Program (CGP)/ SITIS" In The Proceedings of Second New South Wales Symposium on Information Technology and Information Systems. edited by Hugo Rehesaar. Sydney, Australia: School of Information Systems, University of New South Wales.

Jarvis, Helen and Cross, Nereida. (1996). "Cambodian Genocide Program," Paper presented at Communications within Asia 20th Anniversniversary Conference of the Asian Studies Association of Australia, 8-11 July 1996, La Trobe University, Melbourne, Australia.

Jarvis, Helen and Cross, Nereida. (1996). "Documenting genocide in Democratic Kampuchea: the Cambodian Genocide Program," Paper presented at Cambodia: Power, Myth and Memory, Monash University, 11-13 December 1996.

Stormorken, Björn. (1985). HURIDOCS Standard Formats for the Recording and Exchange of Information on Human Rights. Dordrecht: Martinus Nijhoff.

UNESCO. (1997). CDS/ISIS for Windows June 1997: Release Notes. Paris: UNESCO.

UNESCO. (1989). Mini-micro CDS/ISIS Reference Manual (Version 2.3). Paris: UNESCO.

UNIMARC Manual: Bibliographic Format. (1994). Munich, Germany: K.G. Saur.

U.S. Office of Cambodian Genocide Investigations. (1995). [Washington]: Bureau of East Asian and Pacific Affairs, U.S. Department of State. Typescript.

Wong, K. (1997). Procedures in the Mapping of Cambodia Genocide Sites. Sydney, Australia: School of Geomatic Engineering, University of New South Wales.