A NEW FRONTIER IN CHINESE STUDIES: DATABASES AND SCHOLARLY RESEARCH

Yeen-mei Wu

East Asian Library
University of Washington
Seattle, Washington 98106

Keywords: Information Retrieval, Retrieval, Database, Chinese Database, Chinese Studies, Scholarly Research, Research, Network, Online Searching, CD-ROM, Full-Text.

Abstract: Databases are fast becoming an important resource for research in Chinese studies. They make Chinese texts -- few of which have indexes -- more readily accessible. Realizing the extraordinary value for researchers in Chinese studies, research institutes in Asia, Europe, and the United States have produced a number of databases. Unfortunately, many of those databases are not widely known to researchers in the field, much less utilized by them, because there are no bibliographies to make them accessible. This paper is a study of the development of databases for information retrieval service in general and the network environment for Chinese studies in particular, and a survey of databases for Chinese studies.

1. INTRODUCTION: INFORMATION RETRIEVAL AND DATABASES

Information processing and retrieval in the libraries using the computer began in the early 1960s, and libraries information networking started in the early 1970s. Although machine-readable form became available in the mid-1960s, terminals were physically remote from the computer storing the data and used by intermediaries (e.g. librarians, information officers) on behalf of their users. Not until recent years have end-users been directly involved in searching data themselves. In Chinese studies, most services for database searches still depend on intermediaries.

Because of the exponential growth of information, particularly in the scientific/technical field and current news services, many bibliographical indexes in printed format are now provided "online" only, with the printouts provided to specially tailored subject requests. Some databases are now also presented in other forms, such as floppy disc, CD-ROM, and are connected directly to the terminals. Therefore, libraries and information service centers, even the end-users, may install these software databases in their workstations and search "online" instead of using remote phone lines.

Since the late 1980s, many libraries have been developing their institutional network systems to integrate their bibliographic utility. Once they simplify the different interface setups for various databases in the network, libraries will be able to provide better and more economical services to end-users with a centralized campus network. For example, UWIN (the University of Washington Information Navigator) is a menu driven electronic interface, designed to become a distribution center for information concerning the university community as well as a resource center which allows users to have access to a variety of bibliographic databases via the campus network and the international Internet.

In the case of "online" Chinese language databases, the services are still mostly provided by intermediaries. Chinese language databases require different kinds of hardware and supporting software. The Twenty-five Dynastic Histories Full Text Database at the East Asia Library (EAL) of the University of Washington (U.W.), requires special hardware to display the full character set in use. Software development utilizing special X-window and special fonts is required for integration on U.W.'s TCP-IP based campus network. As the database is currently located on a non-networked computer, all remote access is accomplished through modem connection. At present, researchers may conduct searches at EAL, U.W.; or via remote dial-in method provided that they have the necessary hardware setup; or they may request that EAL does the search for them.

2. DATABASES OF CHINESE TOPICS IN THE WESTERN LANGUAGE ENVIRONMENT

Nearly all databases of online information systems concerning China developed in the West are in Western languages and emphasize science and technology, business, law and current events, such as DIALOG, LEXIS, NEXIS. To date, very few databases developed in China are covered by Western online service. There are a few exceptions. One is CHINALAW, produced by the Compu-ter Center of Beijing University's Legal Department. The West Publishing Company has translated CHINALAW into English and added it to its database WESTLAW. Another is RLG RLIN's "Index to foreign legal periodicals," which covers periodical indexes to six Chinese language legal journals from China and one from Taiwan.

In libraries' online catalog networks, OCLC-CJK and RLIN-CJK have had the capability to display the vernacular characters since the early 1980s. Since January 1992, OCLC and RLG have established a direct exchange program of CJK bibliographic records, making it possible for users to obtain more complete bibliographic data for a variety of purposes. There have been efforts to launch special projects: OCLC's recent project to catalog Chinese books published in the Republican Period (1911-1949); and RLG's International Online Union Catalog of Rare Chinese Books and Manus-cripts Printed Before 1976.

Most of the institutions in the United States have developed their local library systems and integrated their CJK records into their bibliographic data since the 1980s. However, most of the local library systems display CJK records in Romanization only, which are inadequate for users to identify exact bibliographical data. Innopac system recently developed a software package to allow libraries to display CJK characters. Several libraries have already adopted the Innopac system.

3. CHINESE DATABASES AND THE SPONSORING INSTITUTIONS

Realizing the importance of and growing demands for databases, many universities and institu-tions in China, Taiwan, Japan and Hong Kong have enlisted their own computer centers to assist in the production of databases for Chinese studies. The following are some of the major institutions involved in such activities with noticeable accomplishment.

• Academia Sinica, Nankang, Taipei

The Computing Center of the Academia Sinica (CCAS) has played a key role in the development of databases for Chinese studies. CCAS's missions are : 1) to improve the research environment and productivity of the Academy; and 2) to promote office automation of the Academy. One of its chief activities is to assist research in Sinology by undertaking joint projects with institutes in the humanities and social studies in Academia Sinica in data processing. Between 1984 and 1990, the Computing Center and the Academia's Institute of History and Philology cooperated in producing The Twenty-five Dynastic Histories Full Text Database which was the first and is so far the largest database for Sinological studies. Besides full text databases, many formatted databases have also been completed or are in planning stages. Other established working groups which have contributed to the building of databases include: "The Historical Document Records Building Group," "Vocabulary Knowledge Base Group," "Data Entry Group," and "Publication of Papers."

• Chinese Academy of Social Sciences (CASS), Beijing

The Computer Department of CASS, established in 1986, is developing full text databases in a program entitled, "Computer Technology for Processing Chinese Ancient Literature." The printed book format, an outgrowth of the computerization of the databases produced by CASS, is far more accurate than the manually produced book. This format expedites the research of those who do not have access to computers.

The Centre for Documentation and Information of CASS, established in 1985, is developing a unified network system for the libraries of CASS. Automation did not start until early 1989. The databases of the union list of the periodicals in the libraries of CASS, and the acquisition list of the foreign periodical subscriptions by CASS libraries, were completed in that year.

• National Central Library (NCL), Taipei

The National Central Library joined Taiwan's Library Association of China (LAC) in 1980 to start the Chinese Library Automation Planning Project with a mission to improve data management, to raise the quality of information services, and to keep pace with new trends in the information exchange involving foreign countries. With the establishment of its Computer Center in December 1982, NCL has been a leader of library automation in Taiwan in many ways, including the creation of the databases of the "national bibliographic data," "index to Chinese periodical literature," etc. NCL also produces " Chinese MARC Data" on CD-ROM, and "Index to Chinese Periodical Litera-ture" on Disc.

• Ku chi cheng li yen chiu so, Sichuan University, Chengdu, Sichuan

Established in 1983, the institute concentrates its efforts on the study of the Sung Dynasty (including Liao, Hsia, and Chin). The production of the database of the enormous Ch'uan Sung wen began in 1986, and will not be completed until 1995. Several important reference tools have been published as by-products of this database, such as Sung jen pieh chi pan pen mu lu, Hsien ts'un Sung tai chu tso tsung mu, Sung tai jen wu chuan chi tzu liao so yin pu pien, etc. The Ch'uan Chin wen hsien database (including works on Chin Dynasty written by Sung, Yuan, Ming, Ch'ing scholars) was started in 1991 and will be completed in 1997.

• Institute of Chinese Studies, Chinese University of Hong Kong

The Institute has just completed a database of Hsien Ch'in Liang Han ch'uan shih wen hsien tzu liao , which contains 102 works, totalling eight million characters. The University and the Commer-cial Press in Hong Kong are jointly producing a concordance entitled, Hsien Ch'in Liang Han ku chi chu tzu so yin ts'ung k'an, using this database. The first series (12 titles in 12 volumes) will be published in late 1992. The second project, Wei Chin Nan-pei ch'ao ch'uan shih wen hsien, is already underway.

4. BIBLIOGRAPHIC GUIDE TO THE CHINESE DATABASES

Information on Chinese databases is hard to gather because it is not well publicized. Some of the information I have acquired came from my research and reading of publications in library and infor-mation science, history, and computer related fields. Other information came from my correspon-dence with the institutions producing databases and persons in charge of the production of databases. Still more information will come as I receive answers to my questionnaire. Finally, I hope to gather more data after my trips to China, Hong Kong, and Taiwan in November and December of this year, where I will visit institutions which are involved in the production of databases.

The attached preliminary list of the Chinese databases is divided into four groups:

• Traditional works

• Modern works

• Bibliographical indices

• Library produced bibliographies

This list of bibliography is grouped according to categories. A review of the function and operation of the setup of each database by the end-users, intermediaries, and computer specialists will reveal the value and capability of its intended mission.

One will notice that some titles are duplicated by different institutions, either due to a lack of communication among the sponsoring institutions, or due to the choice of different editions for the base of computerization. Computer environments are different among countries. In addition, they may not even be compatible among institutions in the same country. A few databases are available in floppy disc, or CD-ROM which may be purchased. Except those run by libraries, many databases are in operation independently, serving their own institutions only and without any networking setups.

5. CONCLUSION

The bibliographic guide presented in this paper is an attempt to make known the available data-bases for Chinese studies. While this guide is not comprehensive, every effort has been made to include all of the available databases produced in all major institutions. The titles cover virtually the entire sweep of Chinese history from antiquity to the present. They comprise a number of subject matters, including dynastic history, literature, classics, and modern laws and regulations. The material in these databases amounts to thousands of printed volumes. With these databases, scholars can retrieve passages with specific characters or words in them, personal and place names, or proper nouns of other kinds. Instead of taking days or weeks, they can now accomplish these tasks in a matter of minutes. This bibliographic guide is compiled with that task as its intended purpose.
 
 

REFERENCES*

Academia Sinica Computing Center. (June 1990). "Computing Center, a brief sketch" Taipei.

-- Also in Chinese, "Chi suan chung hsin kai k'uang."

App, Urs. (May 1992). "The electronic Bodhidharma," Chinesisch und Computer, 7: 59-62.

Chan, Su-chuan. (June 1990). "'T'ai-wan fang chih ch'uan wen tzu liao k'u' shuo tu kung k'ai," T'ai-wan shih t'ien yeh yen chiu t'ung hsun, 15: 33-34.

Chang, Ch'i & Chien-liang Wang. (1991). "'Ch'uan kuo pao k'an so yin' wei chi pien chi, p'ai pan, chien so i t'i hua hsi t'ung," T'u shu kuan tsa chih, 6: 32-34.

Chang, Yueh-hsiao (Librarian, Center for Documentation & Information, Chinese Academy of Social Science), Letter dated March 23, 1992.

Chang, Yueh-hsiao, I-lin Shen. (1991). "Ch'ien chin chung ti Chung-kuo she hui k'o hsueh yuan t'u shu kuan hsi t'ung," Chung-kuo t'u shu kuan hsueh pao, p. 2.

Chen, Fang-cheng (Professor, Institute of Chinese Studies, Chinese University of Hong Kong), Letter, May 1991 & February 1992.

Chinese Academy of Social Sciences, Computer Center. (December 1990). "'Ch'uan T'ang shih su chien hsi t'ung' shih yung shou ming," Beijing.

Chou, Nancy Ou-lan. (October 1991). "Library automation in the Republic of China, the development of a national bibliographic network," T'u shu kuan hsueh yu tzu liao k'o hsueh, 17 2: 24-35.

Chu, Ching-kuo, (Director, Computer Dept, Shanghai Museum) "Shang Chou ch'ing t'ung ch'i ming wen yu liao k'u hsi t'ung," 1991? 6p. [source not available]

Chu, Ching-kuo, "Shou hsieh Chia ku wen tsai hsien shih pieh ti mo hu shu hsueh mo hsing," p.80-88, 1991? [source not available]

Chu, Ching-kuo, Letter dated March 30, 1992.

Chu, Hsiu-ling. (June 1992). "'T'ai-wan fang chih ch'uan wen tzu liao k'u' chin hsing kai k'uang," T'ai-wan shih t'ien yeh yen chiu t'ung hsun, 23: 80-83.

Chu, Ming-hsueh, (Researcher, Computer Dept., Shenzhen University), letter dated March 11, 1992.

Convey, John. (1989). "Online information retrieval, an introductory manual to principles and practice," 3rd ed. London: Library Association Publishing Ltd.

Ding, Zy-kaan. (February to May 1992). (Manager of the Twenty-five Dynastic Histories Database at the Computing Center, Academia Sinica), Letters & meetings, several times.

Fung, Margret C. (April 1991). "Biographical & bibliographical database of Chinese studies, an indispensable tool for research," T'u shu kuan hsueh yu tzu hsun k'o hsueh, 17 1: 26-38.

Ho, Chih-hua, (Researcher, Institute of Chinese Studies, Chinese University of Hong Kong), letter dated July 1, 1992.

Hsieh, Ch'ing-ch'un (Former director of the Computing Center, Academia Sinica), Letter dated February 11, 1992.

Huang, Ch'ing-lien. (June 1991). "'Nien wu shih ch'uan wen tzu liao k'u' yu Chung-kuo li shih ti yen chiu," Hsin shih hsueh, 2 (2): 123-127.

Huang, Jack K.T. & Timothy D. Huang. (1989). "An introduction to Chinese, Japanese and Korean computing," World Scientific, Singapore.

Li, P'o (Researcher, Chinese Dept., Harbin Norman University), Letter dated March 21, 1992.

Lin, Ching-wen (Researcher, East Asia Seminar, University of Zurich), several meetings in May 1992.

Liu, K'un-t'ai. (June 1991). "'Tien nao hua Sung jen pi chi chien so hsi t'ung' ta k'o wen," Hsin shih hsueh, 2 (2): 139-144.

Liu, Tseng-kuei, "T'ien nao tsai Han ch'ien yen chiu chung ti ying yung," Hsin shih hsueh, 2 (2): 129-139 (June 1991).

Liu, Yuan-lin. (April 18, 1992). "Tien nao chia ku wen tzu tien ti ch'ang shih pien chi," Chia ku hsueh yu tzu hsun ko chi hsueh shu yen t'ao hui lun wen chi.

McCloy, William (Assistant Librarian, Comparative Law Library, University of Washington), Meeting, March 12, 1992.

Shanghai Museum, Computer Department, "'Tsang p'in pien mu t'u hsiang kuan li hsi t'ung' shih yung shou ming shu," November 1987.

Shen, Chih-hung (Assistant Librarian, Ku chi yen chiu so, Sichuan University), Letter dated March 10, 1992.

Shenzhen University Library. (November 1991). "Wen hua pu t'u shu kuan tzu tung hua chi ch'eng hsi t'ung ch'an p'in chien chieh (MC-ILAS)," Shenzhen.

T'ien, I (Researcher, Computer Center, Chinese Academy of Social Sciences), Letter dated February 12, 1992.

Tseng, Tsao-chuang, Chih-hung Shen, "Chi suan chi fu chu cheng li Sung tai wen hsien ti yen chiu," 1991?, p.148-151 [source not available].

Wu, Yeen-mei. (October 1991). "Twenty-five Dynastic Histories Full Text Retrieval Database at the University of Washington," Committe for East Asian Libraries Bulletin, 94: 21-24. -- Also in Chinese, "Hua-sheng-tun ta hsueh shih yung nien-wu shih ch'uan wen ch'ien so tzu liao k'u kai k'uang," Chi suan chung hsin t'ung hsun [Academia Sinica], 8 (8): 2 (April 1992).