THE PREPARATION AND DEVELOPMENT OF THE FIRST CD-ROM CHINABASE ONDISC IN CHINA
Yuanshu Chen & Xianhua Tan
Chongqing Branch, ISTIC
Chongqing, China
Abstract: This article introduces the research and establishment of the first CD-ROM Chinabase Ondisc in China -- the Periodicals Chinabase Ondisc, investigates and probes much about the users market of CD-ROM Chinabase Ondisc, preparation of data, pro-cessing of disc, development of system software, and solves the problems such as the effective storage of Chinese characters of the optical disc, separation of mistaken data, compatibility of CD-ROM optical disc drivers, and quick access of data on the optical disc. Compared with other introduced CD-ROM database in foreign languages, our periodical Chinabase Ondisc is quite similar to those products.
The Periodicals ChinaBase Ondisc (PCO) was successfully established by Chongqing Branch, the Institute of Scientific and Technical Information of China (CB ISTIC) under the direction of Information Department of the State Commission of Science and Tech-nology in June, 1992. At present, it is the biggest comprehensive title-copied Database of Chinese documents in China and collected more than 4,600 kinds of scientific and technical periodicals from 1989 to 1991 with a total collection of more than 610,000 pieces.
PCO has been tested by more than 30 units, such as ISTIC Beijing Library. ISTIC of Transportation Department, Shanghai ISTI, Chongqing University and Chendu Geology College etc..., by using in-house developed software. Having retrieved literature with international online databases suggests that the PCO has similar retrieval function as other international CD-ROM products. It is also matchable in terms of its read-out speed, data specification, indexing depth, total storage capacity and data renewal per year. It is the first self-developed CD-ROM database in China.
PCO has been developed on the basis of the Bibliographic Database of Chinese Periodicals. The Bibliographic Database offers information retrieval by means of microcomputers with magnetic disc as media. In the past three years since establishment, rapid and continuing expanded capacity in terms of newly-adding collection (more than 200,000 pieces per year) has made it difficult to operate on microcomputers. There is a great need to expand the capacity of the Database. In order to solve the storage problem, CB ISTIC set up an optical disc research group in July, 1990 to conduct inquiry on the potential development of an optical disc, PCO. This group successively interviewed related organizations such as Golden Disk Corporation of Qinghua University, Chinese Magnetic Recordings Equipment Corporation, Beijing Library, Chinese Education Picture Import and Export Corporation and Shenzhen Xianke Laser Television Corporation and exchanged related technology with optical disc businessmen overseas. First-hand information materials on the technology were obtained, and the group developed a detailed implementation plan for the production of PCO.
2. RESEARCH GROUP'S INQUIRY
2.1. Objectives
Change the storage media in operation of Bibliographic Database of Chinese Periodicals to meet the increasing demand of storage capacity and ensure the advancement of this database. In other words, this can fundamentally solve the problem for storing large amount of data with limited capacity of microcomputers.
2.2. Investigation
Currently there are at least three basic ways to store large quantity of data. These are:
• Make use of harddisk storage with large capacity,
• Use WORM disc,
• Use CD-ROM.
The research group made a thorough investigation on each of these three means.
It is a common and easy way to solve the storage problem by expanding a computer's harddisk without having to change its software. However, capacity of harddisk is still quite limited. With increase of data amount, it's also difficult for installment and expansion of database on the harddisk. The problem between microcomputer operation and large capacity of database still remains.
The use of WORM disk is an effective way to solve the increasing demand of data's storage and this does not need to change the system software. But the price of WORM disc's drivers and discs are relatively high and are not easy to obtain in China.
A simple and economical way to solve storage problem on microcomputer is the use of CD-ROM. CD-ROM occupies a dominant position among other storage media with the characters of large storage, convenient usage, flexible operation, natural technique of manufacture, high level of standardization and easy acceptance by users. Thus it becomes the most popularly used information storage and publishing media both at home and abroad.
With a thoroughly study on CD-ROM, the expansion of harddisk, detailed study on the states of the art of Chinese CD-ROM and CD-ROM in foreign languages, the research group recommended CD-ROM technology can best suit China's specific needs for the expansion of the Bibliographic Database of Chinese Periodicals. Thus, a decision was made to develop PCO.
2.3 . Inquiries of the Users
The development of database depends greatly on the information users' market. In China, not enough attention has been paid to database users. As a results, hundreds of self-developed databases in China have inefficient use. The Bibliographic Database have been rather successful in the last three years because enough attention was paid on users market and social recognition. Therefore when the PCO was developed, users' inquiries were considered very important.
The result of research group's nationwide investigation between October, 1990 and April, 1991 (in 15 provinces, more than 40 big and mid-sized cities, about 400 units) shows that PCO has broad users market.
It is estimated that the main users of PCO would be universities and colleges, medical units, research institutes and ISTIs, among which colleges and universities are subject users. Recently, 200 group users can be identified.
2.4. Timetable of the Product Development
June - August, 1990 Inquiry of Optical Disc
Sept. 1990 - Apr. 1991 Inquiry of users market
May - Sept. 1991 Development of software & Inquiry of production units Oct. 1991 Apr. 1992 Development of system software
May 1992 preprocessing and examining of data
Late May, 1992 Submit data to processing factory abroad
June 1992 Cut master-disc, press subdisc
July 1992 to the present Evaluate usage of users
3. PREPARATION OF DATA AND PRODUCTION OF CD-ROM IN CHINESE
The process to produce the PCO CD-ROM is similar to that of music laser disc. In other words, information is mastered onto a piece of plastic disc with the diameter of 120mm, then a reflection layer is coated on the disc. Under certain condition, read-out of data could be realized. Generally speaking, the driver is equipped with drive software and SCSI to assure the read-out of data on the optical disc via the use of CD-ROM drivers.
According to the international practice of CD-ROM's production, preprocessing of all the data and development of system software is carried out by the production party of the CD-ROM, then the well-processed data is given to the optical disc company who will cut master-disc and press subdisc according to international standard and the customer's specification. Since the data on CD-ROM is read in once and could not be revised, the quality of primary data is significant but difficult. Furthermore, software for retrieving Chinese CD-ROM can not be found in China. Thus, it is necessary for us to produce for PCO. Therefore the research task for the production of PCO is quite arduousand time consuming. The process involved:
• Sort out primary data
Strictly based on related international standard to examine and correct primary data, revise indexing and copying mistakes completely, delete the repeated data to meet the demand of data's quality.
• Processing and examining data
According to the contract with optical disc company, data are sorted out into standard file based on ISO 9660/High Sierra Format and put into magnetic disc to be submitted together with relevant index. Careful examination should be made to data on diskc in order to assure the accuracy of the data file.
• Research and development of software
The system software of PCO includes installment and retrieval. Since there is no universal CD-ROM software in Chinese both at home and abroad, PCO has to use self-complied special software.
• Provide data and produce disc
Processed data was submitted to Discover System Corporation in the USA for the production of the PCO disc since such production capability does exist in China. After the test disc was produced, it was tested, installed.
• Users test use
PCO was sent to users to try out its functions to determine whether the CD-ROM meet the design requirement. From July to September of 1992, through the test use of more than 30 units in Beijing, Shanghai, Chongqing and Chendu and over-all tests by experts, it proved that the CD-ROM is successful. All the specifications are similar to those of the foreign CD-ROMs. The value of systematic function is high.
4. SYSTEM SOFTWARE OF PCO
4.1. Design target
• To realize the effective storage and retrieval of Chinese characters code on CD-ROM
In the last few years, many western CD-ROMs have been introduced in China, thus general knowledge on the use of these CD-ROMs are rather matured. But it is a new problem in China on both the storage and use of Hanzi data on CD-ROM. There is no precedent to go by. How to realize the project becomes a key issue of design.
• Improve the access speed of CD-ROM
Generally speaking, the read-out speed of CD-ROM is relatively slow. How to improve access speed is one of the main specifications to measure the function strength of CD-ROM system software.
• Realize separation of mistaken data on CD-ROM
Any database contains mistakes, the key is that the normal operation of database's system should not be affected because of data error. This is even more important to CD-ROM PCO with large storage capacity. Under the premise of reducing data errors to the smallest scale, how to confine mistakes to a limited area so as not to affect operation of PCO is one difficulty which has to be overcome.
• Compatible of system
In order to expand the use of PCO and further meet the diversified users' needs, this system makes great effort to improve compatible capacity aimed at CD-ROM drivers of different types, models and operation systems.
• Friendly user interface
PCO can be independently installed and operated by users on their own machines, therefore the operation system must be easy and convenient. This system provides all screen and pull -down menu.
4..2 . Function of Software
System software in PCO includes two parts -- system's installation and retrieval.
• Installation -- Before using PCO, users have to install retrieval software on the subdirectory of computer's harddisk first. It's very convenient to install our system and compared with that of some CD-ROM systems abroad, ours' installment is quicker and more convenient.
• Retrieval -- Three ways of retrieval are available. They are by keyword, author and classified number. All of them can operate separately, and gain satisfactory retrieval results by means of mixed Boolean logic computation. Up to 20 items can jointly computed. During the process of retrieval, truncation means can be used to improve recall ratio.
PCO can provide all-screen print and specific print according to the requirement of the user.
4. MAIN FEATURES AND TYPE OF PRODUCT OF PCO
4.1. Main features
• PCO users can use the CD-ROM as storage carrier of data. Its features are large storage capacity of data, quick speed of read-out, well-stability, high adaptability, low requirement to circumstance.
• Big collection of documents and wide range of subjects. About 610,000 pieces of title-copied document are collected from periodicals of all the natural sciences. Renewal time is half a year, adding data about 120,000 pieces per half year.
• Compatibility. PCO uses international standard, special systematic software complied with language C, completely separate from the harddisk of the microcomputer so as to assure the compatibilty of optical disc and software.
• Convenient operation, quick speed of retrieval, high ratio of recall and precision.
4.2. Type of product:
• Data disk. PCO stores title-copied data on scientific and technical periodicals in Chinese from 1989 to 1991.
• System disk. High density floppy disc stores installation software
and retrieval software of PCO.