CURRENT SITUATION AND DEVELOPMENT OF DATABASE SYSTEMS AND COMPUTERIZED INFORMATION RETRIEVAL SERVICES OF ISTIC
Yang Zongli
ISTIC
Beijing 100038, China
Abstract: Under the State Science and Technology Commission of China, ISTIC is a comprehensive scientific and technical information center at the national level. Development of database systems and computerized retrieval information services is one of the most important tasks of ISTIC. Since 1985, a total of 15 databases in English or in Chinese, consisting of 400,000 records covering almost all specific fields, have been developed -- and some new information technology and methods have been used to design and retrieve the databases. An advanced computerized information retrieval system consisting of IMB-4381, VAX-750, and hundreds of microcomputers have also been established and pro-vided the public with on-line information services. Up to now, the ISTIC based interna-tional on-line information retrieval terminal network has connected to nine major interna-tional on-line information retrieval terminal network hosts, including DIALOG, BRS, ORBIT, and STN through satellite telecommunication network. In the coming years, ISTIC will focus its efforts more and more on application of new information technology -to develop new databases, database management systems, application softwares and information searching services.
The Institute of Scientific and Technical Information of China (ISTIC), directly under the State Science and Technology Commission (SSTC) of China, is a comprehensive scientific and technical information research institution at the national level. It provides scientific and technical services for the whole country. Since it was founded in 1956, ISTIC, with great support from the government, has gradually grown to be a national center for scientific and technical information with relatively rich resources and fairly advanced means of information services.
ISTIC has its headquarters in Beijing and a branch in Chongquing, Sichuan province. The main tasks of ISTIC are as follows:
• To collect various forms of scientific and technical information inside and outside of China for the development of national economy, science and technology;
• To process and report Chinese and foreign scientific and technical information materials which have collected, and edit, translate and publish related information books and journals;
• To offer manual and computerized retrieval services using different methods and means;
• To develop databases to meet China's own needs, and to establish step by step a nation-wide online information retrieval network to fully utilize information resources at home and abroad;
• To carry out information analysis and research on Chinese and foreign scientific and technical information;
• To report scientific and technical achievements both at home and abroad, and to provide strategical information services for government decision makers on matters related to the principles and policies for the development of national economy, science and technology;
• To organize documentation work such as document collection, library service, information consulting, technical translation, reproduction, video-tape program production and broadcasting, as well as a special services for national level research projects;
• To conduct analysis and research on information theories, policies, management and services, as well as on the application of modern information technology;
• To register, process, report and disseminate major national scientific and technical achievements;
• To follow the advance of domestic science and technology and to prepare reports to the State Council; and
• To promote international cooperation and interflow of scientific and technical information.
2. DESIGN AND DEVELOPMENT OF DATABASE SYSTEMS
To fully utilize and rapidly search information resources at home and abroad, and to provide the public with vast and rapid information services, ISTIC takes design and development of different kinds of databases meeting China's own needs as one of the most important tasks. The first ISTIC bibliographic database, China Academic Conference Papers Database in machine-readable format, originated in 1984. Up to now, ISTIC has developed a total of 15 databases which contain approximately 400,000 records and cover almost all the specific fields. Six of them are in English. They are:
• China Doctoral Dissertation and Master Theses Database (CDMDB) (English version), with 30,000 records;
• China Academic Society Journal Abstracts Database (CSJA), with 20,000 records;
• ISTIC Western Language Documents Database (CATALOG), with 110,000 records;
• Chinese Research Institution Database(CSI) (English version), with 3,000 records;
• ISTIC Western Language Periodicals Database (Journal), with 20,000 records; and
• Chinese Enterprises and Companies Database (CECDB) (English version), with 20,000 records;
• The ISTIC Databases in Chinese are Chinese Enterprises and companies Database (CCECDB), with 100,000 records;
• China Major Scientific and Technical Achievement Database (STAC), with 30,000 records
• China Appropriate Technology Achievements Database (ATAISC), with 90,000 records;
• China Academic Conference Paper Database (JACPC), with 100,000 records;
• Chinese Research Institution Database (CSI), with 5,000 records;
• Database of Union Catalogue of Chinese Scientific and Technical Periodicals (UCCP), with 10,000 records;
• China Master Theses Database (CDMDB), with 50,000 records; and
• China Doctoral Dissertation Database (CDDB), with 2,000 records.
All the databases can be classified into two parts: bibliographic databases and fact databases. Some more detailed information will now be given to the following databases:
CHINA ACADEMIC CONFERENCE PAPER DATABASE (JACPC), produced by the Data-base Development Division of ISTIC, is the first one of ISTIC and the largest bibliographic file for scientific and technical conference documents in China. It comprises of 110,000 documents contain-ing name of conference, sponsors, title of paper, authors, publishers, abstract and some other descriptive cataloging information. Indexing terms, called here descriptor, and subject category codes are assigned for each document processed using Chinese thesaurus and classification scheme. The database covers aeronautics, agriculture, astronomy and astrophysics, chemistry, earth science and oceanography, electrics and electrical engineering, energy conversion, materials, mathematical science, mechanical, industrial, civil and marine engineering, physics, space technology, communi-cation, computer, control and information science, metallurgy and mining, and so on. It has been available in online information retrieval systems and microcomputer-base information services in ISTIC since 1989.
CHINESE DOCTORAL DISSERTATION AND MASTER THESES DATABASES (CDMDB), a file produced by Database Development Division of ISTIC, is the first database in English of ISTIC and contains title, author, advisors, degree, name of university, abstract and some physical descriptive information. Indexing terms consisting of descriptors and identifiers, and subject cate-gory codes ,are assigned for each thesis using Thesaurus of Engineering and Scientific Terms (TEST), and UMI-DAI codes.
CHINESE ENTERPRISES, and COMPANIES DATABASE (CECDB), a directory file pro-duced by Technical and Economic Department of ISTIC, is a fact database providing the relevant information on the enterprises and corporations in China. Each record of CECDB contains the company's name, abbreviation, director's name, province, city, country code, address, telephone, cable, telex, fax, subcompanies or representative agencies, organization code, registered capital, subject categories, brief introduction to the unit, patent or new products, and main products or business lines. The employee, annual capacity (for enterprises) or sales (for companies), offer of technology and joint venture opportunities are covered too. Indexing terms, named here key words, are also assigned to each record for more effective information searching. Classification codes are also assigned using the Chinese Classification Scheme (for Chinese version) and SIC (Standard Industrial Classification) (for English version). At the users' request, ISTIC can provide the following machine-readable CECDB products:
• floppy disk in CDS/ISIS record format -- the records which can be read and searched by IBM PC computers or their compatible ones using the micro CDS/ISIS information package;
• floppy disk in ISO 2709 record format -- all records can be read and searched by microcom-puters and appropriate software developed by users;
• magnetic tape in ISO 2709 record format which is suitable for all kinds of computers and database systems;
• magnetic tape in CCF format in which all data will be recorded by CCF Format (Common Communication Format) developed by UNESCO, and is suitable for all computers and database systems.
CECDB covers all types of commercial companies and industrial enterprises as well as all pro-duct areas and subject fields. These establishments may be either headquarter or single-location establishments with most of which are engaged in import and export business, and manufacturing. The subject categories covered include financial and bank business; international trade, and import and export business; agriculture and foods; mechanical engineering and machinery; chemical and petrochemical engineering; materials and packaging; computer, electronics and automation; commu-nication and transportation; electric engineering and household appliance; arts and crafts; forestry and timber processing; medicine and health; civil aviation; architecture and building materials.
Some new information technology, for example, coding technology, Chinese Character processing and imputing methods and technology, new softwares and management systems have been used to design and develop the services of the above databases.
3. THE COMPUTERIZED INFORMATION SERVICES AT ISTIC
Since 1980, ISTIC has gradually established a computerized information retrieval system. At present, the ISTIC computer system consisting of a IBM 4381, a vax-750 computer and hundred's of microcomputers can provide the public with the following services:
3.1. International On-line Information Retrieval Service.
Up to now, the ISTIC-based international on-line information retrieval terminal network which consists of almost 100 terminals in China, has connected to nine major on-line information retrieval hosts comprised of DIALOG, BRS, ORBIT, STN, ESA-IRS, ECHO, PFDS, DATASTAR, and FIZ TECHNIK through the satellite telecommunication network, also under the recently signed Sino-Japanese bilateral cooperation agreement in the field of ST information. ISTIC is going to set up a terminal to access the databases in the information system of Japan Information Center for Science and Technology. With all these international host systems, users in China can access not only the world ST information, but also up-to-date international news and commercial information.
3.2. Domestic On-Line Retrieval Information Services
ISTIC has developed an on-line English version database retrieval system on the basis of CDS/ISIS, a general information software package donated by UNESCO, which is installed on ISTIC's mainframe computer, IBM-4381. In addition to the imported English databases, Engineer-ing Index (EI) and Compendex, a total of six databases in English are available to service. Ten databases in Chinese have been installed and are available in the on-line information retrieval system.
A national computerized information retrieval network through the long distance telephone sys-tem and CHINAPAC, Packet switching data communication network under X.25, both provided by the Ministry of Posts and Telecommunication, has been built by ISTIC. Within the network there are nearly 100 terminals throughout the country which can access either the international on-line system through ISTIC or any of ISTIC's databases. Also, ISTIC has successfully connected its information retrieval network with two other Beijing based on-line information retrieval system located in the Institute of ST Information of Ministry of Chemical Industry, and the Institute of ST Information of the Ministry of Machinery and Electronics Industry. As a result, all the terminals of ISTIC's computerized information retrieval network now can access the databases of the above-mentioned two on-line systems outside ISTIC.
3.3. Development of Retrieval Software
ISTIC takes two information softwares as its database management software. They are the retrieval software CDS/ISIS (for both mainframe and microcomputers) and TRIP, originally design-ed to handle only alphabetic data, was modified by ISTIC to handle Chinese character information. The two softwares have very good performances to information retrieval.
4. FUTURE DEVELOPMENT OF ISTIC INFORMATION SYSTEM
In the coming years, ISTIC will focus its efforts more and more on applying new information technology to design and develop new databases, database management systems, application soft-wares, and to improve retrieval services. In three years, the records of ISTIC's databases will be added to 500,000 to 600,000. Some new databases will be designed and developed. New methods and technology for inputting and processing Chinese characters will be used to database manage-ment. Also, new image processing technology and scanning input technology will be used in data-base development. Now ISTIC's software engineers are designing a new generation of database management system and information retrieval software package. A new general information package for database management named ISTIC vision 1.0, whose resource program is written by PASC language and C language, has been developed and used in database management of ISTIC. The new information software package has very good performance. The main functions of the software are as follows:
• To define a database containing all data elements needed by users and to open up to ten data- databases at the same time;
• To input new records into database;
• To edit, check and delete any records contained in a database;
• To build and manage automatically a file suitable to rapid access for each element of database to optimize searching speed;
• To search information and records of database through a complicated searching language;
• To display and print any records of database;
• To sort the records of database at any needed order; and
• To develop user's own application program using the integrated programming function provided by the software package.
With the special B* data structure and original programming,
the software has very fast search-ing speed and wonderful data exchange
functions. The data of database can be exchanged into ISO-2790 format data
using the software package. Also, a network version of the information
software package is being developed now. It will be completed in the end
of this year and connected as an information retrieval and database management
software with local and long range networks of com-puter communication.
In addition, the ISTIC's computer system will be progressively improved
by using new information technology and equipping with new computers and
other hardwares.
REFERENCES
ISTIC. (1991). A Brief Introduction to ISTIC.
ISTIC. (1991). An Introduction to ISTIC's Database.