DATA BASE CONSTRUCTION FOR LIBRARY INFORMATION SYSTEM
Cai Hongmei & Liu Qimao
University Library
Huazhong University of Science and Technology
Wuhan, 430074 China
Abstract: This paper explains briefly the building of library information system on the basis of MINISIS, the construction and usage of data bases in the information system, and assures the position of data base in the information system. The creation, performance and implementation of the Chinese Major Science and Technology Achievements Data Base (STAC) are introduced emphatically. The coordinated application of various special topic data bases in library information system is discussed. Some existing problems in the application of MINISIS are stated finally.
The key point in library automation management system and various special topic data base retrieval systems is the data base construction. The library information system on our campus operates on HP3000 computer with MINISIS, an information management system developed in 1978 by the Canadian International Development and Research Center (IDRC). MINISIS can manipulate various types of information records of variable field length fields in a way conventional data processing technique can not do well. Thus, MINISIS is an ideal selection as a software tool to build the library information system.
The structure of the library information system on MINISIS is shown schematically in Figure 1. First, we used MINISIS to build a database of the library collected books and periodicals in Chinese and Western languages. Library routine such as acquisition, cataloging, circulation, retrieval etc. were then running on the basis of the database together with MINISIS and IMAGE. Thereafter, on MINISIS we constructed article titles data base of Chinese scientific and technological periodicals, basic research data base, Chinese major science and technology achievements data base, and Chinese patent database along with their retrieval systems.
The outstanding problems during the construction of these databases are data standardization and building speed. Thus, we accept data sources of standardized bibliography as much as possible, accept existing data of various special topics, and manually login data out of these two sources. For instance, for Chinese books, the UNIMARC data published by the Beijing Library is adopted, and data of ISO2709 format are transformed into MINISIS data through MINISIS-MARC interface. As to books in Western languages, CD-ROM data are taken over to undergo transmission from CD-ROM to MINISIS. Regarding PC data operation, we have successfully solved the data transmission problems between MINISIS and CDS/ISIS or DBASE PC system.
At present, the Chinese and Western languages bibliography database are playing important role in our library automation system; the periodical article titles data base has provided readers with retrieval service since early 1992. Particularly from project funding point of view, in February 1992, the basic research information data base became an important source of evidence or reference for researchers to consider their project applications.
Exemplified with the Chinese Major Science and Technology Achievements data base, the pro-cedure and method of data base construction through the use of MINISIS are described as follows.
2. THE CHINESE MAJOR SCIENCE AND TECHNOLOGY ACHIEVEMENTS AND ITS RETRIEVAL SYSTEM
2.1 The Design of STAC
Data base STAC (Science and Technology Achievements of China) is built on the DATADICT module of MINISIS. STAC is R&D type data base with fields defined as follows:
FIELD NAME MNEMONIC TAG FIELD LENGTH
Name of achievement CGMC A000 100
English name of achievement item CGYM A010 100
Level of achievement CGSP A020 14
Beginning and end dates of research YJRI C200 16
Beginning date QSRQ C201 8
End date ZZRQ C202 8
Accomplishing unit WCDW C300 31
Serial number of accomplishing unit DWXH C301 1
Name of accomplishing unit WCDWM C302 30
Major researcher YJRY C400 12
Serial number of researchers YJRX C401 1
Name of researchers RYXM C402 10
Serial number of researcher unit YJDWXH C403 1
Remarks NOTE D300 100
Classification identity FLBS D400 16
Key words ZTBS D500 16
Abstract of achievement D600 D600 400
The above fields are processed in variable length to save space, while subfields and repetitive fields are set for some of them and the achievement abstract field is supplied to help users to know the basic technical contents of the achievement.
2.2. Data Sources and Their Processing
The Technical Research Achievements Administration Office of the State Science and Tech-nology Commission and the Chinese Scientific and Technological Information Research Institute have been making efforts on the construction of STAC. In order to avoid repeated data login, diskettes supplied by the Chinese Scientific and Technological Information Research Institute following national standard GB2901-XX "Magnetic Tape Format for Bibliography Information Interchange" are used to process data, so that desired fields can be extracted and transformed to fit MINISIS data base.
By analyzing the data on diskettes, it was found that they agree basically with the ISO2709 format, and they also are composed of header, catalog field and data field, but some bits in header are defined differently from ISO2709. For this reason, the data are transferred in file form by emulated communication technique from PC to mainframe, where a corresponding definition CD has been set up to correspond between the structure of MINISIS database and that of the file with ISO format, so that it can be transferred to mainframe. Desired fields are then taken out and correlated with those of the MINISIS data base, the characters such as the field indicator, subfield identifier etc. are deleted, and finally the data are loaded into data base STAC by running ISOCONV module in the mainframe. As MINISIS needs CSIC (Character Set Identifying Code) to identify different character sets when Chinese characters and other auxiliary characters are processed, the exit program MAPCHAR is utilized to insert CSIC (%16%4) in front of Chinese character field when the data are loaded into STAC.
2.3. Functions and Features of the System
The library information system built on HP3000 with MINISIS has relatively strong functions shown in the following respects.
• Information Retrieval
- In order to save the overhead on disk space, possible retrieving paths are analyzed in details according to usual convention together with supervisory and retrieval demands, and inverted files of some fields in STAC data base are created, including name of achievement, research beginning date, end date, accomplishing unit, name of researchers, Chinese classification number, keyword, abstract etc... as quick retrieving entries, so that the needed information can be accessed quickly and accu-rately.
- The inverted sorting of achievement name and abstract fields is done in word drawing mode, which allows free text retrieval to make the operation convenient and flexible.
- Due to organization adjustment, and the lack of writing formalization, unit names in the data base may not be unified. In this case, inverted sorting on the whole field will be quite inconvenient for retrieval. Thus, word drawing mode is adopted to allow free text retrieval.
- Boolean expression may be used to flexibly and accurately describe user's complicated retrie-val demands.
- As to fields without inverted sorting file, sequential retrieval may be run to meet user's special needs.
• Printout of retrieved results can be obtained easily.
• Manipulating data base contents to produce statistic figures, lists and reports on request.
• Calling intrinsic procedure of MINISIS by high-level language to realize batched retrieval and print
out. This function is basically used for originality checking job during annual application of
funding projects.
• Limiting the applicable scope of data. A projection database i created in PS projection mode for some confidential achievements to limit its applicable scope and ensure the data security.
3. COMPREHENSIVE APPLICATION OF SPECIAL TOPIC DATA BASE IN LIBRARY INFORMATION SYSTEMS
The library information system in our university contains three special topical data bases -- for basic research information, Chinese major science and technology achievements (STAC) and Chinese invention patent. These can be operated either individually or collectively to serve out users' needs. The purpose of creating these three topical databases is to offer comprehensive information to various kinds of users, help users select research subjects, avoid repeat research efforts, and raise hit ratio of application projects, while they may also be regarded as a source of evidence or reference for high-level decision making.
These three are all R&D databases built on MINISIS. How to link them together so that users can retrieve effectively any one of the three or combination of them at a single operation?
First, we analyzed the configuration of different fields, and tried to unify their field identifier, field length and the design of inverted sorting on common fields such as name of project, key words, research beginning date etc.
Next, taking these common fields as retrieval entries, intrinsic procedures of MINISIS are invoked, with the same retrieving request, these R&D databases will be accessed separately on demand, and hit records will be printed out. This process is shown in Figure 2.
Third, good user interfaces are developed to provide users with multipath retrieval in menu for-mat by using the functional modules in MINISIS and application programs.
These three measures mentioned above allows our users to access the three data bases effectively.
4. CONCLUSION
In the case of creating a library information system, MINISIS is found to be versatile and convenient, but certain limitations also exist. These include:
• Maximum length of supportable logic record is only 64 KB, and repetition of repeatable field is not to be over 800 KB, which is not sufficient particularly in handling periodicals.
• In designing full screen input format with MINISIS and VPLUS interface, the field length must be not over 256 bytes, beyond which the only way is to get into character processing state.
• For longer records, it is not possible to display screen by screen on the terminal.
• There are certain limitations in the COMPUTE module of MINISIS for statistics, e.g., opera-tion can be complicated, percentage calculation is unavailable, and variables can be no more than 16.
We hope that these limitations will not eliminated
in the future versions of MINISIS. We will at the same time try to explore
new ways to overcome these limitation.
REFERENCES
Liu, Qimao & Cai, Hongmei. (1992). "Library Automation and HP3000 System," Computer Application Study (in Chinese), 9 (2).
Liu, Qimao & Cai, Hongmei. (1992). "The Preliminary Development and Application of HP3000 System in Our Library," Chinese HP (in Chinese), No. 1.
Canada, IDRC. (July 29, 1988). MINISIS G.: Database Managers Guide.
Canada, IDRC. (July 29, 1988). MINISIS G.: Application Programmer's Guide.