TITLE: THE INTERNATIONAL PATENT CLASSIFICATION, 5-TH EDTION, HUNGARIAN TRANSLATION
Agnes Felklné Szanyi
National Office of Inventions
Budapest, Hungary
Sándor Biszak
ARCAMUM B.T.
Budapest, Hungary
Abstract: CD-ROM technology is a powerful new tool in patent information. It is especially an ideal medium for an often used database, the content of which is stable for a longer period, i.e. for the database of International Patent Classification (IPC) that is valid for 5 years. The electronic archive of the Hungarian language printed edition of the 5th version of IPC served as a basis for the fulltext database. The user interface and the software were specially elaborated by the ARCAMUM B.T. for supporting and improving searching possibilities in the database.
The special advantages of software package include: left-hand truncation possi-bility not only for formulating search statements, but also for index searches in expand mode; hypertext-like accessing of cross references; display of chemical formulae; both fulltext and hierarchical display combining online or multiline pre-sentation of the text.
The first CD-ROM products in the field of patent information reached the National Office of Inventions at the beginning of 1990. CASSIS, FIRST and ESPACE -- patent databases containing bibliographic and facsimile information -- proved to be such a great success in use that by the eve of 1991 the first Hungarian CD-ROM appeared on the market.
The disc ultimately emerged from a book, the printed version of the Hungarian language edition of the 5th version of International Patent Classification (IPC), valid from the January 1, 1990 until December 31, 1994. It is the result of five companies' successful cooperation, in which the translation, proofreading, expertise on classification and general management were done by the National Office of Inventions; typesetting, software and user interface were deve-loped by ARCAMUM B.T.; premastering was added by TUDORG, the editor of the disc; masterprint was prepared by PHILIPS in the Netherlands, and finally GLORIA KFT printed 200 copies for commercial use.
2. DATA CAPTURE, DESIGNING TOOLS
Early translation period, the National Office of Inventions already decided to convert the text of IPC into a fulltext database in order to facilitate searching. The Office was aware of the fact that the basic condition to accomplish this aim is to have a high standard electronic text file. In any case if a book or any other printed matter serves as a basis for a database the key to high quality is the data capturing that is to be done before the printing.
This aspect was seriously considered when choosing the Hungarian Ventura DTP system for editing the printed version. After typesetting, the source files were archived in ASCII format containing all the printing information e.g. letter types, layout, etc. The great advantage of the ASCII format is in the integration of data input, word-processing and the possibility of editing as much of the information as required. The printed version of the IPC-5 also contains about 120 images of chemical formulae that were mounted in the book. These images were digitized in .PCX format of the PC Paintbrush software.
3. THE USE OF THE INTERNATIONAL PATENT CLASSIFICATION AS A SYSTEM
The International Patent Classification has been developed as a result of broad coopera-tion of all national patent offices. The aim of the work was to create an effective search tool that could be used for the retrieval of patent documents.
Patent is a mean of industrial property protection where the object of protection is a technical process of a product of a technical process. This is why patents were decided to be classified best according to their technical content. IPC is used all over the world, its content
is revised basically in every five years.
The structure of the IPC is rather complicated and intricate. It's impossible to give or to get a comprehensive picture of it, but a basic knowledge is essential in order to understand the complexity of the software solution.
I has a hierarchical construction with many subordinate and coordinate branches. On the highest level there are eight sections, each of them representing a special field of industry, e.g.: B - mechanics, C - chemistry, D- paper and textile industries, etc... All of the sections are divided into classes (e.g. A01), the classes into subclasses (e.g. A01B). The subclass levels consist of main groups (e.g. A01B 1/00) which consists of subgroups (e.g. A01B 1/22.\ A02D 6/30, or C07D 212/3375). The main groups end on "/00". The subgroups' place in the hierarchy is not correlated with numbering. Their role is determined solely by the number of dots preceding the titles of the subgroups. The highest subgroup level has one dot, the lowest level can have as many dots as necessary, even eight dots.
A subgroup title always defines a field of subject matter within the scope of its main group and the nearest group above it having one dot less. The subgroup title is often a com-plete sentence in which case it begins with a capital letter. If the subgroup title begins with a lower case letter it is to be read as continuation of the title of the next higher, less-indented group, i.e. having one dot less. Nevertheless, in all cases, the subgroup title must be read as being dependent upon, and restricted by, the title of the group under which it is indented.
Examples: a, A 01 B 1/00 Hand tools
1/24 for the treating meadows or lawns
The title of 1/24 is to be read as : Hand tools for treating meadows of
lawns.
b, A 01 B 1/00 Hand tools
1/26 Tools for uprooting weeds
The title of 1/16 is a complete expression but owing to its hierarchical position, the tools for uprooting weeds are restricted to hand tools.
There are certain parts of the IPC which don't form part of the hierarchical system:
• contents of the section;
• subclass indexes;
• notes;
• guide headings.
Some of them contain index-type information, some add references to a group or a larger unit, or introduce a non-defined secondary hierarchy like the guide headings or sub-section titles.
Finally references must be mentioned. These are one or more phrases in parentheses -- following in many cases the text of a class, a subclass, a group, guide heading or a note -- referring to another place in the IPC. A reference shows that the subject matter indicated in it is covered by the place referred to.
4. THE FRAMES OF THE PROGRAM
The basically explained complicated classification system is all-inclusively handled by the retrieval software written in C-language.
On one hand IPC-5 is a menu-driven system. On the other hand "hot-keys" allow us a quick navigation in the database.
The main menu contains three pull-down submenus:
• DISPLAY
• SEARCH
• OPTIONS.
These are accessible from any part of the program if the first letter of the name of screen option is hit while ALT-key is held down.
F1 key provides context sensitive help for users supported by a hierarchical arrangement. Error messages are handled by relevant pop-up windows in the middle of the screen.
4.1. The DISPLAY Mode
The IPC can be studied as if it were the printed edition by reading it in fulltext mode. It is also displayable hierarchically following the whole path of entries from the more specific up to the most generic level called section. This hierarchical mode is very convenient in case of long and detailed subgroup structures, e.g.: A 23 B 4/22 in its fulltext environment reads as "Micro-organisms; Enzymes," in the hierarchical structure it's clear that these micro-organisms are only used in methods for food preservation.
If a hierarchy is too long for one screen -- because some titles, especially of maingroups or subclasses have long, detailed references -- the program can change to "online-mode" where only the first line of each entry is shown.
F2 toggle key enables us to switch between fulltext and hierarchical modes. Working in hierarchical mode the entries having subordinate groups can be opened -- step-by-step -- by pressing F3. With F4 the level desired can be closed. Searches can be performed only in fulltext mode. In hierarchical display mode the possibility of opening a group is indicated by arrows at the beginning of each classification symbol.
Images -- if indicated in the text -- can be displayed by moving the arrow key on the symbol of a figure and pressing ENTER.
The software handles the cross references as a horizontal menu. Cross references are marked by the LEFT or RIGHT arrow keys and F8 allows us to jump to the referred entry. To return after reading the cross reference F7 has to be used. Jumping forward isn't limited, but in reverse only one step is allowed.
4.2. How to Search
The search function is not divided into options. Search statements must be formulated either by typing in or by choosing from one of the three indexes including the word index, the IPC symbol index and cross reference index. A specific feature of the retrieval software is that it can handle variable length fields of the word index and the index terms' length is unlimited.
The standard Boolean operators -- AND and OR -- are supported, NOT isn't implemented. AND means that the two words can be found on the same hierarchical path.
The following proximity operators are supported:
• (S) - the words connected are in the same entry;
• (nW) - n intervening words,
• (nN) - n intervening words, where the order of the words isn't defined.
Right-, and left-hand and embedded truncation are supported both in search statements and in index terms using expand mode. In that mode DEL key deletes the default "pattern" (when all words are displayed) and users can define a new pattern where only the words matching the defined pattern are shown.
A built-in bridge feature of the software enables us to transfer retrieved IPC symbols to bibliographic and facsimile type online and CD-ROM patent databases.
The search process is very fast, the speed of the disc is comparable to that of a hard disk.
4.3. Options for Users
OPTIONS provides seven options such as: directory setting, colour setting, returning to DOS, exit point, profile saving downloading, and finally a short information on the copyright.
5. PERSPECTIVES OF DEVELOPMENT, PUBLICATION PLANS
The IPC-5HU development was the first CD-ROM pilot project in the National Office of Inventions. In the first phase the performance of the retrieval software was analyzed on the basis of conversion of database from magnetic disk to optical disc. In the second phase the retrieval engine was optimized for use on the slow access time storage medium, so the perfor-mance of the program was highly improved.
As for the strategy of the ARCANUM, the company concentrated its efforts to a very special application, where the expertize of the typesetting was efficiently exploited in the software development phase. This knowledge served as a base for participation in the IPC: CLASS tender of the World Intellectual property organization.
The aim of the IPC: CLASS is to publish several editions of the IPC (official English 3rd to 5th, official French 3rd to 5th, German 4th to 5th, Spanish 5th edition) on one optical disc, including the special printed search support tools, e.g., the Official Catchword Index, the Stich- und Schlagwörter-verzeichnis, concordance tables and other support information, e.g., file of valid IPC symbols.
The participants of the WIPO tender were the leading software companies in the field of patent information (Dataware, Jouve) and other famous software companies, BRS, Bertels-mann, etc... and a small Hungarian company, the ARCANUM. The ARCANUM was selected by the WIPO, so the contract was signed in October 1991.
As for the plans of the publisher: The ARCANUM is ready to publish in the near future some very typical CD titles:
• Who is Who in Hungary
• The Holy Bible in Hungarian language
The partner in the IPC-5HU project, the National Office of Inventions plans to publish its patent specifications on optical disc in cooperation with the European patent Office starting in 1992. According to the current plans the disc will contain the facsimile images of the Czech-Slovakian and Hungarian patent specifications (later specifications from Poland).
Because of the poor Hungarian telecommunication facilities the optical
disc publication and dissemination of information has a high priority in
the information strategy of the Hungarian patent office, so the Hungarian
patent bibliographic database, the HUNPADOC, the Hungarian Trademark
Register and the Register of Industrial Designs will be published
in the next year.