Title Of Invention:
Method Of Categorization And Indexing Of Information
Field Of The Invention:
This invention generally relates to method of organizing information and more particularly to Internet search engines and directories.
Background Of The Invention:
Computers have become a very useful tool to save information for the purpose of retrieval of the desired document as it can manage a very large collection of records. However with ever increasing collection of documents that is Internet, organisation of the information has become a very difficult task. Currently mechanisms that are used for the purpose are directories and search engines. The directories are hierarchical structures, which group the similar type of documents together in groups and subgroups. This type of structure is unable to take into consideration multiple relationships that normally exist in the documents and information.
For Example:
• A typical product directory for machines may group the machines as per the types like Lathes, Milling Machines, Grinding Machines, and Shaping M/c etc.
• These may also be organised based on application like Machines for wristwatch industry, M/c for Automobile industry, Machines for heavy industry etc., • These can also be categorized based on type of usage like large-scale production, medium scale production, workshop m/c etc.
• These can also be classifies based on type of raw material the machines are designed to work on.
• These can also be categorized based on types of controls used like Automatic, semiautomatic, manual etc.
• There can be further categorisation based on make, source, quality certification, special construction features etc.
All these relations cannot be effectively addressed in directory indexing.
Search engine like Yahoo, Excite, AltaVista, Google etc. stores and index key words extracted from the text of the document. The documents are given relative ranking for a particular keyword based on the emphasis, occurrences and location of the word in the document. For searching desired document, user enters keywords that are likely to appear in the desired document in the search field. The result is generally a list of document, which contains the entered key word. The user has to browse through the documents before finding the required document. To improve precision of the document search the current search engines employ various techniques like ranking based on user activity, proximity grouping etc.
Still the search is far from satisfactory especially for technical and business information.
Summary Of The Invention:
The invention disclosed here has three main components, an indexing system including a grouping system, a search system and a user interface.
In one aspect of the invention, the indexing system provides multiple fields for indexing the document. Every field, provided for indexing the document, has a defined relationship with the information contained in the indexed document. The defined relationships are Information Type, Object of the Information, Source Sector Of the object of information, Source process of the object of information, Function of the object of information, Branch of knowledge, Application Or Process for which object of information is used, Relation of the information with the application, Category of the Process or application, Output of the Process Or Application, Sector of the Output.
In one aspect of the invention, the fields Information Type, Source Sector Of the object of information, Source process of the object of information, Branch of knowledge, Relation of the information with the application, Category of the Process or application, Sector of the Output are predefined and rest are user defined.
In one aspect of the invention, each unique set of field entries is defined as a distinct category under which documents are grouped. These categories are defined by a text expression generated from the field entries and joining the terms with suitable defining terms.
In one aspect of the invention, the search is carried out in two stages. In the stage one of the search, appropriate entries are made in the search fields if known, and the output is categories available in the database conforming to the search query. In second stage desired category is selected to view documents registered under it.
In one aspect of the invention, making entries in additional fields reduces the number of resultant categories and helps to locate the information quickly.
In one aspect of the invention, queries like raw materials required for manufacturing a given product or technologies for manufacturing a given product or materials going from a given sector to another given sector can be raised.
In one aspect of the invention, complex queries like alternatives for raw materials, machinery, technology etc., products going from one sector to another, etc. can be queried to the database.
Brief Description Of Drawings and Tables:
Figure 1 : Relationships of the expressions used for indexing information.
Figure 2: General arrangement
Table 1 : Table describing fields.
Table 2: List of document type categories
Table 3: List of Branch of knowledge categories
Table 4: List of Sector categories
Table 5: List Process Categories
Table 6: List of Object to process relationships
Detailed Description Of Preferred Embodiments
The method for organising information disclosed here is useful for variety of applications, which include indexing of web pages on Internet, indexing of classified advertisements, indexing of interests to receive information by e-mail or instant messaging etc.
The information is organised in precisely searchable categories and stored in the database.
In the system described here the required information is searched in two stages. In stage one the appropriate category is searched and in stage two, information stored under the category is viewed
The system is basically consists of a database preferably a relational database, a user interface, an information entry program and information search program.
A. The database:
The database has following tables:
a) Document type table: This table stores standardized categories of the documents for validation. b) Object of information table: This table stores entries made in the field of object of information. c) Branch of Knowledge table: This table stores standardized branch of knowledge of the documents for validation for the purpose of validation d) Function Table: This table stores entries made in the field of function of the object of information. e) Process Table: This table stores entries made in the field of process names. f) Process Category Table: This table stores standardized process category of the processes for validation. g) Process Output Table: This table stores entries made in the field of process output, h) Sector Table: This table stores standardized names of sectors for validation. i) Object to process relationship table: This table stores standardized object to process relationships for validation. These are basically expanded from man, money, machine, material, system, and organisations further expanded. j) Category Table: This table stores the categories created by unique combination of entries in eleven fields, which uniquely describes the category. k) Category to URL table: This table records title of the document, URL of the document and category as described in category table.
I) Category to Classified Advertisement Table: This table records title of the classified Advertisement, Classified advertisement and category as described in category table. m) Category to email table: This table records e-mail and category as described in category table.
The details of standardised categories are described elsewhere in the document.
B. The User interface:
The user interface is WebPages and has three main functions information registration and search, where information is WebPages or classified advertisements.
Registration of information: Sequence of steps are as following: -
To communicate requirement for registration of information to the users, Accept information for registration, Carry out initial validation of data, Transfer the data to registration module for information registration.
Search of Information: Sequence of steps is as following: -
1. To communicate requirement for search of information to the users,
2. Accept search query from user.
3. Transfer the query to search module for search of categories. 4. Accept the categories information from the search module.
5. Display the categories information received from the search module to the user. Accept selection of categories from the users.
6. Transfer the categories selected by the user to the search module.
7. Accept search results from the search module. 8. Display search results to the users.
The user interface also communicates other related messages.
C. The Information input program:
This program accepts information received from users through user interface and updates it to database. The sequence of steps is as following:
1. Accept entries to indexing field for selection of categories
2. If all the fields are defined then save the information to the database.
3. If the user does not define all the fields, then offer categories available for registration conforming to the entries made in the fields for selection to the user. 4 Accept categories selected by the user for registration and save the information against all selected categories.
D. The information search program:
This program accepts user queries received through user interface and carries out search in the database. The sequence of steps is as following:
1 Accept entries to search field for search of categories
2 Display categories available in the database conforming to the entries made in the fields for search by the user.
3. Accept categories selected by the user for displaying the information.
4. Send information available against the categories to the user interface
The System described above, user interface in particular is simplified to enable users to register and search a particular type of information, which is used very often, quickly. For example: Machines and equipment, Raw materials, Flats and apartments, Plots & real estate, Cars and vehicles, Tours and travels, Computers, Jobs and assignments, Representation and franchises, etc. In such cases user interface is modified where necessary field entries are built up, irrelevant field entries are deleted, which provides user with minimum and relevant option.
The system described above can be used to find out information even if the title of the information is not known, as by selecting appropriate entries in other field relevant categories can be viewed. For example:
To search raw material for any product select "Product and service information" in field - document type, "Raw material" in field - object to process relationship, and name of the product in the field - process output. The output from the search query shall display all the raw materials required for manufacturing of the product along with the processes.
Similarly information can be searched for technology, consumables, machines and equipment etc.
Also the information can be searched only describing function, sector, branch of knowledge etc.
The foregoing description of an implementation of the invention has been presented for purposes of illustration and description. It is not exhaustive and does not limit the invention to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practicing of the invention. For example, the described implementation includes software but the present invention may be implemented as a combination of hardware and software or in hardware alone. The scope of the invention is defined by the claims and their equivalents.
* * * * *