WO2011155736A2

WO2011155736A2 - Method for dynamically generating additional terms for each meaning of every natural language expression; dictionary manager, document generator, term annotator, search system, and device for building a document information system based on the method

Info

Publication number: WO2011155736A2
Application number: PCT/KR2011/004113
Authority: WO
Inventors: 박동민
Original assignee: Park Dong Min
Priority date: 2010-06-07
Filing date: 2011-06-06
Publication date: 2011-12-15
Also published as: KR20110133909A; WO2011155736A9; WO2011155736A3

Abstract

The present invention relates to changing an information system comprising natural language expressions to an information system based on unit expressions of meaning, which is accompanied by functional changes for an information search system, term dictionary, document generator, and term converter. The accuracy of current search systems is very low. This is because natural language represents many meanings using few words. Due to the problem of expressions becoming longer and more difficult to recollect as the number of terms increases, people use a small number of terms in a repetitive manner. When unit expressions of meaning having 1 term corresponding to 1 meaning are introduced, the accuracy of a search system can approach 100%. The present invention discloses a method for easily generating unit expressions of meaning, and a method for efficiently applying the generated unit expressions of meaning to documents from around the world. The method for creating unit expressions of meaning is a technique of breaking down each natural language term into the number of its respective meanings. Because this is a matter of a simple breakdown of terms, anyone can generate expressions. The task of applying generated terms to documents from around the world is formidable. For this task, according to the present invention, instead of changing each word that is repetitively used, alignment is performed for each word, and certain aligned word groups are simultaneously processed. Even if one word has been used several hundred billion times in documents throughout the world, there is no need to perform term conversions several hundred billion times. If the word in question has several meanings, the task of conversion can be performed simply by way of several sorting commands. Even if the repetitive use of terms does not impose a large load on term conversion, because the number of unit expressions of meaning itself is enormous, term conversion is not simple. The task of processing close to 10 billion unit expressions of meaning is daunting. A method for solving this difficulty is to equally distribute the task to a number of users. The greatest factor contributing to the ambiguity of natural language is the presence of innumerable proper nouns. These encroach on the domains of nouns, adjectives, verbs, and all other parts of speech, causing semantic confusion. While not limited to people's names, when considering proper nouns only in that context, there are over 10 billion terms in this category since the global population exceeds 6 billion. The present invention discloses a configuration in which this prodigious task is equally allotted to a countless number of users. When users have needs, they may perform tasks to fulfill their requirements and benefit from their work. If they feel that term conversion is required, users may perform term generation and term conversion tasks so that a state that is always satisfactory for users can be maintained. The present invention relates to: 1) a unit expression of meaning dictionary manager that can easily generate unit expressions of meaning; and 2) a search annotator which is a means for categorizing words and converting (annotating) words belonging to a word group into unit expressions of meaning. The annotator operates as part of a search system. The alignment and search of words uses existing search system functions. Also provided is 3) a unit expression of meaning converter (annotator) performing a function similar to the search annotator. The task of making a global information system based on unit expressions of meaning is an enormous endeavor. However, the problem of natural language being unclear in meaning presents a large obstacle for development in many fields. The present invention discloses a basis for achieving considerable advances in the semantic web field, search system field, language translation field, and artificial intelligence field, by means of providing clear language thereto.

Description

Method to dynamically generate separate terms for each meaning of all natural language expressions and dictionary manager, document writer, term commenter, search system and document information system construction device based on them

The present invention relates to a term dictionary, a document writer, and an information retrieval involved in generating information, collecting, indexing, searching, and using information, and a term commenter, a document information system construction device, and a semantic web for making them based on semantic terms. (Semantic Web) is included.

The technical field to which the present invention belongs is the field of information retrieval. Since the present invention relates to semantic-based information retrieval, the semantic web field is related to the information retrieval field. The Semantic Web represents the information about resources (web documents, various files, services, etc.) and the relations between resources in a distributed environment such as the Internet. The meaning information (Semanteme) is expressed in an ontology that a machine (computer) can process. It is a framework technology that allows automated machines (computers) to process them. Ontology is a formal specification of shared conceptualization of domains and expresses semantic information of domain vocabulary. Ontology is a kind of knowledge representation, and the computer can understand the concept represented by the ontology and process the knowledge. Ontology's axioms and rules are used for inference and proofing, and a separate rule language is used for rule expression.

The technical field to which the present invention belongs is the field of information retrieval. The current level of search technology can be clearly seen through major search engines. Currently, the major search technologies are natural language-based search technology. Since information is accumulated using unclear natural language and natural language search query is used, it has low search accuracy rate based on meaning. There are 641 people who have the same name as the inventor Hong Gil-Dong (a pseudonym). Searching for information about the inventor under the name of Hong Gil-dong includes unnecessary information about 640 people other than the inventor. In this case, the search accuracy rate for the keyword Hong Gil-dong of the natural language search system is 1/641 on average. Indeed, current search engines display a lot of material, but often they don't really want it. Therefore, we use techniques to narrow the scope of search using various keywords and apply the technique that shows the probable results such as page rank, but we cannot fundamentally solve the ambiguity of natural language. This is an inevitable limitation of natural language based search systems. In order to fundamentally improve the retrieval method, it is necessary to fundamentally solve a problem in which one expression has various meanings. To this end, the meaning must be corrected by adding information expressing all the meanings separately from the natural language. To this end, a separate expression method of the semantic unit is required, not natural language, and the Internet information should be annotated according to the new expression method. In order for this new expression method to be established, there must be a practical method of practical use. Annotating every piece of Internet information is a tremendous task, and creating new dictionaries for the expression of semantic units is a huge task that one or two people cannot. It is known that even the task of creating accurate dictionaries in a specific field requires long-term efforts to gather together, and preliminary work in the whole field is not something that a few people can finish in a year or two. Even if the dictionary is completed, it is impossible to solve the problem of annotating the entire Internet using this dictionary without any special measures.

Creating a dictionary of semantic terms and converting global documents, including the Internet, to semantic terms is a very large task that cannot be done without special methods. However, there is a way to reduce this amount of work to millions of jobs. The total amount of this work depends on the type of words used by global documents and the number of repetitions of each word. However, sorting the whole word can reduce the amount of work by the number of word types / meanings rather than the total number of word repetitions. How many times a particular word is used is not really related to the amount of work. It is not impossible to make a dictionary of semantic units from the index of the retrieval system where all document contents are sorted by natural language and make the index itself based on semantic terms. How many times a particular word is used has little effect on the overall effort. The total number of word types and the total number of meanings are proportional to the amount of overall meaning-based efforts. The present invention finds several meanings in a specific natural language sorted content of a search system, generates a term, annotates this new term in the index, and eventually converts the entire Internet documents into semantic unit terms based on the index. It has the same effect as one. Semantic term-based indexing can be used to convert all documents to semantic terminology. In addition, unlike the ontology dictionary, this method generates terms through the simple task of dividing the natural language into semantic units, so that the general public can easily participate in the task of generating terms and converting documents / indexes into semantic unit terms. When general users generate only a few terms that they are interested in and have knowledge of, search for the corresponding natural language on the Internet and comment out the newly created semantic unit term, it is possible to convert the semantic unit term based on the whole Internet.

Currently, the accuracy of search engines is quite low from a semantic basis. The present invention improves the meaning-based accuracy rate by several times, tens of times to hundreds of times, in some cases, compared to existing search engines. In natural language, a single term often has various meanings, and numerous proper nouns such as human names, store names, place names, etc. are invaded to general nouns, verbs, and adjectives. The present invention uses the semantic unit terminology to improve the level of accuracy of the expression unit to the correct rate of the semantic unit by complementing the unclear natural language. The present invention does not merely suggest a new model but includes a method in which the new model can be well established. It is a tremendous task to set up a new based search engine for 6 billion people around the world, but the present invention divides the work of the vast internet unit into the work of each individual, satisfying the needs of users with simple efforts and based on personal satisfaction. It proposes a structure that can satisfy 6 billion users.

1 is a block diagram of a semantic unit term-based information system

Figure 2 is the construction of a two-step semantic unit term-based information system

3 is a flowchart of constructing a semantic unit term based information system

4 is a configuration diagram of the pre-management center

5 is a comparison of an environment for generating semantic unit terms.

6 illustrates the ambiguity of natural language and the necessity of semantic unit terms

7 is a use example of the generated unique ID

8 is a conceptual structure of a unique ID dictionary

9 is a flowchart illustrating generation of semantic unit terms

10 is a comparison of the present term meaning unit term and the existing ontology dictionary

11 is an intuitive classification and hierarchy of semantic unit terms.

12 shows how to create and use term aliases

13 shows examples of using semantic unit terminology division

FIG. 14 is a flowchart of generating, commenting, and searching a semantic unit term segment.

15 is a semantic unit term group

16 is a flowchart illustrating generation and use of semantic unit terminology group.

17 is a block diagram of the center of an independent tin machine

18 is a default example of an individual

19 is an example of default values corresponding to a specific user.

20 is a flowchart illustrating the determination of semantic unit default values for natural language expressions.

21 is a conceptual structure of an annotation knowledge table

22 shows the application of various annotation knowledge to one natural language.

23 is an annotation knowledge generation flowchart

24 is an application priority of annotation knowledge and default values.

25 is a flowchart in which a knowledge base annotation unit is performed on a document or a query word.

26 is a flowchart for performing annotation knowledge on an index.

27 is a document annotation of an index-based document annotation portion.

Fig. 28 is a flowchart for annotating semantic unit terms to specific natural language expressions in an indexed document.

29 is a scale of a semantic unit term (unique ID +) based information system.

30 is a comparison of various construction methods of a semantic unit term (unique ID +) based information system

31 is productivity comparison of document unit comment and search comment method

32 is a manual annotation type document builder and an automatic annotation type document builder

33 is a flow chart of semantic unit term-based document creation with help of a knowledge base comment unit

34 is a semantic unit term-based information system created around a retrieval system

35 is a minimum configuration of a search system

36 is a diagram illustrating a search commenter added to a search system minimum configuration.

37 is a diagram in which annotator is added to the search system minimum configuration

38 is a meta-based search system

39 is an operation flowchart of a semantic unit term based search system having only basic functions

40 is an operation flowchart of a search system using basic functions and a search annotation function.

Fig. 41 is a flowchart showing the operation of the search system using basic functions and annotation knowledge functions.

42 is a diagram illustrating the configuration of an indexer

Figure 43 Conceptual Structure of Unique ID + Index

44 is a comparison of a unique ID method and a semantic expression ID method on an index.

45 is a semantic unit term based index flowchart

46 shows all annotation devices belonging to various devices

47 is a comparison of annotation devices that form the basis of a semantic unit term-based information system.

48 is an example of a document comment, index comment, and search query comment.

49 shows the difference between a word search comment and a document search comment

50 is a scale comparison of tin units

51 is an annotation comparison for a new document and an existing document

Fig. 52 is a comparison of the importance of each of the annotation devices.

Fig. 53 is a block diagram created around the search commenter;

54 is a flowchart of a search comment.

55 is a flowchart illustrating annotations on indexes of search result words.

56 is a diagram illustrating the structure of a searcher

57 is a search query word

58 is a unique ID + search query interpretation

59 is a flow chart of query term based on semantic unit terms.

60 shows three ways of displaying search results

61 is a flowchart for searching for words and displaying items in word units.

62 is a search flow chart listing and displaying word search results by word for each document.

63 is a flowchart for generating and utilizing search knowledge.

64 is a diagram illustrating the construction of a document information system builder;

65 is a natural language document information system and a unique ID + document information system.

66 is to construct a document information system using dictionary, index and annotation knowledge.

67 is to build a semantic unit term-based document information system using a dictionary and an index

68 is a document information system construction using dictionary and annotation knowledge.

69 is a flowchart illustrating the construction of a document information system using an index.

70 is a flow chart of document information system construction using annotation knowledge.

71 is a flow chart of document information system using search system index and annotation knowledge.

72 is a flowchart illustrating management using collective intelligence on disagreement

73 is a flow chart of storing and using after merging a search target document source with additional information;

Explanation of the sign

02-01 is a natural language document information system

02-02 is a natural language based search system

02-03 is a semantic unit term based device 1

02-04 is a first-level semantic unit term-based information system

02-05 is a semantic unit term based search system

02-06 is a semantic unit term based document information system builder

02-07 is a semantic unit term based information system

03-01 is the semantic unit term-based document creation step

03-02 document collection stage

03-03 is a semantic unit term-based index step

03-04 is a semantic unit term-based index step

03-05 is the semantic unit term generation stage

03-06 is the semantic unit term search comment step

03-07 is an annotation knowledge generation step

03-08 is a knowledge base annotation step

03-09 is the establishment stage of document information system based on semantic unit terminology

05-01 means semantic unit term generation

05-02 is the semantic unit term generation in the word search process

06-01 is the table above and shows that natural language has various meanings.

06-02 shows the following table and shows that unique IDs are assigned to various meanings of natural language.

09-01 is the semantic unit term information acquisition step

09-02 is the semantic unit term generation stage

09-03 is the semantic unit term dictionary entry generation step

11-01 is the semantic unit term classification stage

11-02 is the semantic unit term search classification stage

11-03 is the semantic unit term classification stratification step.

11-04 is the change of semantic unit term classification

11-05 is a step to adjust the semantic unit term classification dissent.

12-01 is the term alias registration step

12-02 is the terminology introduction phase

12-03 is the use of terminology aliases

13-01 is the semantic unit term

13-02 is the primary terminology

13-03 is secondary terminology

14-01 is a semantic unit term term division generation step

14-02 is the semantic unit term hierarchical term division generation step

14-03 is a terminology based annotation step

14-04 is the term division search step

16-01 is the semantic unit term group

16-02 is the search term using the term group

20-01 is the step of determining the default value of semantic unit term by group

20-02 is the meaning unit term default value application step

20-03 applies to the semantic unit term belonging group default

Step 20-04 applies the semantic unit term internet default

23-01 steps search

23-02 is the step of receiving annotation knowledge creation request

23-03 is the annotation knowledge generation step

25-01 is the stage of receiving knowledge base annotation requests

25-02 is annotated knowledge search phase

25-03 is the application of annotation knowledge

25-04 apply defaults

26-01 is the request to perform index target annotation knowledge step

26-02 is annotated knowledge transformation stage

26-03 is annotated index search step

26-04 is annotation index step

28-01 is an index based comment step

28-02 is the application of annotation knowledge

28-03 is the default apply step

32-01 is a manual annotated document writer

32-02 Auto Annotated Document Composer

33-01 is a natural language document creation step

33-02 Knowledge Base Annotation Step

33-03 is the comment change request step

33-04 is the change of semantic unit term comment

33-05 is the annotation unit term generation comment step

38-01 is an external search system

39-01 is the semantic unit term-based document collection stage

39-02 is a semantic unit term-based index step

39-03 is a semantic unit term-based search step

40-01 is the semantic unit term-based document collection step

40-02 is a semantic unit term-based index step

40-03 steps to search comments

40-04 is the semantic unit term based search step

41-01 is the semantic unit term-based document collection

41-02 is a semantic unit term-based index step

41-03 is an annotation knowledge step

41-04 is the semantic unit term based search step

43-01 is a conceptual index for 43-02 documents

43-02 was found in natural language Hong Gil-dong.

44-01 is a conceptual index for 44-02 documents

44-02 is a search for natural language instructions

44-03 is an arrow pointing to a table showing the values to be placed in the semantic unit term field of the index.

45-01 is the natural language indexing stage

45-02 is the semantic unit term-based indexing stage

46-01 is the semantic unit term query comment section (in the semantic term based search commenter).

46-02 is the semantic unit term query part (which is included in the semantic term based searcher).

47-01 is annotated

47-02 Target Document

47-03 comment search

47-04 steps

47-05 is the importance

49-01 shows a word search annotation method that annotates all words in a document.

49-02 does not record the location within a document, so it shows a document retrieval annotation that records only one thing with the same natural language and the same meaning.

51-01 is a new document document creator type 1

51-02 New Document Document Creator Type 2

51-03 is the default document commenter.

54-01 is the semantic unit term-based document retrieval step

54-02 is a document retrieval comment request receipt step

54-03 is an article search comment step

55-01 is the semantic unit term based word search step

55-02 Steps in Receiving Word Search Comments Request

55-03 is a word search comment step

57-01 is a natural search query

57-02 is a unique ID + search query

59-01 is the natural language query stage

59-02 Dictionary Finding Steps

59-03 is the semantic unit term comment step

59-04 is the query modification step

How 60-01 Lists Document Items

How 60-02 Lists Word Entries

How 60-03 lists documents / word entries

61-01 is the step of receiving a word search request

61-02 steps to display word search results

62-01 is the document / word search request receipt step

62-02 is a word search result document-by-word display step

63-01 is a search query review step

63-02 is the creation of search knowledge

63-03 is the stage of receiving a search knowledge disclosure request

63-04 is the search knowledge disclosure stage

65-01 is a natural language document information system

65-02 is a unique ID + document information system

69-01 steps to create document comment information

69-02 Document Annotation Steps

69-03 is the semantic unit term document storage step

70-01 is the document information system document collection stage

70-02 is the application of annotation knowledge documents

70-03 is the application stage of annotation knowledge document information system.

71-01 is the document information system document collection stage

71-02 steps to create document comment information

71-03 is a document comment step

71-04 is the semantic unit term document storage step

71-05 shows the application of annotation knowledge documents.

71-06 shows the application of annotation knowledge document information system.

72-01 is the discussion creation phase

72-02 the discussion stage

72-03 votes stage

72-04 apply the results of the discussion

73-01 saves separate places for documents and addresses

73-02 Steps to Change Document Content

73-03 is the Change Document Use Step.

First, the terms are briefly explained.

Semantic term-In the natural language, the same natural language expression may have several meanings. On the contrary, a single meaning may be expressed in various ways. A semantic unit term generates one term for each meaning. When a natural language expression has various meanings, the term is subdivided by a semantic serial number. On the contrary, when the expressions have various expressions, natural language representative expressions are used to have the same meaning. However, as an exception, even if the meaning is the same, if the languages are different, separate semantic unit terms are created.

In the present invention, the natural language exists in the form of "natural language + meaning unit term" with the semantic unit terms are annotated to clarify the meaning. In the present invention, the semantic unit term is used in two meanings. It may mean "natural language + meaning unit term", and it may mean only "mean unit term" regardless of the natural language. Unless otherwise noted, the term "natural language + semantic term" means a semantic term. For example, a semantic unit term document means a form in which a semantic unit term is annotated in a natural language. The same applies to the semantic unit term search query. The semantic unit term index is also an index containing both natural language information and semantic unit terms. The terms used to clarify this meaning are Unique ID and Unique ID +.

Unique ID-A representative semantic unit term proposed by the present invention. It is made by linking a semantic serial number to a natural language representation. One language is created for each language.

"Unique ID +"-natural language representation and unique ID pair. Usually "natural language: unique ID" takes the form

Annotation-Annotation is used here to clarify meaning by adding semantic unit terms to natural language expressions.

Convert-Here, convert means to convert a natural language expression into (natural language expression, semantic term) pair. After all, comments and conversions mean the same thing.

If a natural language expression is translated into a semantic unit term, it uses the term substitution, not the term conversion.

Word / Meaning / Occurrence-Several words are used in one document. These words may be used in more than one sense. Words with a particular meaning in a document can be used many times and this is called an occurrence. For example, a document containing 1000 occurrences, 500 meanings, and 400 words are used. The meaning and number of words in a document do not exceed the number of occurrences, and in general, the number of meanings is greater than the number of words, but if there are many other expressions of the same meaning, the number of words may be greater than the number of meanings. In this example, one word is repeated 2.5 (1000/400) times on average, and 100 (500-400) words are divided in meaning. A word can be split several times or several words can have a single meaning, so the exact number of split words is unknown.

GUID-Globally Unique Identifier is a pseudo-random number used in application software. While there is no guarantee that a unique value will always be created when generating a GUID, it is very unlikely that the same number will be generated twice if there is an appropriate algorithm. Therefore, the system does not need to maintain serial numbers. However, its length is inconvenient to use.

The present invention is a semantic unit term based information system centered on a retrieval system. First, let's look at an example of natural language information system. The basic components of the natural language retrieval system are document collectors, indexers, and searchers. Natural language document writers and natural language search systems use natural language dictionaries. The natural language information system centered on the retrieval system consists of 5 devices: 1) dictionary, 2) document writer, 3) collector, 4) indexer, and 5) searcher. The semantic unit term-based information system includes all the devices of the natural language information system. The basic framework is the same. Devices added because they are semantic terms are 1) dictionary of semantic terms, 2) commenter of meaning unit, 3) search commenter of meaning unit, and 4) builder of document information system based on semantic unit. The semantic unit term-based information system consists of 5 + 4 = 9 devices. The actual diagram consists of eight devices except the natural language dictionary. This is because the natural language dictionary is conceptually included in the semantic unit term dictionary. Of the four devices that have been added, the need for a semantic dictionary of terms is too obvious. The other three devices (commenters, search commentators, and document information system builders) are the devices needed to convert information made from natural language into information made from semantic unit terms. Unlike natural language, semantic unit terminology is not the language that users use in real life, but the number of words is much longer and its length is longer. Therefore, we need special help because we can't remember and write the document. There is a need for devices that help users easily use semantic terms. Annotator is a device that converts natural language into semantic unit term. As an independent device outside the retrieval system, a document writer, retrieval system, and document information system builder are used. Search commentators are internal devices in the search system that convert the contents of an index from natural to semantic terms. The document information system builder is a device that converts all documents into semantic unit terms based on knowledge and information accumulated in the state of making semantic unit terms based on the retrieval system.

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

1 is a configuration diagram including all devices.

A. A semantic unit term based information system is a semantic unit term based information system including dictionary manager, commenter, document writer, retrieval system, and document information system builder.

B. The semantic term dictionary manager is a device that creates semantic unit terms and adds descriptions to them to create dictionaries and manage them. It is a basic device used by all devices of A. semantic term-based information system. Abbreviation is a dictionary manager.

B1. Meaning unit term generation unit is a device that generates a dictionary by generating a unique ID, a meaning expression ID, or a semantic based GUID that is a semantic unit term, and adds a description to it. Abbreviation is term generating unit.

B2.The semantic unit term management unit is a device that manages the modified semantic unit term.

B3. Meaning unit term dictionary search unit is a dictionary finder. When a user searches a dictionary by inputting natural language, corresponding semantic unit terms are listed and the user selects one of them. It is similar to the function of inputting Hangul and converting to Hanja, but Hanja conversion is replaced with Hanja, but the dictionary search unit is commented after natural language rather than replacing. Abbreviated name is dictionary search

C. Meaning unit term commenter is a device that annotates semantic unit term in natural language expression and is used by D. Meaning unit based document writer, E. Meaning unit based search system, and J. Meaning unit based document information system builder. do. Abbreviation is a tin group. It is very difficult to convert all natural words into semantic units using dictionaries. Annotators are devices that automatically comment or help using annotation knowledge or defaults. It is a device used for comments on natural language in documents, comments on search system indexes, and comments on search query words. It is used to annotate existing documents as well as to comment on natural languages as new documents are created. It can be done by command, or it can be done automatically on a regular basis like an agent. It is also used for annotating bulk documents and for individual documentation.

C1.Annotation Knowledge Management Unit is a device for creating, modifying and deleting annotation knowledge. Annotation knowledge is the knowledge that "in any 1) condition, 2) natural language expression is 3) meaning." This is usually done using a search commenter that finds objects by searching and annotates specific semantic unit terms in certain natural language expressions, and then registers them as annotation knowledge if the results are satisfactory. Usually, 1) condition is query term used in search, 2) natural language expression is specific natural language expression used in search, and 3) meaning is semantic unit term used to comment in search.

C2. Default management unit is a device that manages default values. The default value is the semantic unit term for a specific natural language most frequently used on the individual, in a particular organization, in a particular field or on the Internet. In situations where multiple default values are applied, they usually have priority in order of individual, specific group, sector, and the Internet, and the user can specify the priority or default value. If there is no comment knowledge and a specific natural language cannot be annotated as a semantic unit term, the default value of the highest priority is applied.

C3.Knowledge-based comment section (document / index / query comment) is usually marked as C3.Knowledge-based comment section. It is a device that annotates or helps semantic unit terms in natural language by using comment knowledge or default value. It can be called and used, or it can be run regularly like an agent. It is a device that can be used for all annotations, including bulk documentation.

C4. Index-based document commenting unit is an apparatus that annotates the contents of the document by extracting the information of the index while the index is converted based on semantic terms. The fact that the index has already been based on semantic unit terminology means that the semantic unit term-based information system is completed.

C5.Annotation management unit is a device that shows all the comments and reviews the contents so that the comment errors can be corrected. Comments added by the user's comment knowledge, comments added by the user's search comment, etc. can be viewed in the order of the comment date.

D. The term-based document composer can write a document in semantic unit terms, but conceptually, it creates a document in natural language and finds the corresponding semantic unit term using natural language, and then goes through the two-step process of commenting on the natural language. Create a term-based document. The abbreviation is a document writer.

D1. Natural language writing unit is the same as the existing natural language-based document generator.

D2. Meaning unit term document comment section is a device that annotates documents written in natural language in semantic unit terminology. Annotation is a difficult task only with the help of the semantic unit terminology dictionary, but it can be done without any difficulty with the help of C3.

E. The semantic term based search system is a device for indexing and searching the collected documents based on semantic unit terms. Internal devices include 1) document collector, 2) indexer, 3) search commenter, and 4) searcher.

F. Document Collector is a device that collects documents to be searched.

G. Semantic term-based indexer is a device for creating semantic term-based indexes from retrieved documents. Abbreviation is indexer

H. Semantic term-based search commenter is a device that combines search and annotation functions to annotate indexes. A device that annotates semantic unit terms to specific natural language expressions contained in document (s) found by searching. Abbreviation is a search commenter

H1. Document search comment section (index comment) is a device that annotates a specific semantic unit term to a specific natural language contained in all or part of the documents found by a search. The abbreviation is a document search comment.

H2. Word search comment section (index comment) is a device that writes and annotates a specific semantic unit term for all or part of the words found by the search. Abbreviations are word search comments.

I. A semantic unit term based searcher is a searcher that searches a query made of semantic unit terms for an index created based on a semantic unit term. Abbreviation is a searcher

I1. The document search unit is a list of documents whose search results are the same as in the existing search system. For example, if a word search result is 4 words in 2 documents, 2 items are listed. The resulting items may be subject to document processing.

I2. The word search unit is a word list of the search results. For example, if the word search result is 4 words in 2 documents of 2 documents, 4 items are listed. Result items may be subject to word processing.

I3. Search Knowledge Management Unit is a device that creates and manages search knowledge. If the user determines that the search query is meaningful, the user may register it as search knowledge. Existing natural language search was so low in accuracy that it was less likely to continue to be used as knowledge. On the other hand, the semantic unit term-based search can pursue 100% accuracy. The knowledge of low accuracy rate increases the error rate by operation, but the semantic unit term base can be used in combination.

J. Semantic term-based document information system builder extracts semantic unit term information from semantic unit term-based index to make documents in document information system based on semantic unit term or convert documents to semantic unit term using annotation knowledge. Device. Abbreviation is document information system builder.

J1. Index-based document information system construction unit is an apparatus that makes the documents in the document information system based on semantic unit terms using index information.

J2.Annotation knowledge-based document information system construction unit is a device that makes the documents in the document information system based on semantic unit terms using annotation knowledge.

2 shows a semantic unit term based information system for each stage. It is very vast and almost impossible to make natural language document information system into semantic unit term dictionary and make it into semantic unit term based document information system by entering semantic unit terminology. To overcome this problem, the proposed method is to sort the document information system by words and to annotate the whole word. Fortunately, there are devices that sort by word. This is a search system. In the retrieval system, the contents of all searched documents are sorted by words. The proposed method is to make the index of the retrieval system based on the semantic term instead of changing the document information system to the semantic term. Changing the index to semantic terminology is the same as changing the document information system to semantic terminology. A semantic unit term-based index can make a natural language document information system into a semantic unit term-based document information system. The proposed first-level semantic unit term-based information system (02-04) is made by introducing a search system (02-02) into the natural language document information system (02-01). After sorting by word by the index of the retrieval system, the index should be changed based on semantic terms. It is the semantic unit term-based device 1 (02-03) which is added second to the first-level semantic unit term-based information system. The semantic unit term-based device 1 is a semantic unit term dictionary, a default DB and an annotation knowledge DB and three devices (a semantic unit term dictionary manager, a semantic unit term commenter, and a semantic unit term based search annotator). These devices make natural language-based indexes into semantic unit-based indexes.

The purpose of the first-level semantic unit term-based information system is to build a semantic unit terminology dictionary and a semantic unit term-based index. It can be said that the semantic unit term-based index and dictionary were completed in the first stage, but it is still based on natural language in terms of document information system and retrieval system. Also, step 1 has no role in terms of new documents, rather than existing documents indexed in the retrieval system. In the second step, the semantic unit term-based processing apparatuses for new documents are added, the search apparatuses are changed to the semantic unit term based, and the semantic unit term based retrieval system apparatus is changed to change the document information system to the semantic unit term based (02-05). ) And semantic unit term-based document information system builder (02-06) are added.

This completes the semantic unit term-based document information system for existing documents / new documents and changes the search system to the semantic unit term base (02-07).

The core devices of the semantic unit term-based information system are contained in the first-level information system. If the first stage succeeds, there is no obstacle to the completion of the semantic unit term-based information system. This is because the second stage is not a task performed by a large number of users, but a task performed by the operator / developer and the user simply uses the result.

3 is a flowchart in which a semantic unit term based information system centered on a search system operates. The first four steps (document creation step (03-01), document collection step (03-02), indexing step (03-03) and search step (03-04)) are typical features of the search system. It is not based on semantic unit terminology. If documents are written as semantic terminology documents from the beginning, they can be treated the same as natural language-based information systems, and no special procedures need to be introduced. However, since the dictionary of semantic unit terminology is insufficient at the beginning, it is difficult for the document itself to be written based on semantic unit terminology. Almost all of them are collected and indexed as natural language documents, and the actual step of making indexes based on semantic terms is the next step. From now on, the semantic terminology-based procedure begins. Searching in natural language easily reveals many meanings of the natural language, and the necessity of semantic unit term is raised at this stage and the semantic unit term is generated. In the case of natural language search, existing words are used, but since the present invention does not use a pre-generated term, the term should be generated whenever necessary. To create a term, input a natural word expression, a description of a specific meaning, and request a term generation. The semantic unit term dictionary manager creates a term using natural language expressions and creates a dictionary entry for the term by pairing the generated term with a description (03-05).

Now, the user must divide specific natural words by meaning and display the meaning unit by indexes sorted by words. The user searches using a query to find a specific meaning of a specific natural language and annotates the semantic unit term in the index to the corresponding natural language expression included in the found document (03-06). Conventional natural language indexes index document positions and document names in natural language fields, whereas semantic term-based indexes index document positions and document names in natural language / semi units. The work of creating indexes on a semantic basis based on search annotations can be done. But a more sophisticated approach can be applied here. Rather than performing the search knowledge once and forgetting it, storing this information can be used for other purposes. The most representative example is the application to new documents. The search system index adds content as new documents continue to be added. It is inconvenient for the user to regularly perform search knowledge on newly indexed documents. The search query word used in the search comment, the natural language expression to be commented, and the semantic unit term to be commented out become comment knowledge when stored.

Whether it can be good comment knowledge depends on the search query. If a user has to make a selection from a list after searching, this is not an appropriate annotation knowledge (03-07). When created with annotation knowledge, annotation knowledge is later performed to perform the same tasks as existing search annotations. Annotation knowledge is usually done on a different target than previous search annotations. New documents that are newly created and included in the search system index can be performed regularly. Annotation knowledge can be performed in the form of an agent by setting time and period (03-08). Repeated search annotations and knowledge base annotations build up many semantic term annotations in the index. By extracting semantic unit term annotation information for each document from the semantic unit term-based index and applying it to the corresponding document, make the document into semantic unit term-based document, and make the document information system based on the semantic unit term. Document information systems can be based on semantic units (03-09).

Through this process, semantic unit term dictionary is completed, semantic unit term based index is completed, and semantic unit term based document information system is completed.

4 is a diagram illustrating the configuration of a pre-manager.

B1. Meaning unit term generation unit is implemented by selecting one of four methods, unique ID, semantic expression ID, semantic unit GUID, and semantic expression GUID, but it does not mean that several methods are applied at the same time.

B2. Meaning of terminology management section Means to perform seven functions (term correction, term deletion, term merging, term classification, term alias, term division, term group).

Among them, the term merging is used for merging one of the two or merging by making a third term when two semantic unit terms have the same meaning. Terminology classification is the same as classifying Obama as "man," president. Classifications do not have to be entered at term generation and can specify multiple values.

Terminology aliases can be created for semantic terminology that is used frequently. Long semantic unit terms are term aliases because they are inconvenient for users to enter and difficult to remember. This term alias is translated into the corresponding semantic unit term before being used by the actual device.

The term division function is a function for dividing, dividing and searching a term in detail when a term is frequently used. When you search for semantic unit terms, there are only a few cases and hundreds of millions of cases. If hundreds of millions of cases are found, the terminology split will be used.

A term group is a group of several terms, and the group search shows the combined results of each of the terms in the group.

B3. Meaning unit term dictionary search unit is a dictionary finder. When a user searches a dictionary by inputting natural language, corresponding semantic unit terms are listed and one of them is selected. It is similar to the function of inputting Hangul and converting to Hanja, but Hanja conversion is replaced with Hanja, but the dictionary search unit is commented after natural language rather than replacing.

5 is a comparison of an environment for generating semantic unit terms. Usually, in the case of writing a document in natural language, the necessity of semantic unit term is not felt (05-01). However, in the case of natural language search, it is common to see a word having several meanings. They find that they contain too much unwanted data, and that the accuracy of the search problem is due to the various meanings of natural language (05-02). The retrieval system is the best system to show related information easily while making sense of semantic terminology. The retrieval system makes it easy to generate semantic terminology and to create means for annotating semantic unit terms in the index. The retrieval system is the best tool for transforming natural language based information system into semantic based.

6 shows how ambiguous a natural language is and why a semantic unit term is necessary. The upper part of FIG. 6 shows the cause of the invention (06-01). Natural language has many meanings. This causes the general search engines to have a low accuracy rate based on semantic unit terms. In the case of Hong Gil-dong (a pseudonym, the inventor's name), the accuracy rate is 1/641. A myriad of proper nouns invade common nouns and verb adjectives, making the meaning of words unclear.

The lower part of FIG. 6 shows that a semantic unit term is generated for each meaning of the natural language expression (06-02). The unique ID is a representative semantic unit term used in the present invention and is made by adding a natural language representative expression and a semantic serial number. Unique ID is created separately for each meaning. Looking for Hong Gil-dong on a particular social network service (SNS), there are 641 people with the same name. In Hong Gil-dong_1, 1 is the semantic serial number. After that, if a new Honggil-dong is found, it will be Honggil-dong_642 using the largest meaning serial number. If the semantic unit term is used instead of the natural language, it is 100% at the search accuracy rate of 1/641 in Hong Gil-dong.

7 shows how the generated unique ID is used. The generated semantic unit term is added instead of replacing the existing natural language. The form of adding unique ID to natural language is called "unique ID +". Unique ID + is a concept that includes a natural language expression for the user in addition to the unique ID for clear expression.

8 is a conceptual structure of a unique ID dictionary. The unique ID table contains a representative expression and a unique ID value, and contains a one-line description and a detailed description of the meaning of the unique ID. The one-line description is used when many unique IDs are listed at the same time, and the description is used when there is enough space to see only one unique ID. Usually, natural language and unique ID are one-to-many relationship, but there can be many expressions for one entity. In this case, other expressions that are not representative expressions are entered in other natural language expressions.

9 is a flowchart illustrating generating a semantic unit term. The generation of semantic unit terms is all parts of speech in all languages of the world. The number is at least 10 billion because all proper nouns, including personal names, are included. Before making a request for creating a term, it is common for a user to check the dictionary to see if a term with the same meaning already exists. If the natural language expression and the natural language expression to be used have the same terms and the same meaning, there is no need to generate a term. If the term of the desired meaning is found in the dictionary search, but the natural language representation is different and the other expression does not have the desired natural language expression, the natural term expression to be used for other expression can be added by using the meaning unit term by changing the semantic unit term. If there is a semantic unit term with the desired meaning in the dictionary search and the natural language expression is not the same, but it is included in other expressions, the semantic unit term can be used. When it is necessary to create a semantic unit term, there is no semantic unit term with a desired meaning.

In order to generate a term, a natural language expression and a description of a specific meaning of the natural language expression must be input (09-01). In the term generation step, a new semantic unit term is generated by connecting the semantic serial number of the natural language expression to the input natural language expression. When the input natural language expression is limited to a natural language representative expression of a specific meaning, a unique ID that is a semantic unit term defined in the present invention is generated (09-02). When the term is generated, the semantic unit term dictionary item is generated by pairing the generated semantic unit term and the obtained description (09-03).

10 compares four semantic unit terms of the present invention with an ontology dictionary, which is a typical representative dictionary.

Four semantic unit terms (unique ID, semantic expression ID, semantic unit GUID, semantic expression GUID) are embodiments of the present invention. These terms are very easy to define and very easy to use compared to ontology dictionaries, which can be called conventional semantic dictionaries. Therefore, general users who do not have expertise can participate in generating semantic unit terms of interest and build new document information system using these terms. For example, if the natural language AAA has three meanings, the effort to create three terms AAA_1, AAA_2, and AAA_3 to create a unique ID, and write a description for each one is completed. . The four semantic unit terms may have different shapes, but basically, the knowledge required by the user or the information to be input is similar. Because it is created in the natural language system, it does not require the effort and knowledge to create a completely new language.

The semantic unit term generation method can be considered when there are only two natural languages in the world and each has two meanings.

1. Named AAA_1, AAA_2, BBB_1, BBB_2 is a unique ID method and the system must maintain a semantic serial number for each natural language.

2. If you name it word_1, word_2, word_3, word_4, you must keep the entire serial number.

3. There may be a GUID method for generating four very large numbers, and there is no need to keep the entire serial number. This is a very large number, so there is no possibility of duplicate names.

Unique ID is a method of maintaining and using serial numbers for each natural language. This is the best way to read and remember the user. Unique ID is a representative semantic unit term proposed by the present invention. The process of dividing natural language expressions into semantic units is easy to understand. On the other hand, making various expressions as one semantic unit term may be a little inconvenient for general users because the concept of natural language representation should be introduced. For example, in many news, President Obama is represented as Barack Obama, but there are also cases where it is expressed as Barack Hussein Obama, Barack Hussein Obama II, Barack, and Obama. Creating a term for each of these expressions results in a semantic expression ID. Since the semantic expression ID is not a semantic unit term, it is necessary to merge the semantic unit into a semantic unit term. The merge of semantic expression ID into semantic unit is called semantic merge ID. In terms of content, the semantic merge ID corresponds to a unique ID and the semantic expression ID corresponds to a unique ID +. Comparing the unique ID method and the semantic expression ID method, the semantic expression ID requires several times the term generation effort. It makes the term dictionary large and the user uncomfortable by writing explanations without the need for expression units rather than semantic units. The unique ID + does not have a separate term description can confirm the efficiency of the unique ID method.

The unique ID is the most recommended semantic unit terminology in that the term generation effort is the smallest among the proposed semantic unit terms and is easy to remember and use. For President Obama, the natural language representation is Barack Obama, while Barack Hussein Obama, Barack Hussein Obama II, Barack, and Obama are other expressions. The generated unique ID becomes Barack_Obama_1 assuming that the semantic serial number of the corresponding natural language expression possessed by the system is 1. President Obama, represented as “Barack Hussein Obama,” becomes [“Barack Hussein Obama”: barack_obama_1]. The part enclosed in square brackets is a unique ID + and corresponds to a semantic expression ID.

Unique ID of the present invention has the following meaning. Unique ID was created to remove the ambiguity of natural language, and terms are created for each of the various meanings of natural language. It is the most representative semantic unit term, and it is divided into semantic units including all proper nouns such as names, place names, etc., which are confused with other words. The global set of 6 billion people, including all languages and all parts-of-speech, must be a separate, unique ID item, at least 10 billion. Anyone can easily create based on natural language, so general users can create and annotate terms. Unique ID is a precise language with a rich dictionary. The prerequisite for the new term to actually be established and empowered is to be able to annotate all existing documents with unique IDs. It is not worth it without the annotation method. In the present invention, a tin method is presented. Naturally interpreted and interpreted in this way cannot be the basis for search engines, language translation, semantic web, artificial intelligence (AI), and classification. Eventually, unique ID + will be the basis for search engines, language translation, semantic web, AI, and classification. The unique ID maintains the generation method that depends on the natural language even when creating a concept that does not exist in the existing natural language expression. Create a natural language representation for the new concept and create a unique ID based on the generated natural language representation. In the present invention, a detailed description of the semantic unit term is described using a unique ID. Since the implementation of semantic expression ID and semantic unit GUID is very different from the implementation of unique ID, a separate explanation is not necessary unless a separate explanation is necessary.

11 illustrates a method of intuitively classifying, layering, and managing semantic unit terms. The classification of semantic unit terms means that the object of classification is a semantic unit term. The semantic unit term is also used for the classification name to which the semantic unit term belongs. Classification names can be natural, semantic, or mixed forms of natural and semantic terms. The semantic unit term may have a classification name of 0 or more, and the classification name of the semantic unit term may be added or deleted at any time, and the classification name does not need to be defined before use in the term, and when the term is created or the term is changed. If you enter a classification name that has not existed before, a new classification name is automatically registered, and one classification name belongs to more than 0 classifications and hierarchies. If there is disagreement, the classification and hierarchical structure of terms can be refined through group intelligence such as discussion. Intuitive semantic term classification method.

If a classification name is entered in the classification field of a term while generating or changing a semantic unit term, the term belongs to the classification name (11-01). The term classification can proceed in bulk through search. The semantic unit term dictionary is searched and the selected terms belong to a specific classification (11-02). Classification can have a hierarchical structure. The hierarchical structure is created by selecting two classification names and setting up a hierarchical relationship. This hierarchical relationship setting has a complicated hierarchical structure when it is repeated (11-03). This semantic unit term classification can be changed if a change such as an error is found (11-04). The classification of semantic unit terms proceeds with the participation of many people as natural language develops. Procedures for setting up, discussing, and voting are provided so that the classification of semantic unit terms can be developed by many people (11-05).

FIG. 12 illustrates a method of using a semantic unit term term alias to create and use a term alias that can be used when a semantic unit term is long and difficult to remember. Terms apply to semantic terminology, and term aliases are created and used for individuals, specific groups, or the Internet. The term alias is created using three pieces of information: applied group, term alias, and semantic unit term (12-01). To use a terminology of a group, the group's term aliases are listed in the individual's terminology list (12-02). When a user enters a term alias in the context of entering a search term or a semantic unit term in the document, the actual query term is executed or translated into the corresponding semantic unit term before the document is stored (12-03).

13 shows an example of using a semantic unit terminology division. Searching for President Obama finds too much. You can subdivide this into terminology units to annotate and search. In the example of the figure, President Obama (13-01) is divided into term divisions for the presidency of the president, Senator, and others (13-02), and each is divided into second-tier term divisions (13-03). If you comment out the terminology name for a particular natural language expression in the list of documents found through the search, you can search the terminology name later. Searching by upper semantic unit term or term division will naturally include the contents of the lower term division. The terminology division is recognized in the form of "meaning unit term / terminology division name".

FIG. 14 illustrates a method of managing specific semantic unit terms by dividing them into term segments when necessary to subdivide semantic unit terms, and using them to annotate and search like semantic unit terms subdivided using semantic unit term terms. .

Entering the semantic unit term and term division name to be divided and requesting the term division, the semantic unit term term division is performed (14-01). The terminology division may consist of several layers, not just one. By entering the name of the terminology to be split and the name of the child terminology to be created, the terminology of the lower hierarchy can be created (14-02). Once a term split has been created, it can be used to annotate a document or search system index (14-03) and search for the term using the term split (14-04).

15 shows semantic unit terminology group. If you define a term group, you can create a search query using the term group name. In the example shown in the figure, the search term “2010 Korea High School Grade 1 _Grp” shows the list of the results found with “Hong Gil Dong_1” and the results found with “Kim Gil Dong_1”. Semantic unit terminology The term group, unlike terminology, has no use for annotating documents or search system indexes, and semantic unit terms are more precise language than natural language. Thus, if you search in semantic terms, only a small number of documents can be searched in. Groups can be used to increase concepts or search results at a reasonable size. A list of graduates should be found and each one searched, and this term group function provides a convenient way to perform two-step tasks at once.

Figure 16 shows how to create a semantic unit term group and use it. After inputting a semantic unit term or group list to be grouped, a group name to be created, and a group description, and requesting to create a semantic unit terminology group, a term group is generated using the input items (16-01). The created term group can be used in search queries. The term group included in the search query is converted into a semantic unit term query and the search is performed (16-02). Natural language is not an object that can accumulate knowledge in search because its meaning is unclear. This is because the error is widened as it is used in various ways. Semantic unit terminology can be used in various ways because it is precise and close to 100% of search accuracy.

17 shows a semantic unit term-based information system written around an independent commenter (a semantic term term commenter). The semantic unit term-based information system includes a device for annotating document builders, retrieval systems, and document information system construction units. In this section, the independent commenter is mainly described, and the overall commenter is comprehensively described in the section describing the search commenter in the search system. Semantic unit term commenter provides comment function to all devices (document writer, retrieval system, system builder).

C. Mean unit term commenter is a device for annotating semantic unit term in natural language expression.It is C1.Annotation knowledge management unit, C2.Default management unit, C3.Knowledge-based annotation unit, C4.Index-based document annotation unit and C5.Annotation unit. It is composed and used with the semantic unit dictionary manager.

C. Semantic terminology Commentators are called independent commenters, meaning they can be used without being dependent on a particular device. The search commenter is a powerful commenting device, but is separate from this independent commenter because it depends on the searcher. Independent commentators are called on different devices and used in a variety of ways.

Annotation knowledge consists of 1) comment conditions, 2) natural language expressions to be commented, and 3) semantic unit terms to be commented on. C1.Annotation Knowledge Management Unit is responsible for creating, modifying and deleting this annotation knowledge.

C2. Meaning unit default value management department creates and manages default value for individual or group. The default value is the semantic unit term that a particular person or group uses the most for a particular natural language expression. When the default values of several groups are applied, it is common for the group with the smallest number of members to take precedence over the group with the large number of members. In this respect, the individual's default is the highest priority, groups such as companies or sectors come first, and everyone's Internet has the lowest priority. Individuals using default values decide which default values to apply.

C3.Knowledge-based comment section is a device that annotates semantic unit terms in natural language expression using annotation knowledge and default value. Knowledge base annotations are performed on documents, indexes and queries. That is, it is used for annotation in all parts of natural language input. It can be called from where natural language is input or used in the form of an agent that is executed regularly. It can be done in the form of automatic annotation.

When annotation knowledge is accumulated enough, all annotations can be automatically performed. Knowledge base annotations apply annotation knowledge and default values when run. Whether or not to accept defaults in the absence of annotation knowledge is determined by the configuration. The default value means the highest frequency of use and does not mean that the accuracy is above the standard.

C4. Index-based document Annotation unit is a device that converts a document into semantic term based on information in the index. In order to use the information in the index, the target document must already be included in the search system index. If the document is based on semantic unit terminology, the relevant part of the index can be changed to semantic unit term base. Conversely, if the information in the index is based on semantic unit terminology, the document can be based on semantic unit terminology. This device can be said to be a device for type conversion of existing information.

C5.Annotation management unit is a device that shows all the comments and reviews the contents so that the comment errors can be corrected. My comment manager can view comments added by the comment knowledge that you created, comments added by your search comment, etc. in the order of comment date.

18 is an example of a default value of an individual (inventor Hong Gil-dong). The heading part is a natural language, and the content below the heading indicates various meanings of the natural language (various meanings mean unique IDs). The colored unique ID is the default semantic unit term for the natural language of a specific person. The default value specifies a specific value among several meanings of natural language. In the figure, the default value of natural language Hong-gil-dong is set to Hong-gil-dong_1 (inventor Hong-gil-dong), operation is set to operation_3 (operation), and eyes are set to eye_1 (Eye). If there is a setting to apply the default value, the system automatically annotates the unique ID value, which is the default value when the user enters the above natural language according to the contents of this default DB.

19 is an example of default values corresponding to a specific user. There may be default values at the Internet level, default values for each group, and individual default values. Their priorities are individuals> groups> the Internet. Usually, the default value for the entire Internet is the lowest priority, and smaller groups usually have higher priority. Therefore, the individual's default has the highest priority. The number and priority of groups they belong to can be determined by each user or set by the system. If you set the document field in advance while creating the document, the default value of the field is applied. In general, higher priority has a default value for some natural language and lower priority has a default value for many natural language. The Internet has defaults for all natural languages.

If all natural default values exist, the final default value is that of the highest priority individual. In order for the lowest Internet default to be the final default, all other group defaults must not exist. In the picture above, in the case of natural language Hong Gil-dong, there are several default values, but the highest priority personal default value is the comprehensive default value. In the case of natural language operation, the default values of the group and the Internet exist. In the case of natural language eyes, only the Internet has a default value, which is the final default value.

20 shows a procedure for determining a semantic unit default value for natural language expressions. First, the default value of each group is decided, and the order of priority of the groups included in the application of the default value is determined. Each group records the frequency of use of semantic unit terms by natural language expression and sets the semantic unit term with the highest frequency of use as the semantic unit term default value of the natural language expression (20-01). If a person is known because a search query is being made or the owner of a document is specified, the semantic unit term for a specific natural language expression is applied as the person's default (20-02). If the default value does not exist and the group (field) of the document is specified in the application of personal default value, the semantic unit term for the natural language expression is applied as the default value of the group. Apply priority to groups (20-03). If the corresponding default value does not exist in the group default application step, the semantic unit term for the natural language expression is applied as the default value of the Internet (20-04).

21 shows a conceptual structure of an annotation knowledge table. Usually the first of three elements of comment knowledge, the comment condition refers to the search query. This commentary knowledge is explained as follows.

When this annotation knowledge is performed on a search engine, it acts like a search annotation. The search engine searches for “President Obama” and annotates the unique ID barack_obama_1 in the index for the found documents. When this annotation knowledge is performed on a document, it finds a "President Obama" in the document and converts the Obama to Obama: barack_obama_1.

22 illustrates a situation in which various annotation knowledge is applied to one natural language. The results of applying many annotations knowledge may differ. In this case, more detailed information applies. Which is more detailed is the detailed annotation knowledge that the number of search results is small. “Barack Hussein Obama I” has the smallest number. When there are several annotations that can be applied to one natural language expression, the application priority can be specified. The search query can contain not only natural language but also a lot of information used in advanced search such as unique ID +, target site, field, date range and so on.

23 shows an annotation knowledge generation procedure of verifying annotation knowledge through search and registering it as annotation knowledge. The search is performed by acquiring a search query using phrases allowed by the search query grammar, such as a natural language / meaning term expression, an operator, a period, a site, a field, a category (23-01). After displaying the search results and receiving the user's review, the user is asked to create an annotation knowledge along with the verified search query word, the natural language expression to be commented, the semantic unit term to be commented, and the annotation knowledge (23-02). Annotated knowledge and annotated knowledge ID are created that contain the verified search query word, the natural language to be annotated, the semantic unit term to be commented, and annotated knowledge item is created by combining the annotated knowledge, annotated knowledge ID, and description (23-03). .

24 shows a sequence in which the knowledge base annotation unit applies annotation knowledge and default values. Annotation knowledge is information that is applied when knowledge base annotations are performed. The default value is applied only if there is a setting to apply. The default value is inaccurate information compared to annotation knowledge. Therefore, whether the knowledge base comment is left uncommented or the default is applied is determined by the configuration. In order to apply the default values, the order of application is annotation knowledge> personal default> group default> Internet default. If there is a higher priority semantic unit term, it is used in the semantic unit term annotation of the natural language. If not, the semantic unit term of the next rank is used. If the semantic unit term used in the annotation processing is not correct, the user must correct it.

FIG. 25 illustrates a process of annotating by performing a knowledge base annotation on a document or query word. Indexing works with the help of the search system, but in the case of documents or queries, the search system is not involved. Thus the procedure is very different. First, select a natural language expression to be commented and make a knowledge-based comment request (25-01). Search the annotation knowledge DB for the natural language expression to find the annotation knowledge to apply (25-02). Apply retrieved annotation knowledge to natural language representations (25-03). If there is no comment knowledge and the default is set, the semantic unit term default is applied (25-04).

25 is not a procedure of annotating the entire document, but annotating a specific natural language expression in the document. This procedure can be invoked by the document-wide commenting device, or by a person selecting a specific natural language and then requesting a comment. Annotation knowledge is typically generated from search system queries by default. Therefore, not all annotation knowledge can be used for annotation in natural language expressions. The annotation knowledge is indicated by a function that checks whether it is applicable in the absence of a search system, so the applicability can be confirmed in advance. If the corresponding annotation knowledge is not one but multiple, which one is to be performed first is the annotation knowledge itself. In general, priority has priority because it is determined that a small number of results is accurate when a search is performed.

26 shows a procedure of performing annotation knowledge with respect to an index. In general, the annotation knowledge is the annotation knowledge that performs the search comment and stores the content of the search comment. As a result, annotation knowledge is a duplication of what you've done in previous search annotations. But search system indexes are always changing. Adding new documents is the biggest reason. It is very inconvenient for a person to perform a search annotation each time new documents are added, but if you save the contents at the time of the search annotation, it can be automatically performed regularly. When performing annotation knowledge, you can modify some of the content of the previous annotation knowledge in order to change the length of time or reenactment.

A comment knowledge request for indexing is entered by inputting a comment knowledge ID and a change element (26-01). The requested comment knowledge is modified to reflect the change elements before execution (26-02). (26-03) .Annotate the semantic unit terms included in the annotation knowledge (26-04).

27 shows that an index-based document annotation unit annotates a document using only index information.

In general, the search commenter or commenter accumulates semantic information in the index, while the index-based document commenter is used to extract information from the index and apply it to natural language documents. It is a device that works backwards with semantic unit term indexer. To comment semantic terminology on a document, knowledge base annotations are typically used, using annotation knowledge and default values. Index-based document annotations, on the other hand, use information accumulated in the index, not annotation knowledge. In semantic unit term-based index, semantic unit term annotations are accumulated by search commenter or commenter. The information stored in the index may be more than what can be obtained from annotation knowledge. It is natural that the semantic information of the index is more than that of the annotation knowledge unless the semantic unit term annotation is used as a search annotation and the contents of the annotation are not generated as annotation knowledge. Index information, on the other hand, is not applicable to new documents at all. Therefore, semantic information of index and semantic information of annotation knowledge have different characteristics. The index-based document annotation unit is called and used mainly by the index-based document information system builder. It can also be called and used by the document writer.

Fig. 28 shows a procedure for annotating semantic unit terms to specific natural language expressions in documents indexed to a retrieval system. Documents are included in the index, but they do not necessarily have semantic term comments for specific natural languages in the document. This figure shows the procedure for annotating semantic terminology to a specific natural language expression using all available information, such as information in the search system index, annotation knowledge, and default values. For documents included in the index, the richest and most accurate information is the annotation information from the index.

The semantic unit term annotation is extracted by extracting information on the natural language expression in the document from the index (28-01). If the information is not obtained from the index, the annotation knowledge DB is searched to find the annotation knowledge of the natural language expression, and the semantic unit term is annotated in the natural language expression (28-02). If there is no information corresponding to the annotation knowledge and the default value is set, the default semantic unit term for the natural language expression is applied (28-03).

29 shows the scale of a semantic unit term (unique ID +) based information system. Proper nouns are the main cause of increasing the ambiguity of natural language. Natural language, including all proper nouns, must have unique semantic terminology to eliminate the ambiguity of the language. Considering that the existing population of the world is 6 billion, the number of semantic units at the time of settlement will be at least 10 billion, even considering proper nouns. Considering the size of the current maximum retrieval system of the document information system, it is expected to be in the unit of EXA byte in the not too long time. There should be at least one default value per natural language, and annotation knowledge requires one compound annotation knowledge per natural language. This vast scale of construction indicates that the work of converting the Internet into semantic unit terms cannot be done by a few experts. Making the Internet based on the semantic unit terminology shows that, like natural language, it can only be a way for all Internet users to participate and rely on the changing collective intelligence. The present invention has a structure in which a user can easily participate.

30 illustrates various approaches for building a semantic unit term (unique ID +) based information system. Depending on the approach, it may or may not be possible to base semantic terms on the Internet. The only way to build a system is to decompose it into individual units so that individuals can decompose the Internet and do as much work as they need. But even when working at the individual level, it should not be a way of unevenly burdening individuals. In the early stages of system construction, if an individual comments on the entire word of his or her document, it is difficult to proceed normally. Many words are used in one document. It takes a lot of effort to process many words regardless of the total number of comments. In fact, the number of comments is not proportional to the effort of the individual, but is proportional to the number of unique IDs used. Document-level comments have the problem of using many unique IDs and the benefits of commenting do not benefit them. Since the documents you write are the ones you know well, there is no problem of meaning confusion, and therefore you do not benefit from clarifying meaning. The construction method for each unique ID has the advantage that the burden of the entire construction is distributed evenly to the individual level. The work you build will benefit you. This is because it is a commentary on the keywords of their interest, their interest. Searching for unique ID units via search is often tens of millions of times more efficient than annotating a person's entire word.

31 compares the document unit annotation and the search annotation scheme, for example, productivity. In the above example, the unique ID unit annotation method has 23,000,000 times higher productivity than the document unit annotation method. The annotation requirements for the entire information system are constant. Therefore, tin productivity is the most important measure of new system construction. Unique ID unit annotation is a key device that enables the construction of a new system. Normally this is generated by the agent and performed regularly for new documents.

32 shows a manual annotation type document builder and an automatic annotation type document writer.

A document writer can basically create a semantic term-based document with only a semantic term dictionary manager. You can create a document in natural language and search the semantic unit term dictionary to select the desired semantic unit term by referring to the description of each semantic unit term. However, it is unlikely that a manual document writer will actually be used. This is because document authors are not parties to semantic confusion in natural language, and manual commenting is inconvenient (32-01). The document composer will become the semantic terminology comment in the form of automatic commenting from the time when sufficient comment knowledge is accumulated, and the document composer will review and partially revise the comment content. Documents written in natural language in the autocomment format are autocommented using annotation knowledge and default values. After automatic commenting, the document writer displays a dictionary description of the semantic terminology that was commented out, and which commentary knowledge or default value was commented on (32-02).

33 illustrates a procedure of creating a semantic unit term-based document with the help of the automatic annotation function of the knowledge base comment unit.

First, write the document in natural language. Annotation knowledge is not a device that helps annotate with only one word entered. Although the default value can suggest recommended semantic terms even when there is only one word, it is normal to start a comment after completing a natural language document because it prevents the use of highly accurate annotation knowledge (33-01). Invoke the automatic comment function to apply the commenter's annotation knowledge and semantic unit defaults to natural language sentences to annotate semantic unit terms for each natural language expression (33-02). If there is an error in the comment made by the auto comment function, select the relevant part and start the modification process by requesting the change. Then, the dictionary search for the natural language shows a list of semantic unit terms for the natural language (33-03). When the user selects the corresponding semantic term from the list, the comment is changed to the selected term (33-04). If there is no corresponding term in the semantic unit term list, change the comment to the term after creating the term (33-05).

FIG. 34 shows only a search system in FIG. 1 and simplifies other parts. The J. semantic term-based document information system builder is a device that uses only the results of the retrieval system and is not related to the performance of the retrieval system. The plot consists of all the annotation devices that populate the content of the semantic term-based index.

35 does not have an apparatus for assisting annotation on the index. Therefore, all semantic terminology information in the index should be obtained from semantic terminology based documents. Therefore, the collected document itself must be a complete semantic term-based document. It is difficult to actually work well enough for users to use semantic unit terms in their daily lives instead of natural words.

36 is a block diagram of a semantic unit term-based search commenter added to a basic semantic based search system. Only the commenter is missing among the devices that help the comment. Except for the problem that the commenter cannot repeat the annotation knowledge, it can be said that it is completed from the point of view of the search system. If you do not repeat the contents of previous search annotations on new documents that are newly added to the index, such as agents, it may be inconvenient for people to repeatedly perform search annotations. Therefore, semantic unit term annotations may be incomplete. If these features are included in the search commenter itself, then the search system is complete. However, the absence of a structure that utilizes search annotation knowledge beyond the search system can be a major obstacle to creating a complete semantic term-based information system.

In FIG. 37, there is no search commenter in the search system itself, but a semantic unit term commenter exists outside. This configuration depends on the completeness of the semantic unit term commenter and the performance of the entire search system. The semantic unit term commenter can also have limitations because the contents verified by the search commenter become the annotation knowledge and the completion of the annotation knowledge is a key element of the semantic unit term commenter performance. In this configuration, it can be said that the commenter includes the function of the search commenter to operate normally.

38 is a meta search system. It doesn't have its own document collector and indexer. Therefore, the search is done by an external search engine, annotated, and recorded in its own semantic term-based index. This approach has the advantage of making the system easier at first, but there are some serious limitations. For example, a search comment command to select a word that is not annotated in Obama among the documents searched as “President Obama” and comment out barack_obama_1 has nearly 100 million pieces of data to be received from an external system. If 99% is already annotated, only 1% needs to be imported and processed, but there is a big problem that 100% must be received and processed. There is also a big problem with how well it works with external search systems.

39 illustrates a method of operating a semantic unit term based search system having only basic functions. This method has only basic functions, and the semantic terminology information of the index is obtained from the semantic terminology based document. Other than this, it does not provide a means to add semantic terminology information of the index. The search system collects documents included in the search target, and whether the collected documents sufficiently include semantic unit term information determines the semantic unit term base level of the search system (39-01). Index the collected documents against natural and semantic terms (39-02). Searching for natural words and semantic unit terms stored in the index using query terms including semantic unit terms and natural language expressions (39-03).

40 illustrates a method of operating a semantic unit term based search system in which semantic unit term information is obtained from collected documents and search annotations.

The search system collects documents included in the search object (40-01). Index the collected documents against natural and semantic terminology (40-02). Receives a search annotation request along with a query to find an annotation object, a natural language expression to be commented, and a semantic unit term to be annotated, and annotates the semantic unit term on the search system index to the natural language expression included in the search result of the query. -03). Search the natural language and semantic unit terms stored in the index by query words including semantic unit terms and natural language expressions (40-04).

FIG. 41 illustrates a method of operating a semantic unit term-based retrieval system for obtaining semantic unit term information from collected documents and annotation knowledge.

The search system collects documents included in the search object (41-01). Index the collected documents against natural and semantic terms (41-02). Annotated semantic terms in natural language expressions are annotated using annotation knowledge that has information that certain natural language expressions have meaning under certain conditions (41-03). Search for natural words and semantic unit terms stored in the index using query terms including semantic unit terms and natural language expressions (41-04).

Fig. 42 is a configuration diagram created around the indexer. Parts other than the indexer are simplified.

The indexer is responsible for indexing the collected documents. Semantic term-based indexes have a semantic term field added to the index. Semantic term comments in a semantic term-based document are recorded in the added field. The search commenter also records the semantic terms in this field. If the indexer fails to fill this part, the search commenter or commenter fills this part to base the semantic unit term. If the natural language has only one meaning, it is not necessary to comment. Natural language itself can also play a role as a semantic unit term.

43 shows a conceptual structure of a unique ID + index. This figure is the index (43-01) of the second Hong Gil-dong of a specific document (43-02) found by searching for "Hong Gil-dong". When the natural language expression field and the unique ID field of the left table of the figure are combined, the unique ID + value is formed. After all, this index is the document location index for the unique ID + value.

44 shows how the unique ID scheme and the semantic expression ID scheme are handled on an index. The document on the right is the specific document (44-02) found with the search term "gil" and the table on the left is the index (44-01) for the kildong. The value entered in the semantic unit term field, the second field, is shown in the table below on the left (44-03). In the sentence, the same person, Hong Gil-dong, is expressed in two ways, Hong-gil-dong and Gil-dong. In this case, the unique ID uses the same ID in both expressions, and the semantic expression ID uses a different ID.

45 illustrates a semantic unit term-based indexing method. The indexing device creates a search system index (45-01) with a semantic unit term field blank for each word included in the collected document. If a semantic unit term annotation is included in the word, the semantic unit term is recorded in the semantic unit term field of the word index item (45-02).

46 shows all annotation devices belonging to various devices. In the previous section on semantic unit terminology commenters, the independent commenter section is described, but all commenter devices are described here. 46 is different from FIG. 1. The semantic unit term query term comment unit is included in the search commenter 46-01 and the searcher 46-02. To search, a query term must be prepared, and the query term is also the target of semantic term term annotation. Because query words are very short sentences, they are less important in terms of comments. It is usually treated as part of the document comment. In the case of search commenters, comments are made after the search. The search portion of the search commenter uses much of the same functionality as the searcher. Therefore, the query is used in the search commenter, and the query word in the search commenter is the target of semantic unit term annotation like the query word in the searcher.

Annotation devices often contain the word document. A document should understand exactly what it means in many ways. Documents are sometimes used to mean "document search comments." The opposite concept is a "word search comment". Documents also mean the subject of comments. The opposite of what it means to comment on a document is the record of the index. Semantic Unit Term Document in the comments section means that the document is annotated rather than an index. Documents in the document retrieval comments are document-level records. The target of all search comments is the index.

FIG. 47 briefly describes annotation devices that form the basis of a semantic unit term-based information system as part of the description of FIG. 46.

In semantic unit term-based information system, making natural language information based on semantic unit term is the core task. The function of adding semantic unit term to natural language is simply called annotation function. Annotation targets are places where comments are made. It is divided into document comment, index comment, and search query comment (47-01). The target document is already indexed to the retrieval system and indicates whether it is annotated using the functionality of the retrieval system or an annotation method that does not use the retrieval system. This means that new documents are not included in the index and are processed regardless of the search system (47-02). The splitting of search annotations occurs because existing search results are listed as documents. An incomplete way to comment on what a word in a document means is what is meant by a document search comment. Word search comments are more precise (47-03). The C4 index-based document annotation unit, the J1 index-based document information system building unit, and the J2 annotation knowledge-based document information system building unit are functions that are performed secondarily after the first-level semantic unit term-based information system is already completed. It is therefore of no early importance (47-04).

The document information system and the index can be easily based on the semantic unit terminology when one is based on the semantic unit term. The first thing to be based on semantic unit terminology is index, not document information system. This is because the semantic unit term base of indexes is much easier. D2. Semantic terminology Document annotations are not a secondary device, but are not of great importance initially, in that they are not devices that annotate indexes. The semantic unit term query term comment is not important because the amount of comments is extremely small. After all, the C3 knowledge-based commentary, the H1. Document search commentary, and the H2. Word search commentary are the initial critical devices (47-05).

48 shows an example in which document comments, index comments, and search query comments are actually applied. Index comment is applied to the word search comment method.

49 shows a difference between a word search comment and a document search comment.

Word search comments are a way to record all occurrences and are natural. Annotate each word in the document. This is the correct comment. Record up to each occurrence of each word in the document. This method is difficult to apply to existing search systems. A new search device made for this processing is the word search section (49-01). Document retrieval comments are inaccurate and the original comment should be done at every word level, and the problem is caused by the inability to obtain the desired information because the search is not a specific word, but a device to find a specific document. It is an annotation method that may disappear in the long run. Compared to the tin method per generation, only one Hong-gil-dong and two seas are recorded. The position of words should not be recorded (49-02).

50 is a description of what unit to proceed with annotation. There are many other ways, but only the main items are compared. It does not include the section on knowledge base comments. Once the knowledge of the comment is completed, it is possible to comment on the full meaning of the entire document. Accumulation of annotation knowledge means that the semantic terminology-based information system is complete. The earliest and most powerful device is the search commenter. The search commenter performs the entire document-level comment of a particular meaning. Search commentators can sometimes be tens of millions of times more productive than manual commenting on each natural language representation of an individual document. This is an important means of enabling semantic terminology based information systems.

51 compares annotations for new and existing documents. New and old documents have different processing environments. Since new documents are not included in the search system index, they cannot be processed for the index. New document comments annotate the document itself. Existing documents are commented on the index (51-03). Existing document annotations are annotated with the retrieval system and new document annotations are annotations that proceed regardless of the retrieval system. The new Document Builder-2 writes directly to the search system index, but means that it has a built-in indexer, which is done without any intervention from the search system. Storing the results directly in the search system's index does not mean using the traditional document annotation method. In the case of document writer-1, the document writer creates a semantic unit term-based document, and the collector collects the semantic unit term-based index (51-01). In the case of document writer-2, the document writer does not pass the semantic terminology to the collector and then directly indexes it (51-02). The indexing method can be conveniently used in situations where it is difficult to store and keep the annotated documents separately. Normally, you cannot save a changed document to its original location unless you are the owner of the document. In this situation, the changed contents are stored directly in the index without storing the changed documents.

The information stored in the index can be used at any time to convert a natural language document into a document annotated with semantic terms. Existing document commenters comment on the index with the documents included in the index. New documents can also be commented using existing document commenters if they are included in the index without any semantic term annotation work until the document is written. This is because annotating with indexes is more efficient.

52 shows the importance of each of the annotation devices step by step. The vast majority of existing documents will be annotated by search and knowledge base annotations, and will be completed when most of the existing documents are annotated. In the finalizer, new documents will be annotated by a document writer with the help of the knowledge base commentary. However, it is not known how much comments will be made by the document writer. Even if the document author himself does not have any confusion of meaning, and there is no effort for commenting, he or she is likely to leave it as a natural language document without commenting. In this case, after indexing, the annotations will be processed by the knowledge base annotation. In the finalizer, the document annotation and knowledge base annotations play a major role, but since the document annotation portion actually calls and uses the knowledge base annotation, all annotations are automatically made by the knowledge base annotation.

53 is a diagram illustrating the configuration of a search annotation machine. Simplification beyond the search commenter. The document retrieval section is a forced part because the existing retrieval system has a structure for searching a document. When the word search feature is added, the document search comment is not a necessary device. This is because comments are added to certain words rather than added to the document.

54 shows a procedure of annotating a specific semantic unit term in an index to a specific natural language expression for documents found through a search. This method does not specify that the natural language representation of a certain position in the document is derived because the existing search function has a structure for searching the document. Retrieve documents containing natural and semantic unit terms and search documents. This function uses the existing search function as it is (54-01). A search annotation request is received (54-02), with a list of all or some selected documents, the natural language expression to be commented, and the semantic term to be annotated. For the selected documents, the semantic unit term for the natural language expression is annotated on the search system index and the position in the document of the natural language expression is not recorded (54-03).

55 shows a procedure of annotating a specific semantic unit term in an index to a specific natural language expression for words found through a search. This method specifies that a natural language representation of a location in a document is performed and is performed in a structure of searching for words unlike a conventional search function. The words are searched by obtaining a query including natural and semantic unit terms (55-01). A search annotation request is received together with a list of all the search results words or some selected words, a natural language expression to be annotated, and information about semantic unit terms to be annotated (55-02). For the selected words, the corresponding semantic unit terms are commented on the search system index for the natural language expression, and the position in the document of the natural language expression is clearly recorded (55-03).

56 is a diagram illustrating the configuration of a searcher. Parts other than the browser are simplified. The searcher performs a search with a search query. I. Semantic term-based searcher includes I1. Document search unit, I2. Word search unit and I3. Search knowledge management unit, and there is a natural language query unit for creating a search query and a semantic unit term query term comment unit. Search comments do not comment the document, but comment the found words. Therefore, to help the search commenter's role, the searcher has been enhanced with the ability to find words rather than documents. Compared to a document search for a document, a word search has been added to clarify which words within the found document are desired to be listed. In the existing natural language search, the search method was not called knowledge. Due to the low accuracy of natural language search, it was difficult to use complex knowledge. This is because the more errors are calculated as a result of the search. However, the semantic unit term-based search can be 100% accurate and can be registered as a search knowledge and used in combination. Search knowledge is created by registering the experience of search as knowledge. Both the search commenter and the searcher need a search query, and the query is the target of the semantic term term annotation. Therefore, the searcher has a natural language query unit and a semantic unit term query term comment unit. In the representative diagram (FIG. 1), the query-related part is not exposed as a component.

57 shows a search query. Query terms are used in search systems and search commenters in search systems. A natural language search query is composed of one or more natural words and various operators such as and / or, specific time periods, specific sites, specific classifications, etc. (57-01). The unique ID + search query consists of one or more unique ID + and various operators such as and / or, a specific time period, a specific site, a specific classification, etc. (57-02).

58 shows how the unique ID + search query is interpreted.

59 shows a method of creating a semantic unit term-based query word. Semantic unit terminology is difficult to remember and use, so input natural language and convert it to semantic unit term by dictionary search. Similar to the existing query method, a natural language is obtained to prepare a query (59-01). A natural language expression to be annotated in the query is selected and a dictionary search request is made (59-02). Obtain the selected item from the list of semantic unit terms listed and annotate the natural language (59-03). For the query words annotated with the semantic unit term, the natural / mean unit pair is changed to the pure semantic unit term (59-04).

60 shows three ways of displaying search results. In general, a retrieval system is a device for retrieving a document and thus lists the document items (60-01). This method of document listing makes it difficult to process certain words within a particular document. If the natural language in a document is always used in the same sense, it is not a big obstacle to commenting. In practice, document-level commenting is not a major obstacle because you can comment on the meaning of each specific natural language in a document. In particular, the accuracy of the initial semantic unit term-based retrieval system is not a big obstacle. In general, since the natural language retrieval rate is very low and shows a superior accuracy rate, it is not a big problem to reduce the accuracy rate slightly based on the semantic unit term. But in the long run, it is clear that it will be an obstacle to pursuing 100% accuracy. Document-level comments can't tell you where a natural language expression is located in a document. Word item listing eliminates the problem of document-level comments. It can be clearly expressed as a semantic unit term of a natural language expression at a specific position in a specific document. (60-02) This is a feature that existing search systems should add. However, this can be inconvenient if you need to use the traditional document listing method. The document / word item listing method combines the document listing method and the word listing method (60-03). Word commentary does not necessarily mean that only one word is processed. Search for “President Obama” to support President_1 comment on President and comment barack_obama_1 on Obama.

61 shows a procedure of searching for a word and displaying an item in word units. The number of search result items is the same as the number of words searched for, and can be used for word-by-word processing. The word search query can find the words you want, display the results in word units, and the number of items listed is the same as the number of words searched. A search query for finding a document and a term (natural language expression or semantic unit terminology) information to be searched for in the searched document are received (61-01). The words searched by the word search query are listed and displayed (61-02).

62 shows a search procedure for searching for words and listing and displaying the results by word for each document. The search results are organized by word by document, and the results can be used for document-by-document and word-by-word processing.The search query finds the words you want within the desired document, displays the document as one item, and displays each word unit for each document. The results are displayed in the same way as the number of items listed, plus the number of documents and terms. A search query for finding a document and a document / word search request are received with information on a term (natural language expression or semantic unit term) to be searched for in the searched document (62-01). The words searched by the word search query are listed and displayed by word of each document (62-02).

63 shows a procedure of generating and utilizing a search knowledge. Existing natural language search was so low in accuracy that it was less likely to continue to be used as knowledge. On the other hand, the semantic unit term-based search can pursue 100% accuracy rate. The knowledge of low accuracy rate increases the error rate by operation, but the semantic unit term base can be used in combination. This procedure provides a means to perform search queries to review the results and to register and use meaningful search queries as search knowledge. Perform and review the semantic unit term-based search query (63-01). Receives a search knowledge generation request along with a search query and its description, generates a search knowledge ID, and turns the knowledge search ID, search query and description into search knowledge (63-02). 63-03) Reveal search knowledge (63-04).

64 is a diagram illustrating the construction of a document information system builder. Parts other than the document information system builder are simplified. The document information system builder plays a role in building the document information system using information stored in the index or annotation knowledge.

65 shows a natural language document information system and a unique ID + document information system.

The document information system is an entire document, including documents of various types such as Internet documents, companies, and personal documents. The natural language document information system is a document information system based on the natural language dictionary (65-01), and the unique ID + document information system (65-02) is created based on the unique ID dictionary. Creating a semantic term-based document information system is a huge task. The value of changing the document information system is the same as the value of the index of the retrieval system that contains all of these documents based on semantic terms. Perfect commentary knowledge is of the highest value. This is because annotation knowledge has the added value of being able to base many parts of semantic terms on future documents. Annotation knowledge cannot be made right away. Making indexes based on semantic terms is the best way to base document information systems on semantic terms and is the best way to create annotation knowledge.

FIG. 66 illustrates the construction of a semantic unit term-based document information system using a semantic unit term dictionary, index, and annotation knowledge. The semantic unit term dictionary is mandatory. Without this, neither the semantic term index nor the annotation knowledge can be created. The semantic unit term index contains information about which natural language representation of a document is meant. Therefore, if the semantic unit term index has enough information, the semantic unit term document information system can be created. Annotation knowledge is the knowledge that "under certain conditions, what natural language means what." Therefore, if there is sufficient comment knowledge, semantic terminology document information system can be made.

67 shows that a semantic unit term based document information system is constructed using a semantic unit term dictionary and an index. If the semantic terminology index has enough information, a semantic terminology document information system can be constructed. However, semantic terminology gives no information about newly created documents.

FIG. 68 illustrates the construction of a semantic unit term-based document information system using a semantic unit term dictionary and annotation knowledge. If there is sufficient annotation knowledge, it is possible to construct semantic unit term-based document information system using only annotation knowledge. Therefore, it is possible to construct a semantic unit term-based document information system without the help of a retrieval system. However, it requires more computing power than semantic-based using index information. In general, index information is larger than the semantic unit term information of annotation knowledge.

FIG. 69 illustrates a procedure for constructing a document information system such as the Internet based on a semantic unit term using a search system index in which information for annotating natural language expressions included in each document is accumulated. (The method of using index can be applied only to the documents included in the search target of the search system.) The semantic unit term annotation information accumulated in the index of the search system is classified by document location and the semantic unit term annotation information of each document is classified. (69-01) Each document collected by the retrieval system includes new semantic terminology annotation information for the document (69-02). Documents created by including semantic unit terms are stored in a separate storage location of the retrieval system including the existing document location information (69-03). 69 is a procedure of extracting information from a search system index and constructing a semantic unit term-based document information system.

FIG. 70 shows a procedure for constructing a document information system such as the Internet based on semantic unit terminology using annotation knowledge accumulated in annotating natural language expressions as semantic unit terminology. Annotation knowledge can be applied without being dependent on a specific search system. Therefore, it is applicable to new documents of a specific search system. Collect documents in the document information system. It does not use a retrieval system and performs document collection directly (70-01). By retrieving the corresponding annotation knowledge for the natural language expression contained in each document and applying the found annotation knowledge to the corresponding natural language expression, the semantic unit term is annotated for all natural language expressions in the document. ). After commenting is completed for each document, repeating the steps of storing the existing document location information in a separate storage location makes the semantic term-based document for all documents (70-03).

71 shows a procedure of constructing a semantic unit term based document information system using a search system index and annotation knowledge. Document information, such as the Internet, by using the search system index for documents that are included in the search system and having sufficient semantic unit term information accumulated in the index, and the annotation knowledge for new documents or documents outside the search system that do not have information in the index. It is a procedure to build a system based on semantic unit terminology.

Collect documents belonging to the document information system (71-01). The semantic unit term annotation information accumulated in the index of the search system is classified for each document position for the documents included in the search system to generate semantic unit term annotation information for each document (71-02). Each document included in the retrieval system contains new semantic terminology annotation information for that document (71-03). Documents created by including semantic unit terms are stored in a separate storage location of the retrieval system including the existing document location information (71-04). For documents that are not included in the search system, the corresponding annotation knowledge is searched for the natural language expression contained within each document, and the applied annotation knowledge is applied to the corresponding natural language expression. Comment on the semantic unit term (71-05). After commenting is completed for each document, repeating the steps of storing the existing document location information in a separate storage location makes the semantic unit term-based document for all documents not included in the search system (71-06).

FIG. 72 is a flowchart illustrating a procedure for managing disagreements about the contents of a semantic unit term dictionary item, comment contents, annotation knowledge, default value, and search knowledge by using collective intelligence.

A user with disagreement about the semantic unit term dictionary entry's content, comment content, comment knowledge, default value, and search knowledge requests a discussion creation along with the discussion topic to create a discussion item on the topic (72-01). Present and discuss their opinions (72-02). If no consensus is reached in the discussion, vote and synthesize the results (72-03). Apply the results from the discussions and votes (72-04).

73 is a view illustrating a storing and using procedure after merging a search target document original with additional information. It is a method of storing and using the changed document contents in the situation where the contents of the search target document of the search system need to be supplemented or changed and the original document cannot be directly modified.

If there is no write permission for the original, the target document is stored in a separate place along with the document address (73-01). Change documents stored in separate places (73-02). Upon receiving a request for change to the address of the original document, the changed document is found and provided using the stored original document address (73-03).

Converting global documents into semantic terms is a huge task. However, if you index words that are used repeatedly by word, you can reduce the amount of work by the number of words and the number of meanings of the words, regardless of the number of repeated uses of the word. This reduces the amount of work to a few million. In addition, there is an easy procedure for the creation of semantic unit terms so that the public can participate, and it has a structure that can be divided and processed by many users. The vast amount of work is reduced to a few million, and the reduced work is not difficult for individual users, providing a structure for participation and sharing. Rather than forcing the user, they do something that is not difficult for their interests, and this massive task is complete.

Claims

An information system that dynamically generates separate terms for each meaning of all natural language expressions, and generates, collects, indexes, comments, and searches information based on the generated semantic unit terms.

a) Generate semantic unit terms based on natural language expressions by acquiring natural language expression and semantic description information, attach semantic description information to the created terms to make semantic unit term dictionary items, modify, merge, A semantic unit term dictionary manager for managing terms by deleting them;

b) a semantic unit term-based document generator for finding a list of semantic unit terms from a semantic unit term dictionary for natural language expressions obtained from the document, and annotating the selected semantic unit terms to the natural language expressions to supplement semantically ambiguous natural language expressions;

c) create, modify, and manage annotation knowledge (commentary terms, natural language expressions to be annotated, semantic unit terms to be annotated), and apply this annotation knowledge to the indexes of the target documents and information retrieval systems; A semantic unit term commenter that manages a default value applied when there is no information and converts an existing natural language document into a semantic unit term based document by using semantic unit term annotation information included in an index;

d) a document collector for collecting documents to be searched;

e) A semantic unit term-based indexer that indexes natural and semantic terms by adding semantic unit term fields to existing natural language indexes that index documents written using only natural language.

f) a semantic unit term-based search commenter that annotates semantic terminology (not to commenting directly on the document) but to the natural language expressions contained in the resulting documents retrieved as semantic unit terminology queries;

g) a semantic unit term based searcher that enables to search by adding a semantic unit term to a query in addition to the existing natural language; And

h) a semantic unit term based document information system builder for constructing a semantic unit term based whole document or a part of a document to be searched by using annotation knowledge and index information; To

Semantic unit term-based information system to include
Search system for sorting and indexing global documents by words, finding various meanings in sorted words, creating semantic unit terms, and annotating semantic unit terms collectively for global documents sorted by words. As a method of using,

a) a semantic unit term-based document creation step of acquiring and annotating semantic unit terms corresponding to some natural language expressions or all natural language expressions in the document;

b) a document collection step of collecting documents to be included in the retrieval system;

c) a semantic unit term-based indexing step that creates a semantic unit term-based index from the collected documents;

d) a semantic unit term based search step of obtaining a semantic unit term based query word, searching from a semantic unit term based index, and displaying a result;

e) generating a semantic unit term by acquiring a term generation request together with a natural language expression and a description of a specific meaning of the expression, and generating a dictionary item by pairing the obtained description;

f) a semantic unit term search comment step of annotating the semantic unit term to a semantic unit term-based index in the natural language expression for the result documents found by searching for a specific meaning of a specific natural language expression as a semantic unit term based query;

g) Annotation knowledge generation step of acquiring query terms, natural language expressions and semantic unit terms used in the search annotation step and registering them as annotation knowledge;

h) a knowledge base annotation step of annotating semantic unit terms using annotation knowledge for designated documents such as new documents; And

i) Extract the semantic unit term annotation information for each document from the semantic unit term-based index and apply it to the corresponding document to make the document a semantic unit term-based document, and make the document information system based on the semantic unit term. Constructing a semantic unit term based document information system by applying the semantic unit term based document information system; To

How a Semantic Unit Term-based Information System that Involves a Search System
a) Semantic unit term generation unit that obtains the natural language expression and description input by the user for all parts of speech in all languages including unique nouns and dynamically generates separate terms for each meaning for various meanings of natural language expressions. ;

b) a semantic unit term management unit for correcting, merging, and deleting generated terms; And

c) a semantic unit term dictionary search unit that performs a dictionary search function for a term unit term; To

Semantic unit term dictionary manager to include
The method according to claim 3,

The term that is generated is a term called “unique ID” which is automatically generated by the natural language representative expression obtained from the user and the meaning of the natural language representative expression. If the language and the meaning are the same, the term has one semantic unit term. Meaning unit term dictionary manager characterized by
The method according to claim 3,

The generated term is a term called “expression meaning ID” which is automatically generated by a user-entered natural language expression and the meaning of the natural language expression serial number. Therefore, when the expression is the same, another term is generated. Unit term dictionary manager
The method according to claim 3,

Semantic unit term dictionary manager, characterized in that the generation of the semantic unit term is made in a retrieval system in which plural semantic problems of natural language are highlighted.
Semantic unit term dictionary created by the dictionary manager of claim 3
As a method of generating a new term for each meaning when a particular natural language expression has more than one meaning for all parts of speech in all languages, including proper nouns.

a) term information obtaining step of acquiring a natural language representation of the meaning, a description thereof, and a term generation request in a situation where a semantic unit term for a specific meaning of a specific natural language expression is absent;

b) a term generation step of generating a semantic unit term using the natural language expression and the number of semantic terms (significant serial number) generated for the natural language expression; And

c) a dictionary item generation step of generating a semantic unit term dictionary item by pairing the generated semantic unit term and the obtained description; To

How to create a semantic unit term that includes
The object of classification is the semantic unit term, the classification name to which the semantic unit term belongs is a natural or semantic unit term, the semantic unit term may have a classification name of 0 or more, and the classification name of the semantic unit term may be added or deleted at any time. The classification name does not need to be defined before use in the term. If you enter a classification name that does not exist when the term is created or changed, the new classification name is automatically registered, and one classification name belongs to zero or more categories. In the case of disagreement, the classification of terms and hierarchical structure is an intuitive semantic unit term classification method that refines through collective intelligence such as discussion.

a) a semantic unit term classification step of classifying a term by acquiring a classification field value expressed as a natural language or a semantic unit term from a semantic unit term dictionary when a term of a term is given in a term generation or term change process;

b) a semantic unit term search classification step of searching a semantic unit term dictionary to obtain a list of selected terms and a classification name and assigning the semantic unit terms to a corresponding classification;

c) a semantic unit term classification stratification step of acquiring a hierarchy relation setting request for two specific classifications and performing stratification;

d) a step of classifying a semantic unit term that obtains and reclassifies a request for a change of a classification when a change is required in a classification of a specific semantic unit term; And

e) When the user's disagreement occurs in the classification of semantic unit terminology, the semantic unit term classification disagreement adjustment step that generates a discussion item by acquiring a discussion creation request with the discussion topic in order for the user to discuss and reach a conclusion by the collective intelligence. ; To

Intuitive classification and hierarchy management of semantic unit terms
To apply term term to semantic unit terminology, and to use semantic unit term that is long and difficult to remember in using semantic unit terminology, a specific group or individual makes a term alias for semantic unit term and uses it As

a) a term alias registration step of acquiring an alias registration request together with a semantic unit term and term alias from a specific group or individual and registering the alias;

b) a term alias introduction step of acquiring a term alias introduction request and a corresponding group name for using a term alias of a specific group or the Internet, and including the term aliases of the group in an individual term alias list; And

c) a term alias translation step of translating a term alias when a user inputs a term alias in a situation of inputting a semantic unit term in a query or document of a search; To

Semantic Unit Terms
When a specific semantic unit term needs to be subdivided, a specific semantic unit term is managed by dividing into a term segmentation, and the semantic unit term term division is used to annotate and search like a semantic unit term subdivided.

a) a semantic unit terminology splitting step of obtaining a semantic unit terminology division request, a specific semantic unit term, a terminology name to be generated, and a terminology description to generate lower terminology of the semantic unit terminology;

b) Semantic unit term A semantic unit term that obtains a hierarchical sub-term segmentation of the semantic unit term by acquiring a request for term division, a specific semantic term / (hierarchical) term division name, a term division name to be generated, and a term division description. Creating a hierarchical terminology division;

c) a semantic unit term that obtains the request for annotation, the document to be commented, the natural language expression to be commented, and the semantic unit term / (hierarchical) term division to be annotated and annotates the semantic unit term / hierarchical term division to the corresponding natural language expression of the documents. Terminology based annotation step; And

d) a semantic unit term term division use search step of obtaining a query including a search request and a semantic unit term / hierarchical term division and searching for corresponding documents; To

Semantic unit terminology including subdivision method and subdivision terminology
In the semantic unit term-based search system, in order to group specific semantic unit terms, the semantic unit terms are managed in a hierarchical group in a tree form and used to search like group semantic unit terms using group names.

a) a term group generation step of generating a group of semantic unit terms by obtaining a semantic unit term grouping request, a semantic unit term or group list to be grouped, a group name to be generated, and a group description; And

b) a term group use search step of obtaining a search request and a query word including a group name, converting the semantic unit term query word, and searching the corresponding documents; To

How to use semantic unit term grouping
a) Annotation knowledge management unit for creating, modifying, and deleting comment knowledge (comment condition, expression of natural language to be commented, and semantic unit term to be commented);

b) a default value management unit that manages a default value of a semantic unit term of a natural language expression applied when there is no comment knowledge;

c) a knowledge base comment section that applies comment knowledge to the actual documents, index of the information retrieval system, and search query words, and applies semantic unit term defaults if there is no comment knowledge; And

d) a semantic unit term dictionary manager for dynamically generating a separate term for each meaning and modifying, merging, and deleting the generated terms by using natural language expressions and descriptions input by the user when the meanings of the natural language expressions vary; To

Semantic Unit Term Commentator to Include
To claim 13

An index-based document commenting unit that adds semantic unit term information accumulated in an index to a corresponding natural language document to form a semantic unit term-based document; end

Added semantic unit term commenter
An index-based document commenting unit that adds semantic unit term information accumulated in the index to the document and makes the semantic unit term-based document; And

A semantic unit term dictionary manager that dynamically creates a separate term for each meaning and modifies, merges, and deletes the generated terms by using natural language expressions and descriptions input by the user when the meanings of the natural language expressions vary; To

Semantic Unit Term Commentator to Include
Annotation database created by the commenter of claim 13
A method of annotating semantic unit terms using annotation knowledge or default values in natural language expressions in new documents or in search queries before they are collected and indexed.

a) a knowledge-based annotation request receiving step of obtaining an annotation-target natural language expression and a knowledge-based annotation request;

b) an annotation knowledge search step of searching an annotation knowledge DB for the natural language expression to find an annotation knowledge to apply;

c) applying annotation knowledge to the natural language representation; And

d) a default value applying step of applying the semantic unit term default value when there is no comment knowledge and the default value setting is set; To

How to Annotate Knowledge Base Documents That Include
A method of annotating a specific semantic unit term in an index to a specific natural language expression under certain conditions or using specific annotation knowledge for a particular object.

a) receiving an annotation knowledge execution request to obtain an annotation knowledge request with an annotation knowledge ID and a change element (specified period, designated object, etc.);

b) transforming annotation knowledge by using annotation knowledge ID to reflect annotation elements and reflecting change factors;

c) an index search using annotation knowledge to find a corresponding index item by performing modified annotation knowledge; And

d) using annotation knowledge index commenting step to comment the semantic unit term included in the annotation knowledge in the found index item; To

Knowledge Base Index Annotation Methods That Include
A method of annotating semantic unit terms in natural language expressions in a document using search system index information, and annotating semantic unit terms in natural language expressions using annotation knowledge or default values when there is not enough information in the index.

a) an index-based annotation step of extracting semantic unit term information accumulated in a search system index and annotating the natural language representation of a corresponding document;

b) applying annotation knowledge to apply the annotation knowledge of the natural language expression if the semantic unit term annotation is not commented in the index-based annotation step; And

c) a default value applying step of applying a default value of a semantic unit when there is no corresponding comment knowledge in the comment knowledge applying step; To

How to Annotate / Include Knowledge-Based Documents That Include
An annotation knowledge generation method of verifying annotation knowledge through search and registering it as annotation knowledge,

a) a search step of performing a search by obtaining a search query using a phrase allowed by a search query grammar such as a natural language / meaning term expression, an operator, a period, a site, a field, a category;

b) an annotation knowledge generation request receiving step of displaying a search result and acquiring an annotation knowledge generation request along with a description of a verified query query, an annotation target natural language expression, a semantic unit term to be annotated, and an annotation knowledge after a user review;

c)) Annotated knowledge to generate comment knowledge and comment knowledge IDs that include the verified search query word, the natural language to be commented, and the semantic unit term to be commented, and the comment knowledge, comment knowledge ID, and description to create comment knowledge items. Generating step;

How to generate annotation knowledge to include
As a method of determining the default value of each group and the priority of the groups included in the application of the default value,

a) each group records the frequency of use of the semantic unit term for each natural language expression and determines the semantic unit term default value for each group to set the highest semantic unit term as the default value of the semantic unit term of the natural language expression;

b) applying a default value of a personal semantic unit term that designates a semantic unit term for a specific natural language expression as a default value of the individual when the search query is being prepared or the owner of the document is known;

c) If the default value does not exist in the step of applying the personal default value and the group (field) of the document is specified, the semantic unit term for the natural language expression is designated as the default value of the group. Applying a group semantic unit default value to give priority to a small group;

d) applying a default value of a semantic unit term that designates a semantic unit term for the natural language expression as a default value of the Internet if a corresponding default value does not exist in the group default value applying step;

Determination of Semantic Unit Term Default Values in Natural Language Expressions
a) natural language writing unit for writing sentences in natural language;

b) the semantic unit term document commenting unit which annotates the semantic unit term in the written natural language expression; And

c) a semantic unit term dictionary manager for acquiring natural language expressions and descriptions input by a user when the meanings of natural language expressions vary, and dynamically generating separate terms for each meaning, and modifying, merging, and deleting the generated terms; To

Semantic unit-based document builder, including
The method according to claim 22,

Semantic unit term commenter to assist in comment operations with annotation knowledge and semantic unit defaults; end

Added semantic term-based document builder
As a method of writing a document using natural language and semantic unit terms,

a) natural language document creation step of creating a document in natural language;

b) a knowledge-based comment step of annotating semantic unit terms for each natural language expression by applying commentary knowledge of the commenter and default values of the semantic unit to the natural language sentence;

c) a comment change request step of obtaining a comment change request and a semantic unit term change target natural language expression and displaying a list of semantic unit terms for the corresponding natural language expression;

d) obtaining a semantic unit term selected from the displayed semantic unit term list and annotating the semantic unit term annotation to annotate the semantic unit term of the natural language expression; And

e) a semantic unit term generation annotation step of generating and annotating a semantic unit term by obtaining a semantic unit term generation request, a natural language expression, and a description in the absence of a semantic unit term corresponding to the semantic unit term list displayed on the screen;

How to write a semantic unit term document
a) a document collector for collecting documents to be searched;

b) a semantic unit term-based indexer that indexes natural and semantic terminology by adding semantic unit term fields to existing natural language indexes that index documents written using natural language only;

c) a semantic unit term based searcher for adding and searching semantic unit terms in addition to existing natural language query terms; And

d) a semantic unit term dictionary manager for dynamically generating a separate term for each meaning and modifying, merging, and deleting the generated terms by using natural language expressions and descriptions input by the user when the meanings of the natural language expressions vary; To

Semantic unit term-based search system that includes
To claim 25,

Semantic Unit Term A semantic unit term-based search commenter for annotating semantic terms (not commenting the document directly but in the index) to natural language expressions contained in the result documents searched by the query; end

Added semantic unit term based search system
To claim 25,

Semantic unit term commenter to assist in comment operations with annotation knowledge and semantic unit defaults; end

Added semantic unit term based search system
To claim 25,

Semantic Unit Term A semantic unit term-based search commenter for annotating semantic terms (not commenting the document directly but in the index) to natural language expressions contained in the result documents searched by the query; And

Semantic unit term commenter to assist in comment operations with annotation knowledge and semantic unit defaults; end

Added semantic unit term based search system
A meta-search system that builds its own semantic unit term-based index using information obtained from external search systems without collecting and indexing documents directly.

a) a natural language-based search commenter for retrieving from an external natural language retrieval system and storing natural language / meaning unit index information in its own semantic unit term index;

b) a semantic unit term-based searcher that enables to search by adding semantic unit terms in addition to existing natural language query terms from a semantic unit term-based index owned by itself;

c) a semantic unit term-based search commenter that annotates the semantic unit terminology (annotating the document directly, but to the index) in the natural language expression contained in the resulting documents retrieved by the query term;

d) a semantic unit term commenter that assists with synonym terminology operations with annotation knowledge and default values; And

e) when the meanings of natural language expressions vary, a semantic unit term dictionary manager that dynamically generates separate terms for each meaning by using natural language expressions and descriptions input by the user, and corrects, merges, and deletes the generated terms; To

Semantic unit term based search system of meta search system type
If the index word has more than one meaning for all index words of the search system serving the countries and languages around the world, separate terms are generated for each meaning and separated into semantic units to remove information ambiguity.

a) a semantic unit term-based document collection step of collecting documents targeted for a retrieval system;

b) a semantic unit term-based indexing step of indexing natural and semantic unit terms by adding a semantic unit term field to an existing natural language index that indexes documents written using only natural language; And

c) a semantic unit term-based search step of searching for a semantic unit term and a query word including a natural term expression for natural language expressions and semantic unit terms stored in the index;

How a Semantic Unit-Based Search System Containing
If the index word has more than one meaning for all index words of the search system serving the countries and languages around the world, separate terms are generated for each meaning and separated into semantic units to remove information ambiguity.

a) a semantic unit term-based document collection step of collecting documents targeted for a retrieval system;

b) a semantic unit term-based indexing step of indexing natural and semantic unit terms by adding a semantic unit term field to an existing natural language index that indexes documents written using only natural language;

c) requesting a search annotation, querying to find an annotation target, obtaining a natural language expression to be annotated, and a semantic unit term to be annotated, and annotating the semantic unit term on the search system index to the natural language expression included in the search result of the query term. Search annotation step; And

d) a semantic unit term-based search step of searching for a semantic unit term and a semantic term term stored in an index and a query word including a semantic unit term and a natural word term;

How a Semantic Unit-Based Search System Containing
If the index word has more than one meaning for all index words of the search system serving the countries and languages around the world, separate terms are generated for each meaning and separated into semantic units to remove information ambiguity.

a) a semantic unit term-based document collection step of collecting documents targeted for a retrieval system;

b) a semantic unit term-based indexing step of indexing natural and semantic unit terms by adding a semantic unit term field to an existing natural language index that indexes documents written using only natural language;

c) performing annotation knowledge to annotate semantic unit terms in natural language expressions using annotation knowledge having information that certain natural language expressions have meaning under specific conditions; And

d) a semantic unit term-based search step of searching for a semantic unit term and a semantic term term stored in an index and a query word including a semantic unit term and a natural word term;

How a Semantic Unit-Based Search System Containing
A semantic unit term-based indexer that includes a semantic unit term index section for indexing natural and semantic unit terms by adding a semantic unit term field to an existing natural language index that indexes documents written using only natural language.
The method according to claim 33,

The semantic unit term-based indexer, wherein the added semantic unit term field is a term field called “unique ID” which is automatically generated by the natural language representative expression input by the user and the semantic serial number of the natural language representative expression.
The method according to claim 33,

The semantic unit term-based indexer is a term field called “expression meaning ID” which is automatically generated by the natural language expression input by the user and the semantic serial number of the natural language expression.
The search system indexes documents collected in the repository by natural language / meaning terms using a document collector.

a) a natural language indexing step of creating a search system index with a semantic unit term field blank for each word of the document; And

b) a semantic unit term index generation step of recording the semantic unit term in the semantic unit term field of the word index item when the semantic unit term is commented on the word; To

Semantic unit-based indexing methods
An apparatus for annotating a specific semantic unit term in a search system index to a specific natural language expression for documents found through a search.

a) a document retrieval comment section for commenting a semantic unit term on a search system index for a particular natural language expression for a plurality of documents found through a search and not specifying which natural language expression in the document is an annotation; And

b) a word search annotation unit that annotates semantic unit terms on a search system index, including location information in the document, for each of the natural language expressions to be annotated in the document found by the search;

Semantic unit-based search commenter that includes
It is a device that processes word units, not document units, and annotates a specific semantic unit term on a search system index to a specific natural language expression for words found through a search.

a) a word search unit that searches for words and clarifies the words to be commented instead of searching for a document through a search; And

b) a word comment section that specifies a document location and a location within the document for comments found in the search and annotates on the search system index;

Semantic unit-based search commenter that includes
This is a search annotation method for indexes that has a feature of annotating a specific semantic unit term in an index with respect to documents found through a search, and does not specify that a natural language expression at any position in a document is provided.

a) a semantic unit term based document retrieval step of retrieving documents by obtaining a query word including a natural language and a semantic unit term;

b) a document search annotation request receiving step of obtaining a list of all or some selected documents of the search result documents, a natural language expression to be annotated, and a semantic unit term to be annotated; And

c) a document search annotation step for annotating the semantic unit terminology corresponding to the natural language expression on the search system index for the selected documents and not recording the position in the document of the natural language expression; To

How to Annotate Searching Documents That Include
Annotation of a specific semantic unit term in an index for a specific natural language expression for words found through a search, and a natural language expression at a certain position in a document is a search annotation method in an index having a feature that specifies.

a) a semantic unit term-based word retrieval step of searching for words by obtaining a query word including a natural language and a semantic unit term;

b) a word search annotation request receiving step of obtaining a list of all or selected partial words of the search result words, a natural language expression to be annotated, and a semantic unit term to be annotated; And

c) a word search annotation step that annotates the semantic unit terminology on the search system index to the natural language expression for the selected words and specifies a location within the document of the natural language expression; To

How to Include Word Search Comments
In a search system that finds a desired document through a query,

a) a semantic unit term based document retrieval unit in which the object to be found is a document satisfying the search query word and the results are displayed in document units;

b) a semantic unit term-based word search unit that finds a word that satisfies the search query word and the result is displayed in word units so that a plurality of words corresponding to a document are displayed in multiple items; And

c) a semantic unit term based search knowledge management unit for generating and managing knowledge used for searching; To

Semantic unit-based searcher that includes
In order to overcome the ambiguity of natural language in a search system that finds a desired document through a query, a method of creating a term based on semantic unit terms,

a) a natural language query generation step of creating a query by obtaining a natural language as in a conventional query method;

b) a dictionary search step of acquiring a request for annotating a natural language expression and a semantic unit term dictionary in a query and listing corresponding semantic unit terms;

c) the semantic unit term comment step of obtaining a selected item from the list of semantic unit terms listed and annotating the natural language; and

d) A query modification step for modifying a query word to modify a natural word / mean unit pair to a pure semantic unit term for a query term annotated with a semantic unit term.

How to Write a Semantic Unit-Based Query
The number of search result items is the same as the number of words searched for, and can be used for word-by-word processing. The word search query can be used to find the words you want and display the results in word units.

a) a word search request receiving step of obtaining a search query to find a document, a term (natural language expression or semantic unit term) to be searched for in the searched document, and a word search request; And

b) a word search result display step of listing terms to be searched for in a document searched with a word search query; To

Semantic unit-based word search method that includes
The search results are organized by word by document, and the results can be used for document-by-document and word-by-word processing.The search query finds the words you want within the desired document, displays the document as one item, and displays each word unit for each document. Is a search method that displays results in, and lists items as the sum of documents and terms.

a) a document / word search request receiving step of obtaining a search query word for finding a document, a term (natural language expression or semantic unit term) to be searched for in the searched document, and a document / word search request; And

b) a word search result for displaying the documents searched by the word search query in document units and the word search results for each document for each term; To

How to search by term based on semantic unit terms
By performing a search query and reviewing the results, a meaningful search query is registered and used as search knowledge.

a) a semantic unit term based search query review step of obtaining and performing a semantic unit term based search query word and displaying a search result for a user review;

b) generating a search knowledge by obtaining a search query and a description thereof, and generating a search knowledge ID;

c) receiving a search knowledge disclosure request step of obtaining a disclosure request for use by others if the user who created the search knowledge is desired: and

d) The search knowledge disclosure step, which provides a list so that you can use your search knowledge:

How to create and use semantic unit term-based search knowledge
A device for extracting semantic unit term annotation information from a semantic unit term-based index of a retrieval system and applying it to documents in the information system to make all internal documents into semantic unit term-based documents.

a) a semantic unit term based retrieval system for accumulating semantic unit term annotations in a semantic unit term based index; And

b) an index-based document information system construction unit for extracting semantic information stored in a semantic unit term-based index, sorting by document, and applying the document to a semantic unit term-based document; To

Constructor for semantic unit term-based document information system that includes
The method of claim 46,

Annotation knowledge-based document information system construction unit for generating semantic unit term-based documents by annotating semantic unit terms to annotated natural language expressions in annotated documents using annotation knowledge and default values; end

Added semantic unit term based document information system builder
As a device for making a global document information system or a specific document information system into a semantic unit term-based document information system,

a) Generate semantic unit terms based on natural language expressions by acquiring natural language expression and semantic description information, attach semantic description information to the created terms to make semantic unit term dictionary items, modify, merge, A semantic unit term dictionary manager for managing terms by deleting them;

b) Annotation knowledge-based document information system construction unit for constructing semantic unit term-based documents by annotating semantic unit terms to annotated natural language expressions using annotation knowledge and default values for the comment target documents; To

Constructor for semantic unit term-based document information system that includes
As a method of constructing a document information system such as the Internet based on a semantic unit term using a search system index, which accumulates information that annotates natural language expressions in each document as semantic unit terms,

a) document annotation information generation step of generating semantic unit term annotation information for each document by classifying semantic unit term annotation information accumulated in an index of a search system for each document position;

b) a document comment step of creating a new document by including semantic unit term annotation information about the document in each document collected by the retrieval system; And

c) a semantic unit term document storing step of storing documents, including semantic unit terminology, including existing document position information in a separate storage place of a retrieval system; To

Semantic unit term-based document information system construction using search system index
As a method of constructing a document information system such as the Internet based on semantic unit terminology, using annotation knowledge gathering knowledge that annotates natural language expression as semantic unit terminology,

a) document information system document collection step of collecting documents belonging to the document information system;

b) Annotation knowledge to annotate semantic unit terms for all natural language expressions in a document by retrieving the corresponding annotation knowledge for natural language expressions contained in each document and applying the found annotation knowledge to the corresponding natural language expressions. Document application step;

c) applying the annotation knowledge document information system to make the semantic unit term-based document for all documents by repeating the step of storing the existing document location information in a separate storage location when the annotation work is completed for each document; To

Semantic unit term based document information system construction method using annotation knowledge
Document information, such as the Internet, by using the search system index for documents that are included in the search system and having sufficient semantic unit term information accumulated in the index, and the annotation knowledge for new documents or documents outside the search system that do not have information in the index. As a method of constructing a system based on semantic unit terms,

a) document information system document collection step of collecting documents belonging to the document information system;

b) document comment information generation step of generating semantic unit term annotation information for each document by classifying semantic unit term annotation information accumulated in an index of the search system for documents included in the search system by document position;

c) a document comment step of creating a new document by including semantic unit term annotation information for the document in each document included in the retrieval system;

d) a step of storing the semantic unit term document that stores the documents created by including the semantic unit terminology, including the existing document location information in a separate storage location of the retrieval system;

e) For all documents that are not included in the search system, search for the corresponding annotation knowledge for the natural language expression contained in each document and apply the found annotation knowledge to the corresponding natural language expression. Applying an annotation knowledge document to annotate semantic unit terms for expressions;

f) After annotation is completed for each document, repeating the step of storing the existing document location information in a separate storage location and applying the annotation knowledge document information system to make a semantic unit term-based document for all documents not included in the search system. ; To

Semantic terminology based document information system using search system index and annotation knowledge
As a method that manages disagreements about the contents, comment contents, annotation knowledge, default value, and search knowledge of semantic unit terminology dictionary items created in semantic unit term-based information system,

a) A discussion creation step in which a user with a disagreement about the semantic unit term dictionary entry, comment content, comment knowledge, default value, and search knowledge requests a discussion creation with the discussion topic and obtains it to create a discussion topic on that topic. ;

b) a discussion stage in which each of the opinions is obtained and stored and displayed;

c) a voting step of acquiring a voting request to activate the voting function and synthesizing each voting if the consensus is not reached in the discussion; And

d) obtaining conclusions from discussions and voting and applying the discussion results to the semantic unit terminology entry, comment content, annotation knowledge, default values, and search knowledge; To

How to adjust disagreement about semantic unit terms
In the semantic unit term-based information system, it is necessary to supplement or change the document according to the necessity of annotating the contents of the search target document, and to save and use the changed document content in a situation where the original document cannot be directly modified.

a) document and address storage step of acquiring the change request, the target document and the document address, and storing the contents and the address of the original document in separate places;

b) a document content change step of acquiring the content change request and the changed content, and changing and storing the content of the document; And

c) a change document use step of receiving a change request to the address of the original document and finding and returning the changed document using the stored original document address; To

How to save and use the combined search source document source and additional information included
As a method to efficiently semantic unit term-based global document information system or specific document information system by sorting all documents of document information system by words and annotating semantic unit terms by word at the same time.

a) creating a word-by-word index to sort all documents of the document information system by words;

b) classifying a specific set of words on the index by semantic unit

c) generating a semantic unit term for each meaning of the word;

d) annotating semantic unit terms to the categorized vowels of words;

e) annotating semantic unit terms in the document using semantic unit terms and document index information annotated in each individual word;

Document Information System Semantic Unit Terminology Based Method Using Word Index
55. The method of claim 54, wherein the use of a retrieval system for word-by-word indexing and classification of a particular set of words relies on a retrieval system retrieval method using a word-by-word index.
Claims 2, 8, 9, 10, 11, 12, 17, 18, 19, 20, 21, 24, 30, 31, 32, 36, 39, Claim 42, 42, 43, 44, 45, 49, 50, 51, 52, 53, 54 and 55 for recording a program for executing the method of a computer on a computer. Computer readable media