WO1999052041A1

WO1999052041A1 - Opening and holographic template type of language translation method having man-machine dialogue function and holographic semanteme marking system

Info

Publication number: WO1999052041A1
Application number: PCT/CN1999/000046
Authority: WO
Inventors: Sha Liu
Original assignee: Sha Liu
Priority date: 1998-04-06
Filing date: 1999-04-06
Publication date: 1999-10-14
Also published as: AU3324999A; CN1296588A; CN1111814C

Abstract

An opening and holographic template type of language translation method having man-machine dialogue function, includes: creating a natural language restricted dialogue template, wherein it contains all necessary semantic information elements in all of the natural languages; determining vocabulary information items and syntax information items which are actually included in symbols of the natural language through check all type man-machine dialogue on the template; implementing original text solution; generating translation based on the solution; and converting the solution to translation symbols for query of translation syntax. The method makes syntax analysis without depending on the language environment of context and fully utilizes complementary man-machine advantages; this method can be used for eliminating the drawback of syntax information transferring in global network communication.

Description

Open holographic template human-machine dialogue language translation method and holographic semantic tagging system

The present invention relates to a computer translation method, and more particularly to a machine translation method suitable for each network terminal in a computer network to perform information transfer and communication in different natural languages. Background technique

Computer network technology has rapidly created a global era of network information with its advantages of extending in all directions and everywhere. However, due to the obstacles to the transmission and exchange of semantic information between different natural languages, the efficiency of the use of the network and network information has been significantly restricted. How to use machine translation processing to make each network end user use only his own natural language to transmit semantic information on the network. It is undoubtedly of great practical significance and high commercial value for saving network space, improving the transmission efficiency of network information, and realizing the popular international sharing of network information resources.

At present, in the field of machine translation, on the one hand, the machine translation methods introduced systematically by artificial intelligence textbooks are rarely used in actual product development. On the other hand, the machine translation methods applied in the developed machine translation systems are The above-mentioned phenomena cannot be achieved, which indicates that: the basic theoretical research is seriously lagging behind; the machine translation techniques and methods used have common defects; the expected goals are not realistic. Since the 1990s, there have been two emerging types of machine translation methods and they have gradually become the main stream of natural language information processing technology. 'One is to build a corpus based on the statistical analysis of large-scale real texts, and the other is a human-machine dialogue and natural language-limited machine translation method.

The statistical analysis of large-scale real texts is based on multi-angle information sampling analysis of large-scale real texts, such as symbols, sentence patterns, parts of speech, and semantics, so as to provide multiple matching modes for symbol strings in any natural language. A language information processor based on experience Law. Methodologically, this natural language information processing method can superimpose multiple matching analysis results of the source language and establish a matching relationship with the multiple matching analysis results of the target language to complete the natural language directly. Automatic translation, but the reality is that natural language systems have random and open characteristics. Any statistical method can only provide probabilistic knowledge. It is impossible to restrict access to natural language vocabulary and its conceptual definitions, and it is impossible to determine various omitted expressions. It is also impossible to resolve the new ambiguities after generating the target language. Therefore, although the statistical analysis of large-scale real texts is indeed a meaningful basic work for the use of computers for various natural language information processing, for machine translation, this technical means needs to be combined in a comprehensive and effective The object processing system method can fully realize its application value.

Human-machine dialogue and natural language-restricted machine translation methods involve the user adjusting the machine translation dictionary and adjusting the source language expression at the input end, while adjusting the translation results. Although this method can obtain better machine translation quality, it requires users to be proficient Mastering the source and target languages of machine translation requires a relatively high learning and operating cost of human-machine dialogue, which is comparable to human translation. Purpose of the invention

The object of the present invention is to design an open holographic template human-machine dialogue machine translation method to comprehensively solve the problem of obstacles to multilingual information transmission and communication in computer networks, and try to achieve a substantial breakthrough in machine translation technology. This breakthrough must meet the following requirements:

1. Effective access restrictions on natural language common words and their definitions;

2. Do not rely on context for semantic analysis;

3. Realize accurate transmission of semantic information through literal translation;

4. Find the new ambiguity solution after generating the target language;

5. Users only need to be proficient in their mother tongue; 6. Utilize the means and results of large-scale statistical analysis of real texts to fully realize the complementary advantages of human and machine;

7. Meet the need for conversion to multiple target languages.

Another object of the present invention is to propose a holographic semantic annotation system, which can be used to make holographic semantic annotation on a text, and store the annotation information together with the text. Recall the callout information along with the text when needed. Summary of the invention

According to an aspect of the present invention, an open holographic template-type human-machine dialogue language translation method is provided, including the following steps:

a. General restrictions on various natural languages;

b. Establish a human-machine dialogue template that takes sentences as objects and includes necessary semantic information elements of various natural languages;

c. The man-machine dialogue template provides all alternative semantic information items corresponding to the original language symbols subject to contract restrictions and blank information items for user expansion;

d. The computer of the translation system first automatically optimizes all the alternative semantic information restricted by the contract, and then the original user manually adjusts and confirms the preferred result on the human-computer dialogue template;

e. The translation system generates a translation according to the semantic information items determined by the human-machine complementarity, and converts the semantic information items determined by the human-machine complementarity into a translation symbol, and provides the translation user with the translation for query.

According to another aspect of the present invention, a holographic semantic tagging system is provided, which includes: a necessary semantic information library, which stores basic vocabulary and its conceptual definitions and syntactic information items;

A text input device for inputting text to be semantically annotated; A text storage device is used to store text input through a text input device; a text display device is used to display a certain text stored in the text storage device; a sentence selection device is used to select a certain one of the text displayed by the text display device. A sentence

Automatic sentence structure analysis device, for automatically analyzing the structure of a selected sentence according to statistical experience;

A semantic annotation template display device is used to display a semantic annotation template. The semantic annotation template is displayed corresponding to the selected sentence when a sentence is selected by the sentence selection device, and includes a vocabulary corresponding to each word in the sentence. The information element item and the syntactic information element item, the lexical information element item displays the corresponding vocabulary's concept definition and all synonyms included in the necessary semantic information base, and each syntactic information element item is analyzed by the automatic analysis device according to the sentence structure Results of displaying all possible syntactic information items of the corresponding vocabulary, where each syntactic information item is stored in the necessary semantic information base;

Semantic labeling device, used for selecting concept definitions and synonyms and syntactic information item items in each lexical information element item in the semantic labeling template; labeling text storage device for storing the labels with labels The text of the message;

A labeling instruction device, configured to instruct a certain sentence in the text displayed by the text display device to display its labeling;

The annotation display device is configured to display, in the form of the annotation template, annotation information corresponding to the commanded sentence stored in the annotation text storage device.

Industrial applicability

The technical features of the open holographic template-based human-machine dialogue machine translation method of the present invention are: The basic point of human-machine dialogue is that the user directly selects template information, for the user, only the mother tongue needs to be mastered, and there is basically no learning cost; It is made with full consideration of the computer's actual boundary ability for information processing, and the accuracy of semantic information transmission as the central task and practical goal. This method makes full use of the complementary advantages of human and machine, and the translation content is not affected by the language environment and application. Domain limitation; This method provides a comprehensive system solution to solve the basic technical obstacles of machine translation by establishing a unified limited standard and a full-selection human-machine dialogue, which provides a comprehensive technical guarantee for fundamentally improving the quality of machine translation; this method Can make full use of the results of large-scale corpus construction, the natural language processing method is concise and practical, and has good implementability; although in the source language information solution stage, the language that the user does not understand is impossible to conduct human-computer dialogue, but can be used in Under the premise of ensuring translation quality, multilingual translation results can be obtained by implementing one language input.

The open holographic template human-machine dialogue language translation method of the present invention has universal application value in the field of network information exchange, and has a broad international market in opening online online machine translation services.

The holographic semantic annotation system of the present invention can store the lexical interpretation and grammatical structure information of a text with the text at the same time, and display these annotation information when needed. This system can be widely used in the interpretation of legal documents and language teaching. Brief description of the drawings

Figure 1 is a schematic diagram of the structure of a natural language holographic dialogue template with a sentence as the object; Figure 2 shows the content of a holographic dialogue template with an English sentence as the object; Figure 3 is a schematic diagram of the vocabulary information communication restriction structure between different natural languages; Fig. 4a and Fig. 4b are schematic diagrams of two methods for displaying dialogue information in the process of human-machine dialogue; Fig. 5 is a schematic diagram of the spatial positioning structure of syntactic component information;

Fig. 6 is a process of man-machine interaction information when an English sentence is translated according to the method of the present invention.

FIG. 7 is a schematic diagram of a syntactic information item actually carried by a translation user querying a natural language symbol "with a telescope"; a preferred embodiment of the present invention The principle and implementation process of the open holographic template human-machine dialogue language translation method of the present invention is explained below with an example of translating English sentences into Chinese. The example sentence used is

"I saw a boy with a telescope near the bank." ("I saw a boy with a telescope near the bank.")

The example sentence contains multiple language symbols. The language symbols mentioned here can be either words or phrases. Each language symbol carries a certain amount of semantic information, including the concept definition of the language symbol, the tense, the voice, and the composition of the language symbol in the sentence. For example, the conceptual definition of the word "saw" is "see", the tense is the past tense, the voice is the active voice, and the component in the sentence is a predicate. However, due to the complex and diverse nature of natural language, linguistic symbols may carry more than one kind of semantic information. For example, the concept of the word "saw" can be defined as "seeing" and "understanding and understanding" as well as the phrase "with The syntactic component of "a telescope" can be either a predicate modifier or an object modifier.

The inventor believes that the fundamental task of natural language translation is to accurately transfer the actual semantic information carried by the original language symbols to users in different languages. To this end, the method adopted by the present invention is to solve all semantic information items of the original text in a human-computer interaction manner on the original user side, generate a translation according to the result of the solution, and convert the solution result into a translation symbol, which is provided to the translation. The target users can be searched in order to realize the full translation with the original user and target users participating together, and improve the quality of semantic information transmission.

In order to solve the semantic information of the original text, the present invention establishes a natural language holographic dialogue template with a sentence as an object as shown in FIG. 1. The so-called "hologram" refers to the inclusion of all natural language characters in this template. The necessary semantic information elements include conceptual definition items, tense information items and voice information items belonging to lexical information elements, and syntactic component items belonging to syntactic information elements. The dialog template is used to provide the original text user with alternative semantic information items corresponding to the original language symbols for human-computer interaction selection. The content of these dialog information items, such as What will be explained later must be limited by the system. The dialog template also includes some non-required information items, such as semantic attributes, grammatical attributes, and higher-level semantics (lattice). These information items can not be selected by the user, and only a probabilistic automatic solution can be performed by the computer in order to convert automatically. Generate translations to provide relevant information.

In order to accurately convey semantic information between different languages, it is best to use literal translation. This is because machine translation systems cannot randomly adjust the vocabulary and sentence pattern of the target sentence. However, due to the differences between the conceptual systems and syntactic systems of various natural languages, to ensure the quality of literal translations, it is necessary to ensure that lexical information items and syntactic information items can be exchanged equivalently between the source language and the target language. Therefore, the present invention carries out a unified and integrated treatment of the differences between different natural languages by establishing a system of the principle of contractual limitation. This principle of restrictive communication includes syntactic information communication and lexical information communication.

The syntactic principles of syntactic information designed by the present invention include: uniformly merge syntactic information with the same function and different objects; try to delete syntactic concepts that are not indispensable in the analysis of semantic aggregation relations, such as direct objects and indirect objects in English grammar. The present invention provides only the simplified syntactic information concept on the dialog template as a standard syntactic information item of different natural languages for users to choose.

The vocabulary information communication principle designed by the present invention is shown in FIG. 3, and a basic concept set is determined through statistical analysis and synonym merging of vocabulary usage frequencies in large languages. However, in practice, not every basic concept of natural language is completely corresponding. When corresponding vacancies occur, other common words in the language should be used to explain this concept to explain the basics of various languages. Conceptual mandatory alignment. For example, the verb meaning term of the English vocabulary orphan is defined as the basic concept, while there is no corresponding word in Chinese, it is described as "orphan". In addition, synonyms that are basic concepts of various natural languages are used as synonyms. Since it is not possible to find all the corresponding concepts of a vocabulary in a natural language, it is impossible to find the corresponding concepts in other natural languages. Basic concept words are replaced with synonyms (synonym replacement is also inevitable in human translation). If the two items cannot be processed after the two contract processing, blank information items are provided in the holographic dialog template as redundant information. In determining the conceptual definitions of different natural language vocabularies, the present invention adopts connotative-focused vague conventions (such as "school" in Chinese and "school" in English :); concepts that do not consider differences in part-of-speech are unified (such as not Consider all tense variants of the English vocabulary (become) and probabilistic contracting that prioritizes concepts used in multiple languages. In order to enrich the expressiveness of a language, any language needs to have synonyms of the same concept. Therefore, the lexical use probability is used as a redundancy standard for lexical concepts. Preference is given to vocabularies used in multiple languages, followed by the probability of using a natural language. High vocabulary. Vocabularies that do not meet the above two conditions are treated as redundant concepts, and blank information items are provided in the holographic dialog template accordingly. The concept definitions that have been processed through contract restrictions are provided as vocabulary alternatives in the holographic template for different natural language users to choose from, so as to ensure that the natural language vocabulary concept information can be interchanged equivalently. The invention also sets a unified encoding for the corresponding vocabulary concepts in different natural languages, so as to facilitate information transmission on the network.

On the other hand, in order to process the natural language symbols that are not included in the system and make the human-computer interaction more flexible, the dialog template of the present invention is designed to be open based on the basic principle of contract restrictions, that is, when a certain When the original natural language symbol is not included in the machine translation system, the original user can call the natural language symbol of the limited information item that the system has already included to describe it semantically.

The method for compulsorily restricting a variety of natural language concept systems of the present invention is essentially different from the traditional intermediate language method. The traditional intermediate language technology faces a completely unrestricted natural language system, Multilingual translation is achieved through the establishment of an intermediate concept system between multiple natural languages, but the openness of various natural language concept systems makes it impossible for the intermediate language system to have continuity; the mandatory method of restricting covenants is through man-machine Dialogue methods make necessary restrictions and cohesion on vocabulary and meaning items, and the differences and openness between various natural language concept systems Make reasonable restrictions to ensure that the vocabulary and syntactic concepts of multiple natural languages can be interchanged successfully.

Now refer to FIG. 2 again, and continue to explain the method for the original user to solve the semantic information of the original. The figure shows the alternative restricted semantic information items provided by the man-machine dialogue template to the original user corresponding to each language symbol of the original. The process of solving the original semantic information is the process of selecting, confirming, and supplementing these alternative information items in the human-computer dialogue template.

In the selection of vocabulary information items, we must make full use of the advantages of man-machine complementarity. The basic principles followed by computer automatic optimization are: Through a large-scale statistical analysis of real text, the frequency order of vocabulary information items of polysemous words is ranked to reduce Search scope of user options; Through large-scale statistical analysis of real texts, vocabulary information items are optimized according to the correlation characteristics between syntactic information items and lexical information items to further narrow the selection of information items. Vocabulary is preferably its nouns, such as "" and "telescope" in Figure 2. Through large-scale statistical analysis of real text, the probability information of collocations is obtained, and vocabulary information items such as Chinese are so beautiful Duohua ", where" 好 "is a polysemous word, and the most likely meaning solution of the word" good "before the adjective" beautiful "is the degree adverb" very "; for a text symbol that expresses part-of-speech information explicitly, The selected vocabulary information items can be derived to narrow the selection of information items, such as "spring" in English Although the root of the word is ambiguous, the past tense of the verb "spmng" has clearly limited the choice of meaning items.

Through the automatic option processing of the above technical means, most of the vocabulary information items actually required by the user have been ranked first. Since the lexical information items required to express semantics already exist in the user's mind, for the user, most of the Vocabulary information item selection is just a confirmation process for each preferred information item in the template.

In various natural languages, whether it is implicit or explicit, the syntactic information generally includes part-of-speech information, syntactic component information, and higher semantic (case) information, among which the syntactic component Information is the only syntactic organization system with complete organization capabilities and universal commonality. Therefore, as long as the information items of syntactic components are determined, the semantic aggregation relationship of a natural language symbol string has actually been determined. In the selection of syntactic information items, it is also necessary to make full use of the complementary advantages of human and machine. The basic principles that it follows are: to obtain the word order, part of speech, higher semantic (lattice) information and syntactic information through large-scale statistical analysis of real text. To automatically select syntactic information items. For example, if the word order of a vocabulary is 1, the part of speech is a noun, and the higher-level semantics is the subject of the behavior, it can be determined as the subject; the user finally determines the syntactic component information item through the option operation.

By selecting vocabulary information items and syntactic information items on the template in a man-machine dialogue mode, the actual semantic information of the original text is solved. The user selects the vocabulary information items and syntactic information items actually carried by each natural language symbol string directly on the holographic dialog template, which is the simplest way of human-computer dialogue. The specific method may be to process the identified items in bold, such as Shown in Figure 1.

Through the human-computer complementary selection and confirmation of lexical information items and syntactic information items in sentences in the holographic dialog template, the information solving task of natural language has been completed, so it is no longer necessary to rely on the context to perform semantic analysis on the sentence.

For users, analyzing and determining abstract syntactic relationships is far more difficult than judging polysemous information items. Therefore, in order to reduce the difficulty of selecting syntactic component information items, the syntax can be linearly arranged as shown in Figure 5 in actual operation. The component information items are transformed into spatial positioning expressions to assist in the selection of human-computer dialogues for syntactic component information items. Using the modified area, core area and supplementary area of the syntactic information as horizontal coordinates, and the subject area, predicate area and object area of the syntactic information as vertical coordinates, a syntactic information dialog frame is made, and the user can "with a telescope" in the frame. Select the modified object.

In the actual human-machine dialogue process, the template partial display method and template virtual method can also be used. As shown in Figure 4a, the syntactic information is fully displayed (? In the figure indicates that the user can select it again :) and "I shown in Figure 4b. see a boy with a telescope near the bank ". Those skilled in the art should understand that the dialogue during the man-machine dialogue There are many methods for displaying information, and they are not limited to the examples in this specification.

The method of the present invention has the necessary information for automatic conversion to a variety of natural language expression forms by restricting the systematic communication of grammatical concepts and common concepts, and performing full selection of human-computer complementary information within the scope of restricted information items. However, there is always a syntax component that is omitted by the user. Logically speaking, as long as all the information items of existing text symbols are determined, most of the omitted parts can be automatically added by the user according to the context when reading the information (such as subject and object omitted). However, in order to accurately convey the semantics, the non-omitted sentence components must also be enhanced by holographic dialog templates to ensure the quality of machine translation (such as the subject and object have been selected in the alternative information items of a sentence, they cannot be omitted). Related verbs).

In order to solve the problem of finding new ambiguities after generating the target language translation, the holographic intermediate live translation results are provided to the target language user for direct query along with the translation, which can achieve the complete resolution of the new target language ambiguity. If the user intentionally retains the ambiguity or ambiguity of the language expression, he can make multiple simultaneous selections when selecting the information item.

Referring to FIG. 6, the flowchart illustrates the basic process of human-computer interaction information processing in the open holographic template-type human-machine dialogue language translation method of the present invention. The middle columns 11 to 17 are the main flow of the computer of the translation system. Columns 21 to 26 show the user's participation process, and columns 31 to 35 on the right show the relationship with the internal database and rule base during human-computer interaction. One-way arrows indicate the direction of human-computer interaction, and two-way arrows indicate the language translation. In the process of calling data and rules in the process, the marked N indicates that the system information processing requires human-computer interaction, and the marked Y indicates the next operation step of automatically entering the system flow. # # # # Indicates the information processing of this translation system and the Internet system interface. Above it is the original client and below it is the translation client.

The process starts, and steps 11 are executed, and the natural language symbols to be translated are input by the original user in sequence.

Referring to FIG. 2 in combination, in the order 1 to 10 of the template, ten natural elements in this example are filled in order. The language symbol "I saw a boy with a telescope near the bank"; step 12 of the main program of the system performs a search of lexical spare information items for each natural language symbol in the extensible multilingual corresponding lexical information item symbol library 31, when When the search is not available, the semantics of natural language symbols can be described by the original user on the template using the system's already acquired semantic symbols through step 21. The above process finally generates the concept definition items, semantic attribute items, tense items, For vocabulary spare information items composed of voice items, etc., if the concept definition information item is blank under a natural language symbol, such as "?" At the symbol "bank", the original user can use the information item that has been provided in the system. Lexical symbols describe them semantically, that is, the concept definition item "institution for keeping or lending money ^ in the template; step 13 of the main program of the system, according to the rules in the lexical information item probabilistic selection rule 32, are included in the template by the computer pair Multiple vocabulary spare information items of each natural language symbol in the automatic optimization, such as those specified in bold in the template The information items can be selected and confirmed by the original user for the semantic information items that have not been determined and preferred in step 22; step 14 of the system main program automatically labels the rule base 33 by calling the syntactic component information items, and The syntactic information items of natural language symbols are automatically labeled, and the above process finally generates syntactic component items, part-of-speech items, and higher-level "lattice" items in the template. In step 15 of the main program of the system, the syntax component information item automatic selection rule base 34 is called. The syntactic component information items of each natural language symbol are automatically optimized. In the meantime, the three-dimensional structure model library 23 of the syntax information items can be called through step 24, and the original user selects and confirms the syntax information items that have not obtained the only preferred result on the template, such as a template. The information items specified in boldface; the main program of the system can now pass the identified information items on the network in a self-defined encoding form.

The dialogue template includes all the information items that natural language symbols can carry. All of its spare information items include not only the definition of natural language symbols, tense information, voice information, syntactic information, higher "lattice" information, part-of-speech information, Singular and plural information, masculine positive information, and other information that can be manually designed and labeled can be expanded under the open template. When the original user uses the semantic description method to solve the original symbol in step 21 of FIG. 6, the system program also automatically counts the frequency of its use. When the frequency of use reaches a certain level, that is, the natural language of all languages in the translation system. Simultaneously add new natural language symbols or new information items in the symbol library. For example, when the use frequency of the solution bank reaches a certain level, the system adds a new symbol to the French natural language symbol library.

"banque" and the corresponding French symbols already included in the system are used for semantic description, and other relevant alternative information items are given. The extension method is the same for other languages.

With reference to FIG. 7, in step 16 of the main program of the translation client system, the translation automatic translation generation rule base 35 is called, and according to the multilingual symbol and ordinal conversion rules, the solution results of the information items confirmed by the original user are automatically converted into the translation user requirements. The natural language translation of the translation is shown in Figure 7. The Chinese translation generates the result "I saw a boy with a telescope near the bank"; the system main program will ask the user if the translation is unambiguous in step 17; if there is any ambiguity, the translation user In step 26, the query range of the related information items can be determined through the human-computer interaction process, during which the multilingual corresponding information item symbol library 25 can be called, such as whether the translation user modifies the subject or object in order to solve "with a telescope", as shown in FIG. 7 ? As shown, you can directly query the syntactic information item that the symbol actually carries to determine that it is a modified object. This concludes the translation process.

The quality of semantic information transmission is the fundamental obstacle for machine translation technology to win the huge international market in the era of global network information. To achieve a substantial breakthrough, human-machine dialogue is inevitable. The translation scheme with complementary advantages of human-machine dialogue can effectively improve translation Quality has practical value. Because this method has the advantages of accurate semantic information transmission, no restriction of locale environment, convenient operation by users, simultaneous conversion and generation of multiple target languages, multilingual generalization of dialogue schemes, and simple and reliable technical means, it will be used in the field of network information exchange. It has universal application value and will also have a broad market in online machine translation services. According to the concept of the above method, the present invention also provides a holographic semantic tagging system, which includes: Necessary semantic information base, which contains the basic vocabulary and its conceptual definitions and syntactic information items;

A text input device for inputting text to be semantically annotated;

A text storage device for storing text input through a text input device;

A text display device for displaying a certain text stored in the text storage device; a sentence selection device for selecting a certain sentence in the text displayed by the text display device;

The annotation display device is configured to display the annotation information corresponding to the commanded sentence stored in the annotation text storage device in the form of the annotation template. One application of the holographic semantic labeling system of the present invention is a homologous holographic semantic labeling system. Taking the legal industry as an example, there are many types of laws, and corresponding knowledge bases need to be established. Developing expert systems has a wide range of applications. One of the common application requirements is that ordinary users Semantic understanding and identification of legal provisions. Various expert systems at home and abroad are "question-and-answer" man-machine interfaces: the system asks many questions in turn, and the user makes a choice of "Yes" or "No" one by one, or enters simple data, and then The system searches the knowledge base, infers a certain conclusion based on the matching between the problem and the knowledge, and then tells the user.

This "question-and-answer" human-machine interface is rigid and cumbersome, and the questions asked by the system are set in advance and are not flexible. Such a system seems too low in IQ.

If you use the same language semantic annotation technology when entering legal interpretation clauses, contracts, agreements, and pleadings, you can enter the holographic data of the language symbols used at one time, which will greatly facilitate users' query and classification.

The same-language semantic tagging technology is not only applicable to the development of various expert-level knowledge systems, but also has universal practical value for improving the accuracy of semantic interpretation of legal interpretation, contract content, and technical description documents.

Implementation method of the same language semantic tagging technology:

Only by applying the original text processing technology of the holographic translation template and providing a professional thesaurus, the same language semantic annotation can be realized.

One application of the holographic semantic tagging system of the present invention is a foreign language holographic language teaching system.

Computer-assisted instruction has been widely used. The application in the field of foreign language teaching mainly uses multimedia teaching methods (listening, speaking, reading, and writing in parallel) and examination question bank teaching. The language holographic template provides a computer-assisted instruction method for foreign language teaching that systematically reflects the commonalities and symbolic personality of different language concepts.

When the user enters a parent sentence:

If the user selects the concept definition of the native language vocabulary, and through the multilingual unified coding provided by the system, the holographic template can call up all corresponding vocabularies in multiple languages.

If the user selects the tense, voice, and syntactic component information items of the mother sentence, holographic teaching The system can use the interface technology and internal conversion rules of the holographic translation system to provide step-by-step process of symbol deformation and order transformation in any language.

If the user directly inputs a foreign sentence, and through the multi-language unified coding provided by the system, the holographic template can both provide holographic semantic annotations in foreign languages, or directly convert the holographic semantic annotations into the mother tongue.

Claims

Claim

An open holographic template human-machine dialogue language translation method, comprising the following steps:

a. General restrictions on various natural languages;

e. A translation system generates a translation according to the semantic information items determined by the human-machine complementarity, and converts the semantic information items determined by the human-machine complementarity into a translation symbol, and provides the translation user with the translation for query.

The method of claim 1, wherein the necessary semantic information elements in step b include concept definition, tense information, voice information and syntactic component information item.

3. The open holographic template-type human-machine dialogue language translation method according to claim 1 or 2, characterized in that the general restrictions on various natural languages in step a include: al. Unified merge function, Objects with different syntactic concepts; a2. Try to delete indispensable syntactic concepts; a3. Establish a multilingual general basic concept set through statistical analysis and synonymous merging of the main language vocabulary usage frequency; a4. Use various natural language The synonyms of the basic concept are used as synonyms. When there are vacancies corresponding to the synonyms in different natural languages, the synonyms are replaced by the basic concept words. A5. For natural language words or concepts that cannot be expressed uniformly with the basic concepts, The call template provides blank information items.

The method of claim 1, wherein: in step c, when an alternative information item of the same language corresponding to the original language symbol item appears in step c When blank, users can call natural language symbols that have been included in the system to describe them.

The method for translating an open holographic template-type human-machine dialogue language according to claim 4, further comprising: performing statistics on the frequency of use of information items extended by the user, and determining new additions based on the results of statistics on the frequency of use The universal basic concept adds natural language symbol items and corresponding information items to the human-machine dialogue templates in all languages of the translation system.

6. The open holographic template-type human-machine dialog language translation method according to claim 1, wherein: the method for manually adjusting and confirming the automatic optimization result in step d is performed by a user on the holographic dialog template. Uncertain information items are manually selected.

7. The open holographic template-type human-machine dialogue language translation method according to claim 1, wherein: the sentence-oriented human-machine dialogue template in step b is a dialog frame including a three-dimensional spatial positioning syntax.

The method of claim 1, wherein: the human-machine dialogue template of the sentence in step b is a virtual human-machine dialogue template.

The method of claim 3, wherein the method for restricting various natural languages further comprises a6. Fuzzy connotations centered on connotation and a7 . The concept of uniformity does not take into account the difference of parts of speech.

The method of claim 1, wherein in the step d, the user can manually or manually adjust the preferred results on the holographic dialog template. Confirm your selection.

11. A holographic semantic labeling system, comprising:

Necessary semantic information base, which contains the basic vocabulary and its conceptual definitions and syntactic information items;

A text input device for inputting text to be semantically annotated;

A text storage device for storing text input through a text input device;

12. The holographic semantic tagging system according to claim 11, characterized in that the necessary semantic information database correspondingly stores multi-language restricted vocabulary and its conceptual definitions, and correspondingly stores multiple Syntactic information items with limited language in each language.

13. The holographic semantic labeling system according to claim 11, wherein the vocabulary information element item of a certain vocabulary further displays a vocabulary of a specified language stored in the necessary semantic information database corresponding to the vocabulary, and the vocabulary's The syntax information element item also displays the syntax information item of the specified language stored in the necessary semantic information database corresponding to the syntax information item of the vocabulary.

14. The holographic semantic tagging system according to claim 11, characterized in that the content in the vocabulary information element item can be changed to other vocabulary meanings in addition to the alternative content. Information.