WO2009103208A1 - Sentence component device and reading foreign languages and producing universal language and text conversion method - Google Patents

Sentence component device and reading foreign languages and producing universal language and text conversion method Download PDF

Info

Publication number
WO2009103208A1
WO2009103208A1 PCT/CN2008/072593 CN2008072593W WO2009103208A1 WO 2009103208 A1 WO2009103208 A1 WO 2009103208A1 CN 2008072593 W CN2008072593 W CN 2008072593W WO 2009103208 A1 WO2009103208 A1 WO 2009103208A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
cabin
current
language
library
Prior art date
Application number
PCT/CN2008/072593
Other languages
French (fr)
Chinese (zh)
Inventor
刘树根
Original Assignee
Liu Shugen
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liu Shugen filed Critical Liu Shugen
Priority to CN200880128636.7A priority Critical patent/CN102007490B/en
Publication of WO2009103208A1 publication Critical patent/WO2009103208A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language

Definitions

  • Statement component device and native language read foreign language and generate world text and text conversion method
  • the present invention relates to an apparatus and method for processing or converting natural language; and more particularly to an apparatus and method for text conversion; namely, a statement component device, a statement component making method, and a text conversion method based on a sentence component for a native language reading foreign language .
  • BACKGROUND OF THE INVENTION Information processing of computer language words represents machine translation, which is the highest point of technical difficulty. Its operational object is a text file, produced by the computer's word processing technology. The existing word processing technique is to encode characters of various languages and then generate a text file by using a character code (internal code).
  • MT machine translation
  • the key technology of MT has four aspects: word analysis, syntax analysis , meaning analysis and literary analysis. Its working process is to divide the sentence into several words, find out the meaning of the word through the electronic dictionary stored in the machine database, analyze the meaning of the statement according to the grammar rules, and transform it into a conceptual construct. Then use the language model to generate the target language. Although in principle, it is not difficult to implement this series of steps, but due to the particularity and variety of language, and the limitations of the development level of artificial intelligence technology, it is currently required to be in different languages. Correct translation is impossible, which is why the current machine translation software cannot .
  • Translation memory mainly designed for professional translators and organizations, with translation memory and human-computer interaction as the core, requiring users to have independent translation capabilities.
  • the principle of TM is based on a database, and all translated materials are stored in the database in sentences.
  • the system automatically analyzes the electronic documents during translation. The 100% matching sentences can be automatically replaced.
  • the partially matched sentences can be translated according to the matching degree, and the new sentences are manually translated through the translation suggestions provided by the system.
  • Scientific research shows that the amount of repetitive work in translation is about 30%, and the TM translation software makes "the same sentence never needs to be translated a second time", thus improving work efficiency. "But 70% still have to rely on human workers...
  • the object of the present invention is to overcome the deficiencies of the prior art, and to provide a sentence component device and a method for fabricating a two-sentence component; and using the technical means of human-computer interaction to solve the language expression of the existing computer, which is also a word, a word, a sentence expression Technical issues that are not equal or uniform. Thereby, the technical effects of the sentence components stored in the sentence component library and the different language words, words and sentences are equivalent and unified are generated.
  • a further object of the present invention is to provide a text conversion method based on a sentence component for a native language reading foreign language; to solve the existing machine translation and text conversion, the result is poorly readable, and cannot express the technical problem of the original text meaning; thereby improving the translation or conversion
  • the readability of the text the technical effect of the original meaning consistent with the original text.
  • the conversion also generates Esperanto, which produces technical effects that can be read in multiple languages. It has greatly changed the status quo of the existing word processing text for the reading and writing of people in this language.
  • the technical solution adopted by the present invention to solve the technical problem thereof is:
  • sentence pattern library 300 used to store sentence structure components, there are sentence patterns, English sentence patterns, Chinese sentence patterns, Russian sentence pattern fields, which contain at least one record, the same semantic sentence pattern is the same record, corresponding The sentence patterns of the genre are stored in the corresponding sentence pattern field, and the sentence pattern code represents the language of each sentence sentence type in each sentence pattern field in the same record;
  • the cabin model library 400 is used for storing the cabin model member , there is a cabin model, an English cabin model, a Chinese cabin model, a Russian cabin model field, which contains at least one record, the same semantics of the cabin model with
  • the idiom library 600 is configured to store the small idiom component, and has an idiom code, an English idiom, a Chinese idiom, a Russian idiom field, which includes at least one record, and a small idiom of the same semantics is recorded at the same time.
  • the small idioms of the corresponding language are stored in the corresponding idiom field, and the idiom code represents the semantics of each idiom in each idiom field of the same record; the ideology code preparation unit 103, and the statement
  • the component storage unit 101 is connected to receive the notification of the component adding unit 106, and when the new record is generated by any one of the above four libraries, the current library representative number is added to the high-order word plus the current library record number to generate the meaning code, and Fill in the code field of the current library as the unified double-byte fixed-length multi-language semantic interworking code of the statement component.
  • the Italian code is unique to the same semantic representation of each language component in the current record of the current library;
  • the component reading unit 104 is connected to the sentence component storage unit 101 for receiving the read command, determining a certain record of the library by the number of segments included in the code, and reading the required words into the corresponding record of the corresponding library.
  • a member matching unit 105 is connected to the sentence component storage unit 101 for receiving a matching command, and querying the corresponding language index field of the corresponding component library according to the sentence or the sentence content of the given language and the guidance of the current operating point.
  • the component adding unit 106 is respectively connected to the sentence component storage unit 101 and the Italian code compiling unit 103, and is configured to receive the add new component command, and confirm the corresponding in the query. After the component library has no identical components, the new component is added to the corresponding language component field of the corresponding component library. When a new component is added to a new record, the information is simultaneously notified to the code generation unit 103; component library operation control, interface unit 107.
  • the component reading unit 104, the component matching providing unit 105, and the component adding unit 106 are connected to the sentence component storage unit 101, and receive calls of various applications based on the present sentence member or , , ,
  • the statement component in the statement component device is obtained by means of expert operation and human-computer interaction, and is obtained from the parsing comparison bilingual training sample corpus; another source of the statement component is that the user feedback information is reviewed by the expert.
  • the statement component is a standard component for assembling a language sentence or a standard component for encoding a sentence, including the following four types:
  • a sentence structure component 201, 301 which is used to form the basic structural framework of a sentence, represents the basic semantic category of the sentence, and also determines the order and number of sentence boxes contained in the sentence, and includes the sentence. More complicated grammatical phenomena;
  • 2 cabin module 202, 401 which is used to form the basic structural framework of the complex sentence cabin, represents the basic semantics of the sentence cabin, and also determines the number and number of cabins contained in the sentence cabin. The more complicated grammatical phenomena of this type of sentence cabin;
  • the syllabus members 501, 503 are components that are used by the ensemble string to fill the components of the simple sentence capsules 203 to 204 or the bonnets 205 to 207.
  • the simple sentence cabin and the cabin eye are the same as the upper and lower concepts. , are not more than three meaning clusters except for the meaning of the word;
  • J-language component 601 which is too short to be used to separate sentences and sentence sentences as small components, used to directly form short sentences.
  • the statement component library in the statement component device the language included in the library, except for English, Chinese, and Russian, each additional sentence, the sentence library, the cabin model library, the Italian group library, The idioms are each added with a certain sentence pattern, a literary module, a string, and an idiom field.
  • the newly added genre can only be added to the same record if it has the same semantics as the existing genre.
  • Words, words, and sentences are equivalent and uniform statement components, including the following steps:
  • each round of A and B bilingual as a sample pair where the A language is assigned to the pinyin text or the already-matched language, the B language can be assigned.
  • the first round of bilingual analysis of the training samples in which the B language of the sample is English and the B language is Chinese. From the second round, the new language pair must have been analyzed and compared. When joining Russian, only Chinese-Russian or English-Russian materials can be used as bilingual training samples. In the second round of analysis, the B-language in the sample should be compared with Chinese or English, and B should be new.
  • each round of training corpus samples should be large enough to add new sentence/sentence ratio ⁇ 1% before considering adding new languages, conducting second round of analysis, and, in other respect, industry based on training sample corpus
  • the source or application scope source to mark, divide the sentence pattern library, the cabin model library, the meaning group library, and the idiom library to form the corresponding library for the industry or special version;
  • step S2 Analyze the sentence at the sentence level, compare the sample sentences that have been divided into sentence patterns and sentence cabins, and then take out the contents of the sentence cabin, and then divide the cabin model and the cabin eye into a cabin, and store the cabin model as a cabin module.
  • the model library, the contents of the cabin or the simple sentence cabin aligned by the intention group are stored in the group of meaning clusters as the group of the meaning group; after processing all the sentence cabins, the next bilingual sample sentence pair processing Then, step S2 is performed.
  • the step S2 of the sentence level analysis in the method for fabricating the sentence component further includes the following steps:
  • step S23 is performed, if yes, there is a matching sentence pattern performing step S26;
  • N N+1, check whether both A and B are clicked on the two points and whether the two points are valid. If not, the prompt is repeated, if the click is correct And effective, the content between the two points of the B statement is dug and filled in "[N]", the round of the sentence is finished, and the next round of repeating step S24 is to dig a sentence cabin;
  • step S3 of the sentence level analysis in the sentence member preparation method further includes the following step:
  • the content of the B-string field of the same record if the content of the B-string is contained in the current sentence of the B-language, the content of the B-string is filled in the B-language field of the reference table, and does not contain it to be empty;
  • the A text string field has the same record, and the corresponding reference table also has one A language field with a heavy record candidate, which is a complete reference table, the open window two displays the reference table, the group compound word command button, and the groupable compound word operation prompt Accept the expert to click on the reference table and mark it in the marked field of the mark; when the group compound command button is clicked and the reference table has continuous records to be clicked, the content of the A language field with the mark record in the reference table is connected with " _
  • step S37 Determine whether the current sentence cabin is a simple sentence cabin, if it is to perform step S37, if not, further query the cabin model library to determine whether the current sentence cabin contains a cabin model, if not, without the cabin model, performing step S34, if yes, the cabin
  • the module is used as the current cabin model and is incorporated into the current sentence cabin content, and step S36 is performed;
  • Open window 3 as an editable window display the current double-sentence cabin content again, accept the expert to write the cabin model, and also display the save mode command button;
  • the new cabin model also meets the format requirements, and the newly edited A and B language modules are stored as cabin modules in the cabin model A.
  • the mold and B cabin model fields fill the current cabin content into the current cabin model, or fill in the new cabin model as a complex sentence cabin display that has been divided into cabin molds and cabin eyes;
  • step S37 Take out the contents of one of the cabins in sequence, and then proceed to step S37;
  • the alignment determination command button is displayed under the second window reference table, and the reference table accepts the expert to extend or add the meaning of the word according to the example, and does not change the original word or the word, the length of the string is added or subtracted, and the tape is accompanied by the word.
  • the group string has been formed, and then the contents of the A and B language fields are stored as the group string component in the A string or the B text field of the meaning group library.
  • step S36 is performed until all the cabin eyes of the current sentence cabin are completed, and then the sample sentence sentence sentence sentence and sentence sentence are determined. If there is still an unprocessed sentence compartment in the middle, yes, step S32 is executed to continue processing the sentence cabin. Otherwise, all the sentence cabins are processed, and step S31 is performed to perform the next round of sentence pair operation.
  • ⁇ 3> providing a method for reading a foreign language in a native language based on a statement component, comprising the following steps:
  • the user specifies the native language and the source language (foreign language, the same below), which is the language of the component library, and divides the screen into upper, middle, lower or front, middle and back windows.
  • the lower or rear window is used to display the source language to be read
  • the upper or the front window is used to display the read native language text
  • the repentance operation is displayed on the prompt line. Exit and other command buttons and one or one shift word order buttons, or make them as floating bars immediately below the middle or user can move;
  • the source statement reads in, reads a sentence from the source language as the current sentence is displayed in the middle window, the native text displays the last sentence of the processed sentence, and the source text displays the content of the current sentence.
  • the sentence is determined, if the entire sentence of the current sentence is processed, the query buffer and the command button are:
  • the feedback buffer is not empty, the feedback buffer information is added to the source language, the native language, the current source sentence, and the like.
  • E-mail feedback to the support website clear the feedback buffer, and store the "feedback sentence" flag in the world text cache; when the repent command button is clicked, according to the user's click on the remorse string, the corresponding in the regret selection cache is taken out.
  • SJW if the source text is not finished, the file header is recorded Lower source text offset; when the repentance operation, the save exit command button are not clicked, determine the current sentence is given, perform steps
  • Step S5 The method for reading a foreign language in a native language based on a sentence component, wherein the four sentence component library is used to obtain a conversion operation of the foreign language, the corresponding component of the native language and the Italian code by looking up the table, and giving the content of the native language field of the same record.
  • Step S6 further includes the following steps:
  • step S603 if there is no matching sentence pattern, the stored sentence library represents the number in the feedback buffer area;
  • the mother sentence type is mainly displayed in the upper part of the middle window, and the source sentence pair is nested in the source sentence type, and the note type is displayed in the native language of the window. Below, and read the sentence code into the world buffer;
  • step S605 Take the sentence cabin, from left to right, in the middle window, mark the current sentence cabin in the parent sentence type, store the current sentence cabin label in the world text buffer area, and mark and extract the corresponding sentence cabin content of the source language as the current sentence cabin content. , determining whether the current sentence cabin content belongs to a simple sentence cabin, if not, executing step S605, if yes, performing step S608;
  • step S606 is performed, and if multiple matching cabin modules are found, the middle window is extended downward. Displaying the corresponding native module in the extension, accepting the user selection and then performing step S606, if there is no matching cabin model, the storage module represents the number in the feedback buffer;
  • the mother tongue module is displayed in the extension of the middle window, and the source sentence cabin content is nested in the source language module, and the note is displayed. Below the native module of the window, and reading the cabin model into the world buffer;
  • the eyes are taken from the left to the right in the native mode, and the current cabin eye is marked one by one on the native language module.
  • the current cabin eye is marked in the world space buffer, and the corresponding cabin contents are marked and extracted. Go to step S608;
  • step S609 determines the meaning of the word, read the simple sentence box of the source language or a word string in the cabin eye from left to right, and query the source language string field of the meaning group library. If only one identical word string is found, step S609 is performed, if Find a plurality of identical word strings, respectively take out the contents of their same-recorded native language string fields, back them up in the regret selection cache, and display them in the lower part of the expanded middle window. After receiving the user selection, perform step S609, if the same words are There is no string, and the current source word string is stored in the feedback buffer area;
  • step S608 Take out the content of the currently recorded native language string field and fill in the current parent sentence box or the current native language cabin, and take out the intention group code and store it in the world text buffer; continue to step S608 until the current simple sentence cabin or the current cabin operation is completed.
  • the current sentence or cabin eye loss compensation operation According to the information of the personality loss table, the current sentence or cabin eye loss compensation operation; then correct the current sentence or the cabin's native language word order according to the information of the native language word list; last query ⁇ , one move word order button, when ⁇ button Clicked to click on the word string of the current sentence or cabin user ,
  • step S610 is performed subsequently;
  • step S610 Judging, if the current sentence compartment is still unprocessed, step S607 is performed. If no, the current sentence and the sentence compartment are not processed, step S604 is performed, and if all the sentence compartments of the current sentence are all processed, the subsequent step S7.
  • the method for reading a foreign language based on a sentence component of a sentence component wherein the email is fed back to the support website, and when the support website receives the feedback email from the user, the new component is added to the corresponding component library after being processed by the expert in real time, and The new component and related information are fed back to the user in real time, and the original "feedback sentence" flag is replaced by the user's participation; wherein the content of the world text buffer is saved as the world text, in the process of the user reading the foreign language in the mother tongue, the world
  • the text buffer is stored in real time with the code code of the sentence pattern, idiom code, cabin code, and group code, as well as the sentence number, the cabin number, etc., which are saved in the world and read directly in the mother tongue.
  • the method for reading foreign language based on the sentence component of the sentence component is actually one of the applications based on the sentence component, and referring to the coding step of the current sentence using the four sentence component library, and the decoding step of the world text reading, Generate a variety of sentence-based component-based application systems: a method based on sentence component-based world text generation, used to convert traditional text into world text, followed by multi-lingual readout; text processing method based on statement component, for Converting a source text into a textual text, or converting it into a multi-language; a statement-based machine translation method for translating a source language into a target, or translating into a multilingual.
  • the beneficial effects of the present invention are:
  • the statement component storage unit of the statement component device is provided with four libraries for storing sentence patterns, cabin models, and meaning clusters respectively. , ,
  • a certain code field used to compile the code.
  • the Italian code not only uniquely represents the common semantics of the same component, but can be decomposed into a certain record of a certain library. Such a design has the beneficial effect that the component and the component can be directly converted or changed by the meaning of the code.
  • the sentence pattern and the cabin model provide a framework for the sentence, which involves a complicated grammar and determines the position of the sentence cabin and the cabin eye. This avoids the prior art using artificial intelligence for syntactic analysis and grammar analysis. The bitterness. Benefits The results are consistent with the original text.
  • the corresponding sub-library is separated, which is suitable for the industry or special version, and has a support website; it is beneficial to user segmentation and has more beneficial effects for users.
  • step features of the method for making the sentence component necessarily produce the beneficial effects of the meaning of the sentence component by the equalization and unification of the meanings of words, words and sentences of different language characters.
  • the characteristics of the statement component library are more reliable and more reliable than the electronic dictionary and rule library of the prior art machine translation, and the translation quality and the text conversion quality are more and more reliable; The benefits of improved text quality.
  • the foreign language reading method is one of the applications based on statement components.
  • the foreign language can be read directly in the mother tongue and can be read by everyone. And after reading, it also generates Esperanto; in this way, as long as one foreign language has been read by one person, thousands of people in the back can read the Esperanto - read the multilingual mother tongue without intervention; this has always been a dream of.
  • FIG. 1 is a schematic structural view of a sentence member device;
  • FIG. 2 is a schematic diagram of a sentence member;
  • FIG. 3 is a schematic diagram of a sentence library;
  • FIG. 4 is a schematic diagram of a cabin model library;
  • FIG. 5a is a schematic diagram of a cluster of clusters (English single string);
  • 5b is a schematic diagram of the Yiqun string library (English complex string);
  • Fig. 6 is a schematic diagram of the idiom library;
  • Fig. 7 is a sentence pattern level comparison flowchart;
  • Fig. 1 is a schematic structural view of a sentence member device;
  • FIG. 2 is a schematic diagram of a sentence member;
  • FIG. 3 is a schematic diagram of a sentence library;
  • FIG. 4 is a schematic diagram of a cabin model library;
  • FIG. 5a is a schematic diagram of a cluster of clusters (English single string);
  • 5b is a schematic diagram of the Yiqun string library (English complex
  • Sentences - In natural language, the basic unit of expressing complete semantics is called a sentence; sentences of different language words Can express the same semantics. Sentences can be divided into two parts: sentence pattern and sentence cabin. A sentence pattern contains at least one sentence cabin. Sentence pattern - abstraction from a type of sentence, relatively stable in the sentence, embodying the basic semantics and generics of the sentence; constitutes the sentence structure of the basic structural framework of the sentence.
  • the sentence pattern shows that the basic semantics and generics of the sentence are oriented to all human beings and cross-lingual; and the basic structural framework is oriented to the specific natural language, and it involves the complex and individualized grammatical phenomena of natural language.
  • the sentence cabin - those flexible alternatives embedded in the basic structural framework of the sentence pattern is called the sentence cabin.
  • the sentence cabin accepts the choice and restriction of the sentence pattern; the sentence cabin can be filled or replaced with a group of meanings to form a colorful and specific sentence.
  • the number of sentences and their semantic content are all-human and cross-lingual; but their position, order and meaning clusters in the basic structure of the sentence structure are specific to the natural language; Grammatical phenomena are also extremely Example sentence sentence sentence explanation (# line number):
  • Example 1 1# 6 #11# The three lines each represent the basic semantics and generics of the three sentence patterns, facing the world and cross-lingual parts; for example, "as long as the sentence pattern" means generic and basic semantics, (01074) Indicates the model number, which is the low-digit decimal number of the Italian code.
  • 2 2 ⁇ 5#, 7 ⁇ pendulum, 12 ⁇ 15# represent the structural framework of the three sentence patterns, which are oriented to specific natural language.
  • the left front part of each line is the frame structure of the sentence pattern, the sentence brackets are inside the sentence box; the right rear part is the corresponding sentence box and content examples.
  • 2#7#12# is for Chinese; 3#8#13# for English; 4#9#14# for Russian; 5#10#15# for other languages.
  • the numbers in or above the curly braces are the sentence numbers.
  • the number of sentence cabins (such as 2 cabins for 1# and 4 cabins for 6#), and the semantic meaning of each sentence cabin is world-oriented and cross-lingual; its position, order, and structure in the sentence structure
  • the meaning clusters used for filling are language-oriented (for example, ⁇ 2 ⁇ in 7 ⁇ 9# is different in the Chinese-English Russian sentence patterns; the filled meaning group strings are: work, work, JI > ⁇ 0 H ). , ;
  • the sentence compartment is filled or composed of a cluster of meanings that are dominated by the group (also preliminarily understood as being filled or composed of a string of words).
  • the size of the sentence cabin is quite different.
  • the smallest sentence compartment contains only one cluster of meanings; the largest sentence compartment can contain a clause or clause.
  • Yiqun group - Yiqun is the equivalence and unity of the "meaning" of words, words, phrases or phrases in natural language; it is the basic unit of human thinking activities. .
  • the meaning of the meaningless group is ⁇ all human beings; it is also metabolized with the development of human dubbing.
  • the meaning group is divided into single string and double string; only one original string is a single string (as shown in the English string in Figure 5a); consists of two or more original strings, and uses "_" Connected to a complex string (as shown in the English string in Figure 5b).
  • Simple sentence cabin - except for the words that do not express the meaning of the word, no more than three sentence groups are called simple sentence cabins (as shown in Figure 2, 203 ⁇ 204).
  • sentence cabin 2 are simple sentence cabins, sentence cabin 2 English with multiple strings; sentence cabin 3 is larger than the simple sentence cabin, contains the cabin model, is a complex sentence cabin. Cabin and cabin eye - further analysis of the contents of the complex sentence compartment; the frame structure of the sentence pattern is called the cabin model; the replaceable part embedded in the frame structure of the cabin is called the cabin eye.
  • the sentence cabin and the cabin eye are the upper and lower concept; however, the simple sentence cabin and the cabin eye are equal in size, and the same is not more than three meaning clusters except for the unspoken words.
  • ⁇ 1 ⁇ + is his + ⁇ 2 ⁇ + and + ⁇ 3 ⁇
  • (00205) is the cabin model number; this cabin model contains 3 cabin eyes, and the contents of the three cabin eyes are no more than three clusters: 3 ⁇ 1 ⁇ fairy ⁇ 2 ⁇ dance ⁇ and 3 ⁇ play heavenly music ⁇ for him ⁇
  • Statement component - statement component is the equivalence and unity of words, words, sentences and sentences between different languages. According to the ideology of natural language, human beings interpret and compare pairs of sentences; Unified sentence patterns, cabin models, meaning clusters, and small idioms.
  • the statement component encoded by the library can be another component of the assembled sentence or a standard component that encodes the sentence.
  • the statement component includes a sentence component, Cabin module, esthetic string component and small idiom component. Yitong code——for multilingual, unified coding of interpretive and intercommunicative statement components called Yitong code.
  • Esperanto - generated by Yitong code reflecting many The language is interoperable, and can be used for multi-text reading or text conversion of special text files.
  • This special file is expected to be used in the world and is called world.
  • Sentence structure theory simplifies the complexity of natural language and adapts to it. Flexibility; and resolve the grammatical inconsistency between them.
  • the computer provide a convenient operation platform, using the form of human-computer interaction, the brain and the computer can complement each other well.
  • the analysis, comparison process generated sentence patterns, cabin models, Statement group and small idioms are used to build the library, and the Yitong code is compiled.
  • the statement component library is generated.
  • the statement component store is the statement component.
  • These statement components are the meaning between multiple languages. Equivalent, can be assembled, spliced into sentences ( Figure 2). We can use these components to assemble sentences; we can also use sentences to encode sentences, use the Italian code that matches them to generate World Essay, etc. In the process, the computer only needs to do simple table lookup and judgment; encoding or decoding operations are all possible.
  • FIG. 1 is a schematic diagram of a structure of a sentence component device.
  • the sentence component device includes: a sentence component storage unit 101, an original component 102, an Italian code preparation section 103, and a component read.
  • the ejection unit 104, the component matching giving unit 105, the component adding unit 106, the component library operation control, and the interface unit 107 are seven components:
  • the ( ) sentence member storage unit 101 is a central component of the device.
  • the sentence patterns are stored in the corresponding sentence pattern field 301.
  • the sentence pattern described in this sentence actually refers to the framework part of the sentence pattern, which is oriented to each natural language.
  • the curly braces indicate the sentence cabin, the middle number is the number of the sentence cabin, the sentence cabin is filled by the meaning group, the position, the order of the sentence cabin in the sentence pattern and the filled meaning group are all oriented to the natural language; It can be seen from the content 301 in the library of Fig. 3 that the same sentence cabin has the same label but its position and order in the sentence patterns of the various languages are not consistent.
  • the sentence pattern field stores the sentence pattern code, and the sentence pattern code represents the semantic meaning of each sentence pattern in each sentence type field in the same record.
  • the sentence pattern reflects the basic semantics and generics of the sentence is all-human and cross-lingual; it contains the number of sentence cabins and the meaning of the sentence cabin are all human-oriented, cross-lingual; human-oriented, cross-lingual representation Is the sentence code. That is to say, the sentence code represents the sentence form semantics, which insinuates the sentence patterns of each language; each sentence pattern can infer another sentence pattern through the sentence pattern.
  • grammar which belongs to various natural languages
  • the framework part of the sentence model is a complex and individualized grammatical phenomenon of natural language. However, even if there is a grammatical phenomenon in the sentence cabin, it is extremely simple.
  • the cabin model library 400 used to store the cabin module, has the cabin model code, the English cabin model, the Chinese cabin model, and the Russian cabin model field, as shown in Figure 4. It contains at least one record, the same semantics of the same model, and the corresponding model of the cabin is stored in the corresponding language model field 401.
  • the cabin model is the frame structure part of the complex sentence cabin and is oriented to the natural language.
  • the square brackets indicate the cabin eye, the middle number is the number of the cabin eye, and the cabin eye is also filled with the cluster.
  • the position, order and filling of the cabin eyes in the cabin mold are all oriented to the natural language; as can be seen from the contents of the library in Figure 4, the same cabin eye is labeled the same but it is in each language class.
  • the cabin model code field stores the cabin model code, and the cabin model code represents the semantics of each type of cabin model in each cabin model field in the same record.
  • the basic meaning of the cabin model is for all human beings, cross-lingual; it contains the number of cabin eyes and cabin language for all human beings, cross-lingual; its expression is the cabin model code. That is to say, the cabin model represents the semantics of the cabin model, which insinuates the cabin models of each language; each type of cabin module can infer another type of cabin module through the cabin model code.
  • the grammar in the sentence cabin it belongs to the natural language.
  • the cabin model covers the grammatical phenomenon of natural language. However, even if there is grammatical phenomenon in the cabin, it is extremely simple.
  • the meaning group library 500, 502 is used to store the meaning group component, the intention group code, the English string, the Chinese string, the Russian string field, as shown in Fig. 5a-b. It includes at least one record, the same semantic group of the same group of records, and the meaning group of the corresponding language is stored in the corresponding text string field 501, 503.
  • the meaning cluster is the content of the sentence cabin or the cabin eye.
  • the sentence cabin and the cabin eye are the concept of the upper and lower position.
  • the sentence cabin is divided into a simple sentence cabin and a complex sentence cabin.
  • the complex sentence cabin is extracted as a sentence structure and is the cabin eye.
  • the sentence cabin and the cabin eye are the upper and lower concept; however, the simple sentence cabin and the cabin eye are equal in size, and the same is not more than three meaning clusters except for the unspoken words.
  • the pinyin string has a single string and two strings.
  • the single string is an original word string 501.
  • the compound string is composed of more than one original word string and is connected by " _ " to form 503.
  • the meaning group code field stores the meaning group code, and the meaning group code represents the semantic meaning of each group of meaning group strings in each group of the meaning group strings in the same record, which is oriented to humans and cross-language; For all natural languages.
  • the meaning group code represents the semantic meaning of the group of meaning groups, and the group of meaning groups of each type of text is inferred; each group of meaning groups can infer another group of meaning groups through the group code.
  • the idiom field contains at least one record, a small idiom of the same semantics is co-located, and a small idiom of the corresponding language is stored in the corresponding idiom field 601.
  • the idiom code represents the semantics of the small idioms of each language in the idiom fields of the same record. That is to say, the idiom code represents the semantics of the small idioms, and the small idioms of each language are inferred; the idioms of each language can insinuate the idioms of another language through the idioms.
  • the structure of the above four libraries emphasizes that only the same components of the same semantics are in the same record, and the same record is designed with a certain code field for compiling the code.
  • the Italian code is inline with the same components of the same record.
  • Such a structure ensures that the component and the component can be directly converted or changed by the Italian-style code conversion; that is, different languages can be converted into each other.
  • the relationship between the above four libraries is parallel, and they do not interfere with each other and are shared among the statement component storage units. All operations or controls of other components are accepted.
  • the original department 102 stores the index files related to the above four libraries; it also includes the original CPU and the like.
  • the code generation unit 103 is connected to the sentence component storage unit 101 and the Italian code preparation unit 103.
  • the representation of the same semantics of the current library and the components of the language in the current record is unique;
  • the component matching matching section 105 is directly connected to the sentence component storage section 101 for receiving a matching command. According to the sentence or sentence content of the given language and the guidance of the current operating point, the corresponding language records are searched in the corresponding language index field of the corresponding component library, and the required language components are matched. If there is no matching record, no match signal is returned.
  • the component adding unit 106 is directly connected to the sentence member storage unit 101. Used to receive the Add New Component command. After the query confirms that the corresponding component library does not have the same component, the new component is added to the corresponding language component field of the corresponding component library. When a new component is added to a new record, the information is simultaneously notified to the code generation unit 103.
  • the component library operation control and interface unit 107 is connected to the sentence component storage unit 101 by the component reading unit 104, the component matching operation unit 105, and the component adding unit 106. It is used to receive calls from various applications based on the statement component or to receive related commands, return the caller required statement components, or connect to other application devices based on the statement component through the interface.
  • the statement component described above (see FIGS. 2 and 3 to 6) is another component for assembling a linguistic sentence, and is also a standard component for splitting and encoding another sentence. There are four types as follows: , , .
  • the genus of Italians also determines the order and number of sentences in the sentence, and it involves the more complicated grammatical phenomena of such sentences.
  • Modules 202, 401 used to form the basic structural framework of a complex sentence compartment. It represents the basic semantics of this type of sentence cabin, and also determines the number and number of cabins contained in such sentence cabins, and covers the more complicated grammatical phenomena of such sentence cabins. Both sentence patterns and cabin modules provide a framework for sentences, which involves a complex grammar and determines the order of the sentence cabins and cabin eyes. This avoids the use of artificial intelligence for syntactic analysis and grammatical analysis. bitter. There is a contribution to the results that can be consistent with the original text.
  • the group of characters 501 to 503 is a component that is played by the group of meanings.
  • the simple sentence cabin and the cabin eye are the same as the upper and lower concept, and they are not more than three meaning clusters except for the vocabulary;
  • the small idiom component 601 by the sentence is too short to separate the sentence pattern, sentence sentence as a small idiom component. Used to form short sentences directly.
  • the sentence component device described above in addition to the corresponding corresponding language (English, Chinese, Russian), for each additional language, the sentence library, the cabin model library, the Italian group library, the idiom should be first used.
  • the library respectively adds a certain sentence pattern, a document module, a string, and an idiom field. And the newly added components can only be added to the same record if they have the same semantics as the existing language components. That is, it is emphasized again that only statement components of the same semantics can share a record.
  • a method of making a statement component The method of making a statement component: 1 Prepare a sample corpus, and take the corpus of the bilingual or multilingual text version of the same content as a training sample. The human-computer interaction method is used to first perform the analysis of the sentence-level analysis; then, the analysis is performed on the sentence-level level.
  • the word, word, and sentence meanings are equivalent and unified statement components, including the following steps:
  • the training corpus sample for each round should be as large as the new sentence/sentence ratio ⁇ 1% before considering adding new languages; for example, during the operation, one working day is counted, and the new sentence number is divided by the new one.
  • the ratio of sample sentence pairs is ⁇ 1%.
  • it is marked according to the industry source or application range source of the training sample corpus.
  • FIG. 7 is a sentence level comparison flowchart. Analyze the sentence at the level of the sentence. Read bilingual sentence pairs, divide sentence patterns, sentence cabins, store sentence patterns as sentence components in sentence patterns, and put small idioms that are not enough to separate sentence patterns and sentence cabins into idioms as small idioms. At the same time, the bilingual sample sentence pairs that have been divided into sentence patterns and sentence cabins are saved, in order to further compare the sentence cabin level. The specific steps are shown in Figure 7: Start, first read a bilingual sample sentence pair 701. Then, the sentence pattern word program 702 is called, and the sentence pattern library is returned to return 4, and the B language matches the sentence pattern. A matching sentence pattern 703 is judged.
  • Pinyin text between two points ⁇ a word string, ideogram ⁇ a word is valid. If no, prompt to redo; if the click is correct and valid, the content between the two points of the A and B statements is dug and filled in " ⁇ N ⁇ ", the round of the digging cabin ends, the next round of repeated digging cabins
  • the sentence operation 704 then digs a sentence cabin. When it is judged that the sentence type 705 command button is clicked and N ⁇ l, it indicates that the new sentence pattern operation is completed. Clear the corresponding display above, and save the sentence pattern or save the small idiom 706. Add two new sentence patterns as sentence structure components to the sentence database A sentence sentence type, B sentence sentence type field.
  • N 0 at this time, it means that the current bilingual sample sentence pair is judged as a small idiom because it is not enough to separate the sentence pattern and the sentence cabin. Then, to clear the corresponding display, the two small idioms are added as idiom components to the idiom A idioms and B idiom fields.
  • the continuation type 707 the current bilingual sample sentence number is filled in the current matching sentence pattern, or filled in the current new sentence pattern, as a sample sentence sentence that has been divided into sentence patterns, sentence compartments Compare the readings. This step ends. next , .
  • the current bilingual sample Fill in the current new sentence pattern, as a sample sentence pair that has been divided into sentence patterns and sentence cabins, such as "Does 1 ⁇ John ⁇ 2 ⁇ work ⁇ as 3 ⁇ hard ⁇ as 4 ⁇ Henry ⁇ ? 1 ⁇ John ⁇ Like 4 ⁇ Henry ⁇ 2 ⁇ Effort ⁇ 3 ⁇ Work ⁇ ?", and save it, the profiling of the statement level is compared to the reading, this step ends.
  • the next bilingual sample sentence begins, then Execute reading the bilingual sample sentence pair 701. If the sentence pair read above is "How do you do?", "Hello!
  • Multilingual vs. Semantic Considerations Semantic considerations from multilingual pairs, at least bilingual pairs. If the conditions permit, it is natural to take as many words as possible while extracting the sentence; it is because it is impossible to ask at least a bilingual pair. Such as:
  • the principle sentence of 5 part of speech and replaceability is the part that can be replaced by other words.
  • the part of the vocabulary in the sentence cabin is limited to numbers, nouns, adjectives, and plurals. In a few cases, other words (such as verbs, adverbs, etc.) are considered. If you want to give priority to the class of words in the sentence cabin, then the first is the string, the special string, the second string, the adjective string..., and finally consider the verb string. The least considered are prepositions and conjunctions. That is to say, prepositions and conjunctions are almost all part of the sentence.
  • the above four sentence pairs respectively, by a when, lead a clause.
  • the first pair can separate the subject-predicate and the time-sentence into two sentence cabins.
  • the second pair can be made into three sentence cabins.
  • the third pair can't separate adverbials, can only be made into a sentence cabin; the fourth pair of questions on the topic, can not be separated, can only be made into a sentence cabin.
  • the third, the four pairs are "only like this", which is the scale to be grasped in the lesser case.
  • Figure 8 is the sentence cabin level comparison flow chart. As shown in Figure 8, start running, read the bilingual sample pair 801 that has been divided into sentence patterns and sentence cabins.
  • group compound word 802 steps sequentially take out the sentence sentence form, the sentence sentence of the sentence sentence to one of the sentence cabins, open the window one shows the A, B language sample sentence example, the lower part shows the A, B language current sentence cabin
  • the current sentence compartment of the A language is segmented by the word string and then filled in the A language field of the reference table in turn, and then a word string is sequentially taken to find the A text string field of the meaning group library, and the same record is taken out after being found.
  • B string field content If the content of the B string is contained in the current sentence of the B language, the content of the B string is filled in the B field of the reference table, and does not contain it to be empty.
  • the cabin module is not executed to write the cabin module 805, open window three as an editable window, the current double statement The contents of the cabin are displayed again, and the expert is used to write the cabin model.
  • the display module command button is also displayed. If the cabin module is included, the cabin module is used as the current cabin module. Steps: Divide the cabin mold and the cabin eye 806. The save mode command button is clicked and the editable window has been edited , ,
  • the current sentence space is filled into the current module, or the new model is filled as a complex sentence display with the cabin and cabin.
  • the alignment determination command button is displayed.
  • the reference table accepts the expert to extend or add the meaning of the word according to the example, does not change the original word word, and adds or subtracts the string length, the adhesive tape accompanying word, and the word form change supplementary word meaning item.
  • the alignment determination command button When the alignment determination command button is clicked, it indicates that the reference string of the reference table ⁇ ⁇ , ⁇ ⁇ has been intentionally aligned, that is, the group string has been formed, and the escrow group 808 operation is performed, and the contents of the ⁇ and ⁇ ⁇ field are recorded as records.
  • the group string component is stored in the string or the string field of the meaning group library. If the current operation is the cabin eye, the judgment eye is finished 809, no, if the current sentence compartment has no operation of the cabin eye, then perform the cabin eye 807 step; until all the cabin eyes of the current sentence cabin are completed.
  • the sentence cabin After the judgment of the sentence cabin is completed 810, it is judged whether there is an unprocessed sentence cabin in the sample sentence sentence sentence sentence sentence, and there is still an unprocessed sentence cabin, and the sentence box 802 step is executed; The sentence cabin. If all the sentence cabins have been processed, perform the re-reading sentence pair 801 step, and read in the next bilingual sample sentence pair that has been divided into sentence patterns and sentence boxes. Go to the next round of operations.
  • Start running, read in has divided the sentence sample, sentence cabin bilingual sample pair 801, for example "1 ⁇ the fisherman ⁇ consents to 2 ⁇ return the_feather_suit ⁇ , on condition that 3 ⁇ fairy dance and play heavenly music for him ⁇ .,,, " Under the condition of 3 ⁇ fairs dancing for him and playing the music of the heavens ⁇ , 1 ⁇ fisherman ⁇ promised 2 ⁇ return the feathers ⁇ . "There are 1, 2, 3, and two sentence cabins for each example sentence.
  • the current sentence of the A language is divided into the word string and filled in the reference field A field.
  • B-language field Return V reference table B has three records that are empty, but corresponding to "feather” has “feather”; corresponding to "suit” has “clothes”; and these three records have the "" mark of the expert click .
  • the compound word command button has been clicked, and the contents of the A language field with the mark record in the reference table are connected by "_" to form a compound word, and the marked record is merged into one record, and the A language field is filled in the compound word, B The language field is filled in with an equal semantic string.
  • the reference table becomes: — —
  • the cabin model library is judged whether the current sentence cabin contains a cabin model. No, without cabin model, execute the step of writing cabin module 805, open window three as an editable window, and the current double statement cabin content "fair dance and play heavenly music for him,,""The fairy dances for him and plays the music in the sky. "Re-display, accept the expert to write the cabin model, and also display the storage module command button. If the cabin module is included, the cabin module is used as the current cabin module: the cabin mold and the cabin eye 806 are divided.
  • the cabin mode command button is clicked and the window can be edited, such as " ⁇ 1 ⁇ ⁇ 2 ⁇ and ⁇ 3 ⁇ for him", " ⁇ 1 ⁇ is his ⁇ 2 ⁇ and ⁇ 3 ⁇ ", has been edited
  • the new cabin model also meets the format requirements, and the newly edited A and B language modules are stored as cabin modules in the cabin model A and the B cabin model fields, as shown in Figure 4.
  • the sentence cabin content is filled in the current cabin model, or filled into the new cabin model, such as: "1 ⁇ fairy ⁇ 2 ⁇ dance ⁇ and 3 ⁇ play heavenly music ⁇ for him", "1 ⁇ fairy ⁇ for him 2 ⁇ Dancing ⁇ and 3 ⁇ playing the music of the sky ⁇ "As a complex sentence cabin display that has been divided into cabins and cabins.
  • This complex sentence Contains 3 cabin eyes. Continue to take the cabin eye 807, take out the contents of one cabin eye in turn, and then perform the step of performing the group alignment 808.
  • the alignment determination command button is displayed, and the reference table accepts the expert extension by example or Adding the meaning of the word, changing the length of the string, adding the tape, attaching the word, changing the word form, adding the meaning of the word, etc., or modifying the meaning of the group alignment.
  • the three cabins A and B have The corresponding original word string does not need to be explained.
  • the "play” Chinese original dictionary only has "game, competition, sports, gambling, script; play, play, play, play, play, play", etc. without “playing" acceptance.
  • the expert extendends or adds words according to the example” plus "play” and "play”.
  • the current cabin eye is the cabin eye 3.
  • the reference table is:
  • a language field play heavenly music
  • B-language field Playing the music of the sky
  • the save meaning group string 808 operation is performed, and the A and B language fields are recorded one by one.
  • the content is stored as an A-string or a B-string field of the meaning group library as a group of meaning strings. If the current operation is the cabin eye, the judgment eye is finished 809, no, if the current sentence cabin and the cabin eye are not operated, the step of taking the cabin eye 807 is performed; until all the cabin eyes of the current sentence cabin are completed.
  • the sentence cabin After the judgment of the sentence cabin is completed 810, it is judged whether there is still an unprocessed sentence compartment in the sample sentence sentence sentence sentence sentence, and there is still an unprocessed sentence cabin, and the execution of the sentence cabin 802 step; The sentence cabin. If all the sentence cabins have been processed, perform the re-reading sentence pair 801 step, and read in the next bilingual sample sentence pair that has been divided into sentence patterns and sentence cabins. Go to the next round of operations.
  • the sentence level comparison operation is as described above for "group compound words", and the original word string is connected by "-" into a compound word (multiple string); ;
  • V has the meaning of "teaching, teaching”; the length of the word string is “teaching” for splicing, and the meaning of "teaching” is added.
  • the middle window is used to display the current operation sentence and related information in the operation
  • the lower or rear window is used to display the source language to be read, on or before
  • the widget window is used to display the native language text that you have read.
  • a command button such as a repentance operation, a save exit, and the like, and a shift word sequence button are displayed; or they are made as a float bar immediately below the middle or the user can move.
  • run the source statement reads in 901, reads a sentence from the source language as the current , , ,
  • the small idiom 902 is judged, and the source idiom field of the idiom library is queried with the current sentence. If yes, give the small idiom 903, take out the native idioms in the native idiom field of the same record, display it in the middle window, and read the idiom code of the same record into the essay buffer, and then execute the step source. The statement reads in 901. If not, it is not found, and the sentence-type subroutine 904 is continuously called.
  • the source sentence field of the current sentence query query library if there is no matching sentence pattern, the stored sentence library represents the number in the feedback buffer.
  • the corresponding parent sentence type is displayed in the lower part of the middle window, and the user is selected to select a sentence pattern. Or only find a matching sentence pattern, give the sentence pattern of the same record, the parent sentence type and the source sentence type, highlight the parent sentence type in the upper part of the middle window, and nest the source sentence pair into the source sentence type. , the note is displayed below the native language of the window, and the sentence code is read into the world cache. Then, the step 905 step is executed, and the current sentence box in the parent sentence type is marked from left to right in the middle window, and the current sentence box number is stored, that is, (the original sentence box number + FFE0H in the sentence pattern) is in the world text buffer area.
  • the corresponding sentence cabin content of the source language is marked and taken out as the current sentence cabin content, and it is judged whether the current sentence cabin content belongs to the simple sentence cabin 906. If it is a simple sentence cabin, perform step word determination 909. If not, perform a step to match the cabin mold 907.
  • the mother tongue module is highlighted in the extension of the middle window, and the source sentence cabin content is nested in the source language module, the note is displayed below the native module of the window, and the cabin model code is read into the world text buffer. .
  • the subsequent take-up of the eye 908 step From the left to the right of the native language model, the current cabin eye is marked one by one on the native language module; the current cabin eye mark (the original cabin eye mark + FFD0H on the cabin model) is stored in the world text buffer; For the cabin eye content, perform step word meaning determination 909. Read the simple sentence box of the source language or a word string in the cabin eye from left to right, and query the source language string field of the meaning group library.
  • the execution step gives a native string 909.
  • the content of the native language string field of the current record is taken into the current parent sentence box or the current native language capsule, and the intention group code is stored in the world text buffer; the word meaning determination 909 is continued until the current simple sentence cabin or the current cabin operation is completed.
  • Personality loss compensation 911 for the current sentence or cabin is based on the information of the personality loss table.
  • the execution step source statement is read into 901.
  • the following steps are further illustrated by an example: Start preparing for some interfaces.
  • the source statement reads 901, reads a sentence from the source language such as "Children not Allowed! as the current sentence is displayed in the middle window, the native text display shows the content of the processed preceding sentence, and the source text displays the content of the current sentence. Judging the small idiom 902, querying the source idiom field of the idiom library with the current sentence.
  • the minor idiom 902 is used to query the source idiom field of the idiom library with the current sentence. None, not found, continue to call the sentence-type subroutine 904.
  • the source sentence type field of the query sentence library is queried with the current sentence, then only Find a matching sentence pattern, give the sentence pattern of the same record, the parent statement type, and the source statement type, and tell the parent sentence type " ⁇ 1 ⁇ to his ⁇ 2 ⁇ , if Can ⁇ 4 ⁇ , it can be ⁇ 3 ⁇ .”
  • Emphasis is placed on the upper part of the middle window, and the source sentence is nested in the source statement "the ⁇ 1 ⁇ told his ⁇ 2 ⁇ that ⁇ 3 ⁇ on condition that ⁇ 4 ⁇ .,,, Note is displayed under the mother tongue "the 1 ⁇ doctor ⁇ told his 2 ⁇ patient ⁇ that 3 ⁇ he would prescribe him some patent medicine ⁇ on condition that 4 ⁇ he strictly follow his instructions ⁇ .
  • the sentence code "sentence code 001061" is read into the world text buffer. Then execute the step 905 step, from left to right in the middle window to mark the current sentence in the parent sentence, and save the current sentence number in The world text buffer area. At the same time, the corresponding sentence cabin content of the source language is marked and taken out as the current sentence cabin content, and it is judged whether the current sentence cabin content belongs to the simple sentence cabin 906. If it is a simple sentence cabin, the step word meaning is determined 909. For example: Sentence 1, sentence 2; save sentence box number"
  • Him some patent medicine judge the simple sentence cabin 906, no, perform the procedure to check the cabin model 907.
  • the mother tongue module is highlighted in the extension of the middle window, and the source sentence cabin content is nested in the source language module "he would prescribe him 1 [some patent medicine], and the note is displayed below the native font of the window, and Read the cabin model code "" into the essay buffer. However, follow the steps of 902.
  • the native cabin is marked one by one on the native cabin model; The world text buffer; at the same time labeling and extracting the corresponding cabin content of the source language, "some patent medicine execution step word meaning determination 909.
  • Read a word string in the cabin eye of the source language from left to right query the source language of the meaning group library String field. If multiple identical word strings are found, the contents of their same-recorded native language string fields are taken out, backed up in the regret selection cache, and displayed in the lower part of the extended middle window, and the receiving user selects one of them. For example: some several - ifcb quite a few patent patent license effects understand the medical internal medicine user selected as "some special effects". 909.
  • the feedback buffer When the feedback buffer is not empty, the feedback buffer will be The information plus the source language, the native language, the current source sentence and other information is made into an email feedback to the support website; the feedback buffer is cleared, and the "feedback sentence" flag is stored in the world text buffer.
  • the regret operation command button When the regret operation command button is clicked, according to The user clicks on the confession string, and extracts the corresponding content in the confession selection cache to allow the user to re-select the word string and make related modifications. This is clear enough and complete, and can be implemented by those skilled in the art.
  • the class, the directory, and the plurality of terms belonging to the time are all included in a "repentance operation table", and the repentance operation is in the parent sentence clicked by the user, according to the sentence cabin in which the user is located.
  • the cabin eye and the word string give the contents of the same batch table, let the user re-elect and modify accordingly.
  • the regret operation table has a regret batch, a sentence cabin number, a cabin number, a string of words, Fields such as group code.
  • the cabin number and the eye number are the same, and the source string is different.
  • the content of the world text buffer, as in the case of the above example, is " ⁇ "00064; Sentence pattern 001061; sentence cabin No.
  • the reason for the loss of personality loss is that the above-mentioned simple sentence cabin and cabin eye are the upper and lower concept, but the size is the same, except that they are less than or equal to three meaning clusters except the word. Not imaginary words such as Chinese quantifiers, English articles, etc. These lost words are compensated when read in their native language.
  • the information is derived from the personality loss table, and the personality loss table contains fields such as associated strings and compensation strings.
  • the native language word order sometimes needs to be adjusted.
  • the reason is also from the sentence cabin and the cabin eye. They are the same size, and they are less than or equal to three Italian clusters except for the unspoken words. There is no requirement for the order of the three clusters in the machine, so the various applications based on the components of this statement.
  • the native language word list contains the first word string, the read string, and the adjustment string field.
  • the read string is the serial sequence when read in the native language; the first string is the first string of the read string; the adjusted string is the word order that should be adjusted.
  • the table is checked and the match is automatically adjusted. Then, let the system interpret the "one button” and "one button” buttons. If clicked by the user, adjust it according to the user's intention.
  • the word string clicked by the current sentence or the cabin user is moved after the next word string; when the "one button” is clicked, the word string clicked by the current sentence cabin or the cabin user is forwarded to Before the previous word string; at the same time, the moved word order is added to the native language word list for use.
  • One situation that does not require intervention is that the user's readers do not have to intervene when they ask for speed and do not mind. Therefore, the functions of loss compensation, native language order, etc. are made available to the user.
  • the above mentioned emails are sent to the support website.
  • the support website receives the feedback email from the user, after the expert processes it in real time, the new component is added to the corresponding component library, and the new component and related information is fed back to the user in real time, and the original "feedback sentence" flag is replaced by the user's participation. .
  • This is one of the 1 user support; users can have any opinions, suggestions, etc. to communicate and support in this way.
  • social testing and social accumulation can be obtained. 3 Guide multilingual development together, the application of the present invention provides a platform for worldwide cross-lingual communication. It also ended the history of the natural evolution and development of natural language in their respective independent systems; the beginning of a multi-lingual common rapid development process.
  • sentence 1 indicates that the Chinese string "doctor" recorded by the W-fetch-sense group library 2131 is filled in according to the reference.
  • the sentence cabin 1 becomes: "1 ⁇ Doctor ⁇ tells his ⁇ 2 ⁇ , if ⁇ 4 ⁇ , it can be ⁇ 3 ⁇ .
  • sentence cabin No. 2 - ⁇ , ⁇ W ⁇ - take - meaning group string library 6386 recorded Chinese string "patient” according to the instructions filled in the sentence cabin 2 becomes: "Doctor told his 2 ⁇ patient ⁇ , if Can ⁇ 4 ⁇ , it can be ⁇ 3 ⁇ .”
  • the Chinese cabin model recorded in the sentence cabin No. 4 H ⁇ , cabin model code 207 "[1] + his + [2]" is filled in the sentence cabin 4 as follows:
  • Cabin 1 indicates the meaning of the group code - the Chinese string "some” recorded by the group of letters 8260 ; Lu W 7 ⁇ 5- fetch - meaning group string library 17655 recorded Chinese string "special effect”; take - meaning group string library 5484 recorded Chinese string "medicine”; according to the reference into the cabin eye 1 becomes:
  • the native language reading method is one of the applications based on statement components.
  • a plurality of sentence component-based application systems can be generated: a method system based on sentence component-based world text generation. Used to convert traditional text into world, then multilingual readout.
  • a text transformation method based on statement components. Used to convert a source text into a textual text, or to convert it into multiple languages.
  • a machine translation method based on statement components. Used to translate a source language into a language, or to translate into multiple languages.
  • the software system resulting from the implementation of the present invention can be implemented on existing medium, small, micro, supercomputers, notebook computers, PDAs, and the like, or on separate or connected computers. Implementations can be run on a variety of computer networks, particularly on the Internet. The implementation can also be run on devices such as "Personal Digital Assistant", PDA (Personal Digital Assistant).
  • PDA Personal Digital Assistant
  • the product after the implementation of the invention can be applied to work, study, leisure, travel, etc., which need to communicate with people in other languages; it can be used in homes, institutions, schools, and various fields involving foreign languages.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

A sentence component device, including a sentence component storing part (101), comprising a sentence component base of a sentence component constituted by an electronic data form; a sentence pattern base (300), for storing a sentence pattern component, including a sentence pattern code, a multi-language category sentence pattern field, comprising at least one record, the cabin semanteme of the different language category in the same record are same; a sense cluster base (500,502), for storing a sense cluster component, including a sense cluster code, a multi-language category cluster field, comprising at least one field, the sense cluster semanteme of the different language category in the same record are same; a phrase base (600), for storing the small phrase component, including a phrase code, a multi-language category phrase field, comprising at least one record, the small phrase semanteme of the different language category in the same record are same; a sense understanding code programming part (103), connecting with the sentence component storing part (101), programming a sense understanding code for each record of the sentence component base.

Description

语句构件装置和母语读外文并生成世界文及文本转换方法  Statement component device and native language read foreign language and generate world text and text conversion method
技术领域 本发明涉及一种自然语言的处理或转换的装置和方法;特别涉及一种文本转换的 装置和方法; 即语句构件装置、 语句构件制作方法以及基于语句构件的母语读外文的 文本转换方法。 背景技术 计算机语言文字的信息处理代表是机器翻译、是技术难度的至高点。它的操作对 象是文本文件, 由计算机的文字处理技术所产生。 现有的文字处理技术是把各种语言 文字的字符进行编码, 然后利用字符代码 (内码) 生成文本文件。 缺点是如此产生的 计算机文本文件和纸上的文件一样, 只能供各自语种的人们读写、 交流。 因而不同语 种的人们必须借助翻译。 机器翻译如 《自然语言的计算机处理》 冯志伟著, 上海外语教育出版社 1996年 10月出版,其中第八章第一节机器翻译。全文详细阐述了 19世纪 30年代初法国科学 家阿尔楚尼提出用机器来进行语言翻译的想法开始; 到 1946年世界第一台计算机问 世, 同年就开始了机器翻译的研究, 并一时兴旺起来; 1966年 11月美国科学院的语 言自动处理咨询委员会公布了题为 "语言与机器 (ALPAC)" 的报告, 否定了机器翻 译并指出机器翻译遇到了难以克服的 "语义障碍"; 继后机器翻译出现的空前萧条; 1970-1976年复苏; 以至 1976年后的繁荣等等, 洋洋两万言, 最后指出 " 1964年, 美 国 ALPAC报告指出的机器翻译遇到的 '语义障碍' 至今仍然存在, 机器翻译技术至 今似乎仍然没有突破性的进展"。 "机器翻译系统的实用化和商品化问题面临着严峻的 考验"。 大众软件, 2004年第二期, 作者王槊, 采访业界多位专家后报道, 《机器翻译, 路在何方》一文中写道: "目前的机器翻译主要有两种形式, MT和 TM。 MT (machine translation) 就是我们常见的基于规则的机器翻译软件, 其主要用途是为了帮助英 文不好的用户提供翻译参考, 但准确性不高。 MT的关键技术有 4个方面: 单词分析、 语法分析、 意义分析和文理分析。 它的工作过程是, 先把语句分成几个单词, 通过存 放于机器数据库内的电子字典查清词义, 根据语法规则分析语句的意思, 并把它变换 成概念构造, 然后借助语言模型生成目标语言。 尽管从原理上来看, 要实现这一系列 步骤并不困难, 但由于语言的特殊性和多样化, 以及人工智能技术发展水平的限制, 目前要做到不同语种间正确互译是不可能的, 这也是为什么现在的机器翻译软件无法 。 BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an apparatus and method for processing or converting natural language; and more particularly to an apparatus and method for text conversion; namely, a statement component device, a statement component making method, and a text conversion method based on a sentence component for a native language reading foreign language . BACKGROUND OF THE INVENTION Information processing of computer language words represents machine translation, which is the highest point of technical difficulty. Its operational object is a text file, produced by the computer's word processing technology. The existing word processing technique is to encode characters of various languages and then generate a text file by using a character code (internal code). The shortcomings are the same as the files on paper, which can only be read, written and communicated by people of their own language. Therefore, people of different languages must rely on translation. Machine translation such as "Computer Processing of Natural Languages" Feng Zhiwei, Shanghai Foreign Language Education Press, published in October 1996, the eighth chapter of the first section of machine translation. The full text elaborates on the beginning of the idea of the use of machines for language translation in the early 1930s by the French scientist Alcuuni; by the time the world's first computer was introduced in 1946, the study of machine translation began in the same year, and it flourished for a while; 1966 In November, the Academic Automatic Processing Advisory Committee of the American Academy of Sciences published a report entitled "Language and Machines (ALPAC)", negating machine translation and pointing out that machine translation encountered insurmountable "semantic barriers"; Unprecedented depression; recovery in 1970-1976; and the prosperity after 1976, etc., 20,000 words, and finally pointed out that "in 1964, the "semantic barrier" encountered by machine translation in the US ALPAC report still exists today, machine translation technology It still seems that there is still no breakthrough progress.""The practical and commercialization of machine translation systems is facing a severe test." Volkswagen Software, the second issue of 2004, author Wang Wei, interviewed many experts in the industry and reported that "Machinery Translation, Where is the Road" wrote: "The current machine translation mainly has two forms, MT and TM. MT (machine translation) is our common rule-based machine translation software, its main purpose is to help users with poor English to provide translation reference, but the accuracy is not high. The key technology of MT has four aspects: word analysis, syntax analysis , meaning analysis and literary analysis. Its working process is to divide the sentence into several words, find out the meaning of the word through the electronic dictionary stored in the machine database, analyze the meaning of the statement according to the grammar rules, and transform it into a conceptual construct. Then use the language model to generate the target language. Although in principle, it is not difficult to implement this series of steps, but due to the particularity and variety of language, and the limitations of the development level of artificial intelligence technology, it is currently required to be in different languages. Correct translation is impossible, which is why the current machine translation software cannot .
Memory, 翻译记忆), 主要面向专业翻译人员和机构设计, 以翻译记忆和人机交互为 核心, 要求使用者具备独立的翻译能力。 TM的原理是基于数据库, 将翻译过的所有材 料以句子为单位存入数据库。 翻译时系统自动对电子文档进行分析, 100%匹配的句子 可以自动替换, 部分匹配的句子可根据匹配度提出翻译建议, 新句子则通过系统提供 的翻译建议进行人工翻译。 科学研究表明, 翻译中的重复工作量约为 30%, TM翻译软 件使"相同的句子永远不需要翻译第二遍", 从而提高了工作效率。"但 70%还得靠人 工 … 。 最后作者指出: "机器翻译技术本身存在的问题是阻碍其发展的硬伤。 目^ 不仅在中国, 整个世界范围内机器翻译技术都没有很大的突破。 试图用机器通过有限 的规则和语料提高翻译准确性, 在短期内无法实现。 在语言智能化研究理论不成熟的 情况下, MT软件研究在技术上碰到瓶颈,无法解决一个单词在不同语言环境下的词义 选择问题, 同样也无法在复杂多变的语境中正确选择语法规则, 因此, 翻译水平无法 实现明显提高。" 现有技术既然如此, 希望在于另辟蹊径! 发明内容 本发明要解决的技术问题是:  Memory, translation memory, mainly designed for professional translators and organizations, with translation memory and human-computer interaction as the core, requiring users to have independent translation capabilities. The principle of TM is based on a database, and all translated materials are stored in the database in sentences. The system automatically analyzes the electronic documents during translation. The 100% matching sentences can be automatically replaced. The partially matched sentences can be translated according to the matching degree, and the new sentences are manually translated through the translation suggestions provided by the system. Scientific research shows that the amount of repetitive work in translation is about 30%, and the TM translation software makes "the same sentence never needs to be translated a second time", thus improving work efficiency. "But 70% still have to rely on human workers... The last author pointed out: "The problem with machine translation technology itself is a flaw that hinders its development. Objective ^ Not only in China, there is no major breakthrough in machine translation technology throughout the world. Attempts to improve translation accuracy through machines with limited rules and corpus are not possible in the short term. Under the condition that the theory of linguistic intelligence research is not mature, MT software research encounters bottlenecks in technology, can not solve the problem of word meaning selection in different language environments, and can not correctly select grammar in complex and variable context. Rules, therefore, the level of translation cannot be significantly improved. "In view of the prior art, the hope lies in another way! SUMMARY OF THE INVENTION The technical problem to be solved by the present invention is:
1、 现有计算机的语言文字表达, 对不同语言文字的字、 词、 句表意不对等也不 统一; 1. The expression of the language of the existing computer, the meaning of the words, words and sentences in different languages is not uniform;
2、 现有机器翻译、 文本转换; 其结果可读性差, 不能表达原文本意。 本发明的目的在于克服现有技术的不足,提供①一种语句构件装置及②语句构件 制作方法; 用人机交互的技术手段, 解决现有计算机的语言文字表达, 同样是字、词、 句表意不对等也不统一的技术问题。从而产生贮存于语句构件库的、不同语言字、词、 句表意得以对等和统一的语句构件的技术效果。 本发明的进一步目的是提供③一种基于语句构件的母语读外文的文本转换方法; 解决现有机器翻译、 文本转换, 其结果可读性差, 不能表达原文本意的技术问题; 从 而提高译文或转换文本的可读性, 表意能与原文一致的技术效果。 转换的同时还生成 世界文, 产生可多语种读出的技术效果。 大大改观了现有字处理文本只供本语种人们 读写交流的现状。 本发明解决其技术问题所采用的技术方案是: 2. Existing machine translation and text conversion; the results are poorly readable and cannot express the original text. The object of the present invention is to overcome the deficiencies of the prior art, and to provide a sentence component device and a method for fabricating a two-sentence component; and using the technical means of human-computer interaction to solve the language expression of the existing computer, which is also a word, a word, a sentence expression Technical issues that are not equal or uniform. Thereby, the technical effects of the sentence components stored in the sentence component library and the different language words, words and sentences are equivalent and unified are generated. A further object of the present invention is to provide a text conversion method based on a sentence component for a native language reading foreign language; to solve the existing machine translation and text conversion, the result is poorly readable, and cannot express the technical problem of the original text meaning; thereby improving the translation or conversion The readability of the text, the technical effect of the original meaning consistent with the original text. The conversion also generates Esperanto, which produces technical effects that can be read in multiple languages. It has greatly changed the status quo of the existing word processing text for the reading and writing of people in this language. The technical solution adopted by the present invention to solve the technical problem thereof is:
〈一〉、 提供一种语句构件装置,包括 CPU、 输入、 输出和存放查询相关索引表的 原有部 102, 其特征在于还包括: 语句构件存储部 101, 含有包括用电子数据形式构成的、 存储了多语种语意对 , : 句型库 300, 用于存储句型构件, 有句型码、 英文句型、 中文句型、 俄文句型字 段, 其包含至少一个记录, 相同语意的句型同处一个记录, 相应文种的句型存储在相 应文种句型字段内, 句型码代表了同一记录内各文种句型字段内的各文种句型的语 ; 舱模库 400, 用于存储舱模构件, 有舱模码、 英文舱模、 中文舱模、 俄文舱模字 段, 其包含至少 个记录, 相同语意的舱模同处 个记录, 相应文种的舱模存储在相 应文种舱模字段内, 舱模码代表了同一记录内各文种舱模字段内的各文种舱模的语 ; 意群串库 500、 502, 用于存储意群串构件, 有意群码、 英文串、 中文串、 俄文串 字段, 其包含至少一个记录, 相同语意的意群串同处一个记录, 相应文种的意群串存 储在相应文种串字段内, 意群码代表了同一记录内各文种串字段内的各文种意群串的 语意; 习语库 600, 用于存储小习语构件, 有习语码、 英文习语、 中文习语、 俄文习语 字段, 其包含至少一个记录, 相同语意的小习语同处一个记录, 相应文种的小习语存 储在相应文种习语字段内, 习语码代表了同一记录内各文种习语字段内的各文种习语 的语意; 意通代码编制部 103, 与语句构件存储部 101相连, 用于接收构件添加部 106的 通知, 仅当上述四个库任何之一出现新记录时, 把当前库代表数作高位字加上当前库 记录号生成意通代码, 并填入当前库的某某码字段, 作为语句构件统一的双字节定长 的多语种语意互通的意通代码, 意通代码对于当前库当前记录内各语种构件的同一语 意表示是唯一的; 构件读出部 104, 与语句构件存储部 101相连, 用于接收读出命令, 以意通代码 所含数段确定某库某记录, 并到相应库相应记录读出所需要的语种构件; 构件匹配给出部 105, 与语句构件存储部 101相连, 用于接收匹配命令, 根据所 给语种的句子或句舱内容以及当前操作点的指引, 在相应构件库相应语种索引字段查 询匹配, 给出匹配的所需要的语种构件或返回无匹配信号; 构件添加部 106, 分别与语句构件存储部 101、 意通代码编制部 103相连, 用于 接收添加新构件命令, 在查询证实相应构件库没有相同构件后, 将新构件添加到相应 构件库的相应语种构件字段内, 当给一个新记录添加新构件时, 同时发信息通知意通 代码编制部 103; 构件库操作控制、 接口部 107, 通过构件读出部 104、 构件匹配给出部 105、 构 件添加部 106与语句构件存储部 101相连,接收基于本语句构件的各种应用的调用或 , , , </ RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt;</RTI><RTIgt; Stored multilingual semantics , : sentence pattern library 300, used to store sentence structure components, there are sentence patterns, English sentence patterns, Chinese sentence patterns, Russian sentence pattern fields, which contain at least one record, the same semantic sentence pattern is the same record, corresponding The sentence patterns of the genre are stored in the corresponding sentence pattern field, and the sentence pattern code represents the language of each sentence sentence type in each sentence pattern field in the same record; the cabin model library 400 is used for storing the cabin model member , there is a cabin model, an English cabin model, a Chinese cabin model, a Russian cabin model field, which contains at least one record, the same semantics of the cabin model with the same record, the corresponding language of the cabin model is stored in the corresponding language cabin model field Inside, the cabin model code represents the language of each language module in each language model field in the same record; the meaning group library 500, 502, used to store the meaning group components, intentional group code, English string, Chinese A string, a Russian string field, which contains at least one record, a syntactic group of the same semantics is in the same record, and the meaning group of the corresponding language is stored in the corresponding text string field, and the meaning group code represents the text in the same record. a string of words in a string field The idiom library 600 is configured to store the small idiom component, and has an idiom code, an English idiom, a Chinese idiom, a Russian idiom field, which includes at least one record, and a small idiom of the same semantics is recorded at the same time. The small idioms of the corresponding language are stored in the corresponding idiom field, and the idiom code represents the semantics of each idiom in each idiom field of the same record; the ideology code preparation unit 103, and the statement The component storage unit 101 is connected to receive the notification of the component adding unit 106, and when the new record is generated by any one of the above four libraries, the current library representative number is added to the high-order word plus the current library record number to generate the meaning code, and Fill in the code field of the current library as the unified double-byte fixed-length multi-language semantic interworking code of the statement component. The Italian code is unique to the same semantic representation of each language component in the current record of the current library; The component reading unit 104 is connected to the sentence component storage unit 101 for receiving the read command, determining a certain record of the library by the number of segments included in the code, and reading the required words into the corresponding record of the corresponding library. a member matching unit 105 is connected to the sentence component storage unit 101 for receiving a matching command, and querying the corresponding language index field of the corresponding component library according to the sentence or the sentence content of the given language and the guidance of the current operating point. Matching, giving the required language component of the match or returning the no match signal; the component adding unit 106 is respectively connected to the sentence component storage unit 101 and the Italian code compiling unit 103, and is configured to receive the add new component command, and confirm the corresponding in the query. After the component library has no identical components, the new component is added to the corresponding language component field of the corresponding component library. When a new component is added to a new record, the information is simultaneously notified to the code generation unit 103; component library operation control, interface unit 107. The component reading unit 104, the component matching providing unit 105, and the component adding unit 106 are connected to the sentence component storage unit 101, and receive calls of various applications based on the present sentence member or , , ,
其它应用装置相连接。 所述的语句构件装置中的语句构件: 语句构件通过专家操作、人机交互的方式,来自剖析比对双语对训练样本语料得 到; 语句构件的另一个来源是用户的反馈信息经专家审核后再加入; 语句构件是用于组装语言句子的另部件、或对句子进行编码的标准件,包括如下 四种: Other application devices are connected. The statement component in the statement component device: the statement component is obtained by means of expert operation and human-computer interaction, and is obtained from the parsing comparison bilingual training sample corpus; another source of the statement component is that the user feedback information is reviewed by the expert. The statement component is a standard component for assembling a language sentence or a standard component for encoding a sentence, including the following four types:
①句型构件 201, 301, 用于构成句子的基本结构框架, 代表了该类句子基本语意 类属, 也决定了该类句子所含句舱的位次和个数, 并包揽了该类句子的较复杂的语法 现象; A sentence structure component 201, 301, which is used to form the basic structural framework of a sentence, represents the basic semantic category of the sentence, and also determines the order and number of sentence boxes contained in the sentence, and includes the sentence. More complicated grammatical phenomena;
②舱模构件 202, 401, 用于构成复杂句舱的基本结构框架, 代表了该类句舱基本 语意类属, 也决定了该类句舱所含舱眼的位次和个数, 并包揽了该类句舱的较复杂的 语法现象; 2 cabin module 202, 401, which is used to form the basic structural framework of the complex sentence cabin, represents the basic semantics of the sentence cabin, and also determines the number and number of cabins contained in the sentence cabin. The more complicated grammatical phenomena of this type of sentence cabin;
③意群串构件 501、 503, 是由意群串充当的构件, 用于填充简单句舱 203〜204 或舱眼 205〜207 的构件, 简单句舱与舱眼是上、 下位概念而大小一样, 都是除不表 意虚词外不超过三个意群串; The syllabus members 501, 503 are components that are used by the ensemble string to fill the components of the simple sentence capsules 203 to 204 or the bonnets 205 to 207. The simple sentence cabin and the cabin eye are the same as the upper and lower concepts. , are not more than three meaning clusters except for the meaning of the word;
④小 >J语构件 601, 由过于简短不足以分出句型、 句舱的句子充当小 >」语构件, 用于直接构成简短的句子。 所述的语句构件装置中的语句构件库: 库内所包括文种, 除英文、 中文、 俄文外, 每增加一个文种, 首先应将句型库、 舱模库、 意群串库、 习语库分别依次各增加一个某文句型、 某文舱模、 某文串、 某文 习语字段, 新加文种构件只有与已有文种构件语意相同的才能填加在同一个记录上; 提取其中句型库、 舱模库、 意群串库、 习语库中的某文句型、 某文舱模、 某文串 或某文习语和某某码两个字段构成某某语言库、 第一语言库或第二语言库, 用于语言 翻译或文本转换。 4 small > J-language component 601, which is too short to be used to separate sentences and sentence sentences as small components, used to directly form short sentences. The statement component library in the statement component device: the language included in the library, except for English, Chinese, and Russian, each additional sentence, the sentence library, the cabin model library, the Italian group library, The idioms are each added with a certain sentence pattern, a literary module, a string, and an idiom field. The newly added genre can only be added to the same record if it has the same semantics as the existing genre. Extracting the sentence pattern library, the cabin model library, the meaning group library, the syllabus in the idiom library, a document module, a string or an idiom and a certain code to form a certain language library , a first language library or a second language library for language translation or text conversion.
〈二〉、 提供一种语句构件的制作方法, 利用相同内容的双语或多语种文字版本 的语料作为训练样本, 利用人机交互的方式进行句型、 句舱两个层面的剖析比对, 得 出字、 词、 句表意得以对等和统一的语句构件, 包括如下步骤: <2> Providing a method for making a sentence component, using a bilingual or multilingual text version of the same content as a training sample, and using human-computer interaction to analyze the sentence and sentence levels at the two levels. Words, words, and sentences are equivalent and uniform statement components, including the following steps:
S1. 利用相同内容的双语或多语种文字版本的语料作为训练样本, 每轮选 A、 B 双语作为一个样本对, 其中 A语分配给拼音文字或已经比对过的文种, B语可以分配 ; 第一轮双语对训练样本的剖析比对,其中双语对样本的 A语为英文, B语为中文, 从第二轮开始新语对中必须其一是已经进行过剖析比对的, 如当加入俄文时, 只能取 中俄或英俄语料作为双语对训练样本, 第二轮剖析比对的双语对样本中 A语应是已比 对过的中文或英文, B语应是新加的俄文; 每一轮的训练语料样本应大到新增句型 /句例比 < 1%后方可考虑增加新语种、 进 行次 轮的剖析比对, 另 方面, 可以根据训练样本语料的行业来源或应用范围来源 来标记、 划分句型库、 舱模库、 意群串库、 习语库来构成相应分库, 用于行业或专用 版本; S1. Using the corpus of the bilingual or multilingual text version of the same content as the training sample, each round of A and B bilingual as a sample pair, where the A language is assigned to the pinyin text or the already-matched language, the B language can be assigned. The first round of bilingual analysis of the training samples, in which the B language of the sample is English and the B language is Chinese. From the second round, the new language pair must have been analyzed and compared. When joining Russian, only Chinese-Russian or English-Russian materials can be used as bilingual training samples. In the second round of analysis, the B-language in the sample should be compared with Chinese or English, and B should be new. Russian; each round of training corpus samples should be large enough to add new sentence/sentence ratio < 1% before considering adding new languages, conducting second round of analysis, and, in other respect, industry based on training sample corpus The source or application scope source to mark, divide the sentence pattern library, the cabin model library, the meaning group library, and the idiom library to form the corresponding library for the industry or special version;
52.句型层面剖析比对, 读取双语样本句对, 划分出句型、 句舱, 把句型作为句 型构件存入句型库, 把不足以分出句型、 句舱的小习语作为小习语构件存入习语库; 52. Analyze the sentence at the sentence level, read the bilingual sample sentence pairs, divide the sentence pattern, sentence cabin, and store the sentence pattern as a sentence structure in the sentence pattern library, and put a small sentence that is not enough to separate sentence patterns and sentence cabins. As a small idiom component, it is stored in the idiom library;
53.句舱层面剖析比对, 把已经划分出句型、 句舱的样本句例对, 依次取出句舱 内容, 进一歩划分出舱模、 舱眼, 把舱模作为舱模构件存入舱模库, 把经过意群对齐 的舱眼或简单句舱的内容以意群串为单元作为意群串构件存入意群串库; 处理完所有 句舱, 接着下一个的双语样本句对处理、 接续执行步骤 S2。 所述语句构件的制作方法中的句型层面剖析比对的步骤 S2 进一步包括如下步 骤: 53. Analyze the sentence at the sentence level, compare the sample sentences that have been divided into sentence patterns and sentence cabins, and then take out the contents of the sentence cabin, and then divide the cabin model and the cabin eye into a cabin, and store the cabin model as a cabin module. The model library, the contents of the cabin or the simple sentence cabin aligned by the intention group are stored in the group of meaning clusters as the group of the meaning group; after processing all the sentence cabins, the next bilingual sample sentence pair processing Then, step S2 is performed. The step S2 of the sentence level analysis in the method for fabricating the sentence component further includes the following steps:
521.读入一个双语样本句对; 521. Read a bilingual sample sentence pair;
522.调用配句型子程序查找句型库返回 A、 B语匹配句型, 若否、没有匹配句型, 执行步骤 S23, 若是、 有匹配句型执行步骤 S26 ; 522. Calling the sentence-type subroutine to find the sentence type library returns A, B language matching sentence pattern, if no, there is no matching sentence pattern, step S23 is performed, if yes, there is a matching sentence pattern performing step S26;
523.以当前双语样本句对为例制作新句型, 弹开一窗口, 上横行显示 A语句、下 横行显示 B语句, 横行下再显示挖句舱、 存句型两个命令按钮, 并提示专家点击 、 B 语例句的待挖句舱的首尾点, 挖句舱计数器 N=0 ;  523. Taking the current bilingual sample sentence pair as an example, a new sentence pattern is created, and a window is opened, the A statement is displayed on the horizontal line, the B statement is displayed on the horizontal line, and the two command buttons are displayed under the horizontal line, and the sentence type is saved, and the prompt is displayed. The expert clicks, the B-term example of the head and tail of the sentence cabin to be excavated, the digging cabin counter N=0;
524.当接收到挖句舱命令按钮被点击后, N=N+1, 检查 A、 B语是否都被点击两个 点以及这两个点是否有效, 若否, 提示重作, 如果点击正确并且有效, 将 、 B语句 两点之间的内容挖去并填入 " [N] ", 该轮挖句舱结束,下一轮重复步骤 S24再挖下一 个句舱;  524. When the command to receive the digging cabin is clicked, N=N+1, check whether both A and B are clicked on the two points and whether the two points are valid. If not, the prompt is repeated, if the click is correct And effective, the content between the two points of the B statement is dug and filled in "[N]", the round of the sentence is finished, and the next round of repeating step S24 is to dig a sentence cabin;
525.当接收到存句型命令按钮被点击并且 N ^ l, 表示挖句舱制作新句型操作完 毕, 清除步骤 S23、 S24的显示, 把两个新句型作为句型构件分别写入句型库 A文句 型、 B文句型字段, 如果接收到存句型命令按钮被点击、 但 N=0, 表示当前双语样本 句对不足以分出句型、 句舱而被判定为小习语, 清除步骤 S23、 S24 的显示, 把两个 小习语作为小习语构件分别写入习语库 A文习语、 B文习语字段; 、 「 作为已经划分出句型、 句舱的样本句例对存盘备 S3步骤读取, 再执行歩骤 S21。 所述语句构件的制作方法中的句舱层面剖析比对的步骤 S3 进一步包括如下步 骤: 525. When the received sentence type command button is clicked and N ^ l, indicating that the new sentence pattern operation is completed, the display of steps S23 and S24 is cleared, and the two new sentence patterns are respectively written as sentence structure members. Type A text and B sentence type fields, if the received sentence type command button is clicked, but N=0, it means that the current bilingual sample sentence pair is not enough to separate the sentence pattern and the sentence cabin and is judged as a small idiom. Clearing the display of steps S23 and S24, and writing two small idioms as small idiom components into the idiom A idioms and B idiom fields; And "the sample sentence sentence that has been divided into the sentence pattern and the sentence cabin is read by the storage unit S3 step, and then the step S21 is performed. The step S3 of the sentence level analysis in the sentence member preparation method further includes the following step:
531.读入一个由 S26步骤存盘的已经划分出句型、 句舱的样本句例对; 531. Read a sample sentence sentence that has been divided into sentence patterns and sentence compartments saved by the S26 step;
532.取句舱,依次取出已经划分出句型、 句舱的样本句例对当中的一个句舱作为 当前句舱, 开窗口一上部显示 A、 B语样本句例, 下部显示4、 B语当前句舱内容; 同时,把 A语当前句舱以词串为单元切分并依次填入参考表 A语字段, 再依次取 出一个词串查找意群串库的 A文串字段, 找到后取出同记录的 B文串字段内容, 如果 该 B文串内容在 B语当前句舱中含有, 把 B文串内容填入参考表 B语字段, 不含有让 它为空; 如果意群串库的 A文串字段有相同的记录,相应参考表也多一条 A语字段有重的 记录备选, 作完整个参考表, 开窗口二显示参考表、 组复词命令按钮以及可组复词操 作提示; 接受专家点击参考表并在被点记录标志字段作标志; 当组复词命令按钮被点击并且参考表有连续记录被点击,将参考表中有标志记录 的 A语字段内容以 " _ "相连组成复词, 并把有标志记录合并成一条记录, A语字段填 入该复词, B语字段以相等语意的词串填写; 532. Take the sentence cabin, and then take out the sample sentence sentences that have been divided into sentence patterns and sentence cabins, and use one of the sentence boxes as the current sentence cabin. The upper part of the window displays the sample sentences of A and B, and the lower part shows the 4, B language. The current sentence cabin content; At the same time, the current sentence compartment of the A language is segmented by the word string and sequentially filled in the reference field A language field, and then a word string is sequentially taken to find the A text string field of the meaning group library, and then found and taken out The content of the B-string field of the same record, if the content of the B-string is contained in the current sentence of the B-language, the content of the B-string is filled in the B-language field of the reference table, and does not contain it to be empty; The A text string field has the same record, and the corresponding reference table also has one A language field with a heavy record candidate, which is a complete reference table, the open window two displays the reference table, the group compound word command button, and the groupable compound word operation prompt Accept the expert to click on the reference table and mark it in the marked field of the mark; when the group compound command button is clicked and the reference table has continuous records to be clicked, the content of the A language field with the mark record in the reference table is connected with " _ " group In the compound word, and merge the marked records into one record, the A language field is filled in the compound word, and the B language field is filled in the same semantic string;
533.判断当前句舱是否简单句舱, 若是执行步骤 S37, 若否、 进一步查询舱模库 判断当前句舱是否含有舱模, 若否、 不含舱模执行步骤 S34, 若是、 该所含舱模作为 当前舱模并对号入座地纳入当前句舱内容,执行步骤 S36 ; 533. Determine whether the current sentence cabin is a simple sentence cabin, if it is to perform step S37, if not, further query the cabin model library to determine whether the current sentence cabin contains a cabin model, if not, without the cabin model, performing step S34, if yes, the cabin The module is used as the current cabin model and is incorporated into the current sentence cabin content, and step S36 is performed;
534.开窗口三作为可编辑窗口,将当前双语句舱内容再显示,接受专家以此为基 础编写舱模, 还显示存舱模命令按钮;  534. Open window 3 as an editable window, display the current double-sentence cabin content again, accept the expert to write the cabin model, and also display the save mode command button;
535.当存舱模命令按钮被点击,并且可编辑窗口已经被编辑过,新编舱模也符合 格式要求,将新编 A、B语舱模作为舱模构件存入舱模库 A文舱模、 B文舱模字段,同时, 将当前句舱内容对号入座地填入当前舱模, 或填入新编舱模作为已经划分出舱模、 舱 眼的复杂句舱显示; 535. When the deposit module command button is clicked and the editable window has been edited, the new cabin model also meets the format requirements, and the newly edited A and B language modules are stored as cabin modules in the cabin model A. The mold and B cabin model fields, at the same time, fill the current cabin content into the current cabin model, or fill in the new cabin model as a complex sentence cabin display that has been divided into cabin molds and cabin eyes;
536. 依次取出一个舱眼的内容, 接续执行步骤 S37 ; 536. Take out the contents of one of the cabins in sequence, and then proceed to step S37;
537. 意群对齐, 在第二个窗口参考表下显示对齐确定命令按钮, 参考表接受专 家按实例延伸或增补词义、 不改变原有字、 单词的前提下加减串长度、 粘带附随字、 词形变化增补词义项等意群对齐的修改, 或优选记录; , 、 537. Intention group alignment, the alignment determination command button is displayed under the second window reference table, and the reference table accepts the expert to extend or add the meaning of the word according to the example, and does not change the original word or the word, the length of the string is added or subtracted, and the tape is accompanied by the word. , the modification of the meaning of the inflection of the inflected word, or the modification of the meaning group alignment, or the preferred record; , ,
已成意群串,然后逐记录地把 A、 B语字段内容作为意群串构件存入意群串库的 A文串 或 B文串字段; The group string has been formed, and then the contents of the A and B language fields are stored as the group string component in the A string or the B text field of the meaning group library.
S39.如果当前操作的是舱眼, 并且当前句舱还有舱眼没有操作, 执行步骤 S36 直到作完当前句舱的所有舱眼, 再判断当前已经划分出句型、 句舱的样本句例对中是 否还有未处理的句舱, 是, 执行步骤 S32继续处理句舱, 否, 全部句舱处理完毕, 执 行步骤 S31, 进行下一轮句对操作。 S39. If the current operation is the cabin eye, and the current sentence cabin has no operation of the cabin eye, step S36 is performed until all the cabin eyes of the current sentence cabin are completed, and then the sample sentence sentence sentence sentence and sentence sentence are determined. If there is still an unprocessed sentence compartment in the middle, yes, step S32 is executed to continue processing the sentence cabin. Otherwise, all the sentence cabins are processed, and step S31 is performed to perform the next round of sentence pair operation.
〈三〉、 提供一种基于语句构件的母语读外文方法, 包括如下步骤: <3>, providing a method for reading a foreign language in a native language based on a statement component, comprising the following steps:
54. 界面, 由用户指定母语和源语(外语,下同)各是构件库所含的哪一个文种, 把屏幕分成上、 中、 下或前、 中、 后三个窗口, 中部窗口用于显示当前操作句以及操 作中的相关信息, 下或后部窗口用于显示源语待读文本, 上或前部窗口用于显示已读 的母语文本, 此外, 在提示行显示悔操作、 存盘退出等命令按钮以及一、 一移词序按 钮, 或把它们作成浮条紧随中部下或用户可移; 54. Interface, the user specifies the native language and the source language (foreign language, the same below), which is the language of the component library, and divides the screen into upper, middle, lower or front, middle and back windows. For displaying the current operation sentence and related information in the operation, the lower or rear window is used to display the source language to be read, the upper or the front window is used to display the read native language text, and in addition, the repentance operation is displayed on the prompt line. Exit and other command buttons and one or one shift word order buttons, or make them as floating bars immediately below the middle or user can move;
55.源语句读入, 读入源语一个句子作为当前句显示在中部窗口, 母语文本显示 尾加已处理的前一句内容, 源语文本显示减当前句内容; 55. The source statement reads in, reads a sentence from the source language as the current sentence is displayed in the middle window, the native text displays the last sentence of the processed sentence, and the source text displays the content of the current sentence.
56.转换操作, 利用四个语句构件库对当前句通过查表得出外、 母语相应构件和 意通代码的转换操作、 给出同记录的母语字段内容; 56. The conversion operation, using the four sentence component library to obtain the conversion operation of the foreign language, the corresponding component of the native language and the Italian code by looking up the current sentence, and giving the content of the native language field of the same record;
57. 句确定, 如果当前句的全部句舱处理完毕, 查询反馈缓存区和命令按钮: 当反馈缓存区不为空, 将反馈缓存区的信息加上源语、母语、 当前源语句子等信 息作成电子邮件反馈到支持网站, 清空反馈缓存区, 在世界文缓存区存入 "反馈句" 标志; 当悔操作命令按钮被点击,根据用户点击的欲悔词串,取出悔选择缓存中的相应 内容让用户重选词串并作相关修改; 当接收到存盘退出命令时, 将世界文缓存区的内容存盘为世界文, 文件名 =源语 文件名 . SJW, 如果源文未完,文件头中记下源文偏移; 当悔操作、 存盘退出命令按钮都没有被点击时, 确定当前句的给出, 执行步骤57. The sentence is determined, if the entire sentence of the current sentence is processed, the query buffer and the command button are: When the feedback buffer is not empty, the feedback buffer information is added to the source language, the native language, the current source sentence, and the like. E-mail feedback to the support website, clear the feedback buffer, and store the "feedback sentence" flag in the world text cache; when the repent command button is clicked, according to the user's click on the remorse string, the corresponding in the regret selection cache is taken out. The content allows the user to re-select the string and make related changes; when receiving the save exit command, save the contents of the world cache to World, file name = source file name. SJW, if the source text is not finished, the file header is recorded Lower source text offset; when the repentance operation, the save exit command button are not clicked, determine the current sentence is given, perform steps
S5。 所述一种基于语句构件的母语读外文方法,其中利用四个语句构件库对当前句通 过查表得出外、 母语相应构件和意通代码的转换操作、 给出同记录的母语字段内容的 歩骤 S6进一歩包括如下歩骤: S5. The method for reading a foreign language in a native language based on a sentence component, wherein the four sentence component library is used to obtain a conversion operation of the foreign language, the corresponding component of the native language and the Italian code by looking up the table, and giving the content of the native language field of the same record. Step S6 further includes the following steps:
S601.判小习语, 以当前句查询习语库的源语习语字段, 若无, 没有找到, 执行 , , 将同记录的习语码读入世界文缓存区, 然后执行步骤 S5 ; S601. Judging a small idiom, querying the source idiom field of the idiom library with the current sentence, if not, not found, executing , , reading the recorded idiom code into the world volume buffer, and then performing step S5;
5602.调用配句型子程序, 以当前句查询句型库的源语句型字段, 若查到一个匹 配的句型, 执行步骤 S603 , 如果查到多个匹配句型, 在中部窗口下部显示相应的母语 句型, 接受用户选定后再执行步骤 S603 , 如果一个匹配的句型也没有, 存句型库代表 数于反馈缓存区;  5602. Calling the sentence-type subroutine, querying the source sentence type field of the sentence pattern library with the current sentence, if a matching sentence pattern is found, performing step S603, if multiple matching sentence patterns are found, displaying corresponding in the lower part of the middle window The parent sentence type, after accepting the user selection, executing step S603, if there is no matching sentence pattern, the stored sentence library represents the number in the feedback buffer area;
5603.给出同记录的句型码、 母语句型以及源语句型, 将母语句型着重显示在中 部窗口的上部,把源语句子对号入座地套入源语句型,附注式显示在该窗口母语下方, 并把句型码读入世界文缓存区; 5603. Give the sentence code of the same record, the parent sentence type and the source sentence type. The mother sentence type is mainly displayed in the upper part of the middle window, and the source sentence pair is nested in the source sentence type, and the note type is displayed in the native language of the window. Below, and read the sentence code into the world buffer;
5604.取句舱, 从左到右在中部窗口标示母语句型中的当前句舱, 存入当前句舱 标号于世界文缓存区, 同时标示和取出源语相应句舱内容作为当前句舱内容, 判断当 前句舱内容是否属于简单句舱, 若否, 执行歩骤 S605 , 若是执行步骤 S608 ; 5604. Take the sentence cabin, from left to right, in the middle window, mark the current sentence cabin in the parent sentence type, store the current sentence cabin label in the world text buffer area, and mark and extract the corresponding sentence cabin content of the source language as the current sentence cabin content. , determining whether the current sentence cabin content belongs to a simple sentence cabin, if not, executing step S605, if yes, performing step S608;
5605. 查配舱模, 以当前句舱内容查询舱模库的源语舱模字段,若查到一个匹配 的舱模, 执行步骤 S606,如果查到多个匹配舱模,向下扩展中部窗口,在扩展部显示相 应的母语舱模,接受用户选定后再执行步骤 S606,如果一个匹配的舱模也没有,存舱模 库代表数于反馈缓存区;  5605. Inspect the matching cabin module, and query the source language module field of the cabin model library with the current sentence cabin content. If a matching cabin model is found, step S606 is performed, and if multiple matching cabin modules are found, the middle window is extended downward. Displaying the corresponding native module in the extension, accepting the user selection and then performing step S606, if there is no matching cabin model, the storage module represents the number in the feedback buffer;
5606. 给出同记录的舱模码、母语舱模以及源语舱模,将母语舱模着重显示在中 部窗口的扩展部, 把源语句舱内容对号入座地套入源语舱模, 附注式显示在该窗口母 语舱模的下方, 并把舱模码读入世界文缓存区; 5606. Giving the same recorded cabin model code, mother tongue module and source language module, the mother tongue module is displayed in the extension of the middle window, and the source sentence cabin content is nested in the source language module, and the note is displayed. Below the native module of the window, and reading the cabin model into the world buffer;
5607.取舱眼, 以母语舱模为准从左到右, 在母语舱模上逐个标示当前舱眼, 存 当前舱眼标号于世界文缓存区, 同时标示和取出源语相应舱眼内容, 执行步骤 S608 ; 5607. The eyes are taken from the left to the right in the native mode, and the current cabin eye is marked one by one on the native language module. The current cabin eye is marked in the world space buffer, and the corresponding cabin contents are marked and extracted. Go to step S608;
5608. 词义确定, 从左到右读出源语的简单句舱或舱眼中的一个词串, 查询意群 串库的源语文串字段,若只查到一条相同词串,执行步骤 S609,若查到多条相同词串, 分别取出它们的同记录母语串字段内容,备份于悔选择缓存,并显示在已扩展的中部 窗口下部,接收用户选定后再执行步骤 S609,如果 条相同的词串也没有,存当前源语 词串于反馈缓存区;  5608. Determine the meaning of the word, read the simple sentence box of the source language or a word string in the cabin eye from left to right, and query the source language string field of the meaning group library. If only one identical word string is found, step S609 is performed, if Find a plurality of identical word strings, respectively take out the contents of their same-recorded native language string fields, back them up in the regret selection cache, and display them in the lower part of the expanded middle window. After receiving the user selection, perform step S609, if the same words are There is no string, and the current source word string is stored in the feedback buffer area;
5609.取出当前记录的母语串字段内容填入到当前母语句舱或当前母语舱眼, 取 出意群码存入世界文缓存区; 继续执行步骤 S608 ,直到当前简单句舱或当前舱眼操作 完毕; 根据个性丢失表的信息进行当前句舱或舱眼的个性丢失补偿操作; 再根据母语词序表的信息纠正当前句舱或舱眼的母语词序; 最后查询→、一移词序按钮, 当→按钮被点击将当前句舱或舱眼用户所点击词串 , 5609. Take out the content of the currently recorded native language string field and fill in the current parent sentence box or the current native language cabin, and take out the intention group code and store it in the world text buffer; continue to step S608 until the current simple sentence cabin or the current cabin operation is completed. According to the information of the personality loss table, the current sentence or cabin eye loss compensation operation; then correct the current sentence or the cabin's native language word order according to the information of the native language word list; last query →, one move word order button, when → button Clicked to click on the word string of the current sentence or cabin user ,
词串之前, 同时将移后的词序加入母语词序表备用, 后续执行步骤 S610 ; Before the word string, the word sequence after the move is added to the original word list for use, and step S610 is performed subsequently;
S610. 判断, 如果当前句舱还有舱眼未处理, 执行步骤 S607 , 若否而当前句子 还有句舱未处理, 执行步骤 S604, 若当前句子所有句舱全部处理完毕, 后续步骤 S7。 所述的一种基于语句构件的母语读外文的方法: 其中作成电子邮件反馈到支持网站, 当支持网站接收到来自用户的反馈邮件时, 由专家实时处理后,新构件加入相应构件库,并将新构件及相关信息实时反馈给用户, 并在用户的参与下替换原 "反馈句"标志; 其中将世界文缓存区的内容存盘为世界文, 是在用户利用母语读外文的过程中, 世界文缓存区被同时实时地存入了句型码、 习语码、 舱模码、 意群码等构件代码, 以 及句舱标号、 舱眼标号等,将它们存盘生成世界文,利用母语直接阅读外文,读后还生 成了世界文, 一篇外文只要一人读过, ,后面的千千万万人就可以读世界文了,读世界 文比母语读外文更快捷、 不用干预,语意准确,读出文种用户自选,世界文的多语读出 过程只是译码过程,具体步骤是:  S610. Judging, if the current sentence compartment is still unprocessed, step S607 is performed. If no, the current sentence and the sentence compartment are not processed, step S604 is performed, and if all the sentence compartments of the current sentence are all processed, the subsequent step S7. The method for reading a foreign language based on a sentence component of a sentence component: wherein the email is fed back to the support website, and when the support website receives the feedback email from the user, the new component is added to the corresponding component library after being processed by the expert in real time, and The new component and related information are fed back to the user in real time, and the original "feedback sentence" flag is replaced by the user's participation; wherein the content of the world text buffer is saved as the world text, in the process of the user reading the foreign language in the mother tongue, the world The text buffer is stored in real time with the code code of the sentence pattern, idiom code, cabin code, and group code, as well as the sentence number, the cabin number, etc., which are saved in the world and read directly in the mother tongue. In foreign languages, after reading, he also produced the world text. As long as one foreign language has been read by one person, the thousands of people in the back can read the world text. Reading the world text is faster than reading the foreign language in the mother tongue, without intervention, and the meaning is accurate. The user-selected user language, the multilingual reading process of Esperanto is only the decoding process, the specific steps are:
①依次逐个取出代码; 1 sequentially remove the code one by one;
②用开关语句将代码分类分别处理; 2 use the switch statement to classify the code separately;
③其中如果是句舱标号、 舱眼标号, 用以指示当前句舱或当前舱眼; 3 where if it is a sentence number and a cabin eye label, it is used to indicate the current sentence compartment or the current cabin eye;
④把句型码、 >」语码、 舱模码、 意群码分解为某库某记录号, 给出某库、 某记录 的某读出文种字段内容, 如果是意群码则按指示给出到当前句舱或当前舱眼; 4 Decompose the sentence pattern, >" code, cabin model code, and meaning group code into a certain record number of a certain library, and give the contents of a certain read field of a certain library or a record. If it is an intention group code, follow the instructions. Given to the current sentence compartment or the current cabin eye;
⑤接续执行①直到文本结束。 所述的一种基于语句构件的母语读外文的方法, 其实是基于语句构件的应用之 一, 参照其中利用四个语句构件库对当前句的编码步骤、 世界文读出的译码步骤, 可 以产生多种基于语句构件的应用系统: 基于语句构件的世界文生成的方法系统,用于将传统文本转换成世界文, 继后可 以进行多语种读出; 基于语句构件的文本转换方法,用于将某源语文本转换成某目语文本给出, 或转 换成多文种给出; 基于语句构件的机器翻译方法,用于将某源语翻译成目语给出,或翻译成多语种。 与现有技术相比, 本发明的有益效果是: 5 Continue to execute 1 until the end of the text. The method for reading foreign language based on the sentence component of the sentence component is actually one of the applications based on the sentence component, and referring to the coding step of the current sentence using the four sentence component library, and the decoding step of the world text reading, Generate a variety of sentence-based component-based application systems: a method based on sentence component-based world text generation, used to convert traditional text into world text, followed by multi-lingual readout; text processing method based on statement component, for Converting a source text into a textual text, or converting it into a multi-language; a statement-based machine translation method for translating a source language into a target, or translating into a multilingual. Compared with the prior art, the beneficial effects of the present invention are:
1、 语句构件装置的语句构件存贮部设有四个库, 分别存贮句型、 舱模、 意群串 , , 1. The statement component storage unit of the statement component device is provided with four libraries for storing sentence patterns, cabin models, and meaning clusters respectively. , ,
某某码字段, 用以编制意通代码。 意通代码不但唯一地代表了同记录同类构件的共同 语意, 而且可以分解为某库某记录。 这样的设计得到构件与构件之间可以直接转换或 通过意通代码转换而语意不变的有益效果。 A certain code field, used to compile the code. The Italian code not only uniquely represents the common semantics of the same component, but can be decomposed into a certain record of a certain library. Such a design has the beneficial effect that the component and the component can be directly converted or changed by the meaning of the code.
2、 句型、 舱模构件为句子提供了一个框架, 包揽了复杂的语法, 决定了所含句 舱及舱眼的位次, 这就避免了现有技术利用人工智能作句法分析、 语法分析之苦。 有 效益于结果表意能与原文一致。 2. The sentence pattern and the cabin model provide a framework for the sentence, which involves a complicated grammar and determines the position of the sentence cabin and the cabin eye. This avoids the prior art using artificial intelligence for syntactic analysis and grammar analysis. The bitterness. Benefits The results are consistent with the original text.
3、 根据训练样本语料的行业来源或应用范围来源来标记, 分出相应分库, 适宜 于行业或专用版本, 并且设有支持网站;有益于用户细分,有更适应用户的有益效果。 3. According to the industry source or application range source of the training sample corpus, the corresponding sub-library is separated, which is suitable for the industry or special version, and has a support website; it is beneficial to user segmentation and has more beneficial effects for users.
4、 所述语句构件的制作方法的步骤特征, 必然产生不同语言文字的字、 词、 句 的表意得以对等和统一而成为语句构件的有益效果。 4. The step features of the method for making the sentence component necessarily produce the beneficial effects of the meaning of the sentence component by the equalization and unification of the meanings of words, words and sentences of different language characters.
5、 所述语句构件库的特征, 比现有技术机器翻译的电子词典、 规则库对翻译质 量、 文本转换质量的贡献更大更可靠; 必然使基于语句构件的各种应用, 产生译文或 转换文本质量提高的有益效果。 5. The characteristics of the statement component library are more reliable and more reliable than the electronic dictionary and rule library of the prior art machine translation, and the translation quality and the text conversion quality are more and more reliable; The benefits of improved text quality.
6、 母语读外文方法是基于语句构件的应用之一。 能用母语直接阅读外文, 人人 可读。 而且读后还生成世界文; 这样, 一篇外文只要一人读过, 后面的千千万万人就 可以读世界文——不必干预地多语种母语读出; 这一直是人们所梦昧以求的。 6. The foreign language reading method is one of the applications based on statement components. The foreign language can be read directly in the mother tongue and can be read by everyone. And after reading, it also generates Esperanto; in this way, as long as one foreign language has been read by one person, thousands of people in the back can read the Esperanto - read the multilingual mother tongue without intervention; this has always been a dream of.
7、 设有网站支持能保证基于语句构件的产品的应用, 并建立了用户联系, 对服 务质量以及版本升级具有益的效果。 附图说明 图 1是语句构件装置结构示意图; 图 2是语句构件示意图; 图 3是句型库示意图; 图 4是舱模库示意图; 图 5a是意群串库 (英单串) 示意图; 图 5b是意群串库 (英复串) 示意图; 图 6是习语库示意图; 图 7是句型层面比对流程图; 图 8是句舱层面比对流程图; · ; 具体实施方式 本发明人认为, 对于计算机语言文字信息处理来说, 语法、 语音、 词汇不同不是 难题, 关键是不同自然语言的字、 词、 句表意不对等,也不统一; 机内表达亦然,只好 让计算机要象人一样理解分析它, 致使 " "语义障碍 "至今仍然存在,翻译出来的结果 往往令人啼笑皆非"。 句型构件理论 (本发明人的发现, 末公开): 我们明白, 电脑还不能象人脑一样理解语义; 电脑之长在于存贮和搜索。人脑正 好与电脑相反, 能理解语义, 但存贮和搜索能力都远远不及电脑。 人脑与电脑有着很 好的互补, 但能否实现很好的互补, 有懒于语言文字在计算机内的表达。 非常庆幸, 自然语言的本质就是表意,并且所表之意人类互通。各种语言文字都由字符组成词汇, 词汇组成句子, 句子组成文章; 其本质属性是: 句子是表达完整语意的基本单元; 不 同语言文字的句子可以表达相同的语意。 自然语言是整个人类社会发展的产物。 人类 产生语言文字的时候, 由于时空的隔离, 人们被分散在许多独立的社会里生活; 语言 文字也在这许多独立社会里, 在各自独立体系内缓慢演变和发展。 正因为各种语言文 字都在各自的独立体系内缓慢地演变发展而来。 不同语言文字之间, 语音不同, 语法 不一样, 词汇字符更是形形色色。 这也就造成了不同语言文字之间字、 词、 句表意不 对等也不统一。 假如语言文字在计算机中的表达 (表示), 字、 词、 句表意能够对等和 统一; 不同语言文字之间的翻译和转换就不是什么难事了。 如何才能使不同语言文字 之间的字、 词、 句表意能够对等和统一呢? 本发明人从句型着手研究, 不是语法书上的句型, 是便于计算机操作的句型。经 过多年的探讨, 得出语句构件理论。 这里把语句构件论所涉及的主要论断、 与本发明 有关的、 具有特定意义的概念定义和解释如下: 句子——在自然语言里,表达完整语意的基本单元称为句子; 不同语言文字的句 子可以表达相同的语意。句子可分为句型、句舱两部分,一个句型至少包含一个句舱。 句型——出自一类句子的抽象, 在句子中相对稳定, 体现句子基本语意及类属; 构成该类句子基本结构框架部分称句型。句型体现句子基本语意及类属是面向全人类 的、 跨语种的; 而其基本结构框架是面向具体自然语言的, 并包揽着自然语言复杂、 个性化的语法现象。 句舱——镶嵌在句型这个基本结构框架上的那些灵活的可替换部分称为句舱。句 舱接受句型的选择和制约; 句舱可用意群串填充或替换, 形成丰富多彩的、 具体的句 子。 句舱个数、 其语意内容是面向全人类的、 跨语种的; 但其在句型基本结构框架中 的位置、 次序和用以填充的意群串是面向具体自然语言的; 句舱即使有语法现象也极 句型句舱举例解释 (#示行号): 7. With website support, it can guarantee the application of product based on statement component, and establish user contact, which has beneficial effects on service quality and version upgrade. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic structural view of a sentence member device; FIG. 2 is a schematic diagram of a sentence member; FIG. 3 is a schematic diagram of a sentence library; FIG. 4 is a schematic diagram of a cabin model library; FIG. 5a is a schematic diagram of a cluster of clusters (English single string); 5b is a schematic diagram of the Yiqun string library (English complex string); Fig. 6 is a schematic diagram of the idiom library; Fig. 7 is a sentence pattern level comparison flowchart; Fig. 8 is a sentence block level comparison flowchart; The present inventor believes that for computer language text information processing, grammar, voice, and vocabulary are not difficult problems, and the key is that words, words, and sentences in different natural languages are not equal or uniform; Similarly, the computer has to be understood and analyzed like a human being, resulting in "the semantic barrier" still exists today, and the results of the translation are often ridiculous." Sentence structure theory (the discovery of the inventor, the end of the disclosure): We understand that computers can not understand the semantics like the human brain; the length of the computer lies in storage and search. The human brain is just the opposite of the computer, it can understand the semantics, but the storage and search capabilities are far less than the computer. The human brain and the computer have a good complement, but can achieve a good complement, lazy language expression in the computer. Fortunately, the essence of natural language is the meaning of meaning, and the meaning of human beings. All kinds of language and characters are composed of characters, vocabulary is composed of sentences, and sentences are composed of articles; its essential attributes are: sentences are the basic unit for expressing complete semantics; sentences of different languages can express the same semantics. Natural language is the product of the development of the entire human society. When human beings produce language and characters, people are dispersed in many independent societies because of the isolation of time and space. Languages and characters in these many independent societies slowly evolve and develop in their own independent systems. It is precisely because various languages and languages have evolved slowly in their respective independent systems. Between different languages, the voice is different, the grammar is different, and the vocabulary characters are more diverse. This also results in the inconsistency or inconsistency of words, words and sentences between different languages. If the expression of the language is expressed in the computer (representation), the meaning of words, words and sentences can be equal and unified; translation and conversion between different languages is not difficult. How can we make words, words and sentences between different languages and words consistent and unified? The inventor started the study from the sentence pattern, not the sentence pattern on the grammar book, and is a sentence pattern that is convenient for computer operation. After many years of discussion, the theory of sentence components was obtained. Here, the main thesis involved in the statement component theory, and the concept definitions and explanations related to the present invention are as follows: Sentences - In natural language, the basic unit of expressing complete semantics is called a sentence; sentences of different language words Can express the same semantics. Sentences can be divided into two parts: sentence pattern and sentence cabin. A sentence pattern contains at least one sentence cabin. Sentence pattern - abstraction from a type of sentence, relatively stable in the sentence, embodying the basic semantics and generics of the sentence; constitutes the sentence structure of the basic structural framework of the sentence. The sentence pattern shows that the basic semantics and generics of the sentence are oriented to all human beings and cross-lingual; and the basic structural framework is oriented to the specific natural language, and it involves the complex and individualized grammatical phenomena of natural language. The sentence cabin - those flexible alternatives embedded in the basic structural framework of the sentence pattern is called the sentence cabin. The sentence cabin accepts the choice and restriction of the sentence pattern; the sentence cabin can be filled or replaced with a group of meanings to form a colorful and specific sentence. The number of sentences and their semantic content are all-human and cross-lingual; but their position, order and meaning clusters in the basic structure of the sentence structure are specific to the natural language; Grammatical phenomena are also extremely Example sentence sentence sentence explanation (# line number):
1# 只要会句型 (01074) 2个舱 1# As long as the sentence pattern (01074) 2 cabins
2# 只要你 {1}, 你就会 {2} 。 1{坚持不懈} 2{成功 } 2# As long as you {1}, you will {2}. 1{Perseverance} 2{success}
3# {1} and you will {2} . 1 {Persevere} 2 {succeed} 3# {1} and you will {2} . 1 {Persevere} 2 {succeed}
4# ECJIH {1}, TO {2} . 1 {BbmacTaHBaeTe} 2{6y eTe c ycnexoM} 4# ECJIH {1}, TO {2} . 1 {BbmacTaHBaeTe} 2{6y eTe c ycnexoM}
5# 略 (其它语种) 5# slightly (other languages)
6# 象一句型 (00892) 4个舱 6# 象一句 (00892) 4 cabins
7# {1}象 {4}一样 {3} {2}吗? 1{约翰}4{亨利}3{努力}2{工作} 7# {1} Like {4} {3} {2}? 1{John}4{Henry}3{Effort}2{Work}
8# Does{l} {2}as{3}as{4}? 1 {John} 2 {work} 3 {hard} 4 {Henry} 8# Does{l} {2}as{3}as{4}? 1 {John} 2 {work} 3 {hard} 4 {Henry}
9# {1} {2} τακ{3} κακ {4} 1 {PaGoTaeT} 2{¾>KOH} 3{ycepHo} 4{ΓεΗΐΐΗ} 9# {1} {2} τακ{3} κακ {4} 1 {PaGoTaeT} 2{3⁄4>KOH} 3{ycepHo} 4{ΓεΗΐΐΗ}
10# 略 (其它语种) 10# 略 (Other languages)
11# 的高句型 (00922) 3个舱 11# high sentence type (00922) 3 cabins
12# {2}的 {1}高于 {3} 。 2讓 } 1醒 } 3{红宝石 } {#} of 12# {2} is higher than {3}. 2 let } 1 wake up } 3{ ruby }
13# The {1} of {2} is above {3} • 1 {price} 2 {wisdom} 3 {rubies} 13# The {1} of {2} is above {3} • 1 {price} 2 {wisdom} 3 {rubies}
14# {1} {2} BMine {3} . 1{CTO ocTb} 2{MyApocTH} 3{py6nna} 14# {1} {2} BMine {3} . 1{CTO ocTb} 2{MyApocTH} 3{py6nna}
15# 略 (其它语种) 15# 略 (Other languages)
上例① 1#6#11#三行各表示三个句型的表示基本语意和类属,面向世界、跨语种部 分; 其中如 "只要会句型"表示类属和基本语意, (01074) 表示句型号,是意通代码 低位字十进制数。 Example 1 1# 6 #11# The three lines each represent the basic semantics and generics of the three sentence patterns, facing the world and cross-lingual parts; for example, "as long as the sentence pattern" means generic and basic semantics, (01074) Indicates the model number, which is the low-digit decimal number of the Italian code.
② 2〜5#、 7〜擺、 12〜15#表示三个句型的结构框架, 是面向具体自然语言的。 每行的左前部分是句型的框架结构,花括号内是句舱;右后部分是相应句舱及内容例。 其中 2#7#12#面向中文; 3#8#13#面向英文; 4#9#14#面向俄语; 5#10#15#面向其它语 种等。 2 2~5#, 7~ pendulum, 12~15# represent the structural framework of the three sentence patterns, which are oriented to specific natural language. The left front part of each line is the frame structure of the sentence pattern, the sentence brackets are inside the sentence box; the right rear part is the corresponding sentence box and content examples. 2#7#12# is for Chinese; 3#8#13# for English; 4#9#14# for Russian; 5#10#15# for other languages.
③上述举例花括号内或其前面的数字为句舱号。 句舱的个数 (如 1#有 2个舱、 6#有 4个舱), 和各句舱表示的语意是面向世界、 跨语种的; 而其在句型框架结构中 的位置、次序、用以填充的意群串是面向语种的(如 7〜9#中的 {2}在中英俄文句型内 的位次不同; 填充的意群串分别是: 工作 、 work 、 JI >κ 0 H )。 , ; 3 The numbers in or above the curly braces are the sentence numbers. The number of sentence cabins (such as 2 cabins for 1# and 4 cabins for 6#), and the semantic meaning of each sentence cabin is world-oriented and cross-lingual; its position, order, and structure in the sentence structure The meaning clusters used for filling are language-oriented (for example, {2} in 7~9# is different in the Chinese-English Russian sentence patterns; the filled meaning group strings are: work, work, JI > κ 0 H ). , ;
上述就是句型、 句舱的定义和解释的简洁举例。 句舱由意群统领的意群串填充或组成 (也可初步近似地理解为由词串填充或组 成)。 但句舱的大小有较大差别。 最小的句舱只包含一个意群串; 最大的句舱可以包 含一个从句或分句。 我们把句舱分为简单句舱和复杂句舱两种: 意群——意群是自然语言的字、 词、 词组或短语之 "意"的对等和统一; 是人类 思维活动的基本单元。 意群无语种之限,属 τ全人类; 也随人类杜会发展而代谢着。 意群串——意群在语言文字里相应的表示称意群文字串, 简称意群串。意群串分 单串、 复串两种; 只含有一个原有词串的为单串 (如图 5a中的英文串); 由两个或两 个以上原有词串组成, 并用 "_ "相连为复串 (如图 5b中的英文串)。 简单句舱——除不表意虚词外不超过三个意群串的句舱称简单句舱(如图 2所标 203〜204)。 英文如 "a an the in on to and"等不表意时忽略不计; 中文 的量词同样不计。 其它语种如此类推。 不同语种之间, 这三个串只要求有相应的、 语 意相同的串就可以, 不要求它们前后次序的一致。 复杂句舱——大于简单句舱, 含有舱模的句舱称复杂句舱。 上文所列举例句的句舱都属于简单句舱。 下面这个例句含有复杂句舱: The above is a concise example of the definition and interpretation of sentence patterns and sentence cabins. The sentence compartment is filled or composed of a cluster of meanings that are dominated by the group (also preliminarily understood as being filled or composed of a string of words). However, the size of the sentence cabin is quite different. The smallest sentence compartment contains only one cluster of meanings; the largest sentence compartment can contain a clause or clause. We divide the sentence cabin into two kinds of simple sentence cabins and complex sentence cabins: Yiqun group - Yiqun is the equivalence and unity of the "meaning" of words, words, phrases or phrases in natural language; it is the basic unit of human thinking activities. . The meaning of the meaningless group is τ all human beings; it is also metabolized with the development of human dubbing. The meaning group string - Yi Qun in the language text corresponding to the meaning of the group text string, referred to as the meaning group string. The meaning group is divided into single string and double string; only one original string is a single string (as shown in the English string in Figure 5a); consists of two or more original strings, and uses "_" Connected to a complex string (as shown in the English string in Figure 5b). Simple sentence cabin - except for the words that do not express the meaning of the word, no more than three sentence groups are called simple sentence cabins (as shown in Figure 2, 203 ~ 204). English such as "a an the in on to and" is neglected; the Chinese quantifiers are also excluded. Other languages are analogous. Between different languages, these three strings only need to have corresponding, semantically identical strings, and they are not required to be consistent. Complex sentence cabins - larger than the simple sentence cabin, the sentence compartment containing the cabin model is called a complex sentence cabin. The sentence compartments of the example sentences listed above are all simple sentence cabins. The following example sentence contains a complex sentence:
1 {the fisherman} consents to 2 { return the_f eather_suit} , on condition that 3 {fairy dance and play heavenly music for him} . 在 3 {仙女为他跳舞并演奏天上的乐曲 }的条件下, 1 {渔夫 }答应 2 {归还羽衣}。 本例句舱 1、 句舱 2都是简单句舱, 句舱 2英文含复串; 句舱 3大于简单句舱, 含有舱模, 属于复杂句舱。 舱模和舱眼——进一步剖析复杂句舱的内容;得出如同句型的框架结构部分称舱 模; 镶嵌在舱模框架结构上的可替换部分称舱眼。 句舱和舱眼是上、 下位概念; 但简 单句舱和舱眼的大小相等, 同样是除不表意虚词外不大于三个意群串。 如上例句船 3 {fairy dance and play heavenly music for him} 3 {仙女为他 跳舞并演奏天上的乐曲} 利用句型理论剖析,可得出舱模: 1 {the fisherman} consents to 2 { return the_f eather_suit} , on condition that 3 {fairy dance and play heavenly music for him} . Under the condition of 3 {fairs dancing for him and playing the music of the heavens}, 1 {fisher} Promise 2 {return to the feathers}. This example cabin 1, sentence cabin 2 are simple sentence cabins, sentence cabin 2 English with multiple strings; sentence cabin 3 is larger than the simple sentence cabin, contains the cabin model, is a complex sentence cabin. Cabin and cabin eye - further analysis of the contents of the complex sentence compartment; the frame structure of the sentence pattern is called the cabin model; the replaceable part embedded in the frame structure of the cabin is called the cabin eye. The sentence cabin and the cabin eye are the upper and lower concept; however, the simple sentence cabin and the cabin eye are equal in size, and the same is not more than three meaning clusters except for the unspoken words. The above example boat 3 {fairy dance and play heavenly music for him} 3 {fairy dance for him and play the music of the heavens} Using the sentence pattern theory analysis, you can get the cabin model:
(00205) {l} + {2} +and+ {3} +for him (00205) {l} + {2} +and+ {3} +for him
{ 1 }+为他 +{2}+并 +{3 } 其中(00205)是舱模号; 这个舱模包含 3个舱眼, 三个舱眼的内容都不大于三个 意群串: 3 {1 {fairy} 2 {dance} and 3 {play heavenly music} for him} { 1 }+ is his +{2}+ and +{3 } where (00205) is the cabin model number; this cabin model contains 3 cabin eyes, and the contents of the three cabin eyes are no more than three clusters: 3 {1 {fairy} 2 {dance} and 3 {play heavenly music} for him}
3 {1 {仙女}为他2 {跳舞}并3 {演奏 天上的 乐曲 } } 、 。 . i 3 {1 {Fairy} for him 2 {dance} and 3 {playing the music of the heaven} } , . . i
do? 您好! Get away ! 滚开! "等等。 语句构件——语句构件是不同语言文字之间, 字、 词、 句表意的对等和统一。 根据自然语言之表意人类互通,剖析比对多语句对; 得出表意对等和统一的句型、 舱 模、 意群串和小习语等语句构件。 经建库编码后的语句构件可以是组装句子的另部件 或对句子进行编码的标准件。 语句构件包括句型构件、 舱模构件、 意群串构件和小习 语构件。 意通代码——面向多语种, 语意相等、 互通的语句构件的统一编码称意通代码。 世界文——由意通代码生成, 体现多文种语意互通, 并可进行多文种读出或文本 转换特殊的文本文件, 这种特殊的文件有望通用于世界而称世界文。 句型构件理论简化了自然语言的复杂性、又适应它的灵活性; 并化解它们之间语 法不一致难题。 然而, 我们把 语法分析、 语义理解分配给人脑。 组织专家根据句型 原理, 对句子进行句型、 意群串两个层面的语意剖析、 比对整理。 这些需要理解的艰 难的也是一劳永逸的事由人脑完成。 同时 把经常性的、 单调、 繁琐的记忆、 搜索、 匹配等工作交给电脑。 让计算机提供一个便捷的操作的平台, 利用人机交互的形式, 让人脑、 电脑能够很好地互补。 把剖析、 比对过程中产生的句型、 舱模、 意群串和小 习语等语句构件建库保存, 并统一编制意通代码, 如此产生语句构件库。 语句构件库存贮的是语句构件。 这些语句构件, 是多种语言文字之间表意得以对 等的、 可以组装、 拼接成句子的(图 2)。 我们可以利用这些构件组装句子; 也可以把 句子利用这些构件来编码, 利用与它们相匹配的意通代码生成世界文等。 在这过程中 计算机只要做简单的查表、 判断; 编码或者译码等操作就可了。 基于语句构件进行自 然语言的翻译、 转换; 或者文本转换等; 得到的译文或所转换出来的文本不但可读性 好, 而且表意能与原文一致。 下文参照附图、 利用实施例将本发明的内容进一步说明如下: 一、 一种语句构件装置 图 1是语句构件装置结构示意图。如图 1所示, 语句构件装置包括: 语句构件存 储部 101、 原有部 102、 意通代码编制部 103、 构件读出部 104、 构件匹配给出部 105、 构件添加部 106和构件库操作控制、 接口部 107等七个部件: Do? Hello! Get away! Get out! "Wait. Statement component - statement component is the equivalence and unity of words, words, sentences and sentences between different languages. According to the ideology of natural language, human beings interpret and compare pairs of sentences; Unified sentence patterns, cabin models, meaning clusters, and small idioms. The statement component encoded by the library can be another component of the assembled sentence or a standard component that encodes the sentence. The statement component includes a sentence component, Cabin module, esthetic string component and small idiom component. Yitong code——for multilingual, unified coding of interpretive and intercommunicative statement components called Yitong code. Esperanto - generated by Yitong code, reflecting many The language is interoperable, and can be used for multi-text reading or text conversion of special text files. This special file is expected to be used in the world and is called world. Sentence structure theory simplifies the complexity of natural language and adapts to it. Flexibility; and resolve the grammatical inconsistency between them. However, we assign grammar analysis and semantic understanding to the human brain. Principle, the sentence is sentenced, the meaning of the string of two levels of semantic analysis, comparison finishing. These difficult to understand is also a once-and-for-all thing done by the human brain. At the same time the regular, monotonous, cumbersome memory, search, Matching and other work to the computer. Let the computer provide a convenient operation platform, using the form of human-computer interaction, the brain and the computer can complement each other well. The analysis, comparison process generated sentence patterns, cabin models, Statement group and small idioms are used to build the library, and the Yitong code is compiled. The statement component library is generated. The statement component store is the statement component. These statement components are the meaning between multiple languages. Equivalent, can be assembled, spliced into sentences (Figure 2). We can use these components to assemble sentences; we can also use sentences to encode sentences, use the Italian code that matches them to generate World Essay, etc. In the process, the computer only needs to do simple table lookup and judgment; encoding or decoding operations are all possible. The translation or conversion of natural language; or text conversion; etc.; the obtained translation or the converted text is not only readable, but also ideographically consistent with the original text. Hereinafter, the contents of the present invention will be further developed by using embodiments with reference to the accompanying drawings. The description is as follows: 1. A sentence component device FIG. 1 is a schematic diagram of a structure of a sentence component device. As shown in FIG. 1 , the sentence component device includes: a sentence component storage unit 101, an original component 102, an Italian code preparation section 103, and a component read. The ejection unit 104, the component matching giving unit 105, the component adding unit 106, the component library operation control, and the interface unit 107 are seven components:
( )语句构件存储部 101, 是本装置的中心部件。 含有用电子数据形式构成的、 存 储了多语种语意对等的语句构件的二维数据库表。 它们是句型库、 舱模库、 意群串库 和习语库 (参附图 3〜6 ) 四个语句构件库: The ( ) sentence member storage unit 101 is a central component of the device. A two-dimensional database table containing statement components constructed in the form of electronic data and storing multilingual semantic equivalents. They are a sentence library, a cabin model library, an Italian group library and an idiom library (see Figures 3 to 6).
1、 句型库 300, 用于存储句型构件, 有句型码、 英文句型、 中文句型、 俄文句 。 , 1. A sentence pattern library 300 for storing sentence structure components, having a sentence pattern, an English sentence pattern, a Chinese sentence pattern, a Russian sentence . ,
种的句型存储在相应文种句型字段内 301。 这句所述文种句型实际是指句型的框架部 分, 是面向各自然语言的。 其中的花括号表示句舱, 中间的数字是该句舱的编号, 句 舱由意群串填充, 句舱在句型中的位置、 次序以及填充的意群串都是面向各自然语言 的; 从图 3的库中内容 301可以看出, 同一个句舱, 其标号一样但它在各语种句型中 的位置、 次序并不一致。 句型码字段存放句型码, 句型码代表了同一记录内各文种句 型字段内的各文种句型的语意。 句型体现该类句子基本语意及类属是面向全人类的、 跨语种的; 它所包含的句舱个数、 句舱语意都是面向全人类, 跨语种的; 面向人类, 跨语种的表示就是句型码。 也就是说, 句型码代表句型语意, 影射了各文种句型; 各 文种句型又可以通过句型码影射另一个文种句型。 至于语法是属于各自然语言的, 句 型的框架部分包揽着自然语言复杂、 个性化的语法现象, 然而句舱即使有语法现象也 极为简单了。 The sentence patterns are stored in the corresponding sentence pattern field 301. The sentence pattern described in this sentence actually refers to the framework part of the sentence pattern, which is oriented to each natural language. The curly braces indicate the sentence cabin, the middle number is the number of the sentence cabin, the sentence cabin is filled by the meaning group, the position, the order of the sentence cabin in the sentence pattern and the filled meaning group are all oriented to the natural language; It can be seen from the content 301 in the library of Fig. 3 that the same sentence cabin has the same label but its position and order in the sentence patterns of the various languages are not consistent. The sentence pattern field stores the sentence pattern code, and the sentence pattern code represents the semantic meaning of each sentence pattern in each sentence type field in the same record. The sentence pattern reflects the basic semantics and generics of the sentence is all-human and cross-lingual; it contains the number of sentence cabins and the meaning of the sentence cabin are all human-oriented, cross-lingual; human-oriented, cross-lingual representation Is the sentence code. That is to say, the sentence code represents the sentence form semantics, which insinuates the sentence patterns of each language; each sentence pattern can infer another sentence pattern through the sentence pattern. As for grammar, which belongs to various natural languages, the framework part of the sentence model is a complex and individualized grammatical phenomenon of natural language. However, even if there is a grammatical phenomenon in the sentence cabin, it is extremely simple.
2、 舱模库 400, 用于存储舱模构件, 有舱模码、 英文舱模、 中文舱模、 俄文舱 模字段, 如图 4所示。 其包含至少一个记录, 相同语意的舱模同处一个记录, 相应文 种的舱模存储在相应文种舱模字段内 401。 舱模是复杂句舱的框架结构部分, 是面向 各自然语言的。 其中的方括号表示舱眼, 中间的数字是该舱眼的编号, 舱眼也由意群 串填充。 舱眼在舱模中的位置、 次序以及填充的意群串都是面向各自然语言的; 从图 4的库中内容 401可以看出, 同一个舱眼, 其标号一样但它在各语种舱模中的位置、 次序并不一致。 舱模码字段存放舱模码, 舱模码代表了同一记录内各文种舱模字段内 的各文种舱模的语意。 舱模的基本语意, 是面向全人类的、 跨语种的; 它所包含的舱 眼个数、 舱眼语意都是面向全人类, 跨语种的; 其表示就是舱模码。 也就是说, 舱模 码代表舱模语意, 影射了各文种舱模; 各文种舱模又可以通过舱模码影射另一个文种 舱模。 至于句舱内的语法也是属于各自然语言的, 舱模包揽着自然语言的语法现象, 然而舱眼即使有语法现象也极为简单。 2. The cabin model library 400, used to store the cabin module, has the cabin model code, the English cabin model, the Chinese cabin model, and the Russian cabin model field, as shown in Figure 4. It contains at least one record, the same semantics of the same model, and the corresponding model of the cabin is stored in the corresponding language model field 401. The cabin model is the frame structure part of the complex sentence cabin and is oriented to the natural language. The square brackets indicate the cabin eye, the middle number is the number of the cabin eye, and the cabin eye is also filled with the cluster. The position, order and filling of the cabin eyes in the cabin mold are all oriented to the natural language; as can be seen from the contents of the library in Figure 4, the same cabin eye is labeled the same but it is in each language class. The position and order in the mold are not consistent. The cabin model code field stores the cabin model code, and the cabin model code represents the semantics of each type of cabin model in each cabin model field in the same record. The basic meaning of the cabin model is for all human beings, cross-lingual; it contains the number of cabin eyes and cabin language for all human beings, cross-lingual; its expression is the cabin model code. That is to say, the cabin model represents the semantics of the cabin model, which insinuates the cabin models of each language; each type of cabin module can infer another type of cabin module through the cabin model code. As for the grammar in the sentence cabin, it belongs to the natural language. The cabin model covers the grammatical phenomenon of natural language. However, even if there is grammatical phenomenon in the cabin, it is extremely simple.
3、 意群串库 500、 502, 用于存储意群串构件, 有意群码、 英文串、 中文串、 俄 文串字段, 如图 5a-b所示。 其包含至少一个记录, 相同语意的意群串同处一个记录, 相应文种的意群串存储在相应文种串字段内 501、 503。 意群串是句舱或舱眼的内容, 句舱与舱眼是上下位概念, 句舱分简单句舱和复杂句舱两种, 复杂句舱抽出如同句型 的框架结构后就是舱眼。句舱和舱眼是上、下位概念;但简单句舱和舱眼的大小相等, 同样是除不表意虚词外不大于三个意群串。 拼音文字的意群串有单串复串两种, 单串 即一个原有词串 501, 复串是由多于一个原有词串, 并以 " _ "相连而成 503。 意群码 字段存放意群码, 意群码代表了同一记录内各文种意群串字段内的各文种意群串的语 意, 是面向人类, 跨语种的; 各文种意群串是面向各自然语言的。 也就是说, 意群码 代表意群串的语意, 影射了各文种意群串; 各文种意群串又可以通过意群码影射另一 个文种意群串。 , , 、 、 3. The meaning group library 500, 502 is used to store the meaning group component, the intention group code, the English string, the Chinese string, the Russian string field, as shown in Fig. 5a-b. It includes at least one record, the same semantic group of the same group of records, and the meaning group of the corresponding language is stored in the corresponding text string field 501, 503. The meaning cluster is the content of the sentence cabin or the cabin eye. The sentence cabin and the cabin eye are the concept of the upper and lower position. The sentence cabin is divided into a simple sentence cabin and a complex sentence cabin. The complex sentence cabin is extracted as a sentence structure and is the cabin eye. The sentence cabin and the cabin eye are the upper and lower concept; however, the simple sentence cabin and the cabin eye are equal in size, and the same is not more than three meaning clusters except for the unspoken words. The pinyin string has a single string and two strings. The single string is an original word string 501. The compound string is composed of more than one original word string and is connected by " _ " to form 503. The meaning group code field stores the meaning group code, and the meaning group code represents the semantic meaning of each group of meaning group strings in each group of the meaning group strings in the same record, which is oriented to humans and cross-language; For all natural languages. That is to say, the meaning group code represents the semantic meaning of the group of meaning groups, and the group of meaning groups of each type of text is inferred; each group of meaning groups can infer another group of meaning groups through the group code. , , , , ,
习语字段, 如图 6所示。 其包含至少一个记录, 相同语意的小习语同处一个记录, 相 应文种的小习语存储在相应文种习语字段内 601。 习语码代表了同一记录内各文种习 语字段内的各文种小习语的语意。 也就是说, 习语码代表小习语的语意, 影射了各文 种小习语; 各文种小习语又可以通过习语码影射另一个文种小习语。 上述四个库的结构强调只有相同语意的同类构件才同处一个记录,同一记录又设 计了某某码字段, 用以编制意通代码。 意通代码与同记录的同类构件的相互影射。 这 样的结构保证了构件与构件之间可以直接转换或通过意通代码转换而语意不变; 也就 是说, 不同语种之间可以借此进行相互转换。 上述四个库之间的关系是平列的, 它们 互不干预又共处语句构件存贮部之中。 都要接受其它部件的操作或控制。 The idiom field, as shown in Figure 6. It contains at least one record, a small idiom of the same semantics is co-located, and a small idiom of the corresponding language is stored in the corresponding idiom field 601. The idiom code represents the semantics of the small idioms of each language in the idiom fields of the same record. That is to say, the idiom code represents the semantics of the small idioms, and the small idioms of each language are inferred; the idioms of each language can insinuate the idioms of another language through the idioms. The structure of the above four libraries emphasizes that only the same components of the same semantics are in the same record, and the same record is designed with a certain code field for compiling the code. The Italian code is inline with the same components of the same record. Such a structure ensures that the component and the component can be directly converted or changed by the Italian-style code conversion; that is, different languages can be converted into each other. The relationship between the above four libraries is parallel, and they do not interfere with each other and are shared among the statement component storage units. All operations or controls of other components are accepted.
(二)原有部 102, 存贮有关上述四个库的索引文件; 也包括原有 CPU等。 (2) The original department 102 stores the index files related to the above four libraries; it also includes the original CPU and the like.
(3意通代码编制部 103, 分别与语句构件存储部 101, 意通代码编制部 103相连。 仅当上述四个库任何之一出现新记录时, 把当前库代表数: 例如习语库 =FF00H、 句型 库=?00011、 舱摸库 =EF00H、 意群串库 =0001H (也是这四个库的库标、 数值段段标; 数 段的起点, 止点即下一个库标数 -1 )作高位字加上当前库记录号合成意通代码; 并填 入当前库的某某码字段, 作为语句构件统一的双字节定长的多语种语意互通的意通代 码。 意通代码对于当前库、 当前记录内各语种构件的同一语意的代表是唯一的; (3) The code generation unit 103 is connected to the sentence component storage unit 101 and the Italian code preparation unit 103. Only when any of the above four libraries has a new record, the current library representative number: for example, the idiom library = FF00H, sentence library =?00011, cabin library = EF00H, group library =0001H (also the library label and numerical segment of the four libraries; the starting point of the segment, the last library number -1 ) Make the high word plus the current library record number to synthesize the code; and fill in the code field of the current library, as the unified two-byte fixed-length multi-lingual semantic interoperability code of the statement component. The representation of the same semantics of the current library and the components of the language in the current record is unique;
(四)构件读出部 104, 与语句构件存储部 101直接相连, 用于接收读出命令, 以意 通代码所含数段标确定某库某记录, 即意通代码数一最小库标数 =最小库标数所指库 的记录号。 然后到相应库相应记录读出所需要的语种构件。 (4) The component reading unit 104 is directly connected to the sentence component storage unit 101, and is configured to receive the read command, and determine a certain record of the library by using the number of segments included in the code, that is, the number of the code of the passbook and the minimum number of the library. = The record number of the library indicated by the minimum library number. Then read the required language components to the corresponding records in the corresponding library.
©构件匹配给出部 105, 与语句构件存储部 101直接相连, 用于接收匹配命令。 根据所给语种的句子或句舱内容以及当前操作点的指引, 在相应构件库相应语种索引 字段查询与其相匹配的记录, 给出匹配的所需要的语种构件。 如没有匹配记录则返回 无匹配信号。 The component matching matching section 105 is directly connected to the sentence component storage section 101 for receiving a matching command. According to the sentence or sentence content of the given language and the guidance of the current operating point, the corresponding language records are searched in the corresponding language index field of the corresponding component library, and the required language components are matched. If there is no matching record, no match signal is returned.
(六)构件添加部 106,与语句构件存储部 101直接相连。用于接收添加新构件命令, 在查询证实相应构件库没有相同构件后, 将新构件添加到相应构件库的相应语种构件 字段内。 当给一个新记录添加新构件时, 同时发信息通知意通代码编制部 103。 (6) The component adding unit 106 is directly connected to the sentence member storage unit 101. Used to receive the Add New Component command. After the query confirms that the corresponding component library does not have the same component, the new component is added to the corresponding language component field of the corresponding component library. When a new component is added to a new record, the information is simultaneously notified to the code generation unit 103.
(七)构件库操作控制、 接口部 107, 通过构件读出部 104、 构件匹配操作部 105、 构件添加部 106与语句构件存储部 101相连。用于接收基于本语句构件的各种应用的 调用或接收相关命令进行操作, 返回调用者所需语句构件, 或通过本接口与基于语句 构件的其它应用装置相连接。 上文所述语句构件 (参附图 2及 3〜6 ) 是用于组装语言句子的另部件, 也是对 句子进行另部件拆分、 编码的标准件。 有如下四种: , , 。 (7) The component library operation control and interface unit 107 is connected to the sentence component storage unit 101 by the component reading unit 104, the component matching operation unit 105, and the component adding unit 106. It is used to receive calls from various applications based on the statement component or to receive related commands, return the caller required statement components, or connect to other application devices based on the statement component through the interface. The statement component described above (see FIGS. 2 and 3 to 6) is another component for assembling a linguistic sentence, and is also a standard component for splitting and encoding another sentence. There are four types as follows: , , .
意类属, 也决定了该类句子所含句舱的位次和个数, 并包揽了该类句子的较复杂的语 法现象。 The genus of Italians also determines the order and number of sentences in the sentence, and it involves the more complicated grammatical phenomena of such sentences.
2、 舱模构件 202, 401, 用于构成复杂句舱的基本结构框架。 代表了该类句舱基 本语意类属, 也决定了该类句舱所含舱眼的位次和个数, 并包揽了该类句舱的较复杂 的语法现象。 句型和舱模构件都为句子提供了框架结构,包揽了复杂的语法, 决定了所含句舱 及舱眼的位次, 这就避免了现有技术利用人工智能作句法分析、 语法分析之苦。 有贡 献于结果表意能与原文一致。 2. Modules 202, 401, used to form the basic structural framework of a complex sentence compartment. It represents the basic semantics of this type of sentence cabin, and also determines the number and number of cabins contained in such sentence cabins, and covers the more complicated grammatical phenomena of such sentence cabins. Both sentence patterns and cabin modules provide a framework for sentences, which involves a complex grammar and determines the order of the sentence cabins and cabin eyes. This avoids the use of artificial intelligence for syntactic analysis and grammatical analysis. bitter. There is a contribution to the results that can be consistent with the original text.
3、意群串构件 501〜503, 是由意群串充当的构件。用于填充简单句舱 203〜204 或舱眼 205〜207 的构件, 简单句舱与舱眼是上、 下位概念而大小一样, 都是除不表 意虚词外不超过三个意群串; 3. The group of characters 501 to 503 is a component that is played by the group of meanings. For filling the simple sentence cabin 203~204 or the cabin eye 205~207, the simple sentence cabin and the cabin eye are the same as the upper and lower concept, and they are not more than three meaning clusters except for the vocabulary;
4、 小习语构件 601, 由过于简短不足以分出句型、 句舱的句子充当小习语构件。 用于直接构成简短的句子。 上文所述的语句构件装置, 除已有的相应文种 (英文、 中文、 俄文)夕卜, 每增加 一个语种, 首先应将句型库、 舱模库、 意群串库、 习语库分别依次各增加一个某文句 型、 某文舱模、 某文串、 某文习语字段。 并且新加文种的构件只有与已有语种构件的 语意相同的才能填加在同一个记录上。也即再次强调只有相同语意的语句构件才能共 处一个记录。 对于上述四个库, 可以只提取二个字段构成相应分库; 即提取句型库、 舱模库、 意群串库、 习语库中的某文句型、 某文舱模、 某文串或某文习语和某某码两个字段, 构成某某语言库或第一语言库、 第二语言库而应用于语言翻译或文本转换等场合。 上文所述语句构件的来源, 其一、 是通过专家操作、 人机交互的方式, 剖析比对 双语对训练样本语料得到。 其二、 即另一个来源是用户的反馈信息经专家审核后再加 入; 通过支持网站实现。 二、 一种语句构件的制作方法: 语句构件的制作方法: ①准备样本语料,取相同内容的双语或多语种文字版本的 语料作为训练样本。 利用人机交互的方式先②进行句型层面的剖析比对; 然后③进行 句舱层面的剖析比对。 从而得出字、 词、 句表意得以对等和统一的语句构件, 包括如 下步骤: 4, the small idiom component 601, by the sentence is too short to separate the sentence pattern, sentence sentence as a small idiom component. Used to form short sentences directly. The sentence component device described above, in addition to the corresponding corresponding language (English, Chinese, Russian), for each additional language, the sentence library, the cabin model library, the Italian group library, the idiom should be first used. The library respectively adds a certain sentence pattern, a document module, a string, and an idiom field. And the newly added components can only be added to the same record if they have the same semantics as the existing language components. That is, it is emphasized again that only statement components of the same semantics can share a record. For the above four libraries, only two fields can be extracted to form a corresponding sub-library; that is, a sentence-type library, a cabin model library, an Italian group library, a sentence pattern in a corpus library, a document module, a string or An idiom and a certain code form two fields, which constitute a certain language library or a first language library and a second language library, and are applied to a language translation or text conversion. The source of the statement component described above, one is through the expert operation, human-computer interaction, analysis and comparison of the bilingual training sample corpus. Second, another source is that the user's feedback information is added after being reviewed by experts; it is implemented through the support website. Second, a method of making a statement component: The method of making a statement component: 1 Prepare a sample corpus, and take the corpus of the bilingual or multilingual text version of the same content as a training sample. The human-computer interaction method is used to first perform the analysis of the sentence-level analysis; then, the analysis is performed on the sentence-level level. Thus, the word, word, and sentence meanings are equivalent and unified statement components, including the following steps:
()准备样本语料,利用相同内容的双语或多语种文字版本的语料作为训练样本。 每轮比对选 A、 B双语作为一个样本对。 其中 A语分配给拼音文字或已经比对过的文 ; 第一轮双语对训练样本的剖析比对。其中双语对样本的 A语为英文, B语为中文。 从第二轮开始新语对中必须其一是已经进行过剖析比对的。 如当加入俄文时, 只能取 中俄或英俄语料作为双语对训练样本, 第二轮剖析比对的双语对样本中 A语应是已比 对过的中文或英文, B语应是新加的俄文。 这个特征就是上文所强调的, 保证相同语 意的构件才共处一个记录, 是一个强有力的措施。 每 轮的训练语料样本应大到新增句型 /句例比 < 1%后方可考虑增加新语种; 例 如在操作过程中, 一个工作日下来统计, 新增的句型数除以新增的样本句对的比例 < 1%。 < 1%后再考虑进行次一轮的剖析比对。 另一方面, 根据训练样本语料的行业来源 或应用范围来源来标记。 借此划分句型库、 舱模库、 意群串库、 习语库为若干个相应 分库。 这些分库用于相应行业或专用版本。 加上设有支持网站; 有益于用户细分; 也 有益于版本升级。 上述语料的收集过程中、 必要时的录入是自然的事。 () Prepare sample corpus and use corpus of bilingual or multilingual text versions of the same content as training samples. Each round compares A and B as a sample pair. Where the A language is assigned to the pinyin text or the already compared text The first round of bilingual analysis of the training samples. Among them, the B language of the bilingual sample is English, and the B language is Chinese. From the second round, the new language must have been analyzed and analyzed. For example, when joining Russian, only Chinese-Russian or English-Russian materials can be taken as a bilingual training sample. The second round of comparative analysis of the bilingual pair of samples should be compared with Chinese or English, and B should be new. Canadian in addition. This feature is emphasized above. It is a powerful measure to ensure that the same semantic components are shared together. The training corpus sample for each round should be as large as the new sentence/sentence ratio < 1% before considering adding new languages; for example, during the operation, one working day is counted, and the new sentence number is divided by the new one. The ratio of sample sentence pairs is < 1%. After 1%, consider the next round of analysis. On the other hand, it is marked according to the industry source or application range source of the training sample corpus. By dividing the sentence pattern library, the cabin model library, the meaning group library, and the idiom library into a number of corresponding sub-libraries. These sub-libraries are used in the respective industry or proprietary versions. Plus a support site; useful for user segmentation; also beneficial for version upgrades. It is a natural matter to collect the above-mentioned corpus and, if necessary, to enter it.
(二)句型层面剖析比对 图 7是句型层面比对流程图。句型层面剖析比对。读取双语样本句对, 划分出句 型、 句舱, 把句型作为语句构件存入句型库, 把不足以分出句型、 句舱的小习语作为 小习语构件存入习语库; 同时保存已经划分出句型、 句舱的双语样本句例对, 以备句 舱层面的进一步比对。 具体步骤如图 7所示: 开始, 先读入一个双语样本句对 701。 然后调用配句型字 程序 702, 查找句型库返回4、 B语匹配句型。判有匹配句型 703, 若是、有匹配句型, 下续型例配 707; 若否、 没有匹配句型, 续挖句舱作句型操作 704。 即以当前双语样 本句对为例制作新句型, 弹开一窗口, 上横行显示 A语句、 下横行显示 B语句, 横行 下再显示挖句舱、 存句型两个命令按钮, 并提示专家点击 A、 B语例句的待挖句舱的 首尾点, 挖句舱计数器 N=0。 当挖句舱命令按钮被点击后, 置N=N+l, 检查 A、 B语是 否都被点击两个点以及这两个点是否有效。 拼音文字两点间^一个词串、 表意文字^ 一个字为有效。 若否, 提示重作; 如果点击正确并且有效, 将 A、 B语句两点之间的 内容挖去并填入 " {N} ", 该轮挖句舱结束,下一轮重复挖句舱作句型操作 704再挖下 一个句舱。当判断到存句型 705命令按钮被点击并且 N^ l,表示挖句舱制作新句型操 作完毕。 清除上述相应显示, 进行存句型或存小习语 706。 把两个新句型作为句型构 件分别加入句型库 A文句型、 B文句型字段。 如果这时 N=0, 表示当前双语样本句对 不足以分出句型、 句舱而被判定为小习语。 那么, 清除相应显示, 则把两个小习语作 为小习语构件分别加入习语库 A文习语、 B文习语字段。 再接续型例配 707, 把当前 双语样本例句对号入座地填入当前匹配句型、 或填入当前新作句型, 作为已经划分出 句型、 句舱的样本句例对存盘备句舱层面的剖析比对读取接续。 本步骤结束。 下一个 , 。 配句型字程序, 其优选例是: 事先把句型变为句型词串 (如把 "Does {1} {2} as {3} as {4}? "变为 " does as as "), 再把例句从左到右, 英 (拼音文字) 逐个单 词、 中 (非拼音文字)逐个字取下, 以它们查句型首字或首单词; 把符合的句型集于 临时库。 然后以循环语句再逐个句型考测。 循环中又设开关语句, 以句型词串空之数 作开关语句 (如句型词串 " does as as " = 2空 3段), 进入后句型词串每段依次 与例句比对, 比对后两者都弃去 (相同者无影响)。 当各段都分别能在例句段中找到 为之中选, 即是匹配例句的句型, 列表给出。 如果有数个句型符合, 选取所含句型词 串最长的那个句型。 如果出现串长度相同的情况, 接受人工干预。 以下以具体的句例对进一步来说明上述歩骤流程: 开始, 读入一个双语样本句对 701 , 如果读入的句对是 "Does John work as hard as Henry ? 约翰象亨利一样努 力工作吗? ": 调用配句型字程序 702, 查找句型库返回4、 B语匹配句型。 判有匹配 句型 703, 若否、 没有匹配句型, 续挖句舱作句型操作 704步骤。 即以当前双语样本 句对为例制作新句型。 弹开一窗口, 上横行显示 A语句, "Does John work as hard as Henry ?"、 下横行显示 B语句 "约翰象亨利一样努力工作吗? ", 横行下再显示挖句 舱、 存句型两个命令按钮, 并提示专家点击 A、 B语例句的待挖句舱的首尾点, 挖句 舱计数器 N=0。 当挖句舱命令按钮被点击后, 置N=N+l, 检查 A、 B语是否都被点击两 个点以及这两个点是否有效。 如 "Does I John work as hard as Henry ?,, " |约 翰 I象亨利一样努力工作吗? "(' Ι ' 为被点击处)。 它们的点击正确并且有效, 将 、 Β语句两点之间的内容挖去, 这时 N=l填入 " { 1} ", 为 "Does {1} work as hard as Henry ?" " {1}象亨利一样努力工作吗? "。 存句型命令按钮没有被点击; 下一轮重 复挖句舱作句型操作 704再挖下一个句舱; 如 "Does {1} work | as hard as Henry?" " {1}象亨利一样努力 I工作 I吗? " N=2, 填入 " {2} ", 为 " Does {1} {2} as hard as Henry ?" " {1}象亨利一样努力 {2}吗? "。 如果存句型命令按钮没有被点击; 继 续, N=3、 N=4。 当为 "Does {1} {2} as {3} as {4} ?", " {1}象 {4}一样 {3} {2}吗? " 时, 判断到存句型 705命令按钮被点击并且 N ^ l, 表示挖句舱制作新句型操作完毕。 清除上述相应显示,进行存句型 706步骤;把两个新句型作为句型构件 "Does {1} {2} as {3} as {4} ?"加入句型库 A文句型字段; " {1}象 {4}一样 {3} {2}吗? "加入句型 库 B文句型字段。 再接续型例配 707步骤, 把当前双语样本例句对号入座地填入当前 新作句型, 作为已经划分出句型、 句舱的样本句例对, 如 "Does 1 {John} 2 {work} as 3 {hard} as 4 {Henry} ? 1 {约翰 }象 4 {亨利 }一样 2 {努力 } 3 {工作 }吗? ", 并存盘保 留, 备句舱层面的剖析比对读取, 本步骤结束。 下一个双语样本句对开始, 再执行读 入双语样本句对 701。 假如上面读入的句对是 "How do you do?"、 " 您好!", 当判断到存句型 705命 令按钮被点击, 这时 N=0, 表示当前双语样本句对不足以分出句型、 句舱而被判定为 。 , . you do?"加入习语库 A文习语、 " 您好!" 加入 B文习语字段。 再接续型例配 707, 本步骤结束。 下一个双语样本句对开始, 再执行读入双语样本句对 701。 句型层面比对整理的要点是挖句舱作句型, 其中该如何挖、该如何制作句型。要 求就是保证可操作性的前提下追求代表性。 可操作性即让电脑无需理解、 分析做诸如 上述的查表、 判断、 存储等操作。 代表性即句型涵盖的句例多少, 可涵盖句例越多代 表性越好。 句型层面比对过程中要求专家掌握的原则措施如下: (2) Sentence level analysis comparison Figure 7 is a sentence level comparison flowchart. Analyze the sentence at the level of the sentence. Read bilingual sentence pairs, divide sentence patterns, sentence cabins, store sentence patterns as sentence components in sentence patterns, and put small idioms that are not enough to separate sentence patterns and sentence cabins into idioms as small idioms. At the same time, the bilingual sample sentence pairs that have been divided into sentence patterns and sentence cabins are saved, in order to further compare the sentence cabin level. The specific steps are shown in Figure 7: Start, first read a bilingual sample sentence pair 701. Then, the sentence pattern word program 702 is called, and the sentence pattern library is returned to return 4, and the B language matches the sentence pattern. A matching sentence pattern 703 is judged. If yes, there is a matching sentence pattern, and the following pattern is matched with 707; if no, there is no matching sentence pattern, and the sentence box type operation 704 is continued. That is to make a new sentence pattern by taking the current bilingual sample sentence pair as an example, popping up a window, displaying the A statement on the horizontal line and the B statement on the horizontal line, and displaying the two command buttons of the sentence box and the sentence pattern under the horizontal line, and prompting the expert Click the head and tail points of the sentence box to be excavated in the A and B example sentences, and dig the cabin counter N=0. When the digging cabin command button is clicked, set N=N+l to check whether both A and B words are clicked and whether the two points are valid. Pinyin text between two points ^ a word string, ideogram ^ a word is valid. If no, prompt to redo; if the click is correct and valid, the content between the two points of the A and B statements is dug and filled in "{N}", the round of the digging cabin ends, the next round of repeated digging cabins The sentence operation 704 then digs a sentence cabin. When it is judged that the sentence type 705 command button is clicked and N^ l, it indicates that the new sentence pattern operation is completed. Clear the corresponding display above, and save the sentence pattern or save the small idiom 706. Add two new sentence patterns as sentence structure components to the sentence database A sentence sentence type, B sentence sentence type field. If N=0 at this time, it means that the current bilingual sample sentence pair is judged as a small idiom because it is not enough to separate the sentence pattern and the sentence cabin. Then, to clear the corresponding display, the two small idioms are added as idiom components to the idiom A idioms and B idiom fields. The continuation type 707, the current bilingual sample sentence number is filled in the current matching sentence pattern, or filled in the current new sentence pattern, as a sample sentence sentence that has been divided into sentence patterns, sentence compartments Compare the readings. This step ends. next , . A preferred example of a sentence-type word program is: Change the sentence pattern to a sentence string in advance (such as "Does {1} {2} as {3} as {4}?" becomes " does as as ") , then from the left to the right, English (Pinyin text) word by word, medium (non-phonetic text) one by one word, to find the sentence type first word or first word; set the matching sentence pattern in the temporary library. Then use the loop statement to test the sentence type by sentence. In the loop, a switch statement is set, and the number of the sentence type string is used as a switch statement (such as a sentence string " does as as " = 2 empty 3 segments), and each segment of the sentence string is sequentially compared with the example sentence. Both are discarded after the comparison (the same has no effect). When each segment can be found in the example sentence, it is the selected one, that is, the sentence pattern matching the example sentence, and the list is given. If there are several sentence patterns, select the sentence pattern with the longest sentence string. If there is a case where the string length is the same, manual intervention is accepted. The following is a detailed explanation of the above steps: Start, read a bilingual sample sentence pair 701, if the sentence pair is "Does John work as hard as Henry? John worked hard like Henry? ": Call the sentence-type word program 702, find the sentence pattern library returns 4, B-word matching sentence pattern. A matching sentence pattern 703 is determined. If no, there is no matching sentence pattern, and the sentence box operation is continued. That is to say, a new sentence pattern is created by taking the current bilingual sample sentence pair as an example. Pop up a window, the horizontal line shows the A statement, "Does John work as hard as Henry ?", the horizontal line shows the B statement "John worked hard like Henry?", and then displayed the digging sentence cabin, the sentence sentence type Command buttons, and prompt the expert to click on the first and last points of the sentence box to be drilled in the A and B example sentences, and dig the cabin counter N=0. When the digging cabin command button is clicked, set N=N+l to check whether both A and B words are clicked and whether the two points are valid. Such as "Does I John work as hard as Henry ?,, " | John I work hard like Henry? "(' Ι ' is the clicked point.) Their clicks are correct and valid, and the content between the two points of the Β statement is dug, then N=l is filled with "{1}", which is "Does {1} Work as hard as Henry ? "" {1} Work hard like Henry? ". The sentence-type command button is not clicked; the next round of repeating the sentence-sentence operation 704 and then dig a sentence cabin; such as "Does {1} work | as hard as Henry? "" {1}Working like I Henry I work I? "N=2, fill in "{2}", for " Does {1} {2} as hard as Henry ? "" {1}Working like Henry {2}? "If the save sentence command button is not clicked; continue, N=3, N=4. When is "Does {1} {2} as {3} as {4} ? ", " {1} Like {4} {3} {2}? When it is judged that the sentence type 705 command button is clicked and N ^ l, it indicates that the new sentence pattern operation is completed. The above corresponding display is cleared, and the sentence pattern 706 step is performed; the two new sentence patterns are used as the sentence pattern. The widget "Does {1} {2} as {3} as {4} ? "Add a sentence pattern A sentence sentence type field; " {1} like {4} like {3} {2}? "Add a sentence pattern B sentence sentence type field. Then follow the example with 707 steps, the current bilingual sample Fill in the current new sentence pattern, as a sample sentence pair that has been divided into sentence patterns and sentence cabins, such as "Does 1 {John} 2 {work} as 3 {hard} as 4 {Henry} ? 1 {John} Like 4 {Henry} 2 {Effort} 3 {Work}?", and save it, the profiling of the statement level is compared to the reading, this step ends. The next bilingual sample sentence begins, then Execute reading the bilingual sample sentence pair 701. If the sentence pair read above is "How do you do?", "Hello! ", when it is judged that the sentence type 705 command button is clicked, then N=0, indicating that the current bilingual sample sentence pair is insufficient to separate the sentence pattern and the sentence cabin and is judged as . , . you do?" Join the idiom A idiom, "Hello! "Add B language idiom field. Continue to type 707, this step ends. The next bilingual sample sentence pair begins, then read the bilingual sample sentence pair 701. The main point of sentence pattern comparison is to dig the sentence cabin Sentence pattern, how to dig, how to make a sentence pattern. The requirement is to pursue the representativeness under the premise of ensuring operability. The operability means that the computer does not need to understand and analyze the operations such as table lookup, judgment, storage, etc. as described above. The representative sentence pattern covers the number of sentences, which can cover the more representative the better the sentence. The principle measures required by the experts in the sentence-level comparison process are as follows:
①多语对语义考虑原则 从多语对、至少是双语对的语义上考虑。如果条件许可, 自然是取尽可能多的语 对同时进行句型提取; 正是因为不可能才要求至少是双语对进行。 如: 1 Multilingual vs. Semantic Considerations Semantic considerations from multilingual pairs, at least bilingual pairs. If the conditions permit, it is natural to take as many words as possible while extracting the sentence; it is because it is impossible to ask at least a bilingual pair. Such as:
We used to go to the movies about once a week. 通常我们每周大约去看一次电影。 这个句对, 英文可以把 " go to the movies ", 作为一个句舱, 但是中文相应的 "看电影" 中间插有 "一次"。 " once a week"作为一个句舱, 中文的 "每周一次" 又被其它词隔开。 这两种情况都不行, 必须在双语对语义上考虑均可才行。 这个句对 可: We used to go to the movies about once a week. Usually we go to see a movie about once a week. In this sentence, English can use "go to the movies" as a sentence cabin, but the Chinese corresponding "watching movie" has "one time" inserted in the middle. " Once a week" as a sentence cabin, Chinese "once a week" is separated by other words. Neither of these situations can be done, and it must be considered in terms of bilingual semantics. This sentence pair can:
1 {We} used to 2 {go to the movies about once} a week. 通常 1 {我们 } 每周 2 {大约去看一次电影 }。 ②代表性考虑原则 句舱的多少、大小直接影响句型的代表性。我们的原则是保证可操作性的前提下 追求代表性。 关于句舱大小以何为宜, 下文解说。 这里先解释一下代表性的问题: 1 {We} used to 2 {go to the movies about once} a week. Usually 1 {us} 2 per week {about to watch a movie}. 2 representative consideration principle The number and size of the sentence cabin directly affect the representativeness of the sentence pattern. Our principle is to pursue representation under the premise of ensuring operability. Regarding the size of the sentence cabin, it is explained below. Here is a brief explanation of the representative problem:
How many are there in your fami ly? 你家有几口人? 这个句对如果只把" your fami ly"作为句舱;可用" his fami ly; John' s fami ly; your class "等等填充。 但是由于 "How many"与 " fami ly"语义有关联。 中文对于 "家"问 "几口人"; 但对于 "班级"应问 "多少学生", 或 "多少人"。这样只把 "your fami ly"作为句舱代表性就差。 如果把 "How many"与 " fami ly"作成两个句舱, 不 但语意上可以相互照应; 而且代表性也增加了。 How many are there in your fami ly? How many people are there in your family? If you only use "your fami ly" as a sentence cabin; you can use "his fami ly; John's fami ly; your class" and so on. But because "How many" is related to "family" semantics. Chinese asks "a few people" for "home"; but "how many students" or "how many people" should be asked for "class". This only makes "your fami ly" a poor representation of the sentence cabin. If you make "How many" and "Fami ly" into two sentence cabins, you can not only take care of each other in terms of semantics; but also increase the representativeness.
③朴质准确的考虑原则 鉴于意通文本的定位是 "朴质准确地传递语义", 当挖句舱顾此失彼而无耐时, 可以修改华丽的译句为朴质直译, 再挖句型。 如: 兵不厌诈。 该例译句 "兵不厌诈"既华丽又简练, 但句型采集难以操作。 把中文改为朴质直 译 "战争中再多的诡计也不为过。"再作挖句舱处理: 3 Simple and accurate consideration principle In view of the positioning of the Yitong text is "simple and accurate transfer of semantics", when the digging of the sentence cabin is not able to resist the time, you can modify the gorgeous translation of the sentence into a simple literal translation, and then dig the sentence. Such as: The soldiers are not deceptive. The translation of the sentence "Bold is not deceptive" is both gorgeous and concise, but the sentence collection is difficult to operate. It is not too much to change Chinese to a simple translation. "There are no more tricks in the war."
There can never be too much 1 {deception} in 2 {war} . 2 {战争 } 中再多的 1 {诡计 } 也不为过 。 There can never be too much 1 {deception} in 2 {war} . 2 {war} More than 1 {诡} is not too much.
For al l their great size , the elephants moved absolutely noiselessly . 尽管象的身躯庞大, 它走动起来却一点声音也没有。 这 "一点声音也没有"; 很难落实到句型或某个句舱; 将它改为朴质直译 "走动 起来却静静地" 问题便迎刃而解了: For al l their great size , the elephants moved absolutely noiselessly . Although the body of the elephant is huge, it does not sound at all. This "no sound at all"; it is difficult to implement into a sentence pattern or a sentence cabin; change it to a simple literal translation "walking but quietly" The problem is solved:
For al l their 1 {great size} , 2 {the elephants} 3 {moved} absolutely 4 {noiselessly} . 尽管 2 {象} 的 1 {身躯庞大} , 2 {它} 3 {走动} 起来却 4 {静静地 } 。 For al l their 1 {great size} , 2 {the elephants} 3 {moved} absolutely 4 {noiselessly} . Although 2 {image}'s 1 {body is huge}, 2 {it} 3 {walking} up 4 but Quietly}.
④语法简繁的原则 从语法方面考虑, 复杂的、个性化的语法现象都揽到句型上; 使句舱内语法极为 简单。 上面所举的例子不难明白这一点。 在具体操作上还可以适当增加句舱个数来降 低句舱复杂程度, 尽量少作大句舱 (详下文)。 4 Principles of grammar and simplification From the grammatical point of view, complex and individualized grammatical phenomena are taken into the sentence pattern; the grammar in the sentence cabin is extremely simple. The example above is not difficult to understand. In the specific operation, the number of sentence cabins can be appropriately increased to reduce the complexity of the sentence cabin, and the sentence compartment should be minimized (see below).
1 {She} never 2 {comes} but 1 {she} 3 {brings something for the chi ldren} . 1 {She} never 2 {comes} but 1 {she} 3 {brings something for the chi ldren} .
1 {她} 没有一次 2 {耒} 不是就 3 {为孩子们带来一些东西} 。 如增加句舱, 把句舱 {3}改成 {3} {4}复杂性就降低了。 如: 1 {She} Not once 2 {耒} Not on 3 {Bring something to the children}. If you increase the sentence cabin, the sentence {3} is changed to {3} {4} complexity is reduced. Such as:
1 {She} never 2 {comes} but 1 {she} brings 3 {something} for the 4 {chi ldren} . 1 {她} 没有一次 2 {耒} 不是就为 4 {孩子们 } 带来 3 {—些东西} 。 1 {She} never 2 {comes} but 1 {she} brings 3 {something} for the 4 {chi ldren} . 1 {She} Not once 2 {耒} Not for 4 {children} Bring 3 {- Something}.
⑤词性和可替换性的原则 句舱是可以被其它词汇替换的部分, 可替换的词汇越多, 可替代性越强; 间接地 使句型的代表性加强。 句舱内词汇的词性尽可能局限于数词、 名词、 形容词、 复串, 少数情况才考虑其它词类(如动词、副词等)。如果说要给做句舱的词类排优先次序, 那首先就是数串、 专用串, 次则名串、 形容词串……, 最后考虑动词串。 最不考虑 的是介词和连词。 也就是介词、 连词几乎都纳归句型部分。 要求句型采集达到句舱内语法极为简单, 复杂语法现象尽揽于句型; 挖去句舱后 留下的句型, 所含句型词不宜太少, 因为太少了不便句型的检出。 最理想的情况是每个句舱之前后都有句型词, 也就是没有连续句舱的情况。原则 是任何一个句型, 在任何文种里的框架结构必须有一个或一个以上的文字串作为句 型词。 不允许在多语对当中, 某文种甚至没有句型词, 中文因为最简洁, 这种情况 时有发生, -但发生就得返工, 必须避免。 有时连续出现几个句舱, 即连续句舱问题。还有是句舱的大小, 尽量少作大句舱 问题, 这些都与句型词不宜太少相关, 这些情况在相关标题下阐述。 The principle sentence of 5 part of speech and replaceability is the part that can be replaced by other words. The more alternative words, the stronger the substitutability; the grounding makes the representation of the sentence form stronger. The part of the vocabulary in the sentence cabin is limited to numbers, nouns, adjectives, and plurals. In a few cases, other words (such as verbs, adverbs, etc.) are considered. If you want to give priority to the class of words in the sentence cabin, then the first is the string, the special string, the second string, the adjective string..., and finally consider the verb string. The least considered are prepositions and conjunctions. That is to say, prepositions and conjunctions are almost all part of the sentence. It is very simple to require the sentence pattern collection to reach the sentence grammar, and the complex grammar phenomenon is all in the sentence pattern; the sentence patterns left after the sentence cabin is dug, the sentence patterns should not be too small, because there are too few inconvenient sentence types. Out. The ideal situation is that there is a sentence before and after each sentence, that is, there is no continuous sentence. The principle is that any sentence pattern, the frame structure in any language must have one or more text strings as sentence patterns. It is not allowed in a multilingual pair, a certain language does not even have a sentence type, because Chinese is the most succinct, this happens from time to time - but it has to be reworked and must be avoided. Sometimes there are several consecutive sentence compartments, that is, continuous sentence cabin problems. There is also the size of the sentence cabin, as little as possible for the big sentence cabin problem, these are not too relevant to the sentence patterns, these conditions are explained under the relevant headings.
⑦多连续句舱尽量避免的原则 The principle of avoiding more than 7 consecutive sentence cabins
1 {1} 2 {get to work} at 3 {nine o' clock} every morning . 每天早上 3 {九点钟 } 1 {我} 2 {开始工作} 。 上例英连续出现 {1} {2}两个句舱; 相应的中文则变为 {3} {1} {2}三个连续句舱。 二个或二个以上句舱相连称连续句舱; 三个或三个以上句舱相连称多连续句舱。 连 续句舱不但具有句型词太少之弊, 套句型还得人工干预。 特别是连续三个或更多的 情况应尽量避免。 如上例减少到两个句舱就无此之虑了: 1 {1} 2 {get to work} at 3 {nine o' clock} every morning. 3 {9 o'clock} 1 {{}} {start work} every morning. In the previous example, there were consecutive {1} {2} two sentence cabins; the corresponding Chinese became {3} {1} {2} three consecutive sentence cabins. Two or more sentence compartments are connected to each other as consecutive sentence compartments; three or more sentence compartments are connected to each other as multiple consecutive sentence compartments. Continuous sentence cabins not only have the disadvantage of too few sentence patterns, but also have to intervene manually. In particular, three or more consecutive situations should be avoided. The above example is reduced to two sentence cabins without this consideration:
I 1 {get to work} at 2 {nine o' clock} every morning . 每天早上 2 {九点钟 } 我 1 {开始工作} 。 I 1 {get to work} at 2 {nine o' clock} every morning. Every morning 2 {9 o'clock } I 1 {start work}.
⑧尽量少作大句舱的原则 句舱有大有小,最小的句舱只含一个意群串;最大的句舱可以包含一个分句或从 句。 我们在作句型时, 宜尽量少作大句舱。 那么, 如何掌握这个尽量少, 以何为尺 度呢? 以 "只能这样"为准。 例如下面句对: 8 The principle of making the sentence space as small as possible. The sentence cabin is large and small. The smallest sentence box contains only one group of meanings; the largest sentence box can contain a clause or a clause. When we are making sentences, we should try to make as few sentences as possible. So, how to master this as little as possible, what is the scale? Take "only this" as the standard. For example, the following sentence pairs:
Can, you guess 1 {what I was doing} 2 {this morning} ? 你能猜到 2 {今天上午} 1 {我在做什么 } 吗? Can, you guess 1 {what I was doing} 2 {this morning} ? Can you guess 2 {this morning} 1 {What am I doing }?
I have forgotten 1 {what time} he said he 2 {had dinner} 3 {last night} . 我忘记他说他 3 {昨天晚上} 是 1 {什么时候} 2 {吃的晚饭} 。 I have forgotten 1 {what time} he said he 2 {had dinner} 3 {last night} . I forgot that he said he 3 {last night} was 1 {when} 2 {eat dinner}.
What were you doing when 1 {I cal led you on the telephone} ? What were you doing when 1 {I cal led you on the telephone} ?
1 {我打电话给你 } 的时候, 你在做什么? 1 What do you do when I call you }?
I have forgotten 1 {what he said his address was} . 上面四个句对, 分别由 what、 when引出一个从句。 第一对可以把主谓、 时间状 语分开作成两个句舱。 第二对可以作成三个句舱。 第三对不能将状语分开, 只能作 成一个句舱; 第四对 What针对表语提问, 也不能分开, 只能作成一个句舱。 第三、 四两对都 "只能这样", 这就是少作大句舱要把握的尺度。 I have forgotten 1 {what he said his address was} . The above four sentence pairs, respectively, by a when, lead a clause. The first pair can separate the subject-predicate and the time-sentence into two sentence cabins. The second pair can be made into three sentence cabins. The third pair can't separate adverbials, can only be made into a sentence cabin; the fourth pair of questions on the topic, can not be separated, can only be made into a sentence cabin. The third, the four pairs are "only like this", which is the scale to be grasped in the lesser case.
⑨挖句舱后的审定原则 挖句舱之后还得认真审定: 将句型、 以至每个句舱分别审定。 先看句型义, 一定 要都来自句型词, 与句舱内容无粘连。 然后审定每个句舱, 它们必须是可替换的, 与句型分开的, 不与句型义有粘连。 如果某句舱与某句型词有所粘连, 必须修改之。 例如: 9 The principle of verification after the excavation of the sentence cabin After the excavation of the sentence cabin, it must be carefully examined: the sentence pattern, and even each sentence cabin will be examined separately. First look at the sentence meaning, must come from the sentence form, and there is no adhesion to the sentence cabin content. Each sentence is then validated, they must be replaceable, separate from the sentence, and not confined to the sentence. If a sentence box is stuck with a sentence, it must be modified. E.g:
When do you think 1 {the meeting wi l l be held} ? 你认为 1 {会议在什么时候召开 } ? 这样划句型、 句舱不对, 句舱内容 "什么时候"与句型词 "When "有粘连。 应修 改为: When do you think 1 {the meeting wi l l be held} ? What do you think 1 {when is the meeting held}? In this way, the sentence type and the sentence cabin are incorrect, and the sentence content "when" has a glue with the sentence pattern "When". Should be repaired to:
When do you think 1 {the meeting} wi l l be 2 {held} ? 你认为 1 {会议 } 会在什么时候 2 {召开 } ? (3句舱层面剖析比对 图 8是句舱层面比对流程图。 如图 8所示, 开始运行, 读入已经划分出句型、 句 舱的双语样本对 801。 执行取句舱、 显示、 组复词 802步骤, 依次取出已经划分出句 型、 句舱的样本句例对当中的一个句舱, 开窗口一上部显示 A、 B语样本句例, 下部 显示 A、 B语当前句舱内容。 同时, 把 A语当前句舱以词串为单元切分并依次填入参 考表 A语字段, 再依次取出一个词串查找意群串库的 A文串字段, 找到后取出同记录 的 B文串字段内容。 如果该 B文串内容在 B语当前句舱中含有, 把 B文串内容填入参 考表 B语字段, 不含有让它为空。 作完整个参考表, 开窗口二显示参考表、 组复词命 令按钮以及可组复词操作提示。 接受专家点击参考表并在被点记录标志字段作标志。 当组复词命令按钮被点击并且参考表有连续记录被点击, 将参考表中有标志记录的 A 语字段内容以 " _ "相连组成复词, 并把有标志记录合并成一条记录, A语字段填入该 复词, B语字段以相等语意的词串填写。 接续判断是否简单句舱 803, 若是接续意群 对齐 808 ; 若否、 查询舱模库判断当前句舱是否含有舱模 804。 若不含舱模执行编写 舱模 805, 开窗口三作为可编辑窗口, 将当前双语句舱内容再显示, 接受专家以此为 基础编写舱模, 还显示存舱模命令按钮。 若含有舱模, 把所含舱模作为当前舱模执行 步骤: 划分出舱模、舱眼 806。 当存舱模命令按钮被点击,并且可编辑窗口已经被编辑 , 、 When do you think 1 {the meeting} wi ll be 2 {held} ? When do you think 1 {conference} will be 2 {convened}? (3 sentence compartment level analysis comparison Figure 8 is the sentence cabin level comparison flow chart. As shown in Figure 8, start running, read the bilingual sample pair 801 that has been divided into sentence patterns and sentence cabins. , group compound word 802 steps, sequentially take out the sentence sentence form, the sentence sentence of the sentence sentence to one of the sentence cabins, open the window one shows the A, B language sample sentence example, the lower part shows the A, B language current sentence cabin At the same time, the current sentence compartment of the A language is segmented by the word string and then filled in the A language field of the reference table in turn, and then a word string is sequentially taken to find the A text string field of the meaning group library, and the same record is taken out after being found. B string field content. If the content of the B string is contained in the current sentence of the B language, the content of the B string is filled in the B field of the reference table, and does not contain it to be empty. Make a complete reference table, open window 2 Display reference table, group compound command button and groupable compound operation prompt. Accept the expert to click the reference table and mark it in the marked record field. When the group compound command button is clicked and the reference table has continuous records, click There are signs in the reference table The contents of the recorded A language fields are connected by " _ " to form a compound word, and the marked records are merged into one record, the A language field is filled in the compound word, and the B language field is filled in with the same semantic word string. Cabin 803, if it is connected to the group alignment 808; If not, query the cabin model library to determine whether the current sentence cabin contains the cabin model 804. If the cabin module is not executed to write the cabin module 805, open window three as an editable window, the current double statement The contents of the cabin are displayed again, and the expert is used to write the cabin model. The display module command button is also displayed. If the cabin module is included, the cabin module is used as the current cabin module. Steps: Divide the cabin mold and the cabin eye 806. The save mode command button is clicked and the editable window has been edited , ,
Β文舱模字段,同时,将当前句舱内容对号入座地填入当前舱模, 或填入新编舱模作为 已经划分出舱模、 舱眼的复杂句舱显示。 续取舱眼 807, 依次取出一个舱眼的内容, 接续执行意群对齐 808步骤。 在第二 个窗口参考表下显示对齐确定命令按钮, 参考表接受专家按实例延伸或增补词义、 不 改变原有字单词的前提下加减串长度、 粘带附随字、 词形变化增补词义项等意群对齐 的修改, 或优选记录。 当对齐确定命令按钮被点击,表示参考表 ή Α、 Β语的词串已经 意群对齐、 即已成意群串,进行保存意群串 808操作, 逐记录地把 Α、 Β语字段内容作 为意群串构件存入意群串库的 Α文串或 Β文串字段。 如果当前操作的是舱眼, 判舱眼 完 809, 否, 如果当前句舱还有舱眼没有操作, 再执行取舱眼 807步骤; 直到作完当 前句舱的所有舱眼。 再判句舱完 810, 判当前已经划分出句型、 句舱的样本句例对中 是否还有未处理的句舱, 否, 还有未处理句舱, 执行取句舱 802步骤; 继续处理句舱。 如果全部句舱处理完毕, 执行再读入句对 801步骤, 读入下一个已经划分出句型、 句 舱的双语样本句对。 进行下一轮操作。 下面以具体的句例对进一步来说明上述歩骤流程: 开始运行, 读入(经句型层面比对过)已经划分出句型、句舱的双语样本对 801, 例如是 " 1 {the fisherman} consents to 2 { return the_feather_suit}, on condition that 3 {fairy dance and play heavenly music for him} .,,, " 在 3 {仙女为他跳舞 并演奏天上的乐曲 }的条件下, 1 {渔夫 }答应 2 {归还羽衣}。"这个例句对各有 1、 2、 3, 二个句舱。 执行取句舱、 显示、 组复词 802歩骤, 依次取出已经划分出句型、 句舱的 样本句例对当中的一个句舱, 开窗口一上部显示 A、 B语样本句例, 下部显示4、 B语 当前句舱内容。同时,把 A语当前句舱以词串为单元切分并依次填入参考表 A语字段, 再依次取出一个词串查找意群串库的 A文串字段, 找到后取出同记录的 B文串字段内 容。 如果该 B文串内容在 B语当前句舱中含有, 把 B文串内容填入参考表 B语字段, 不含有让它为空。 作完整个参考表, 开窗口二显示参考表、 组复词命令按钮以及可组 复词操作提示。假如句舱 2为当前句舱,当前句舱 A语是 "return the feather suit "; B语是 "归还羽衣"; 这时的参考表为:  In the case of the syllabus, the current sentence space is filled into the current module, or the new model is filled as a complex sentence display with the cabin and cabin. Continue to take the cabin eye 807, and then take out the contents of one of the cabins in turn, and then perform the steps of 808 alignment. In the second window reference table, the alignment determination command button is displayed. The reference table accepts the expert to extend or add the meaning of the word according to the example, does not change the original word word, and adds or subtracts the string length, the adhesive tape accompanying word, and the word form change supplementary word meaning item. A modification of the equivalent group alignment, or a preferred record. When the alignment determination command button is clicked, it indicates that the reference string of the reference table Α Β, Β 已经 has been intentionally aligned, that is, the group string has been formed, and the escrow group 808 operation is performed, and the contents of the Α and Β 字段 field are recorded as records. The group string component is stored in the string or the string field of the meaning group library. If the current operation is the cabin eye, the judgment eye is finished 809, no, if the current sentence compartment has no operation of the cabin eye, then perform the cabin eye 807 step; until all the cabin eyes of the current sentence cabin are completed. After the judgment of the sentence cabin is completed 810, it is judged whether there is an unprocessed sentence cabin in the sample sentence sentence sentence sentence sentence, and there is still an unprocessed sentence cabin, and the sentence box 802 step is executed; The sentence cabin. If all the sentence cabins have been processed, perform the re-reading sentence pair 801 step, and read in the next bilingual sample sentence pair that has been divided into sentence patterns and sentence boxes. Go to the next round of operations. The following is a detailed example to illustrate the above-mentioned process: Start running, read in (sentence level comparison) has divided the sentence sample, sentence cabin bilingual sample pair 801, for example "1 {the fisherman } consents to 2 { return the_feather_suit}, on condition that 3 {fairy dance and play heavenly music for him} .,,, " Under the condition of 3 {fairs dancing for him and playing the music of the heavens}, 1 {fisherman} promised 2 {return the feathers}. "There are 1, 2, 3, and two sentence cabins for each example sentence. Execute the sentence box, display, group compound word 802 steps, and then take out the sentence sentence sentence sentence sentence sentence sentence sentence sentence to one sentence Cabin, the upper part of the open window shows the sample sentences of A and B, and the lower part shows the current sentence of the 4th and B. At the same time, the current sentence of the A language is divided into the word string and filled in the reference field A field. Then, take a word string to find the A string field of the meaning group library, and then find out the content of the B string field of the same record. If the content of the B string is included in the current sentence box of the B language, the content of the B string is Fill in the reference table B language field, does not contain it to be empty. Make a complete reference table, open window two display reference table, group compound word command button and group compound word operation prompt. If sentence box 2 is the current sentence cabin, The current sentence A is "return the feather suit"; B is "returning feathers"; the reference table at this time is:
A i吾字段: return the feather suit A i my field: return the feather suit
B语字段: 归还 V 参考表 B语有三个记录为空, 但对应于 " feather"有 "羽毛"; 对应于 " suit " 有 "衣服"; 并且这三个记录都有专家点击的 " "标志。 并且复词命令按钮已被点 击, 将参考表中有标志记录的 A语字段内容以 "_"相连组成复词, 并把有标志记录 合并成一条记录, A语字段填入该复词, B语字段以相等语意的词串填写。 这时参考 表变为: — — B-language field: Return V reference table B has three records that are empty, but corresponding to "feather" has "feather"; corresponding to "suit" has "clothes"; and these three records have the "" mark of the expert click . And the compound word command button has been clicked, and the contents of the A language field with the mark record in the reference table are connected by "_" to form a compound word, and the marked record is merged into one record, and the A language field is filled in the compound word, B The language field is filled in with an equal semantic string. At this point the reference table becomes: — —
B语字段: 归还 羽衣 现在, 当前句舱 A语是 "return the_feather_suit "; B语是 "归还 羽衣"; 接续判断是否简单句舱 803, 是, 接续意群对齐保存意群串 808步骤; 这时已经是意 群对齐, 然而保存意群串, 把 "return"、 " 归还", 加入意群串库 A、 B文串字段, 如图 5a; 把 " the— feather_Suit "、 " 羽衣"加入意群串库 A、 B文串字段, 如图 5b。 否, 假如当前句舱是句舱 3, 不是简单句舱, 判有无舱模 804, 查询舱模库判断当前 句舱是否含有舱模。 否, 不含舱模, 执行编写舱模 805步骤, 开窗口三作为可编辑窗 口, 将当前双语句舱内容 " fairy dance and play heavenly music for him,,、 " 仙 女为他跳舞并演奏天上的乐曲"再显示, 接受专家以此为基础编写舱模, 还显示存舱 模命令按钮。若含有舱模,把所含舱模作为当前舱模执行步骤:划分出舱模、舱眼 806。 当存舱模命令按钮被点击,并且可编辑窗口, 如这时为 " {1} {2} and {3} for him"、 " {1}为他 {2}并 {3} ", 已经被编辑过; 是新编舱模也符合格式要求,将新编 A、 B语 舱模作为舱模构件存入舱模库 A文舱模、 B文舱模字段,如图 4所示。 同时,将当前句 舱内容对号入座地填入当前舱模, 或填入新编舱模, 如: " 1 {fairy} 2 {dance} and 3 {play heavenly music} for him"、 " 1 {仙女 }为他 2 {跳舞 }并 3 {演奏天上的乐曲} " 作为已经划分出舱模、 舱眼的复杂句舱显示。 这个复杂句舱含有 3个舱眼。 续取舱眼 807, 依次取出一个舱眼的内容, 接续执行意群对齐 808步骤。 在第二个窗口参考表 下显示对齐确定命令按钮, 参考表接受专家按实例延伸或增补词义、 不改变原有字单 词的前提下加减串长度、 粘带附随字、 词形变化增补词义项等意群对齐的修改, 或优 选记录。 这三个舱眼 A、 B文都有相对应的原有词串, 不用多说明。 其中 "play" 中 文原有词典只有 "游戏, 比赛, 运动, 赌博, 剧本; 玩, 扮演, 播放, 进行比赛、 播 放"等而没有 "演奏"接受专家 "按实例延伸或增补词义"加上 "play"、 "演奏", 当前舱眼是舱眼 3, 参考表为: B language field: Returning the feathers now, the current sentence A is "return the_feather_suit"; B is "returning the feathers"; the subsequent judgment is whether the simple sentence 803, is, the continuation of the group alignment saves the group 808 step; has a meaning of aligned, but are intended to save string group, the "return", "return", is intended to join the group string library A, B text string field, as shown in FIG. 5A; the "the- feather_ S uit", "plumage" added The meaning group library A, B text field, as shown in Figure 5b. No, if the current sentence compartment is a sentence cabin 3, not a simple sentence cabin, it is judged that there is no cabin module 804, and the cabin model library is judged whether the current sentence cabin contains a cabin model. No, without cabin model, execute the step of writing cabin module 805, open window three as an editable window, and the current double statement cabin content "fair dance and play heavenly music for him,,""The fairy dances for him and plays the music in the sky. "Re-display, accept the expert to write the cabin model, and also display the storage module command button. If the cabin module is included, the cabin module is used as the current cabin module: the cabin mold and the cabin eye 806 are divided. The cabin mode command button is clicked and the window can be edited, such as "{1} {2} and {3} for him", "{1} is his {2} and {3}", has been edited The new cabin model also meets the format requirements, and the newly edited A and B language modules are stored as cabin modules in the cabin model A and the B cabin model fields, as shown in Figure 4. The sentence cabin content is filled in the current cabin model, or filled into the new cabin model, such as: "1 {fairy} 2 {dance} and 3 {play heavenly music} for him", "1 {fairy} for him 2 { Dancing} and 3 {playing the music of the sky} "As a complex sentence cabin display that has been divided into cabins and cabins. This complex sentence Contains 3 cabin eyes. Continue to take the cabin eye 807, take out the contents of one cabin eye in turn, and then perform the step of performing the group alignment 808. In the second window reference table, the alignment determination command button is displayed, and the reference table accepts the expert extension by example or Adding the meaning of the word, changing the length of the string, adding the tape, attaching the word, changing the word form, adding the meaning of the word, etc., or modifying the meaning of the group alignment. The three cabins A and B have The corresponding original word string does not need to be explained. Among them, the "play" Chinese original dictionary only has "game, competition, sports, gambling, script; play, play, play, play, play", etc. without "playing" acceptance. The expert "extends or adds words according to the example" plus "play" and "play". The current cabin eye is the cabin eye 3. The reference table is:
A语字段: play heavenly music A language field: play heavenly music
B语字段: 演奏 天上的 乐曲 当对齐确定命令按钮被点击,表示参考表内 A、 B语的词串已成意群串,进行保存 意群串 808操作, 逐记录地把 A、 B语字段内容作为意群串构件存入意群串库的 A文 串或 B文串字段。 如果当前操作的是舱眼, 判舱眼完 809, 否, 如果当前句舱还有舱 眼没有操作,再执行取舱眼 807步骤;直到作完当前句舱的所有舱眼。再判句舱完 810, 判当前已经划分出句型、 句舱的样本句例对中是否还有未处理的句舱, 是, 还有未处 理句舱, 执行取句舱 802步骤; 继续处理句舱。 如果全部句舱处理完毕, 执行再读入 句对 801步骤,读入下一个已经划分出句型、句舱的双语样本句对。进行下一轮操作。 句舱层面比对操作如上述"组复词", 将原有词串以 "-"相连成复词(复串); 它 ; B-language field: Playing the music of the sky When the alignment determines that the command button is clicked, indicating that the word string of the A and B words in the reference table has been formed into a group string, the save meaning group string 808 operation is performed, and the A and B language fields are recorded one by one. The content is stored as an A-string or a B-string field of the meaning group library as a group of meaning strings. If the current operation is the cabin eye, the judgment eye is finished 809, no, if the current sentence cabin and the cabin eye are not operated, the step of taking the cabin eye 807 is performed; until all the cabin eyes of the current sentence cabin are completed. After the judgment of the sentence cabin is completed 810, it is judged whether there is still an unprocessed sentence compartment in the sample sentence sentence sentence sentence sentence, and there is still an unprocessed sentence cabin, and the execution of the sentence cabin 802 step; The sentence cabin. If all the sentence cabins have been processed, perform the re-reading sentence pair 801 step, and read in the next bilingual sample sentence pair that has been divided into sentence patterns and sentence cabins. Go to the next round of operations. The sentence level comparison operation is as described above for "group compound words", and the original word string is connected by "-" into a compound word (multiple string); ;
种情况, 要求专家掌握: In case of circumstances, experts are required to master:
①直接词译不能表达时, 按意群组成复串 you were gone 直译应是 "你 (是) 走"; "了"来源于过去式, 故组成复 串 "you— were— gone 你走了,, cal l—on— me 来访我 knew_nothing_about_it 一无所知 compel led— to—go 非去不可 show— himself— in_his— true— colours 现出原形、 现出本色 1 When the direct word translation can't be expressed, the group of the meaning group is composed of you are gone. The literal translation should be "you (yes) go"; "the" comes from the past tense, so the composition of the string "you- were- gone, you are gone. ,, cal l-on- me visit me knew_nothing_about_it know nothing about compel led-to-go non-de-show-self-in_his- true- colours
②以简洁看齐, 复杂表示者组成复串 英 " per-mu grain yield , 每亩谷物产量"; 中为 "亩产"。 将英组成复串 "per- grain—yield", 向中看齐。 又如: 2 succinctly aligned, complex representatives form a series of "per-mu grain yield, per mu yield of grain"; in the middle of "mu production." Put the British into a series of "per-grain-yield", and look at it in the middle. Another example:
Late— at— night 深夜 down—to— the—countryside 下乡 fight— it— out_to_the— end 斗争到底 Late-at-night late night down-to-the-countryside to the countryside fight- it-out_to_the-end
③有转义表示一事物, 或经常性搭配的组成复串 fel l_asleep 入睡 knows—nothing —无所知 come—back— from_the_front 下火线 the— sweat— was— pouring— down 汗如雨下; pictures— it_have_just_taken 近照 put_my_f inger_on 指出 3 escaping means a thing, or a combination of regular fel l_asleep falling asleep know-nothing - no knowledge of come-back - from_the_front under the line the - sweat - was - pouring - down sweaty; pictures - it_have_just_taken Proximity put_my_f inger_on indicates
④词义有重合或重复的组成复串 very—wel l 不错; doubts_of_questions 疑问; bear—fruit 结果; the—far— distance 远处; 经过上述组复词的操作, 往往在一个句舱里, 部分词串组成了复串, 使部分看似 复杂的句舱变为简单句舱。 组复词也是意群对齐的措施之一, 应灵活应用。 句舱层面比对操作如上述 "意群对齐" 的操作要求专家掌握的原则、 措施如下: 4 words have coincidence or repetition of the composition of the string very-wel l is good; doubts_of_questions doubt; bear-fruit results; the-far-distance far; after the above group of compound operations, often in a sentence cabin, part of the string Formed a complex string, making some seemingly complicated sentence cabins into simple sentence cabins. Group compounding is also one of the measures of ensemble alignment and should be applied flexibly. The principles and measures that the expert requires in the operation of the sentence-level alignment operation as described above are as follows:
①依照实例、 延伸或增补词义 2 {应用的 } 就不可能 1 {完美 } 。 其中" useful"词义有 "有用的、有帮手的 、有益的",选它们都不适合; 增补"应 用的"词义项。 1 in accordance with the example, extension or addition of meaning 2 {Applied} It is impossible to {{perfect}. Among them, the meaning of "useful" has "useful, helpful, and beneficial", and it is not suitable to choose them; add the word "applied".
1 {She} was 2 {strong} , for al l 1 {she} was so 3 {smal l} . 1 {¾} 虽然 3 {瘦 小} , 但很 2{结实} 。 如: 其中 small 小的, 增补 "瘦小" 词义项。 1 {She} was 2 {strong} , for al l 1 {she} was so 3 {smal l} . 1 {3⁄4} Although 3 {thin small}, but very 2{strong}. Such as: where small small, add "skinny" word meaning.
I am very i ll. 我患重病。 把动词 "am" 增补 "患" 的词义。 I am very i ll. I am seriously ill. Add the verb "am" to the meaning of "affected".
Learn the truth 明白真相, Learn 增补 "明白 "词义项。 Learn the truth To understand the truth, Learn adds the word "understand".
②不改变原有字、 单词的前提下加减串长度, 便于拼接 2 Adding or subtracting the length of the string without changing the original word or word, easy to splicing
I ask you to teach me every other day. 我请你每隔一天来教我。 其中〃 teach〃I ask you to teach me every other day. I ask you to teach me every other day. Where 〃 teach〃
V 有 "讲授、 教授"词义; 减词串长为 "教"便于拼接, 增加 "教"词义项。 V has the meaning of "teaching, teaching"; the length of the word string is "teaching" for splicing, and the meaning of "teaching" is added.
Sttend school 人学; sttend力口 "人,, 词义项, 简释 "力口人"; school力口 "学,, 词义项, 简释 "学校"。 Sttend school; sttend force "people,, word meanings, brief interpretation of "power mouth people"; school power mouth "learning," meaning, "school".
③粘带附随词串 如 " good好"粘带成 "好处、 好事、 好心、 好用"等 (中文别附随量词,后者另 行处理)。 如 "word 词"粘带成 "词儿"等。 3 sticky tape accompanying the string of words such as "good good" sticky into "benefits, good things, kindness, easy to use", etc. (Chinese with the accompanying words, the latter is handled separately). For example, "word word" sticks into "word" and so on.
④词形变化增补词义 (分词, 比较级等表达的语意, 增加相应词义), 这样可以 省略词形分析、 处理。 been 增补 "还是、 怎么样"词义; punished 增补"受处分"词义; best增补 "最好的"词义; had 增补 "以前"词义; done增补 "做好了、 完成了"词义; vi l lage^t曾补"多个农村"词义; 三、 一种基于语句构件的母语读外文方法 图 9是基于语句构件的母语读外文方法流程图。如图 9所示, 开始作一些界面等 准备。 由用户指定母语和源语各是构件库所含的哪一个文种。 把屏幕分成上、 中、 下 或前、 中、 后三个窗口; 中部窗口用于显示当前操作句以及操作中的相关信息, 下或 后部窗口用于显示源语待读文本, 上或前部窗口用于显示己读的母语文本。 此外, 在 提示行显示悔操作、 存盘退出等命令按钮以及→、 一移词序按钮; 或把它们作成浮条 紧随中部下或用户可移。 然后, 运行, 源语句读入 901, 读入源语一个句子作为当前 , , , 4 The change of the word form adds the meaning of the word (the semantics of the expression of the word segmentation, comparative level, etc.), so that the word form analysis and processing can be omitted. Has been added "still, how" word meaning; punished added "accepted"meaning; best added "best"meaning; had added "previous"meaning; done added "completed, completed"meaning; vi l lage^ t has supplemented the meaning of "multiple rural"; third, a method of reading foreign languages based on sentence components. Figure 9 is a flow chart of foreign language reading methods based on sentence components. As shown in Figure 9, preparations for some interfaces are started. The native language and the source language specified by the user are each of the documents contained in the component library. Divide the screen into upper, middle, lower or front, middle and back windows; the middle window is used to display the current operation sentence and related information in the operation, and the lower or rear window is used to display the source language to be read, on or before The widget window is used to display the native language text that you have read. In addition, at the prompt line, a command button such as a repentance operation, a save exit, and the like, and a shift word sequence button are displayed; or they are made as a float bar immediately below the middle or the user can move. Then, run, the source statement reads in 901, reads a sentence from the source language as the current , , ,
内容。判断小习语 902, 以当前句查询习语库的源语习语字段。若有,给出小习语 903, 取出同记录的母语习语字段中的母语小习语, 显示在中部窗口, 并将同记录的习语码 读入世界文缓存区, 然后再执行步骤源语句读入 901。 若无, 没有找到, 续调用配句 型子程序 904。 以当前句查询句型库的源语句型字段, 如果一个匹配的句型也没有, 存句型库代表数于反馈缓存区。 如果查到多个匹配句型, 在中部窗口下部显示相应的 母语句型, 接受用户选定一个句型。 或者只查到一个匹配的句型, 给出同记录的句型 码、 母语句型以及源语句型, 将母语句型着重显示在中部窗口的上部, 把源语句子对 号入座地套入源语句型, 附注式显示在该窗口母语下方, 并把句型码读入世界文缓存 区。 接着执行取句舱 905步骤, 从左到右在中部窗口标示母语句型中的当前句舱, 存 入当前句舱标号, 即 (句型内原句舱标号 + FFE0H) 于世界文缓存区。 同时标示和取 出源语相应句舱内容作为当前句舱内容, 判断当前句舱内容是否属于简单句舱 906。 若是简单句舱, 执行步骤词义确定 909。 若否, 执行步骤查配舱模 907。 以当前句舱 内容查询舱模库的源语舱模字段。 如果一个匹配的舱模也没有, 存舱模库代表数于反 馈缓存区。 如果查到多个匹配舱模, 向下扩展中部窗口, 在扩展部显示相应的母语舱 模, 接受用户选定其一; 或只查到一个匹配的舱模; 给出同记录的舱模码、 母语舱模 以及源语舱模。 将母语舱模着重显示在中部窗口的扩展部, 把源语句舱内容对号入座 地套入源语舱模, 附注式显示在该窗口母语舱模的下方, 并把舱模码读入世界文缓存 区。 然而后续取舱眼 908步骤。 以母语舱模为准从左到右, 在母语舱模上逐个标示当 前舱眼; 存当前舱眼标号 (舱模上原舱眼标号 +FFD0H) 于世界文缓存区; 同时标示和 取出源语相应舱眼内容, 执行步骤词义确定 909。 从左到右读出源语的简单句舱或舱 眼中的一个词串, 查询意群串库的源语文串字段。 如果一条相同的词串也没有,存当 前源语词串于反馈缓存区。 若查到多条相同词串,分别取出它们的同记录母语串字段 内容,备份于悔选择缓存,并显示在已扩展的中部窗口下部,接收用户选定其一; 或只 查到一条相同词串,执行步骤给出母语串 909。取出当前记录的母语串字段内容填入到 当前母语句舱或当前母语舱眼, 取出意群码存入世界文缓存区; 继续执行词义确定 909, 直到当前简单句舱或当前舱眼操作完毕。 根据个性丢失表的信息进行当前句舱或舱眼的个性丢失补偿 911。再根据母语词 序表的信息纠正当前句舱或舱眼的母语词序 911。 并查询→、 一移词序按钮。 当→按 钮被点击将当前句舱或舱眼用户所点击词串后移于后一词串之后; 当一按钮被点击将 当前句舱或舱眼用户所点击词串前移于前一词串之前; 同时将移动后的词序加入母语 词序表备用。 后续判断 912, 如果当前句舱还有舱眼未处理, 执行步骤取舱眼 908。 后续判断句结束 913, 若否, 而当前句子还有句舱未处理, 执行步骤取句舱 905。 若 当前句子所有句舱全部处理完毕,查询反馈缓存区和命令按钮:当反馈缓存区不为空, 将反馈缓存区的信息加上源语种、 母语种、 当前源语句子等信息作成电子邮件反馈到 支持网站; 清空反馈缓存区, 在世界文缓存区存入 "反馈句"标志。 当悔操作命令按 , , content. The small idiom 902 is judged, and the source idiom field of the idiom library is queried with the current sentence. If yes, give the small idiom 903, take out the native idioms in the native idiom field of the same record, display it in the middle window, and read the idiom code of the same record into the essay buffer, and then execute the step source. The statement reads in 901. If not, it is not found, and the sentence-type subroutine 904 is continuously called. The source sentence field of the current sentence query query library, if there is no matching sentence pattern, the stored sentence library represents the number in the feedback buffer. If multiple matching sentence patterns are found, the corresponding parent sentence type is displayed in the lower part of the middle window, and the user is selected to select a sentence pattern. Or only find a matching sentence pattern, give the sentence pattern of the same record, the parent sentence type and the source sentence type, highlight the parent sentence type in the upper part of the middle window, and nest the source sentence pair into the source sentence type. , the note is displayed below the native language of the window, and the sentence code is read into the world cache. Then, the step 905 step is executed, and the current sentence box in the parent sentence type is marked from left to right in the middle window, and the current sentence box number is stored, that is, (the original sentence box number + FFE0H in the sentence pattern) is in the world text buffer area. At the same time, the corresponding sentence cabin content of the source language is marked and taken out as the current sentence cabin content, and it is judged whether the current sentence cabin content belongs to the simple sentence cabin 906. If it is a simple sentence cabin, perform step word determination 909. If not, perform a step to match the cabin mold 907. Query the source language mode field of the cabin model library with the current sentence content. If there is no matching cabin model, the library is represented by the feedback buffer. If multiple matching cabin models are found, the middle window is extended downward, the corresponding native cabin model is displayed in the extension, the user is selected to select one; or only one matching cabin module is found; the cabin code of the same record is given , native language module and source language module. The mother tongue module is highlighted in the extension of the middle window, and the source sentence cabin content is nested in the source language module, the note is displayed below the native module of the window, and the cabin model code is read into the world text buffer. . However, the subsequent take-up of the eye 908 step. From the left to the right of the native language model, the current cabin eye is marked one by one on the native language module; the current cabin eye mark (the original cabin eye mark + FFD0H on the cabin model) is stored in the world text buffer; For the cabin eye content, perform step word meaning determination 909. Read the simple sentence box of the source language or a word string in the cabin eye from left to right, and query the source language string field of the meaning group library. If there is no identical word string, save the current source word string in the feedback buffer. If multiple identical word strings are found, the contents of their same-recorded native language string fields are respectively taken out, backed up in the regret selection cache, and displayed in the lower part of the expanded middle window, and the receiving user selects one of them; or only one identical word is found. The string, the execution step gives a native string 909. The content of the native language string field of the current record is taken into the current parent sentence box or the current native language capsule, and the intention group code is stored in the world text buffer; the word meaning determination 909 is continued until the current simple sentence cabin or the current cabin operation is completed. Personality loss compensation 911 for the current sentence or cabin is based on the information of the personality loss table. Then correct the current sentence or 911 of the native sentence word order 911 according to the information of the native language word list. And query →, a move word order button. When the → button is clicked, the word string clicked by the current sentence or the cabin user is moved after the next word string; when a button is clicked, the word string clicked by the current sentence cabin or the cabin user is forwarded to the previous word string. Before; At the same time, add the word order after the move to the original word list for use. Subsequent judgment 912, if the current sentence compartment is still unprocessed, perform the step of taking the cabin eye 908. The subsequent judgment sentence ends 913. If no, and the current sentence has a sentence box that has not been processed, the execution step takes the sentence cabin 905. If all the sentences in the current sentence are processed, query the feedback buffer and the command button: when the feedback buffer is not empty, the information of the feedback buffer plus the source language, native language, current source sentence and other information is used as email feedback. Go to the support site; clear the feedback buffer and save the "feedback" flag in the world cache. Repentance command , ,
并作相关修改。 当接收到存盘退出命令时, 将世界文缓存区的内容存盘为世界文, 文 件名 =源语文件名 .sjw; 如果源文未完,文件头中记下源文偏移。 如果悔操作、 存盘退 出命令按钮都没有被点击时, 接续执行步骤源语句读入 901。 下面以实例进一步来说明上述步骤流程: 开始作一些界面等准备。 源语句读入 901, 读入源语一个句子如 "Children not Allowed!"作为当前句显示在中部窗口, 母语文本显示尾加已处理的前 句内容, 源 语文本显示减当前句内容。 判断小习语 902, 以当前句查询习语库的源语习语字段。 有, 给出小习语 903, 取出同记录的母语习语字段中的 "儿童不许入内! "这个母语小 习语, 显示在中部窗口, 并将同记录的习语码 (便于阅读给编码名称及低字位十进制 数, 下同; 如: "^ 读入世界文缓存区, 然后再执行步骤源语句读入 901。 假如读入的源语句子是 " The doctor told his patient that he would prescribe him some patent medicine on condition that he strictly follow his instructions.,,, 作为当前句显示在中部窗口, 母语文本显示尾加已处理的前一句内容 "儿童不许入 内!", 源语文本显示减当前句内容。 判小习语 902, 以当前句查询习语库的源语习语 字段。 无, 没有找到, 续调用配句型子程序 904。 以当前句查询句型库的源语句型字 段, 这时只查到一个匹配的句型, 给出同记录的句型码、 母语句型以及源语句型, 将 母语句型 " {1} 告诉他的 {2} , 如果能 {4} , 就可以 {3} 。"着重显示在中部窗口 的上部, 把源语句子对号入座地套入源语句型 "the {1} told his {2} that {3} on condition that {4} .,,, 附注式显示在该窗口母语下方 " the 1 {doctor} told his 2 {patient} that 3 {he would prescribe him some patent medicine} on condition that 4 {he strictly follow his instructions} . ", 并把句型码 "句型码 001061" 读入世界文缓存区。 接着执行取句舱 905步骤, 从左到右在中部窗口标示母语句型中 的当前句舱, 存入当前句舱标号于世界文缓存区。 同时标示和取出源语相应句舱内容 作为当前句舱内容, 判断当前句舱内容是否属于简单句舱 906。 若是简单句舱, 执行 步骤词义确定 909。 例如: 这里母语依次标示句舱 1、 句舱 2; 存句舱标号 " And make relevant modifications. When receiving the save exit command, save the contents of the world cache to World, file name = source file name .sjw; If the source text is not finished, the source offset is recorded in the file header. If the repentance operation and the save exit command button are not clicked, the execution step source statement is read into 901. The following steps are further illustrated by an example: Start preparing for some interfaces. The source statement reads 901, reads a sentence from the source language such as "Children not Allowed!" as the current sentence is displayed in the middle window, the native text display shows the content of the processed preceding sentence, and the source text displays the content of the current sentence. Judging the small idiom 902, querying the source idiom field of the idiom library with the current sentence. Yes, give a small idiom 903, take out the "Children are not allowed to enter!" in the native idiom field of the same record, this native idiom is displayed in the middle window, and the idiom code of the same record (for easy reading to the code name) And the low-word decimal number, the same as below; For example: "^ Read into the world buffer, and then execute the step source statement to read in 901. If the source statement read is "The doctor told his patient that he would prescribe him Some patent medicine on condition that he strictly follows his instructions.,, as the current sentence is displayed in the middle window, the native text display shows the last sentence of the processed sentence "Children are not allowed inside!", the source text display minus the current sentence content. The minor idiom 902 is used to query the source idiom field of the idiom library with the current sentence. None, not found, continue to call the sentence-type subroutine 904. The source sentence type field of the query sentence library is queried with the current sentence, then only Find a matching sentence pattern, give the sentence pattern of the same record, the parent statement type, and the source statement type, and tell the parent sentence type "{1} to his {2}, if Can {4}, it can be {3}." Emphasis is placed on the upper part of the middle window, and the source sentence is nested in the source statement "the {1} told his {2} that {3} on condition that {4 } .,,, Note is displayed under the mother tongue "the 1 {doctor} told his 2 {patient} that 3 {he would prescribe him some patent medicine} on condition that 4 {he strictly follow his instructions} . And the sentence code "sentence code 001061" is read into the world text buffer. Then execute the step 905 step, from left to right in the middle window to mark the current sentence in the parent sentence, and save the current sentence number in The world text buffer area. At the same time, the corresponding sentence cabin content of the source language is marked and taken out as the current sentence cabin content, and it is judged whether the current sentence cabin content belongs to the simple sentence cabin 906. If it is a simple sentence cabin, the step word meaning is determined 909. For example: Sentence 1, sentence 2; save sentence box number"
01"、 " 句舱标号 02"; 取出源语相应句舱内容 "dOCtor"、 " patient"判断都是简 单句舱(一个词串不展开; 下有三个串者再展开), 词义确定分别为"医生"、 "病人"。 执行步骤给出母语串 909。 取出当前记录的母语串字段内容填入到当前母语句舱 1、 句舱 2, 成为 " 1{医生 }告诉他的 2{病人 } , 如果能 {4} , 就可以 {3} 。"取句舱, 从左到右在中部窗口标示母语句型中的当前句舱, 现在应处理句舱 4。 续取句舱 905; 取出源语相应句舱内容 "he strictly follow his instructions: 判简单句舱 906, 否, 执行步骤查配舱模 907。给出同记录的舱模码" 凝 ί¾¾¾^7"、母语舱模" [1] + 他的 +[2] " 以及源语舱模 "he+[l]+his+[2] "。 源语句舱内容对号入座地套入源语舱 模为 "he 1 [strictly follow] his 2 [instructions] "。 似句舱 1-2处理后的母语是 "医生告诉他的病人, 如果能 4{1 [确实地 执行] +他的 +2 [医嘱] }, 就可以 {3} 。"接 , 01", "sentence box number 02"; take out the corresponding sentence content of the source language "d OC tor", "patient" judgment is a simple sentence cabin (a word string does not expand; there are three strings to expand again), the meaning of the word is determined The steps are "Doctor" and "Patient" respectively. The execution steps are given to the native language string 909. The content of the native language string field of the current record is taken into the current parent sentence compartment 1, the sentence cabin 2, becomes "1{Doctor} tells him 2{ Patient}, if it can {4}, it can be {3}. "take the sentence cabin, from left to right in the middle window to indicate the current sentence cabin in the parent sentence type, now should handle the sentence cabin 4. Continue to take the sentence cabin 905; take out the corresponding sentence content of the source language" he strictly follow his instructions: Simple sentence cabin 906, no, perform a step to match the cabin mold 907. Give the same recorded cabin code "condensation", the native language module "[1] + his +[2]" and the source language module "he+[l]+his+[2]". The source sentence cabin content is nested in the source language module as "he 1 [strictly follow] his 2 [instructions]". The mother tongue after the sentence 1-2 treatment is "the doctor told his patient that if he can 4{1 [definitely execute] + his +2 [doctor]}, then he can {3}." ,
him some patent medicine ", 判简单句舱 906, 否, 执行步骤查配舱模 907。 以当前 句舱内容查询舱模库的源语舱模字段。 只查到一个匹配的舱模 "he would prescribe him [ 1] "; 给出同记录的舱模码 " 凝 i¾¾¾WT、 母语舱模 "开 [ 1] 给他" 以及源 语舱模 "he would prescribe him [ 1] "。 将母语舱模着重显示在中部窗口的扩展部, 把源语句舱内容对号入座地套入源语舱模 "he would prescribe him 1 [some patent medicine] 附注式显示在该窗口母语舱模的下方, 并把舱模码" " 读 入世界文缓存区。 然而后续取舱眼 908步骤。 以母语舱模为准从左到右, 在母语舱模 上逐个标示当前舱眼; 存当前舱眼标号 " 于世界文缓存区; 同时标示和 取出源语相应舱眼内容, " some patent medicine 执行步骤词义确定 909。 从左到 右读出源语的舱眼中的一个词串, 查询意群串库的源语文串字段。 若查到多条相同词 串,分别取出它们的同记录母语串字段内容,备份于悔选择缓存,并显示在已扩展的中 部窗口下部,接收用户选定其一。 这三个串分别如: some 若干 —— ifcb 相当的 几个 patent 专利的 执照 特效 明白的 medicine 内科的 内服 药 用户选定为 " 一些 特效 药 "。 执行步骤给出母语串 909。 取出当前记录 的母语串字段内容填入到当前母语舱眼为 "开 1 [一些特效药]给他", 取出意群码 "意 群码 008264"、 "意群码 017655"、 "意群码 005484"存入世界文缓存区; 当前舱眼操 作完毕。 (这里有"丢失补偿、 母语词序 911 "因未涉及, 待下文补述)。母语句为"医 生告诉他的病人, 如果能确实地执行他的医嘱, 就可以 3 {开 1 [一些 特效 药] 给 他} 。"判还有舱眼 912, 无。 再续判断句结束 913, 是。 至此, 当前句子所有句舱全部处理完毕, 查询反馈缓存区和命令按钮: 当反馈缓 存区不为空, 将反馈缓存区的信息加上源语、 母语、 当前源语句子等信息作成电子邮 件反馈到支持网站; 清空反馈缓存区, 在世界文缓存区存入 "反馈句"标志。 当悔操 作命令按钮被点击, 根据用户点击的欲悔词串, 取出悔选择缓存中的相应内容让用户 重选词串并作相关修改。 这已足够清楚完整, 所属技术领域的技术人员能够实现。 其 中悔操作即在词义确定中, 从多个词义当中选一时, 将当时所属之舱、 目艮, 多个词条 全部纳入一个 "悔操作表"内, 悔操作时在用户所点击的母语句中, 根据其所处句舱、 舱眼及词串给出同批表中内容, 让用户重选并作相应修改。 悔操作表有悔批次、 句舱 号、 舱眼号、 词串、 意群码等字段。 舱号、 眼号相同, 源语串不同悔批号也就不同。 这时,假如接收到存盘退出命令时,将世界文缓存区的内容,如上例 2句内容" ^ 语码 00064; 句型码 001061;句舱 1号;意群码 002131;句舱 2号;意群码 006386;句舱 4号;舱模码 00207;舱眼 1号;意群码 016841; 意群码 017951;舱眼 2号;意群码 019882; 存盘为世界文, 文件名 =源语文件名 . SJW; 如果源文未完,文件头中记下源文偏移。 然 而, 整个流程步骤终止。 假如悔操作、存盘退出命令按钮都没有被点击时,接续执行步骤源语句读入 901。 补述 (上文有 "丢失补偿、 母语词序 911 "等, 这里补述) : 根据个性丢失表的信息进行当前句舱或舱眼的个性丢失补偿 911。个性丢失补偿 因由是上文述及简单句舱和舱眼虽然是上、 下位概念, 但大小一样, 都是除不表意虚 词外少于或等于三个意群串。 不表意虚词例如中文的量词, 英文的冠词等。 这些丢失 了的词串在以母语读出时给以补偿。 其信息来源于个性丢失表, 个性丢失表含有关联 词串、 补偿串等字段。 母语词序有时需要调整, 其因由也是来自于句舱、 舱眼, 它们大小一样, 都是除 不表意虚词外少于或等于三个意群串。 在机内这三个意群串的前后次序没有要求, 所 以基于本语句构件的各种应用。 有可能存在母语词序有时需要调整的情况。 其调整简 单方便, 先可利用一个母语词序表, 母语词序表含有首词串、 读出串、 调整串字段。 读出串即当用母语读出时的串序; 首词串即读出串的首串; 调整串即应调整的词序。 当当前句舱或当前舱眼全部词串词义确定之后, 查该表, 查到符合者自动调整之。 然 后, 让系统判读 "一按钮"和 "一按钮"按钮。 如果被用户点击, 根据用户意图再调 整之。 当 "→按钮"被点击将当前句舱或舱眼用户所点击词串后移于后一词串之后; 当 "一按钮"被点击将当前句舱或舱眼用户所点击词串前移于前一词串之前; 同时将 移动后的词序加入母语词序表备用。 有一个不必干预的情况是当用户读者要求读出速度、不介意时也可不必干预。因 而丢失补偿、 母语词序等功能是作成让用户可选的。 有关支持网站: 上文述及作成电子邮件反馈到支持网站。当支持网站接收到来自 用户的反馈邮件时, 由专家实时处理后, 新构件加入相应构件库, 并将新构件及相关 信息实时反馈给用户, 并在用户的参与下替换原 "反馈句"标志。 这是①用户支持之 一; 用户有什么意见、 建议等等都可以通过这个方式进行沟通和支持。 此外对于②版 本升级, 可以得到社会性检验和社会性的积累。 ③引导多语种共同发展, 本发明的应 用, 为世界性跨语种交流提供了一个平台。 同时也结束了自然语言在各自独立体系内 缓慢地演变和发展的历史; 开始了多语种共同快速发展的历程。 例如要修正、 淘汰或 新增意群文字串; 推广新术语等等可以通过本发明的应用直接向写作者建议、 推荐; 向阅读者宣传解释。 上文述及在用户利用母语读外文的过程中,把世界文缓存区的内容存盘生成世界 文。 这样一篇外语文章只要一人读过, 后面的千千万万人就可以读世界文了, 读世界 文比母语读外文更快捷、 不用干预, 语意准确, 读出文种用户自选, 世界文的多语读 , Him some patent medicine ", judge the simple sentence cabin 906, no, perform the procedure to check the cabin model 907. Query the source language module field of the cabin model library with the current sentence content. Only find a matching cabin model "he would prescribe Him [ 1 ] "; gives the same recorded cabin model code "condensed i3⁄43⁄43⁄4WT, mother tongue module "open [1] to him" and source language module "he would prescribe him [1]". The mother tongue module is highlighted in the extension of the middle window, and the source sentence cabin content is nested in the source language module "he would prescribe him 1 [some patent medicine], and the note is displayed below the native font of the window, and Read the cabin model code "" into the essay buffer. However, follow the steps of 902. From the left to the right, the native cabin is marked one by one on the native cabin model; The world text buffer; at the same time labeling and extracting the corresponding cabin content of the source language, "some patent medicine execution step word meaning determination 909. Read a word string in the cabin eye of the source language from left to right, query the source language of the meaning group library String field. If multiple identical word strings are found, the contents of their same-recorded native language string fields are taken out, backed up in the regret selection cache, and displayed in the lower part of the extended middle window, and the receiving user selects one of them. For example: some several - ifcb quite a few patent patent license effects understand the medical internal medicine user selected as "some special effects". 909. Take out the contents of the current record of the native language string and fill in the current native language capsule to "open 1 [some special effects] to him", take out the group code "meaning group code 008264", "Italian group code 017655", "Italian Group code 005484" is stored in the world text buffer; the current cabin operation is completed. (There are "loss compensation, native language order 911" because it is not involved, to be added below.) The mother sentence is "The doctor told his patient, if If you actually execute his doctor's advice, you can give him {{1} some special effects}. "There is also a cabin eye 912, no. Continued judgment sentence ends 913, yes. At this point, all the sentence boxes of the current sentence are all processed, query feedback buffer and command button: When the feedback buffer is not empty, the feedback buffer will be The information plus the source language, the native language, the current source sentence and other information is made into an email feedback to the support website; the feedback buffer is cleared, and the "feedback sentence" flag is stored in the world text buffer. When the regret operation command button is clicked, according to The user clicks on the confession string, and extracts the corresponding content in the confession selection cache to allow the user to re-select the word string and make related modifications. This is clear enough and complete, and can be implemented by those skilled in the art. When selecting one of a plurality of meanings, the class, the directory, and the plurality of terms belonging to the time are all included in a "repentance operation table", and the repentance operation is in the parent sentence clicked by the user, according to the sentence cabin in which the user is located. , the cabin eye and the word string give the contents of the same batch table, let the user re-elect and modify accordingly. The regret operation table has a regret batch, a sentence cabin number, a cabin number, a string of words, Fields such as group code. The cabin number and the eye number are the same, and the source string is different. The content of the world text buffer, as in the case of the above example, is "^"00064; Sentence pattern 001061; sentence cabin No. 1; Yiqun code 002131; sentence cabin No. 2; Yiqun code 006386; sentence cabin No. 4; cabin model code 00207; cabin eye No.1; Yiqun code 016841; 017951; cabin eye 2; Yiqun code 019882; Save the file as World, file name = source file name. SJW; If the source text is not finished, write down the source text offset in the file header. However, the entire process step is terminated. If the repentance operation and the save exit command button are not clicked, the execution step source statement is read into 901. Supplement (the above is "loss compensation, native language order 911", etc., here): According to the information of the personality loss table, the current sentence or cabin eye loss compensation 911. The reason for the loss of personality loss is that the above-mentioned simple sentence cabin and cabin eye are the upper and lower concept, but the size is the same, except that they are less than or equal to three meaning clusters except the word. Not imaginary words such as Chinese quantifiers, English articles, etc. These lost words are compensated when read in their native language. The information is derived from the personality loss table, and the personality loss table contains fields such as associated strings and compensation strings. The native language word order sometimes needs to be adjusted. The reason is also from the sentence cabin and the cabin eye. They are the same size, and they are less than or equal to three Italian clusters except for the unspoken words. There is no requirement for the order of the three clusters in the machine, so the various applications based on the components of this statement. There may be cases where the native language word order sometimes needs to be adjusted. The adjustment is simple and convenient. First, a native language word list can be used. The native language word list contains the first word string, the read string, and the adjustment string field. The read string is the serial sequence when read in the native language; the first string is the first string of the read string; the adjusted string is the word order that should be adjusted. After the current sentence box or the current cabin eye is determined by all the words, the table is checked and the match is automatically adjusted. Then, let the system interpret the "one button" and "one button" buttons. If clicked by the user, adjust it according to the user's intention. When the "→ button" is clicked, the word string clicked by the current sentence or the cabin user is moved after the next word string; when the "one button" is clicked, the word string clicked by the current sentence cabin or the cabin user is forwarded to Before the previous word string; at the same time, the moved word order is added to the native language word list for use. One situation that does not require intervention is that the user's readers do not have to intervene when they ask for speed and do not mind. Therefore, the functions of loss compensation, native language order, etc. are made available to the user. About the support website: The above mentioned emails are sent to the support website. When the support website receives the feedback email from the user, after the expert processes it in real time, the new component is added to the corresponding component library, and the new component and related information is fed back to the user in real time, and the original "feedback sentence" flag is replaced by the user's participation. . This is one of the 1 user support; users can have any opinions, suggestions, etc. to communicate and support in this way. In addition, for the 2 version upgrade, social testing and social accumulation can be obtained. 3 Guide multilingual development together, the application of the present invention provides a platform for worldwide cross-lingual communication. It also ended the history of the natural evolution and development of natural language in their respective independent systems; the beginning of a multi-lingual common rapid development process. For example, it is necessary to correct, eliminate or add a string of meaning texts; promote new terms and the like can directly suggest and recommend to the author through the application of the present invention; and publicize the explanation to the reader. As mentioned above, in the process of reading a foreign language by the user in the mother tongue, the contents of the world text buffer are saved to generate the world text. As long as one foreign language article has been read by one person, thousands of people in the back can read the world, and reading the world is faster and easier than the native language to read the foreign language, without intervention, accurate semantics, read the user choice, world language Multilingual reading ,
群码 002131;句舱 2号;意群码 006386;句舱 4号;舱模码 00207;舱眼 1 号;意群码 016841; 意群码 017951;舱眼 2号;意群码 019882; 句舱 3号;舱模码 00206;舱眼 1 号;意群码 008260;意群码 017655;意群码 005484;": 假如以中文读出, 执行步骤: ① 依次逐个取出代码; ②用开关语句将代码分类分别处理; ③其中如果是句舱标号、 舱 眼标号, 用以指示当前句舱或当前舱眼; ④把句型码、 习语码、 舱模码、 意群码分解 为某库某记录号, 给出某库、 某记录的某读出文种字段内容, 如果是意群码则按指示 给出到当 句舱或当 舱眼; ⑤接续执行①直到文本结束。 例如: Group code 002131; sentence cabin 2; Yiqun code 006386; sentence cabin 4; cabin model code 00207; cabin eye 1; Yiqun code 016841; Yiqun code 017951; cabin eye 2; Yiqun code 019882; Cabin No. 3; cabin model code 00206; cabin eye No. 1; Yiqun code 008260; Yiqun code 017655; Yiqun code 005484; ": If read in Chinese, perform steps: 1 Remove the code one by one; 2 Use the switch statement Classify the code separately; 3 if it is a sentence number, a cabin number, to indicate the current sentence or the current cabin; 4 to decompose the sentence code, idiom code, cabin model code, and group code into a library A record number, which gives the contents of a certain read field of a certain library or a record. If it is an intention group code, it is given to the sentence cabin or the cabin eye according to the instruction; 5 continues execution until the end of the text. For example:
^ -取-习语库 64记录的中文习语字段内容 "儿童不许入内!"。 句型码 1061记录的中文句型: ^ - fetch - idiom library 64 recorded Chinese idiom field content "Children are not allowed inside!". Sentence pattern 1061 Chinese sentence pattern recorded:
" {1} 告诉他的 {2} , 如果能 {4} , 就可以 {3} 。,, 句舱 1 指示 W -取-意群串库 2131记录的中文串 "医生 "据所指填入句舱 1成为: "1{医生} 告诉他的 {2} , 如果能 {4} , 就可以 {3} 。,, 句舱 2号- 亓、 盧 ^^W^ -取-意群串库 6386记录的中文串 "病人 "据所指填入句舱 2成为: "医生告诉他的 2{病人 } , 如果能 {4} , 就可以 {3} 。" 句舱 4号 H亓、 舱模码 207记录的中文舱模 " [1]+他的 +[2] "据所指填入句舱 4成为: "{1} tells his {2}, if {4}, it can be {3}.,, sentence 1 indicates that the Chinese string "doctor" recorded by the W-fetch-sense group library 2131 is filled in according to the reference. The sentence cabin 1 becomes: "1{Doctor} tells his {2}, if {4}, it can be {3}. ,, sentence cabin No. 2 - 亓, 卢^^W^ - take - meaning group string library 6386 recorded Chinese string "patient" according to the instructions filled in the sentence cabin 2 becomes: "Doctor told his 2{patient}, if Can {4}, it can be {3}." The Chinese cabin model recorded in the sentence cabin No. 4 H亓, cabin model code 207 "[1] + his + [2]" is filled in the sentence cabin 4 as follows:
"医生告诉他的病人, 如果能 4{ [1] 他的 [2]} , 就可以 {3} 。,, 舱眼 1 指示 取-意群串库 16841记录的中文串 "确实地" 意群码 017951- -^ 17951记录的中文串 "执行"据所指填入舱眼 1成 为: "The doctor told his patient that if he could 4{[1] his [2]}, he could {3}.,, the cabin eye 1 indicates that the Chinese string of the 16841 record of the meaning group was "really". Code 017951- -^ 17951 The Chinese string "execution" recorded is filled in the cabin eye 1 as follows:
"医生告诉他的病人, 如果能 4{1 [确实地执行]他的 [2]}, 就可以 {3} 。,, 舱眼 2号 取-意群串库 19882记录的中文串 "医嘱" 据所指填入舱眼 2成 , , 句舱 3号- 亓、 凝 -取-舱模库 206记录的中文舱模 "开 [1] 给他" 据所指填入句舱 3成为: "The doctor told his patient that if he could [4] perform [2], he could {3}.,,,,,,,,,,,,,,,,,,,,,,,,,,,,, According to the instructions, fill in the cabin eye 2 , , sentence cabin No. 3 - 亓, 凝 - take - cabin model library 206 recorded Chinese cabin model "open [1] to him" According to the sentence filled into the sentence cabin 3 becomes:
"医生告诉他的病人, 如果能确实地执行他的医嘱, 就可以 3 {开 [1] 给他 } 。" 舱眼 1 指示 意群码 取 -意群串库 8260记录的中文串 "一些 "; 盧 W 7^5-取-意群串库 17655记录的中文串 "特效"; 取-意群串库 5484记录的中文串 "药 "; 据所指填入舱眼 1成为: "The doctor told his patient that if he can actually execute his doctor's advice, he can 3 [open [1] to him}." Cabin 1 indicates the meaning of the group code - the Chinese string "some" recorded by the group of letters 8260 ; Lu W 7^5- fetch - meaning group string library 17655 recorded Chinese string "special effect"; take - meaning group string library 5484 recorded Chinese string "medicine"; according to the reference into the cabin eye 1 becomes:
"医生告诉他的病人, 如果能确实地执行他的医嘱, 就可以 3 {开1 [一些特效药] 给他 } 。" 母语读外文方法是基于语句构件的应用之一。参照其中利用四个语句构件库对当 前句的编码步骤、 世界文读出的译码步骤, 可以产生多种基于语句构件的应用系统: 基于语句构件的世界文生成的方法系统。用于将传统文本转换成世界文,然后可 以进行多语种读出。 基于语句构件的文本转换方法。用于将某源语文本转换成目语文本给出, 或转换 成多文种给出。 基于语句构件的机器翻译方法。用于将某源语翻译成目语给出,或翻译成多语种。 实施本发明所产生的软件系统可以在现有的中型、 小型、 微、 巨型计算机, 笔记 本电脑、 掌上电脑等单独的或者相连成网的计算机上运行实施。 可以在各种计算机网 络, 特别是在因特网上运行实施。 还可以在诸如 "个人数字助理", PDA (Personal Digital Assistant)的装置上运行实施。 本发明实施后的产品, 可以应用于需要和其 它语种的人们进行交流的工作、 学习、 休闲、 旅游等等场合; 可以用于家庭、 机关、 学校以及各行各业涉及外文的场合。 "The doctor told his patient that if he can actually execute his doctor's advice, he can give him a "some special medicine"." The native language reading method is one of the applications based on statement components. Referring to the decoding step in which the four sentence component libraries are used for the encoding process of the current sentence and the decoding process of the world sentence, a plurality of sentence component-based application systems can be generated: a method system based on sentence component-based world text generation. Used to convert traditional text into world, then multilingual readout. A text transformation method based on statement components. Used to convert a source text into a textual text, or to convert it into multiple languages. A machine translation method based on statement components. Used to translate a source language into a language, or to translate into multiple languages. The software system resulting from the implementation of the present invention can be implemented on existing medium, small, micro, supercomputers, notebook computers, PDAs, and the like, or on separate or connected computers. Implementations can be run on a variety of computer networks, particularly on the Internet. The implementation can also be run on devices such as "Personal Digital Assistant", PDA (Personal Digital Assistant). The product after the implementation of the invention can be applied to work, study, leisure, travel, etc., which need to communicate with people in other languages; it can be used in homes, institutions, schools, and various fields involving foreign languages.

Claims

WO 2009/103208 权 利 要 求 书 PCT/CN2008/072593  WO 2009/103208 Rights Requirements PCT/CN2008/072593
1.一种语句构件装置, 包括 CPU、 输入、 输出和存放响应查询的相关索引表, 其 特征在于还包括: 语句构件存储部(101), 语句构件存储部含有用电子数据形式构成的语句构件的 语句构件库: 句型库 (300), 用于存储句型构件, 有句型码、 多文种句型字段, 其包含至少 一个记录, 同记录内的不同文种句型之语意相同; 舱模库 (400), 用于存储舱模构件, 有舱模码、 多文种舱模字段, 其包含至少 一个记录, 同记录内的不同文种舱模之语意相同; 意群串库 (500、 502), 用于存储意群串构件, 有意群码、 多文种串字段, 其 包含至少一个记录, 同记录内的不同文种意群串之语意相同; 习语库 (600), 用于存储小习语构件, 有习语码、 多文种习语字段, 其包含至 少一个记录, 同记录内的不同文种小习语之语意相同; 意通代码编制部(103), 与语句构件存储部(101)相连,用于对语句构件库的每一 个记录编制一个意通代码。 A statement component device comprising a CPU, an input, an output, and a related index table storing a response query, further comprising: a sentence component storage unit (101), the statement component storage unit including a statement component formed in the form of electronic data Statement component library: sentence pattern library (300), used to store sentence structure components, has a sentence pattern, a multi-text sentence type field, which contains at least one record, which has the same semantic meaning as different sentence patterns in the record; a cabin model library (400) for storing cabin modules, a cabin model code, a multi-lingual cabin model field, which contains at least one record, which has the same semantic meaning as a different language cabin model in the record; 500, 502), for storing the meaning group component, the intention group code, the multi-language string field, and the at least one record, having the same semantic meaning as the different language group strings in the record; the idiom library (600), For storing small idiom components, there are idiom codes, multi-language idiom fields, which contain at least one record, which has the same semantic meaning as different idioms in different records; Yitong Code Compilation Department (103), Sentence storage section member (101) is connected to each of the recording preparation to a statement of intended library member pass code.
2. 根据权利要求 1所述的语句构件装置, 其特征是所述语句构件库的语句构件 是用于组装语言句子的另部件、或对句子进行编码的标准件,语句构件包括如下四种: 2. The statement component device according to claim 1, wherein the statement component of the statement component library is a further component for assembling a language sentence or a standard component for encoding a sentence, and the statement component includes the following four types:
①句型构件 (201, 301), 用于构成句子的基本结构框架, 代表了该类句子基本语 意类属, 也决定了该类句子所含句舱的位次和个数, 并包揽了该类句子的复杂的语法 现象; A sentence structure component (201, 301), which is used to form the basic structural framework of a sentence, represents the basic semantic category of the sentence, and also determines the order and number of sentence boxes contained in the sentence, and covers the sentence. Complex grammatical phenomena of sentence-like sentences;
②舱模构件 (202, 401), 用于构成复杂句舱的基本结构框架, 代表了该类句舱基 本语意类属, 也决定了该类句舱所含舱眼的位次和个数, 并包揽了该类句舱的复杂的 语法现象; 2 cabin module (202, 401), which is used to form the basic structural framework of the complex sentence cabin, represents the basic semantics of the sentence cabin, and also determines the number and number of cabins contained in the sentence cabin. And the complex grammatical phenomena of this type of sentence cabin;
③意群串构件 (501、503),是由意群串充当的构件,用于填充简单句舱(203〜204) 或舱眼(205〜207)的构件; 3 means group members (501, 503), which are members of the group of intentions, for filling the components of the simple sentence cabin (203~204) or the cabin eyes (205~207);
④小习语构件 (601), 由过于简短不足以分出句型、句舱的句子充当小习语构件, 用于直接构成简短的句子。 4 small idiom components (601), sentences that are too short to be used to separate sentence patterns and sentence cabins serve as small idiom components, which are used to directly form short sentences.
3. 根据权利要求 1所述的语句构件装置, 其特征是语句构件库的多文种构件字 段以文种设置, 一个文种对应一个文种构件字段; WP.2 9/1?3208 意群串或习语四个构件库可以提取其中
Figure imgf000037_0001
3. The statement component apparatus according to claim 1, wherein the multi-text component field of the statement component library is set in a language, and one text corresponds to one text component field; W P. 2 9/1?3208 Four clusters of meaning strings or idioms can be extracted from them
Figure imgf000037_0001
模、 文串或文习语两个字段构成文分库、 文语言库、 第一语言库或第二语言库, 用于 语言翻译、 文本转换。 The two fields of the modulo, the string or the idiom constitute a corpus, a linguistic library, a first language library or a second language library for language translation and text conversion.
4. 根据权利要求 1所述的语句构件装置, 其特征是所述意通代码编制部, 仅当 上述四个库任何之一出现新记录时, 把当前库代表数作高位加上当前库记录号合成意 通代码, 并填入当前库新记录的某某码字段, 作为语句构件统一的双字定长 16进制 的意通代码, 意通代码唯一地表示当前库当前记录内各语种构件相同的语意。 4. The statement component device according to claim 1, wherein the Italian code preparation unit adds the current library representative number to the current library record only when a new record occurs in any one of the four libraries. The number synthesizes the code, and fills in the code field of the new record of the current library, as the unified double-word fixed-length hexadecimal code of the statement component. The Italian code uniquely represents the language components in the current record of the current library. The same semantics.
5. 一种基于语句构件的母语读外文并生成世界文方法, 其特征在于包括如下步 骤: 5. A method for reading a foreign language in a native language based on a statement component and generating a world document, comprising the following steps:
54. 界面, 由用户指定母语和源语各是构件库所含的哪一个文种; 54. Interface, the user specifies the native language and the source language are each of the documents contained in the component library;
55. 句读入, 读入源语一个句子作为当前句显示在中部窗口, 母语文本显示尾加 已处理的前一句内容, 源语文本显示减当前句内容; 55. The sentence is read in, a sentence is read into the source language as the current sentence is displayed in the middle window, the native language text is displayed at the end of the processed previous sentence, and the source text is displayed minus the current sentence content;
56. 利用四个语句构件库对当前句通过查表得出意通代码,给出同记录的母语字 段内容, 把相关数据填入反馈缓存区、 世界文缓存区、 悔选择缓存; 56. Using the four-statement component library to generate the Italian-language code by looking up the current sentence, giving the content of the native-language field of the same record, filling the relevant data into the feedback buffer area, the world text buffer area, and the regret selection cache;
57. 判断处理, 如果当前句的全部句舱处理完毕, 查询反馈缓存区和命令按钮: 当反馈缓存区不为空, 将反馈缓存区的信息加上源语、母语、 当前源语句子等信 息作成电子邮件反馈到支持网站, 清空反馈缓存区, 在世界文缓存区存入 "反馈句" 标志; 当悔操作命令按钮被点击,根据用户点击的欲悔词串,取出悔选择缓存中的相应 内容让用户重选词串并作相关修改; 当接收到存盘退出命令时,将世界文缓存区的内容存盘为世界文,如果源文未完, 文件头中记下源文偏移; 当悔操作、 存盘退出命令按钮都没有被点击时, 执行步骤 S5。 57. Judgment processing, if the entire sentence of the current sentence is processed, query the feedback buffer and the command button: When the feedback buffer is not empty, add the feedback buffer information to the source language, native language, current source sentence, etc. E-mail feedback to the support website, clear the feedback buffer, and store the "feedback sentence" flag in the world text cache; when the repent command button is clicked, according to the user's click on the remorse string, the corresponding in the regret selection cache is taken out. The content allows the user to re-select the string and make related modifications; when receiving the save exit command, save the contents of the world text buffer as world text, if the source text is not finished, the source text offset is recorded in the file header; When the save exit command button is not clicked, step S5 is performed.
6.根据权利要求 5所述一种基于语句构件的母语读外文方法,其特征是所述利用 四个语句构件库对当前句通过查表得出意通代码, 同时又给出同记录的母语字段内容 的步骤 S6进一步包括如下步骤:  6 . The statement component-based native language reading foreign language method according to claim 5, wherein the four sentence component library uses the four sentence component library to obtain an Italian code by looking up the current sentence, and simultaneously gives the same recorded native language. The step S6 of the field content further includes the following steps:
5601.判小习语, 以当前句查询习语库的源语习语字段, 若无, 没有找到, 执行 步骤 S602 , 若有, 取出同记录的母语习语字段中的母语小习语, 显示在中部窗口, 并 将同记录的习语码读入世界文缓存区, 然后执行步骤 S5 ; 5601. Judging a small idiom, querying the source idiom field of the idiom library with the current sentence, if not, not finding, performing step S602, if yes, taking out the native idiom in the native idiom field of the same record, displaying In the middle window, and the recorded idiom code is read into the celestial buffer, and then step S5 is performed;
5602.调用配句型子程序, 以当前句查询句型库的源语句型字段, 若查到一个匹 配的句型, 执行步骤 S603 , 如果查到多个匹配句型, 在中部窗口下部显示相应的母语 句^, ^ ^^( ^定后再执行步骤 S603 , 如果一个匹配的句型也没 ςτ(^¾ ^ 数于反馈缓存区; 5602. Calling the sentence-type subroutine, querying the source sentence type field of the sentence pattern library with the current sentence, if a matching sentence pattern is found, performing step S603, if multiple matching sentence patterns are found, displaying corresponding in the lower part of the middle window Native language The sentence ^, ^ ^^ (^ is then executed in step S603, if a matching sentence pattern is also no τ (^3⁄4 ^ number in the feedback buffer;
5603.给出同记录的句型码、 母语句型以及源语句型, 将母语句型着重显示在中 部窗口的上部,把源语句子对号入座地套入源语句型,附注式显示在该窗口母语下方, 并把句型码读入世界文缓存区; 5603. Give the sentence code of the same record, the parent sentence type and the source sentence type. The mother sentence type is mainly displayed in the upper part of the middle window, and the source sentence pair is nested in the source sentence type, and the note type is displayed in the native language of the window. Below, and read the sentence code into the world buffer;
5604.取句舱, 从左到右在中部窗口标示母语句型中的当前句舱, 存入当前句舱 标弓丁 '世界文缓存区, 同时标示和取出源语相应句舱内容作为当前句舱内容, 判断当 前句舱内容是否属于简单句舱, 若否, 执行歩骤 S605 , 若是执行步骤 S608 ; 5604. Take the sentence cabin, from left to right, in the middle window, mark the current sentence cabin in the parent sentence type, deposit it into the current sentence compartment bowing the 'world' buffer area, and mark and extract the corresponding sentence content of the source language as the current sentence. The cabin content, determining whether the current sentence cabin content belongs to a simple sentence cabin, if not, executing step S605, if step S608 is performed;
5605.查配舱模, 以当前句舱内容查询舱模库的源语舱模字段, 若查到一个匹配 的舱模, 执行步骤 S606 , 如果查到多个匹配舱模, 向下扩展中部窗口, 在扩展部显示 相应的母语舱模, 接受用户选定后再执行步骤 S606 , 如果一个匹配的舱模也没有, 存 舱模库代表数于反馈缓存区; 给出同记录的舱模码、 母语舱模以及源语舱模, 将母语舱模着重显示在中 部窗口的扩展部, 把源语句舱内容对号入座地套入源语舱模, 附注式显示在该窗口母 语舱模的下方, 并把舱模码读入世界文缓存区;  5605. Check the matching mode, and query the source language mode field of the cabin model library with the current sentence content. If a matching cabin model is found, go to step S606. If multiple matching cabin modules are found, extend the middle window downward. Displaying the corresponding native module in the extension, accepting the user selection and then performing step S606, if there is no matching cabin model, the storage module represents the number of feedback buffer; giving the same recorded cabin code, The native language module and the source language module, the mother tongue model is highlighted in the extension of the middle window, and the source sentence cabin content is nested in the source language module, and the note is displayed below the native language module of the window, and The cabin model code is read into the world text buffer;
5607.取舱眼, 以母语舱模为准从左到右, 在母语舱模上逐个标示当前舱眼, 存 当前舱眼标号于世界文缓存区, 同时标示和取出源语相应舱眼内容, 执行步骤 S608 ; 5607. The eyes are taken from the left to the right in the native mode, and the current cabin eye is marked one by one on the native language module. The current cabin eye is marked in the world space buffer, and the corresponding cabin contents are marked and extracted. Go to step S608;
5608.词义确定, 从左到右读出源语的简单句舱或舱眼中的一个词串, 查询意群 串库的源语文串字段,若只查到一条相同词串,执行步骤 S609,若查到多条相同词串, 分别取出它们的同记录母语串字段内容,备份于悔选择缓存,并显示在已扩展的中部 窗口下部,接收用户选定后再执行步骤 S609,如果一条相同的词串也没有,存当前源语 词串于反馈缓存区;  5608. Determine the meaning of the word, read the simple sentence box of the source language or a word string in the cabin eye from left to right, and query the source language string field of the meaning group library. If only one identical word string is found, step S609 is performed, if Find a plurality of identical word strings, respectively take out the contents of their same-recorded native language string fields, back them up in the regret selection cache, and display them in the lower part of the extended middle window. After receiving the user selection, perform step S609, if one and the same word There is no string, and the current source word string is stored in the feedback buffer area;
5609.取出当前记录的母语串字段内容填入到当前母语句舱或当前母语舱眼, 取 出意群码存入世界文缓存区; 继续执行步骤 S608 ,直到当前简单句舱或当前舱眼操作 完毕; 根据个性丢失表的信息进行当前句舱或舱眼的个性丢失补偿操作; 再根据母语词序表的信息纠正当前句舱或舱眼的母语词序; 最后查询前进、后退移词序按钮, 当前进按钮被点击将当前句舱或舱眼用户所点 击词串后移于后一词串之后, 当后退按钮被点击将当前句舱或舱眼用户所点击词串前 移于前一词串之前, 同时将移后的词序加入母语词序表备用, 后续执行步骤 S610 ; 5609. Take out the content of the currently recorded native language string field and fill in the current parent sentence box or the current native language cabin, and take out the intention group code and store it in the world text buffer; continue to step S608 until the current simple sentence cabin or the current cabin operation is completed. ; according to the information of the personality loss table, the current sentence or cabin eye loss compensation operation; then correct the current sentence or the cabin's native word order according to the information of the native language word list; finally query the forward and backward shift word order button, the current enter button After being clicked to move the word string clicked by the current sentence or cabin user to the next word string, when the back button is clicked, the word string clicked by the current sentence cabin or the cabin user is moved forward before the previous word string, Adding the removed word order to the native language word list for backup, and subsequently performing step S610;
5610.判断, 如果当前句舱还有舱眼未处理, 执行步骤 S607 , 若否而当前句子还 有句舱未处理, 执行步骤 S604, 若当前句子所有句舱全部处理完毕, 后续步骤 S7。 5610. Judging, if the current sentence compartment is still unprocessed, step S607 is performed. If no, the current sentence has a sentence compartment unprocessed, and step S604 is performed. If all the sentence compartments of the current sentence are all processed, the subsequent step S7 is performed.
9.2^/^ 要求 5所述的一种基于语句构件的母语读外文并 f^^ 8ί> 5,93 其特征是: 所述作为母语读外文的界面,可以专用的一屏或弹开一个活动窗口或把界面作成 浮条紧随光标下或用户可移, 让用户指定母语和源语各是构件库所含的哪一个文种或 选用某一固定的版本, 可把屏幕或一个窗口分成上、 中、 下或前、 中、 后三部分, 中 部用于显示当前句以及操作中的相关信息, 下或后部用于显示源语待读文本, 上或前 部用于显示已读的母语文本, 此外, 在他们的下部或提示行显示悔操作、 存盘退出命 令按钮以及前进、 后退移词序按钮; 所述作成电子邮件反馈到支持网站, 当支持网站接收到来自用户的反馈邮件时, 由网站实时处理后,新构件加入相应构件库,并将新构件及相关信息实时反馈给用户, 并在用户的参与下替换原 "反馈句"标志。 9. 2 ^/^ Requirement 5 is a sentence-based native language read foreign language and f^^ 8ί> 5 , 93 is characterized by: the interface as a native language to read foreign languages, can be dedicated to a screen or pop-up An active window or the interface is made as a float below the cursor or the user can move, let the user specify which language and source language are included in the component library or select a fixed version, can be a screen or a window Divided into upper, middle, lower or front, middle and back. The middle part is used to display the current sentence and related information in the operation. The lower part or the back part is used to display the source language to be read. The upper part or the front part is used to display the read. The native language text, in addition, in their lower or prompt line, the repentance operation, the save exit command button, and the forward and backward shift word order buttons; the email feedback to the support website, when the support website receives the feedback message from the user After being processed by the website in real time, the new component is added to the corresponding component library, and the new component and related information is fed back to the user in real time, and the original "feedback sentence" flag is replaced by the user's participation.
8. 根据权利要求 5〜6 所述的一种基于语句构件的母语读外文并生成世界文方 法, 其特征是生成的世界文: 由句型码、 习语码、 舱模码、 意群码等构件代码, 以及句舱标号、 舱眼标号为元 素构成; 世界文的读出语种由用户选定语句构件库所含任意之一; 世界文的读出过程是利用四个构件库的译码过程,其步骤是: 8. A method for reading a foreign language based on a sentence component according to claims 5 to 6, and generating a world document, characterized by the generated world text: a sentence pattern, an idiom code, a cabin model code, an ensemble code The component code, and the sentence box number and the cabin eye label are composed of elements; the world language reading language is selected by the user to select any one of the sentence component libraries; the world text reading process is to use the decoding of the four component libraries. The process, the steps are:
①依次逐个取出代码; 1 sequentially remove the code one by one;
②用开关语句将代码分类分别处理; 2 use the switch statement to classify the code separately;
③其中如果是句舱标号、 舱眼标号, 用以指示当前句舱或当前舱眼; 3 where if it is a sentence number and a cabin eye label, it is used to indicate the current sentence compartment or the current cabin eye;
④把句型码、 习语码、 舱模码、 意群码分解为某库某记录号, 给出某库、 某记录 的某读出文种字段内容, 其中意群码则按指示给出到当前句舱或当前舱眼; 4 Decompose the sentence pattern code, idiom code, cabin model code, and meaning group code into a certain record number of a certain library, and give the contents of a certain read field of a certain library and a record, wherein the meaning group code is given according to the instruction. To the current sentence compartment or the current cabin eye;
⑤接续执行①直到文本结束。 5 Continue to execute 1 until the end of the text.
9. 一种基于语句构件的源、 目语文本转换方法, 其特征是包括如下步骤: s5. 句读入, 读入源语一个句子作为当前句; s6. 利用四个语句构件库对当前句查表匹配、 给出同记录的目语构件字段内容, 把相关数据填入反馈缓存、 悔选择缓存; s7. 判断处理, 当当前句的全部句舱处理完毕, 查询反馈缓存区和命令按钮: 当反馈缓存区不为空, 将反馈缓存的信息加上源语、 目语文种、 当前句等信息作 成电子邮件反馈到支持网站, 清空反馈缓存;
Figure imgf000040_0001
8^?^ 内容响应用户点击并作相关修改; 当接收到存盘退出命令时,将给出的目语构件存盘为目语文本, 如果源语文件未 完,文件头中记下源文偏移; 当悔操作、 存盘退出命令按钮都没有被点击时, 确定当前句, 执行步骤 s5。
9. A source and text text conversion method based on a statement component, comprising the following steps: s5. reading a sentence, reading a sentence of the source language as a current sentence; s6. using the four sentence component library to the current sentence Check the table matching, give the content of the target component field of the same record, fill in the relevant data into the feedback buffer, and repent the selection cache; s7. Judgment processing, when the entire sentence of the current sentence is processed, query the feedback buffer and the command button: When the feedback buffer is not empty, the feedback cache information plus the source language, the language type, the current sentence and the like are made into an email feedback to the support website, and the feedback buffer is cleared;
Figure imgf000040_0001
? 8 ^? ^ The content responds to the user click and makes relevant modifications; when receiving the save exit command, save the given target component as the target text. If the source file is not finished, the source offset is recorded in the file header; When the save exit command button is not clicked, the current sentence is determined, and step s5 is performed.
10. 根据权利要求 9所述一种基于语句构件的源、 目语文本转换方法, 其特征是 所述的转换操作步骤 s6进一步包括如下步骤: s601 . 判小习语, 以当前句查询习语库的源语习语字段, 若无, 没有找到, 执行 步骤 s602, 若有, 给出同记录的目语习语字段中的小习语, 然后执行歩骤 s5 ; 10. The method according to claim 9, wherein the converting operation step s6 further comprises the following steps: s601. Judging a small idiom, querying an idiom with a current sentence The source language idiom field of the library, if not, is not found, step s602 is performed, if yes, a small idiom in the idiom field of the same record is given, and then step s5 is performed;
s602. 调用配句型子程序, 以当前句查询句型库的源语句型字段, 若查到一个匹 配的句型,执行步骤 s603,如果查到多个匹配句型,响应用户选定后再执行步骤 s603, 如果一个匹配的句型也没有, 存句型库代表数于反馈缓存区; s603. 给出同记录的目语句型, 把当前句对号入座地套入源语句型; s604. 取句舱, 从左到右标出目语句型中的当前句舱, 同时取出源语相应句舱内 容作为当前句舱内容, 判断当前句舱内容是否属于简单句舱, 若否, 执行步骤 s605, 若是执行步骤 s608 ; S602. Calling the sentence-type subroutine, querying the source sentence type field of the sentence pattern library with the current sentence, if a matching sentence pattern is found, performing step s603, if multiple matching sentence patterns are found, responding to the user selection Step s603 is performed, if there is no matching sentence pattern, the stored sentence library represents the number in the feedback buffer area; s603. gives the same statement type of the statement, and inserts the current sentence pair into the source sentence type; s604. Cabin, from left to right, the current sentence cabin in the target sentence type is marked, and the corresponding sentence cabin content of the source language is taken out as the current sentence cabin content, and it is judged whether the current sentence cabin content belongs to a simple sentence cabin. If not, step s605 is performed, if Perform step s608 ;
s605. 查配舱模, 以当前句舱内容查询舱模库的源语舱模字段, 若查到一个匹配 的舱模, 执行步骤 s606, 如果查到多个匹配舱模, 响应用户选定后再执行步骤 s606, 如果一个匹配的舱模也没有, 存舱模库代表数于反馈缓存区; s606. 给出同记录的目语舱模以及源语舱模,把当前句舱内容对号入座地套入源 语舱模; s607. 取舱眼, 从左到右在目语舱模上逐个标示当前舱眼, 取出源语相应舱眼内 容, 执行步骤 s608 ; S605. Check the cabin module, and query the source language module field of the cabin model library with the current sentence content. If a matching cabin model is found, go to step s606. If multiple matching cabin modules are found, respond to the user selection. Then, step s606 is performed, if a matching cabin model is not available, the storage model library represents the number of feedback buffers; s606. gives the same recorded language module and source language module, and sets the current sentence content to the ground cover. In the source language module; s607. take the cabin eye, from left to right on the target cabin model one by one to mark the current cabin eye, remove the source language corresponding cabin eye content, perform step s608 ;
s608. 词义确定, 从左到右读出源语的简单句舱或舱眼中的一个词串, 查询意群 串库的源语串字段,若只查到一条相同词串,执行步骤 s609,若查到多条相同词串,分 别取出它们的同记录目语串字段内容,备份于悔选择缓存,接收用户选定后再执行步 骤 s609,如果一条相同的词串也没有,存当前源语串于反馈缓存区; s609. 取出当前记录的源语串字段内容填入到当前目语句舱或当前目语舱眼, 继 续执行步骤 s608, 直到当前简单句舱或当前舱眼操作完毕; 根据个性丢失表的信息进行当前句舱或舱眼的个性丢失补偿操作; 再根据目语词序表的信息纠正当前句舱或舱眼的目语词序; w ,后退移词序按钮, 当前进按钮被点击将当前句 ^si^^卩! i8/Q 2,5^ 击词串后移于后一词串之后, 当后退按钮被点击将当前句舱或舱眼用户所点击词串前 移于前一词串之前, 同时将移后的词序加入母语词序表备用, 后续执行步骤 s610 ; s610. 判断, 如果当前句舱还有舱眼未处理, 执行步骤 s607, 若否而当前句子 还有句舱未处理, 执行步骤 s604, 若当前句子所有句舱全部处理完毕, 后续步骤 s7。 S608. The meaning of the word is determined, reading a simple sentence box of the source language or a word string in the cabin eye from left to right, querying the source string field of the meaning group library, if only one identical word string is found, step s609 is performed, if Find a plurality of identical word strings, respectively take out the contents of their same record target string fields, back up the regret selection cache, and then perform step s609 after receiving the user selection. If there is no identical word string, save the current source string. In the feedback buffer area; s609. Take out the content of the source string field of the current record and fill in the current target statement cabin or the current target cabin eye, and continue to step s608 until the current simple sentence cabin or the current cabin operation is completed; The information of the table carries out the personality loss compensation operation of the current sentence cabin or cabin; and then corrects the word order of the current sentence cabin or cabin eye according to the information of the target word list; w , back shift word order button, the current enter button is clicked, the current sentence ^si^^卩! i 8 / Q 2 , 5 ^ After the word string is moved after the next word string, when the back button is clicked, the current sentence box is clicked. Or the word string clicked by the cabin user advances before the previous word string, and the changed word order is added to the original word list for subsequent use, and the subsequent step s610 is performed; s610. Judgment, if the current sentence cabin is still unprocessed, the execution Step s607, if no, the current sentence has a sentence box unprocessed, and step s604 is performed. If all the sentence cases of the current sentence are all processed, the subsequent step s7.
PCT/CN2008/072593 2008-02-18 2008-09-28 Sentence component device and reading foreign languages and producing universal language and text conversion method WO2009103208A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200880128636.7A CN102007490B (en) 2008-02-18 2008-09-28 Sentence component manufacture method and mother tongue are read foreign language and generate world's literary composition method

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN200810081482.2 2008-02-18
CN200810081482 2008-02-18
CN2008100862296A CN101246474B (en) 2008-02-18 2008-03-13 Method for reading foreign language by mother tongue based on sentence component
CN200810086229.6 2008-03-13

Publications (1)

Publication Number Publication Date
WO2009103208A1 true WO2009103208A1 (en) 2009-08-27

Family

ID=39946934

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2008/072593 WO2009103208A1 (en) 2008-02-18 2008-09-28 Sentence component device and reading foreign languages and producing universal language and text conversion method

Country Status (2)

Country Link
CN (2) CN101246474B (en)
WO (1) WO2009103208A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112783923A (en) * 2020-11-25 2021-05-11 辽宁振兴银行股份有限公司 Implementation method for efficiently acquiring database based on Spark and Impala

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101510194B (en) * 2009-03-15 2015-09-09 刘树根 A kind of multilingual professional translation method based on sentence component
CN102236645B (en) * 2010-05-06 2016-03-30 上海五和际软件信息有限公司 Based on the natural language man-machine conversation device of semantic logic
CN102043849B (en) * 2010-12-20 2015-03-25 惠州市表意软件有限公司 Realization method for electronic dictionary system with ideographic components as elements
CN103106195B (en) * 2013-01-21 2018-12-11 刘树根 Component identification of expressing the meaning extracts and the machine translation people school based on component of expressing the meaning interacts interpretation method
CN103218353B (en) * 2013-03-05 2018-12-11 刘树根 Mother tongue personage learns the artificial intelligence implementation method with other Languages text
CN105989060A (en) * 2015-02-09 2016-10-05 阿里巴巴集团控股有限公司 Data management method and device
CN106383819A (en) * 2016-01-11 2017-02-08 陈勇 Speech convertor
TWI688969B (en) * 2018-10-24 2020-03-21 大仁科技大學 Dialogue system for medical product recommendation

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1910574A (en) * 2004-01-06 2007-02-07 李仁燮 The auto translator and the method thereof and the recording medium to program it
CN1955953A (en) * 2005-10-27 2007-05-02 株式会社东芝 Apparatus and method for optimum translation based on semantic relation between words

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1050629A (en) * 1990-08-05 1991-04-10 王麟祥 World language sign indicating number and encoding law thereof
CN1617133A (en) * 2003-11-14 2005-05-18 高庆狮 Forming method for sentence meaning expression machine translation and electronic dictionary
CN100555270C (en) * 2004-01-13 2009-10-28 中国科学院计算技术研究所 A kind of machine automatic testing method and system thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1910574A (en) * 2004-01-06 2007-02-07 李仁燮 The auto translator and the method thereof and the recording medium to program it
CN1955953A (en) * 2005-10-27 2007-05-02 株式会社东芝 Apparatus and method for optimum translation based on semantic relation between words

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112783923A (en) * 2020-11-25 2021-05-11 辽宁振兴银行股份有限公司 Implementation method for efficiently acquiring database based on Spark and Impala

Also Published As

Publication number Publication date
CN102007490A (en) 2011-04-06
CN102007490B (en) 2016-09-21
CN101246474B (en) 2012-01-11
CN101246474A (en) 2008-08-20

Similar Documents

Publication Publication Date Title
Sin-Wai Routledge encyclopedia of translation technology
Desagulier et al. Corpus linguistics and statistics with R
WO2009103208A1 (en) Sentence component device and reading foreign languages and producing universal language and text conversion method
Carletta et al. The NITE XML toolkit: flexible annotation for multimodal language data
RU2509350C2 (en) Method for semantic processing of natural language using graphic intermediary language
Matuschek et al. Multilingual knowledge in aligned Wiktionary and OmegaWiki for translation applications
Kang Spoken language to sign language translation system based on HamNoSys
WO2005121993A1 (en) Application system of multidimentional chinese learning
Kouremenos et al. A prototype Greek text to Greek Sign Language conversion system
Turell et al. Transcription
Maraldo Translating Nishida
Bonham English to ASL gloss machine translation
WO2014134971A1 (en) Software and system in place of brain to learn other languages for a native language speaker
Zhou Chinese Translation of Emily Dickinson’s Poetry: Translation Features of Shi Li’s Lilacs in the Sky
Behera Odia parts of speech tagging corpora: suitability of statistical models
Pozzo et al. Aligning Immanuel Kant’s work and its translations
Miyagawa et al. Building Okinawan Lexicon Resource for Language Reclamation/Revitalization and Natural Language Processing Tasks such as Universal Dependencies Treebanking
Ahmad People centered HMI’s for deaf and functionally illiterate users
CN101436179A (en) Method and apparatus for converting text
WO1999052041A1 (en) Opening and holographic template type of language translation method having man-machine dialogue function and holographic semanteme marking system
Hasan et al. An online Punjabi Shahmukhi lexical resource
Lin et al. Chinese-Thai-English Translation Audible Electronic Dictionary Design and Implementation
Hu Related Studies on Formulaic Sequences
KR20040050394A (en) A Translation Engine Apparatus for Translating from Source Language to Target Language and Translation Method thereof
Matvieieva et al. English-Ukrainian Parallel Corpus: Prerequisites for Building and Practical Use in Translation Studies

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 200880128636.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08872576

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08872576

Country of ref document: EP

Kind code of ref document: A1