WO2011021479A1 - 情報処理装置、表示制御方法、およびプログラム - Google Patents
情報処理装置、表示制御方法、およびプログラム Download PDFInfo
- Publication number
- WO2011021479A1 WO2011021479A1 PCT/JP2010/062600 JP2010062600W WO2011021479A1 WO 2011021479 A1 WO2011021479 A1 WO 2011021479A1 JP 2010062600 W JP2010062600 W JP 2010062600W WO 2011021479 A1 WO2011021479 A1 WO 2011021479A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- template
- phrase
- data
- sentence
- unit
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
Definitions
- JP-A-1-207873 includes a word segmentation processing unit, a changed word input processing unit, a translated word determination processing unit, and a translated sentence rewriting processing unit, which have functions to be described later, as the translation support device.
- the structure provided with this is disclosed.
- the word cutout processing unit cuts out a word in the original sentence and cuts out a word in the corresponding translation sentence based on the position specified in the original sentence in the parallel translation sentence.
- the changed word input processing unit inputs a new word to be changed according to the language of the original text.
- the translated word determination processing unit determines a translated word corresponding to the input word using the parallel translation dictionary.
- the translated sentence rewrite processing unit sets the determined translated word at the word position of the translated sentence cut out by the word cutout processing unit.
- the electronic dictionary of Patent Document 1 is not a translation device that translates sentences (original texts). For this reason, the electronic dictionary of Patent Document 1 is a translated sentence corresponding to a word or phrase included in a source sentence in a parallel translation including a sentence (original sentence) in a first language and a translated sentence in a second language that is a translation of the original sentence. Cannot be specified.
- the present invention has been made in view of the above-described problems, and its purpose is to provide an information processing apparatus, a display control method, and a display control method in which a user can confirm a range of a translation that has a corresponding relationship with a range of an original sentence selected by the user. And to provide a program.
- FIG. 6 is a diagram showing co-occurrence information stored in a temporary first co-occurrence buffer 81.
- FIG. 10 is a diagram showing a configuration of other co-occurrence information stored in a temporary first co-occurrence buffer 81. It is the figure which showed the structure of the slot information memorize
- FIGS. 1 to 89 An embodiment of a translation apparatus according to the present invention will be described with reference to FIGS. 1 to 89 as follows.
- the translation apparatus 1 includes an input unit 10, an output unit 11, a control unit 12, a storage device 13, and a memory 14, as shown in FIG.
- the output unit 11 is a device for displaying data input via the input unit 10 and results of various processes in the control unit 12 based on instructions from the control unit 12.
- control unit 12 and each unit in the control unit 12 are functional blocks, and processing in these blocks is realized by software executed by a CPU (Central Processing Unit) described later.
- CPU Central Processing Unit
- a phrase means a word or phrase, and in a phrase, a word (generally defined as a minimum unit that constitutes a sentence and having a specific meaning or grammatical function) or a compound word (Generally defined as one having two or more words connected to represent one meaning).
- FIG. 2 is a diagram showing data stored in the storage device 13. As shown in the figure, the storage device 13 stores a template database 60, a dictionary database 61, a Japanese inflection table 62, a category database 63, a thesaurus data 64, and a co-occurrence relation database 65. .
- FIG. 3 is a diagram showing a configuration of one template data included in the template database 60.
- template data includes a template ID, a template in Japanese (hereinafter referred to as Japanese template), a template in English (hereinafter referred to as English template), a template in Chinese (hereinafter referred to as Chinese template), Is included.
- the above Japanese template includes a fixed part composed of a predetermined word and a variable part that can be replaced with any one of a plurality of predetermined words.
- the word W15 postpositional word functioning as an auxiliary to a main word
- sentence W16 correspond to the fixed part
- ⁇ 1: & HUMAN-SUBJ ⁇ and ⁇ 2: & VB_EAT + v.ren1 ⁇ Corresponds to the variable part.
- the above English template includes a fixed part and a variable part, similar to the Japanese template.
- “What” and “?” Correspond to the fixed part, and ⁇ -i: be_AUX + pres ⁇ , ⁇ -i: # DET_MY-NULL ⁇ , ⁇ 1-i: & HUMAN-SUBJ ⁇ ⁇ 2: & VB_EAT + ing ⁇ corresponds to the variable part.
- the Japanese template and the English template are either a fixed portion made up of a predetermined word or phrase and a plurality of predetermined words or phrases at corresponding positions. And a variable part that can be substituted for the phrase.
- the dictionary ID column an identifier (ID) for distinguishing the dictionary data from other dictionary data is described.
- ID an identifier for distinguishing the dictionary data from other dictionary data.
- a Japanese word / phrase W14 verb
- an English word “drink” corresponding to the word
- a Chinese word / phrase W17 verb
- the part of speech of the heading is described in the above part of speech column.
- utilization column information on the utilization form of the word / phrase is described for each language.
- the meaning of the phrase W18 indicates that the phrase W14 is the utilization (five-tire conjugation in the “ma” column of the kana syllabary) indicated by the phrase W19 (see FIG. 77). The meaning code will be described later.
- the recording medium is not limited to CD-ROM, FD (Flexible Disk), and hard disk, but magnetic tape, cassette tape, optical disk (MO (Magnetic Optical Disc) / MD (Mini Disc) / DVD (Digital Versatile Disc)). , IC (Integrated Circuit) card (including memory card), optical card, mask ROM, EPROM (Electronically Programmable Read-Only Memory), EEPROM (Electronically Erasable Programmable Read-Only Memory), fixation of semiconductor memory such as flash ROM Alternatively, it may be a medium carrying a program.
- the above configuration is merely one aspect of a specific configuration, and may be a configuration in which the mouse is not provided and a keyboard, a monitor, and a hard disk are provided in the translation apparatus 1.
- the translation apparatus 1 can also be configured as a portable portable information terminal such as an electronic dictionary or a mobile phone.
- the template data includes a Japanese template, an English template, and a Chinese template, and a configuration is shown in which a sentence example in each language is created using a template in each language.
- the present invention is not limited to this. It is not a thing.
- step S2 details of the template search in step S2 will be described based on FIG. 13 and FIG.
- the determination unit 22 includes the read word / phrase (WX) or the word / phrase (WX ′) indicating the utilization form of the word / phrase (WX) in the fixed part of the Japanese template in the read template data. It is determined whether or not to perform (S204). When determining whether or not the word (WX ′) exists, the information in the column for utilizing dictionary data and the Japanese language utilization table are used.
- step S205 If it is determined in step S205 that it exists, the control unit 12 advances the process to step S206. On the other hand, when it determines with not existing in step S205, the control part 12 advances a process to step S208.
- step S206 the control unit 12 determines whether or not a word / phrase that has not yet been read exists in the extracted word / phrase buffer 70. If it is determined in step S206 that it exists, the control unit 12 returns the process to step S203. On the other hand, if it is determined in S206 that it does not exist, the selection unit 23 stores the template data in the search result template buffer 71 (S207). As described above, the selection unit 23 selects template data that satisfies a certain condition from a plurality of template data, and stores the selected template data in the search result template buffer 71. Note that after step S207, the control unit 12 advances the processing to step S208.
- step S208 the control unit 12 determines whether there is template data that has not yet been read in the template database 60. If it is determined in step S208 that it exists, the control unit 12 returns the process to step S202. On the other hand, if it is determined in step S208 that it does not exist, the control unit 12 advances the processing to step S13 in FIG.
- control unit 12 extracts information related to the slot portion from the processed sentence storage buffer 75 for each of the above-described categories (that is, a label beginning with “&”), and the extracted data is a table in a predetermined format. (S403).
- control unit 12 writes the character string after the “+” in the slot part for each language in the utilization information column of FIG. For example, the control unit 12 writes “v.ren1” of ⁇ 2: & VB_EAT + v.ren1 ⁇ in the utilization information column in the Japanese template, and ⁇ 2: & VB_EAT + in the utilization information column in the English template. ing ⁇ “ing” is written.
- step S407 the control unit 12 determines whether data is written in the co-occurrence unit buffer 73 or not. If it is determined in step S407 that data has been written, the first replacement unit 24 executes the process of the co-occurrence unit (S408). And after step S408, the control part 12 advances a process to step S409. Details of step S408 will be described later.
- step S409 the output sentence shaping unit 26 performs post-processing on the sentence example created by processing the slot part and the co-occurrence part. Details of the process will be described later. Examples of the sentence include sentence examples in Japanese, sentence examples in English, and sentence examples in Chinese. That is, the first replacement unit 24 and the output sentence shaping unit 26 create sentence examples corresponding to templates in each language.
- Steps S604 and S605 will be described below with specific examples.
- the dictionary search unit 40 writes the dictionary data including the phrase W14 (WX) in the temporary dictionary buffer 78 as shown in FIG.
- FIG. 22 is a diagram showing dictionary data stored in the temporary dictionary buffer 78.
- the English data written in the heading column is “drink”, and the Chinese data is the phrase W17. Therefore, in step S605, the slot replacing unit 41 writes “drink” and the word / phrase W17 in the English word / phrase column and the Chinese word / phrase column of the temporary word / phrase buffer 79, respectively, as shown in FIG.
- FIG. 23 shows data stored in temporary word / phrase buffer 79.
- step S606 the variation search unit 43 determines whether data is written in the utilization information column in the slot information (SX) written in the temporary slot buffer 80. If it is determined in step S606 that data has been written, the slot replacing unit 41 uses the utilization information and the data in the utilization column of the temporary dictionary buffer 78 and the Japanese utilization form table shown in FIG. 62 is used to change the word form of the word (WX) in the temporary word buffer 79 (S607). Further, in step S607, the slot replacing unit 41 changes the form of the English phrase and the Chinese phrase corresponding to the phrase (WX) in the temporary phrase buffer 79 using the utilization information. After step S607, the process proceeds to step S608.
- FIG. 24 is a diagram showing the data after the word form change stored in the temporary word / phrase buffer 79.
- the slot replacing unit 41 replaces ⁇ 2: & VB_EAT + v.ren1 ⁇ of the Japanese template shown in FIG. 17 with the word / phrase W33. Further, the slot replacing unit 41 replaces ⁇ 2: & VB_EAT + ing ⁇ of the English template shown in FIG. Further, the slot replacing unit 41 replaces ⁇ 2: & VB_EAT ⁇ of the Chinese template shown in FIG.
- FIG. 25 is a diagram showing template data in the middle of the replacement process in the slot portion, which is stored in the processed sentence storage buffer 75.
- step S612 the slot replacement unit 41 erases the flag indicating that all the slot information has been extracted. After step S612, the slot replacement unit 41 again extracts one piece of slot information from the slot portion buffer 72, and writes the extracted slot information in the temporary slot buffer 80 again (S613).
- the slot replacement unit 41 determines whether or not a flag indicating that replacement processing has been performed is set in the extracted slot information (S614). If it is determined in step S614 that the flag is set, the control unit 12 advances the process to step S616. On the other hand, if it is determined in step S614 that the flag is not raised, the non-input location replacement unit 44 replaces the slot portion of the template of each language corresponding to the slot information with a predetermined word (S615). . Further, the control unit 12 writes the same word / phrase as the replaced word / phrase in the replacement word / phrase field of the slot information. Further, after step S615, the control unit 12 advances the process to step S616.
- FIG. 29A is a flowchart showing the first half of the process flow of the co-occurrence unit.
- FIG. 29B is a flowchart showing the latter half of the process flow of the co-occurrence unit.
- the co-occurrence replacing unit 42 extracts one piece of the above-described co-occurrence information from the co-occurrence part buffer 73 and writes it to the temporary first co-occurrence buffer 81 (S801).
- the co-occurrence replacing unit 42 first extracts the co-occurrence information related to “be_AUX” from the data in the co-occurrence unit buffer 73 shown in FIG. 19, and the co-occurrence information is temporarily stored as shown in FIG. Write to the first co-occurrence buffer 81.
- the control unit 12 sets a flag (not shown) indicating that the co-occurrence replacing unit 42 has been extracted for the extracted co-occurrence information.
- FIG. 30 is a diagram showing the co-occurrence information stored in the temporary first co-occurrence buffer 81. In the following, for convenience of explanation, the information written in the temporary first co-occurrence buffer 81 as described above is represented as co-occurrence information (CX).
- CX co-occurrence information
- step S801 the control unit 12 determines whether or not a priority processing flag is set in the co-occurrence information written in the temporary first co-occurrence buffer 81 (S802). If it is determined in step S802 that the priority process flag is set, the control unit 12 advances the process to step S803. On the other hand, if it is determined in step S802 that the priority process flag is not set, the control unit 12 advances the process to step S807. For example, for the co-occurrence information related to “be_AUX”, since the priority processing flag is not set as shown in FIG. 30, in this case, the control unit 12 advances the processing to step S808.
- control unit 12 determines in step S802 that there is a priority processing flag, and as a result, the process is stepped. The process proceeds to S803.
- step S804 the co-occurrence replacing unit 42 reads the co-occurrence relation data shown in FIG. 9B.
- step S805 the co-occurrence replacing unit 42 writes the phrase in the temporary phrase buffer 79 based on the slot information shown in FIG. 32 and the read co-occurrence relation data.
- the category (a type of label) to which the word / phrase W12 (see FIG. 7) to be replaced with slot information belongs is “& HUMAN-PRON_SUBJ (& HUMAN-SUBJ)”
- the co-occurrence replacing unit 42 Write the character used when a label is specified as a condition.
- step S809 the co-occurrence replacing unit 42 erases the flag indicating that all the co-occurrence information has been extracted.
- the co-occurrence replacing unit 42 extracts one piece of co-occurrence information from the co-occurrence part buffer 73, and writes the extracted co-occurrence information in the temporary first co-occurrence buffer 81 (S810). .
- step S817 will be explained before explanation of steps S812 to S816.
- step S817 the co-occurrence replacing unit 42 determines whether there is co-occurrence information that has not yet been extracted. If it is determined in step S817 that co-occurrence information exists, the control unit 12 advances the process to step S810. On the other hand, when it is determined in step S817 that the co-occurrence information does not exist, the control unit 12 advances the processing to step S409.
- the co-occurrence replacing unit 42 is based on the co-occurrence information (CX) written in the temporary first co-occurrence buffer 81 and the co-occurrence information (CHX) written in the temporary second co-occurrence buffer 82.
- the co-occurrence relation correspondence data is read from the co-occurrence relation correspondence database (S814).
- the co-occurrence replacing unit 42 writes the phrase to the temporary phrase buffer 79 based on the read co-occurrence relation correspondence data and the slot information (SX) written to the temporary slot buffer 80. (S815). Further, after step S815, the control unit 12 advances the process to step S816.
- step S816 the co-occurrence replacing unit 42 replaces the co-occurrence part related to the co-occurrence information in the template stored in the processed sentence storage buffer 75 with the phrase stored in the temporary phrase buffer 79. Specifically, the co-occurrence replacing unit 42 replaces ⁇ -i: be_AUX + pres ⁇ of the English template shown in FIG. 34 with the above “is”. As a result, template data as shown in FIG. 39 is stored in the processed sentence storage buffer 75.
- FIG. 39 is a diagram showing template data stored in the processed sentence storage buffer 75 in a state in which the co-occurrence portion replacement process has been completed.
- step S409 the output sentence shaping unit 26 converts the phrase W35 (syllable) of the Japanese template in the template data shown in FIG. 39 stored in the processed sentence storage buffer 75 into the phrase W36 as shown in FIG. Change to (Syllable (Muddy sound of word W35)).
- FIG. 40 is a diagram showing example sentence data stored in the processed sentence storage buffer 75.
- the translation device 1 stores rules for formatting a sentence in this way. For example, after all variable parts are replaced, the sentence is formatted according to the rules.
- this rule when the utilization form of the utilization verb shown by the phrase W19 (see FIG. 77) is the utilization form (that is, ren1) indicated by the phrase W26, the phrase W35 is immediately after the verb. Is changed to the word / phrase W36, and when the word immediately after the verb is the word / phrase W38 (syllable) shown in FIG. 77, it is changed to the word / phrase W39 (syllable (a muffled sound of the word / phrase W38)) shown in FIG.
- a first replacement unit that replaces a variable part corresponding to a variable part that can be replaced with the matched word / phrase with a word / phrase in the second language corresponding to the matched word / phrase with respect to the template in the second language that has a corresponding relationship with the template 24.
- the conventional apparatus that does not have a variable part and the translation apparatus 1 according to the present embodiment have the same number of templates, the number of sentence examples that can be created is clearly The number of translation apparatuses 1 according to the embodiment increases. For this reason, the translation apparatus 1 according to the present embodiment can select more accurate sentence examples than the conventional apparatus.
- the translation device 1 is also configured to include at least the display control unit 25 that causes the output unit 11 to display an image based on the template in the second language after the replacement by the first replacement unit 24. Therefore, with this configuration, the user of the translation apparatus 1 can check the sentence example in the second language.
- FIG. 43 is a diagram showing dictionary data stored in the dictionary database 61.
- the dictionary data stores Japanese words and English words corresponding to the words in association with each other. Further, in the dictionary data, information related to the classifier is stored in association with each English phrase.
- FIG. 44 is a flowchart showing a flow for creating an English sentence example using such template data and dictionary data.
- the co-occurrence replacing unit 42 replaces the co-occurrence part “ ⁇ -i: DET_A ⁇ ” with “a” based on the reading of the determined phrase, It is determined whether to replace with “an” (S904). For example, in the case of the input example A, the co-occurrence replacing unit 42 determines to replace the co-occurrence part “ ⁇ -i: DET_A ⁇ ” with “a”. In the case of the input example B, the co-occurrence replacing unit 42 determines to replace the co-occurrence part “ ⁇ -i: DET_A ⁇ ” with “an”.
- step S906 the co-occurrence replacing unit 42 uses the phrase determined as the replacement word in the co-occurrence part “ ⁇ -i # CLASSIFIER ⁇ ” and the replacement word in the co-occurrence part “ ⁇ -i: DET_A ⁇ ”. Replace each co-occurrence with the determined phrase. However, since “(NULL)” is a symbol that means nothing is written as described above, the co-occurrence replacing unit 42 deletes “ ⁇ i # CLASSIFIER”.
- the word / phrase to be replaced in the slot portion has been described by taking as an example a configuration that can be determined independently between the slot portions. That is, for example, in FIG. 3, a configuration has been described in which any one of “ ⁇ 1: & HUMAN-SUBJ ⁇ ” and “ ⁇ 2: & VB_EAT + v.ren1 ⁇ ” can be determined first.
- FIGS. 46 to 55 two examples of forming a co-occurrence relationship between the slot portions will be described with reference to FIGS. 46 to 55.
- FIG. 46 shows the configuration of template data stored in the template database 60.
- a Japanese template and an English template are shown.
- the Japanese template includes two slot portions such as “ ⁇ 1: THIS-THAT ⁇ ” and “ ⁇ 2: GOODS ⁇ ”.
- the English template includes two slot portions such as “ ⁇ 1-i: THIS-THAT ⁇ ” and “ ⁇ 2-i: GOODS ⁇ ”. Both slots of this English template are labeled with “-i”. That is, both slot portions have a co-occurrence relationship with each other.
- FIG. 47 is a diagram showing an example of words / phrases replaced in the slot portion of the Japanese template in the template data shown in FIG. 47.
- the input example A, the input example B, the input example C, and the input example D are replaced in slot portions such as “ ⁇ 1: THIS-THAT ⁇ ” and “ ⁇ 2: GOODS ⁇ ”, respectively.
- This is an example of the phrase.
- the slot portion “ ⁇ 1-i: THIS-THAT ⁇ ” is replaced with the phrase W40 (indicative pronoun), and the slot portion “ ⁇ 2-i: GOODS ⁇ ” is the phrase W41 (noun). The case where it is replaced is shown.
- FIG. 48 is a diagram showing dictionary data stored in the dictionary database 61.
- the dictionary data stores Japanese words and English words corresponding to the words in association with each other.
- information related to plural forms of each word is stored in association with each word in English.
- the slot replacing unit 41 determines whether or not the word / phrase to be replaced in the slot portion “ ⁇ 1-i: THIS-THAT ⁇ ” of the English template is plural (S1001).
- the determination of whether or not the word is plural is made by referring to the dictionary database 61 by the slot replacing unit 41.
- the dictionary database 61 stores information indicating that a phrase is singular or plural.
- step S1004 the slot replacing unit 41 again determines whether or not the word / phrase to be replaced in the slot portion “ ⁇ 2-i: GOODS ⁇ ” of the English template has a plural form. If it is determined in step S1004 that there are plural forms, the slot replacing unit 41 determines whether the word / phrase to be replaced in the slot portion “ ⁇ 2-i: GOODS ⁇ ” of the English template is plural. (S1005). On the other hand, if it is determined in step S1004 that there is no plural form, the slot replacement unit 41 refers to the dictionary data, so that the slot portion “ ⁇ 1-i: THIS-THAT ⁇ ” of the English template. The words used in the above are converted into a singular form (S1006). After step S1006, the process proceeds to step S1008.
- the slot replacement unit 41 is replaced in the slot portion “ ⁇ 2-i: GOODS ⁇ ” of the English template. Is determined to have plural forms.
- the slot replacing unit 41 converts the slot part “ ⁇ 2-i: GOODS ⁇ ” into a singular form. Specifically, in the case of the input example C, there is no substantial change, but the slot replacing unit 41 converts “this” to “this”. In the case of the input example F, the slot replacing unit 41 converts “these” into “this”.
- step S1005 the slot replacement unit 41 determines whether the input example B, the input example D, and the input example E out of the input example A, the input example B, the input example D, and the input example E. It is determined that the phrase to be replaced in ⁇ 2-i: GOODS ⁇ is plural. On the other hand, in the same step, for the input example A, the slot replacing unit 41 determines that the word / phrase replaced in the slot portion “ ⁇ 2-i: GOODS ⁇ ” is not plural.
- step S1005 If it is determined in step S1005 that it is plural, the slot replacing unit 41 converts the word / phrase used in the slot portion “ ⁇ 1-i: THIS-THAT ⁇ ” of the English template into plural (S1007). . Then, after step S1007, the control unit 12 advances the process to step S1008. On the other hand, if it is determined in step S1005 that it is not a plural form, the control unit 12 advances the process to step S1008 without converting the word into a plural form.
- FIG. 51 shows the configuration of template data stored in the template database 60.
- a Japanese template and an English template are shown.
- the Japanese template includes three slot portions such as “ ⁇ 1-i: NOUN ⁇ ”, “ ⁇ 2: NUM ⁇ ”, and “ ⁇ 3-i: CLASSIFIER ⁇ ”.
- the English template includes three slot portions such as “ ⁇ 2-j: NUM ⁇ ”, “ ⁇ 3-j: CLASSIFIER ⁇ ”, and “ ⁇ 1-j: NOUN ⁇ ”.
- a symbol such as “-j” indicating a co-occurrence relationship is added. That is, the three slot portions have a co-occurrence relationship with each other.
- FIG. 52 is a diagram showing an example of words / phrases that are replaced in the slot portion of the Japanese template in the template data shown in FIG.
- input example A and input example B are in slot portions such as “ ⁇ 1-i: NOUN ⁇ ”, “ ⁇ 2: NUM ⁇ ”, and “ ⁇ 3-i: CLASSIFIER ⁇ ”, respectively.
- W42 noun
- a slot portion such as “ ⁇ 2: NUM ⁇ ” is replaced with “2”
- FIG. 53 is a diagram showing dictionary data stored in the dictionary database 61. As shown in the drawing, the dictionary data stores Japanese words and English words corresponding to the words in association with each other.
- FIG. 54 is a flowchart showing a flow for creating an English sentence example using such template data and dictionary data.
- the slot replacing unit 41 refers to the dictionary data, and according to the word / phrase replaced in the slot portion “ ⁇ 1-i: NOUN ⁇ ” of the Japanese template, the slot portion “ ⁇ 3-i: “CLASSIFIER ⁇ ” is selected as a replaceable word / phrase candidate (S1101). For example, in the case of the input example A shown in FIG. 52, the slot replacing unit 41 selects the word / phrase W43 and the word / phrase W41 (see FIG. 53) as the candidates. Further, in the case of the input example B shown in the figure, the slot replacing unit 41 selects the word / phrase W44 (noun) as the candidate.
- step S1104 If it is determined in step S1104 that there is a translation, the slot replacing unit 41 converts the word / phrase of the slot portion “ ⁇ 3-j: CLASSIFIER ⁇ ” of the English template into a plural form (S1105). Then, after step S1105, the control unit 12 advances the process to step S1107. On the other hand, if it is determined in step S1104 that there is no translation, the slot replacement unit 41 converts the word / phrase of the slot portion “ ⁇ 2-j: NOUN ⁇ ” of the English template into a plural form (S1106). Then, after step S1106, the control unit 12 advances the process to step S1107.
- the translation apparatus 1 generates an English sentence example in which each slot is replaced with a phrase for each of the input example A and the input example B.
- FIG. 56 is a diagram showing a schematic configuration of translation apparatus 1A.
- translation apparatus 1A includes an input unit 10, an output unit 11, a control unit 12A, a storage device 13A, and a memory 14A.
- the translation device 1A is different from the translation device 1 having the control unit 12 and the storage device 13 in that the translation device 1A has the control unit 12A and the storage device 13A.
- the storage device 13A stores a template database 60A, a dictionary database 61, a Japanese language utilization type table 62, a category database 63, a thesaurus data 64, and a co-occurrence relation database 65.
- the memory 14A includes an extracted word buffer 70, a search result template buffer 71, a slot part buffer 72, a co-occurrence part buffer 73, a priority co-occurrence buffer 74, a processed sentence storage buffer 75, and a translation result buffer 76.
- a decompressed data storage buffer (not shown), a process waiting buffer (not shown), and an element buffer (not shown) are provided.
- step S2 The outline of the processing in the translation apparatus 1A is the same as the flowchart shown in FIG. However, the template search (step S2) processing in the translation apparatus 1A is different from the flowchart shown in FIG.
- step S2115 translation apparatus 1A determines whether a label exists in the element. If it is determined that a label exists (YES in step S2115), in step S2117, translation apparatus 1A determines whether or not “()” is present in the read element. If it is determined that “()” exists (YES in step S2117), in step S2119, translation apparatus 1A first changes “(” to “ ⁇ ” and “)” in the element buffer. ⁇ ". Furthermore, in step S2119, translation apparatus 1A adds a slot number after “ ⁇ ”. On the other hand, when it is determined that “()” does not exist (NO in step S2117), translation apparatus 1A wraps the entire element with “()” in the element buffer and adds a slot number after “ ⁇ ”. To do.
- FIG. 64 is a view showing another template data (hereinafter referred to as “template data (1971-1_2)”) after being stored in the processing waiting buffer.
- the template data (1971-1_2) is data obtained by the translation apparatus 1 by rewriting the character string W47 of the template data (ID1971-1) using the word / phrase W49 (element).
- FIG. 66 is a diagram showing the state of the element buffer after the element is written in the process of step S2113 for the second time.
- translation apparatus 1A converts “(& VB_EXPLAIN + v.kanou)”, “(& VB_PRONOUNCE + v.kanou)”, and “(& VB_INTERPRET + v.kanou)” into the element buffer in Japanese. Write in the column. Also, the translation apparatus 1A writes “(& VB_EXPLAIN + inf) it in”, “(& VB_PRONOUNCE + inf) it in”, and “(& VB_INTERPRET + inf) it in” in the English column of the element buffer. The translation apparatus 1A performs the same processing for the Chinese column of the element buffer as for the Japanese column and the English column.
- FIG. 67 is a diagram showing the state of the element buffer after the first processing in step S2117.
- the translation apparatus 1A can generate a plurality of Japanese templates from the template TJ. Furthermore, the translation apparatus 1A can generate a plurality of English templates corresponding to the Japanese template from the template TE.
- FIG. 80 shows a bilingual sentence generated using the template data (parallel translation template) shown in FIG. That is, FIG. 80 is a diagram in which the sentence indicated by the Japanese template in FIG. 79 is associated with the sentence indicated by the English template.
- “translated sentence” means a sentence written in Japanese and a sentence written in English.
- the “translation” is to indicate a translation corresponding to the original sentence.
- translation apparatus 1 displays an original sentence W101 (first sentence) and a translated sentence W201 (second sentence) for the original sentence W101 on output unit 11.
- FIG. 81 (b) is a diagram for explaining a case where the user selects the element W121 of the original text W101.
- “element” refers to a part of a sentence.
- the element is a concept including a phrase.
- An element may be a single phrase.
- the element W121 of the original sentence W101 includes a plurality of continuous words.
- translation apparatus 1 changes the display mode of element W221 of translated sentence W201 corresponding to element W121 of original sentence W101.
- the element W221 of the translated sentence W201 includes a plurality of continuous words.
- the translation apparatus 1 displays the comment sentence W321 of the element W221 on the output unit 11.
- FIG. 81 (c) is a diagram for explaining a case where the user selects the element W131 of the original text W101.
- the element W131 of the original sentence W101 includes a plurality of consecutive words / phrases W131a and W131b.
- translation apparatus 1 changes the display mode of element W231, phrase W232, and phrase W233 of translated sentence W201 corresponding to element W131 of original sentence W101.
- the element W231 of the translated sentence W201 includes a plurality of continuous words.
- the translation apparatus 1 displays commentary sentences W331, W332, and W333 on the output unit 11 for the element W231 and the words and phrases W232 and W233, respectively.
- FIG. 82 (a) is a diagram showing the upper category data CD1 whose category ID is “01001”.
- the upper category data CD1 includes information of a label name, expanded data, upper data, and an explanation sentence.
- the expanded data includes a Japanese template (third template) and an English template (fourth template). That is, in the high-order category data, a Japanese template of expanded data (hereinafter also referred to as “first expanded template”) and an English template corresponding to the template (hereinafter also referred to as “second expanded template”). Associated with each other.
- the label name of the upper category data CD1 is “TEMPL_NP-AND2”.
- the label name is used as data (replacement data) for replacing the expanded data.
- the Japanese template of the development data includes two variable parts and one fixed part.
- the English template of the development data includes two variable parts and three fixed parts.
- the variable parts “ ⁇ 1: & NOUN ⁇ ” and “ ⁇ 2: & NOUN ⁇ ” can be replaced by placing the words included in the category “& NOUN” in the thesaurus data (see FIG. 7) in the variable part. It is a good candidate.
- “ ⁇ 1: & NOUN ⁇ ” and “ ⁇ 2: & NOUN ⁇ ” indicate that the part is replaced with a noun.
- the upper data indicates the group to which the label name belongs.
- the explanation text is a text explaining the expansion data associated with the label name. More specifically, the explanatory text indicates what grammatical meaning each replaced template has when the variable part of the Japanese template and the variable part of the English template in the development data are replaced. It is a sentence. For example, information such as “parallel expression of nouns” is included in the upper category data as an explanatory sentence.
- the label name of the upper category data CD2 is “TEMPL_NP-COMPLETE”.
- the Japanese template (third template) of the development data includes two variable parts and a plurality of fixed parts.
- the expanded data English template (fourth template) includes two variable parts and two fixed parts.
- the variable part “ ⁇ 1: & TEMPL_NP ⁇ ” indicates that a phrase included in a category such as “& TEMPL_NP” in the thesaurus data is a candidate that can be replaced in the variable part.
- the variable part “ ⁇ 2: & NOUN ⁇ ” indicates that the part is replaced with a noun.
- FIG. 82 (c) is a diagram showing the upper category data CD3 whose category ID is “02001”.
- the upper category data CD3 includes information on the label name, the expanded data, the upper data, and the commentary, similarly to the upper category data CD1 and CD2.
- the label name of the upper category data CD3 is “TEMPL_PLACE-VCL”.
- the Japanese template (third template) of the expanded data includes one variable part and one fixed part.
- the expanded data English template (fourth template) includes three variable portions.
- the variable part “ ⁇ 1: & VIHECLE ⁇ ” indicates that a phrase included in a category such as “& VIHECLE” in the thesaurus data is a candidate that can be replaced in the variable part.
- ⁇ i: LOC-PREP ⁇ ” and “ ⁇ i: DEF-DET ⁇ ” are co-occurrence portions that are one type of variable portion.
- FIG. 83 is a diagram showing an example of words included in a category such as “& NOUN” in the thesaurus data.
- the category includes, for example, a phrase included in a category such as “& WALLET”, a phrase included in a category such as “& TICKET”, and a phrase included in a category such as “& MONEY”.
- FIG. 84 is a diagram showing co-occurrence relation data.
- FIG. 84A shows the co-occurrence relation data KD1 with the label name “my_DET”.
- the phrase for example, “I”
- the co-occurrence relation data KD1 includes commentary information.
- the commentary sentence is a sentence explaining the grammatical meaning of the characters used in the co-occurrence part. Specifically, contents such as “owned pronoun corresponding to the replacement word” are described as the commentary.
- FIG. 85 is a diagram showing details of the change instruction unit 30 in the translation apparatus 1.
- change instruction unit 30 includes a data generation unit 31, a detection unit 32, and a specification unit 33.
- the specifying unit 33 includes a second replacement unit 331, a second extraction unit 332, a setting unit 333, a first determination unit 334, a third replacement unit 335, a second determination unit 336, and a third determination unit 337.
- the detection unit 32 Based on the input via the input unit 10, the detection unit 32 detects that at least one phrase (for example, two or more consecutive phrases) included in a Japanese sentence (first sentence) has been selected. .
- the detection unit 32 detects that the element W121 in FIG. 81B is selected by a pointing device such as the mouse 120.
- the display control unit 25 changes the display mode of the corresponding phrase based on the identification of the corresponding phrase.
- the second extraction unit 332 extracts the variable part phrase as a keyword from the selected phrase (element).
- the setting unit 333 sets the extracted keyword combination and the extracted keyword alone as search candidates.
- the first determination unit 334 determines, for each search candidate, for each first expansion template, whether or not the search candidate satisfies the condition indicated by the first expansion template.
- the third replacement unit 335 replaces the variable part of the first development template with the search candidate keyword based on the first determination unit 334 determining that the above condition is satisfied. Also, the third replacement unit 335 replaces the variable part of the second development template with a word (English) corresponding to the search candidate based on the first determination unit 334 determining that the above condition is satisfied. .
- the second determination unit 336 determines whether or not the first expansion template after replacement with the search candidate keyword matches at least a part of the selected word / phrase.
- the second replacement unit 331 selects a portion corresponding to the second expansion template that has a corresponding relationship with the first expansion template corresponding to at least a part of the selected phrase from the data based on the English template in the processing data.
- the label name associated with the first development template and the second development template (that is, the label name of the upper category data) is replaced.
- the second replacement unit 331 selects a location corresponding to the first development template corresponding to at least a part of the selected word from the data based on the Japanese template in the processing data. Is replaced with the label name associated with (that is, the label name of the upper category data).
- the second replacement unit 33 after the replacement by the third replacement unit 335, among the data based on the English template in the processing data based on the determination that the second determination unit 336 matches.
- the location of the second development template corresponding to the first development template is replaced with the label name (replacement data).
- the third determination unit 337 determines that the number of keywords used to set each search candidate is determined based on the determination that the search condition is not satisfied for each search candidate. It is determined whether or not there are a plurality.
- setting unit 333 sets a combination of words in three words / phrases W401, W402, and W403 and words W401, W402, and W403 alone as search candidates.
- the setting unit 333 sets a total of seven search candidates.
- search candidate-1 a combination of the phrase W401, the phrase W402, and the phrase W403 is referred to as “search candidate-1”.
- a combination of the phrase W401 and the phrase W402 is “search candidate-2”.
- a combination of the word / phrase W401 and the word / phrase W403 is defined as “search candidate-3”.
- a combination of the word / phrase W402 and the word / phrase W403 is “search candidate-4”.
- the single word / phrase W401 is set as “search candidate-5”.
- the single word W402 is set as “search candidate-6”.
- the single word W403 is set as “search candidate-7”.
- the combination of the three words shown as search candidate-1 does not match the type of the first expansion template (that is, the Japanese template of the expansion data) of the upper category data CD1. Further, the combination of the three words does not match the type of the first development template of the upper category data CD2 and CD3. Accordingly, the first determination unit 334 determines that the search candidate-1 does not satisfy the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3.
- the third replacing unit 335 replaces each variable part of the first development template with each of the words W401 and W402 of the search candidate-2. Also, the third replacement unit 335 replaces each variable part of the second expansion template associated with the first expansion template with each word / phrase (that is, “ticket”, “money”) corresponding to the search candidate-2. To do.
- the second replacement unit 331 Based on the determination that the second determination unit 336 matches, the second replacement unit 331 performs the following processing. Referring to FIG. 86 (d), the second replacement unit 331 corresponds to at least part of the element W121 (location shown in FIG. 86 (c)) among the data based on the English template in the processing data.
- the location corresponding to the second development template (a ⁇ 3: ticket ⁇ and ⁇ 4: money ⁇ ) corresponding to the first development template (see FIG. 82 (a)) includes the label name of the upper category data CD1. Replace with ⁇ & TEMPL_NP-AND2 ⁇ ".
- the second replacement unit 331 performs the same processing as the English template for the Japanese template in the processing data.
- the first determination unit 334 determines that the search candidate-1 does not satisfy the condition indicated by the first development template in the upper category data CD2 and CD3.
- search candidate-3, search candidate-4, search candidate-5, search candidate-6, and search candidate-7 are the same as search candidate-1. There is no need for processing. Therefore, the specifying unit 33 proceeds with the next processing based on the processing data shown in FIG.
- the setting unit 333 resets search candidates based on the keyword extraction process performed again by the second extraction unit 332.
- setting unit 333 sets a search candidate such as “TEMPL_NP-AND2” and word / phrase W403, a search candidate such as “TEMPL_NP-AND2”, and a search candidate such as word / phrase W403.
- search candidate-11 a combination of “TEMPL_NP-AND2” and the word / phrase W403 is referred to as “search candidate-11”.
- the single “TEMPL_NP-AND2” is set as “search candidate-12”.
- the single word W403 is set as “search candidate-13”.
- the first determination unit 334 first determines whether or not the search candidate 11 satisfies the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3 for the search candidate-11.
- the combination of “TEMPL_NP-AND2” and the word / phrase W403 shown as the search candidate 11 does not match the type of the first expansion template of the upper category data CD1 and CD3.
- the combination matches the type of the first expansion template of the upper category data CD2.
- the second determination unit 336 determines that the first expansion template after replacement with “TEMPL_NP-AND2” of the search candidate-12 and the word / phrase W403 is at least a part of the processing data after replacement (see FIG. 86D). To determine whether or not. In this case, the second determination unit 336 determines that they match.
- the second replacement unit 331 Based on the determination that the second determination unit 336 matches, the second replacement unit 331 performs the following processing. Referring to FIG. 86 (h), second replacement unit 331 selects “ ⁇ wallet ⁇ complete with ⁇ & TEMPL_NP-AND2 ⁇ ” as the label of upper category data CD2 among the data based on the English template in the processing data. Replace with " ⁇ & TEMPL_NP-COMPLETE ⁇ " containing the name. The second replacement unit 331 performs the same processing as the English template for the Japanese template in the processing data.
- the specifying unit 33 proceeds with the next processing based on the processing data shown in FIG.
- the second extraction unit 332 extracts “TEMPL_NP-COMPLETE” as a new keyword from the processing data (see FIG. 86 (h)). However, there is only one extracted keyword. For this reason, the specifying unit 33 performs the following processing without performing the above-described processing by the setting unit 333, the above-described processing by the first determination unit 334, and the like.
- the identifying unit 33 identifies a location corresponding to “ ⁇ & TEMPL_NP-COMPLETE ⁇ ” in the English template (see FIG. 79) of the display data as the corresponding phrase.
- “ ⁇ & TEMPL_NP-COMPLETE ⁇ ” corresponds to “ ⁇ wallet ⁇ complete with ⁇ & TEMPL_NP-AND2 ⁇ ” and “ ⁇ & TEMPL_NP-AND2 ⁇ ” to “a ⁇ 3: ticket ⁇ and ⁇ 4: money ⁇ ”
- the identifying unit 33 identifies “ ⁇ 5: wallet ⁇ complete with a ⁇ 3: ticket ⁇ and ⁇ 4: money ⁇ ” as the corresponding phrase.
- the display control unit 25 changes the display mode of the corresponding phrase. At this time, it is preferable from the viewpoint of visual effect that the display mode of the element W121 (FIG. 81 (b)) selected by the user is the same as the display mode of the corresponding phrase.
- the display control unit 25 displays on the output unit 11 the explanatory text of the upper category data CD2 whose label name is “& TEMPL_NP-COMPLETE ⁇ ” shown in FIG.
- the translation apparatus 1 can display the parallel translation shown in FIG. 81 (b) on the output unit 11. Therefore, the user can easily determine the phrase of the translation corresponding to the phrase (original phrase) selected by the user in the original sentence. Furthermore, since the translation apparatus 1 displays the explanatory text in association with the corresponding phrase, the user can confirm the explanatory text of the corresponding phrase.
- the user can compose a sentence based on the parallel translation. That is, when there is a difference between the original sentence in the bilingual sentence (example sentence) and the original sentence of the content to be written, the user uses the translation device 1 to place the translated sentence corresponding to the difference (that is, the place to be replaced). Can be easily identified. For this reason, the user can create a desired sentence by replacing the portion with the translated word of the difference.
- FIG. 87 is a diagram for explaining the operation of the translation apparatus 1 when the element W131 is selected from the original text W101 shown in FIG. 81 (c).
- the second extraction unit 332 specifies the variable part in the selected element W131 in the processing data (see FIG. 79), and Extract words as keywords. That is, the second extraction unit 332 extracts the word / phrase W501 and the word / phrase W502 as keywords from the processing data.
- setting unit 333 sets a combination of words in two words W501 and W502, and words W501 and W502 alone as search candidates. That is, the setting unit 333 sets a total of three search candidates.
- search candidate-21 the combination of the phrase W501 and the phrase W502 is referred to as “search candidate-21”.
- the single word W501 is set as “search candidate-22”.
- the word W502 alone is set as “search candidate-23”.
- the first determination unit 334 first determines whether or not the search candidate -21 satisfies the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3 for the search candidate -21.
- the combination of the two words shown as the search candidate-21 does not match the type of the first expansion template (that is, the Japanese template of the expansion data) of the upper category data CD1. Further, the combination of the two words does not match the type of the first development template of the upper category data CD2 and CD3. Accordingly, the first determination unit 334 determines that the search candidate-1 does not satisfy the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3.
- the first determination unit 334 performs the same processing for the search candidate -22 as for the search candidate -21.
- the word / phrase W501 alone shown as the search candidate-22 does not match the type of the first expansion template of the upper category data CD1. Further, the phrase W501 alone does not match the type of the first expansion template of the upper category data CD2. Accordingly, the first determination unit 334 determines that the search candidate -22 does not satisfy the condition indicated by the first development template in the upper category data CD1 and CD2.
- the first determination unit 334 determines that the search candidate-1 satisfies the condition indicated by the first development template in the upper category data CD3.
- third replacing unit 335 replaces the variable part of the first expanded template with each word / phrase W501 of search candidate-22. Further, the third replacement unit 335 replaces the variable part (slot part) of the second development template associated with the first development template with each word / phrase (that is, “train”) corresponding to the search candidate ⁇ 22. .
- the third replacement unit 335 replaces the other variable part (that is, the co-occurrence part) of the second development template using the co-occurrence relation data shown in FIGS. 84 (b) and 84 (c). . Specifically, the third replacement unit 335 replaces the slot part “1: & VIHECLE” with “train”, and therefore, based on the co-occurrence condition of FIG. 86 (c), the co-occurrence part “ ⁇ -i: LOC -PREP ⁇ "is replaced with” on ". Further, the third replacement unit replaces the co-occurrence portion “-i: DEF-DET” with “the” based on the co-occurrence condition of FIG. 86 (b).
- the second determination unit 336 determines whether or not the first development template after replacement with the word / phrase W501 of the search candidate-22 matches at least a part of the selected element W131 (see FIG. 81C). to decide. In this case, the second determination unit 336 determines that they match.
- the second replacement unit 331 Based on the determination that the second determination unit 336 matches, the second replacement unit 331 performs the following processing. Referring to FIG. 87 (d), the second replacement unit 331 corresponds to at least a part of the element W131 (location shown in FIG. 87 (c)) among the data based on the English template in the processing data.
- the location corresponding to the second expansion template ( ⁇ -i: On ⁇ ⁇ -i: the ⁇ ⁇ 1-i: train ⁇ ) corresponding to the first expansion template (see FIG. 82 (c)) is the upper category data.
- Replace with “ ⁇ & TEMPL_PLACE-VCL ⁇ ” containing the label name of CD1.
- the second replacement unit 331 performs the same processing as the English template for the Japanese template in the processing data.
- the second extraction unit 332 extracts “TEMPL_PLACE-VCL” (word / phrase) and word / phrase W502 as new keywords from the processing data (see FIG. 87 (d)). To do.
- the setting unit 333 resets search candidates based on the keyword extraction process performed again by the second extraction unit 332.
- setting unit 333 sets a search candidate such as “TEMPL_PLACE-VCL” and word / phrase W 502, a search candidate such as “TEMPL_PLACE-VCL”, and a search candidate such as word / phrase W 502.
- search candidate-31 the combination of “TEMPL_PLACE-VCL” and the word / phrase W502 is referred to as “search candidate-31”.
- a single “TEMPL_PLACE-VCL” is set as “search candidate-32”.
- the single word / phrase W502 is set as “search candidate-33”.
- the first determination unit 334 first determines whether or not the search candidate-31 satisfies the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3 for the search candidate-31. In this case, the combination of “TEMPL_PLACE-VCL” and the word / phrase W502 shown as the search candidate ⁇ 31 does not match the type of the first expansion template of the upper category data CD1, CD2, CD3.
- the first determination unit 334 determines whether or not the search candidate -32 satisfies the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3 for the search candidate -32. Also in this case, “TEMPL_PLACE-VCL” shown as the search candidate ⁇ 32 does not match the type of the first expansion template of the upper category data CD1, CD2, CD3.
- the first determination unit 334 determines whether or not the search candidate -33 satisfies the condition indicated by the first development template in each of the upper category data CD1, CD2, and CD3 for the search candidate -33. Also in this case, the word / phrase W502 indicated as the search candidate ⁇ 33 does not match the type of the first development template of the upper category data CD1, CD2, CD3.
- the specifying unit 33 does not perform the replacement process by the third replacement unit 335 as shown in FIG. 87 (c).
- the third determination unit 337 of the specifying unit 33 determines that each of the search candidate-31, the search candidate-32, and the search candidate-33 does not match the type of the first development template of the upper category data CD1, CD2, and CD3. Perform the following process. That is, the third determination unit 337 determines whether or not the number of keywords used to set each search candidate is plural. In the example of FIG. 87, since the number of keywords is two as shown in FIG. 87 (e), the third determination unit 337 determines that the number of keywords is plural.
- the identifying unit 33 identifies the portion of the translation corresponding to each keyword as the corresponding phrase. Specifically, the specifying unit 33 specifies a location corresponding to “ ⁇ & TEMPL_PLACEP-VCL ⁇ ” and the word / phrase W502 in the English template (see FIG. 79) of the display data as the corresponding word / phrase.
- ⁇ & TEMPL_PLACE-VCL ⁇ corresponds to “ ⁇ -i: LOC-PREP ⁇ ⁇ -i: DEF-DET ⁇ ⁇ 1: & VIHECLE ⁇ ”
- the word / phrase W502 is “ ⁇ 2-j: I ⁇ ”
- the specifying unit 33 performs “ ⁇ -i: On ⁇ ⁇ -i: the ⁇ ⁇ 1-i: train ⁇ ”, “ ⁇ 2-j: I ⁇ ”, And “ ⁇ -j: my ⁇ ” is identified as the corresponding phrase.
- the display control unit 25 changes the display mode of the corresponding phrase.
- the display control unit 25 performs display control based on the determination of the third determination unit 337. That is, based on the determination that there are a plurality of keywords, the display control unit 25 displays the translated parts corresponding to the keywords in different display modes for each keyword (see FIG. 81C). Specifically, the display control unit 25 displays “On the train” related to the keyword “TEMPL_PLACE-VCL” and “I” and “my” related to the keyword word W502 as different display modes in the output unit 11. Display. Further, the display control unit 25 performs the display mode changing process for the original sentence (Japanese) in the same manner as the translated sentence.
- step S3001 translation apparatus 1 extracts keywords from the selected range.
- step S3002 translation apparatus 1 generates a search candidate using the extracted keyword.
- step S3003 the translation apparatus 1 selects one search candidate from the generated search candidates.
- step S3004 the translation apparatus 1 searches the upper category data using the selected search candidate. That is, the translation apparatus 1 determines whether or not each selected search candidate satisfies the condition indicated by the first development template included in the upper category data for each upper category data. Extract upper category data.
- step S3005 translation apparatus 1 selects one upper category data from the retrieved upper category data. For example, when translation apparatus 1 extracts a plurality of upper category data in step S3004, translation apparatus 1 selects one upper category data from the plurality of upper category data.
- step S3008 translation apparatus 1 generates at least a part of the generated sentence (first expanded template after replacement) and the selected word or phrase, or processing data after replacement (for example, FIG. 86 (d)). Whether or not matches is determined.
- step S3008 If translation apparatus 1 determines that they match in step S3008 (YES in step S3008), it translates the bilingual example sentence in the processing data in step S3011. That is, translation apparatus 1 performs the above-described replacement process by second replacement unit 331.
- step S3012 translation apparatus 1 updates selection data to be processed. For example, the translation apparatus 1 updates the selection data to be processed from the selection data shown in FIG. 86 (a) to the selection data shown in FIG. 86 (e).
- step S3008 If translation apparatus 1 determines that there is no match in step S3008 (NO in step S3008), translation apparatus 1 determines in step S3009 whether there is unselected upper category data.
- step S3010 determines that it exists in step S3010 (YES in step S3010), it selects one search candidate from unselected search candidates in step S3014. If translation apparatus 1 determines that it does not exist in step S3010 (NO in step S3010), it advances the process to step S303 in FIG.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
Description
図1は、本発明の実施の形態に係る翻訳装置1の概略構成を示した図である。
記憶装置13は、同図に示すとおり、テンプレートデータベース60と、辞書データベース61と、日本語活用形テーブル62と、カテゴリデータベース63と、シソーラスデータ64と、共起関係データベース65とを記憶している。
図4は、辞書データベース61に含まれる一つの辞書データの構成を示した図である。同図に示すとおり、辞書データにおいては、辞書IDと、見出しと、品詞と、活用と、助数詞コードと、意味コードとが対応付けられている。
メモリ14は、同図に示すとおり、抽出語句バッファ70と、検索結果テンプレートバッファ71と、スロット部用バッファ72と、共起部用バッファ73と、優先共起バッファ74と、処理文格納バッファ75と、翻訳結果バッファ76と、テンポラリテンプレートバッファ77と、テンポラリ辞書バッファ78と、テンポラリ語句バッファ79と、テンポラリスロットバッファ80と、テンポラリ第1共起バッファ81と、テンポラリ第2共起バッファ82と、入力文バッファ83と、を備えている。これらの各バッファが記憶するデータについては、後述する。
まず、入力部10を介して日本語による文が翻訳装置1に入力される(S1)。なお、当該入力された文はメモリ14の入力文バッファ83に一時的に記憶される。ステップS1の後は、制御部12が、テンプレートデータベース60において、所定の条件を満たすテンプレートデータデータを検索する(S2)。
まず、制御部12が、検索結果テンプレートバッファ71からテンプレートデータを一つ読み出し、当該読み出したテンプレートデータをテンポラリテンプレートバッファ77に記憶させる(S401)。例えば、図3に示すようようなテンプレートデータが検索結果テンプレートバッファ71に記憶されている場合、制御部12は、図16に示すとおり、結果番号が付与された形でテンポラリテンプレートバッファ77に当該テンプレートデータを記憶させる。なお、結果番号とは、複数のテンプレートデータからテンプレートデータを識別するための番号であり、検索結果テンプレートバッファ71にテンプレートデータが記憶される際に付される番号である。
まず、処理文格納バッファ75には、図17に示したテンプレートデータが記憶されている。また、テンポラリ語句バッファ79(図24参照)には、日本語の欄に語句W33が、英語の欄に「drinking」が、中国語の欄に語句W17が書き込まれている。
このため、共起置換部42は、当該「DET_MY-NULL」に関する共起情報を読み出し、図37に示すとおり、当該読み出した共起情報をテンポラリ第2共起バッファ82に書き込む。なお、図37は、テンポラリ第2共起バッファに記憶された共起情報を示した図である。また、以下では、上記のようにテンポラリ第2共起バッファ82に書き込まれた共起情報を共起情報(CHX)と表記する。
ここで、図15のステップS409における処理を説明する。
図46は、テンプレートデータベース60に記憶されているテンプレートデータの構成を示したものである。説明の便宜上、日本語テンプレートと英語テンプレートとを示している。同図に示すとおり、日本語テンプレートには、「{1:THIS-THAT}」と「{2:GOODS}」といった2つのスロット部が含まれている。また、英語テンプレートには、「{1-i:THIS-THAT}」と「{2-i:GOODS}」といった2つのスロット部が含まれている。この英語テンプレートの両方のスロット部においては、「-i」といった符号が付されている。つまり、両スロット部は互いに共起関係を有している。
図51は、テンプレートデータベース60に記憶されているテンプレートデータの構成を示したものである。説明の便宜上、日本語テンプレートと英語テンプレートとを示している。同図に示すとおり、日本語テンプレートには、「{1-i:NOUN}」と「{2:NUM}」と「{3-i:CLASSIFIER}」といった3つのスロット部が含まれている。また、英語テンプレートには、「{2-j:NUM}」と「{3-j:CLASSIFIER}」と「{1-j:NOUN}」といった3つのスロット部が含まれている。この英語テンプレートの3つスロット部においては、共起関係を示す「-j」といった符号が付されている。つまり、3つのスロット部は互いに共起関係を有している。
ところで、上記においては、翻訳装置1は、翻訳処理を実行しない場合であっても、図3に示した形式の日本語テンプレートや英語テンプレートを、記憶装置13のテンプレートデータベース60に記憶している。以下では、翻訳装置が、翻訳処理を実行する際に、記憶装置13のテンプレートデータベースに予め記憶されたテンプレートに基づいて、日本語テンプレートおよび英語テンプレートを生成する構成について説明する。なお、以下の説明では、同一の部品には同一の符号を付してある。それらの名称および機能も同じである。したがって、それらについての詳細な説明は繰り返さない。
<1.概要>
翻訳装置1の具体的機能の概要について、図78から図81に基づいて説明する。図78は、テンプレートデータベース60に含まれる一つのテンプレートデータの構成を示した図である。図78を参照して、説明の便宜上、日本語テンプレート(第1テンプレート)と英語テンプレート(第2テンプレート)とを示しており、中国語テンプレートは示していない。
図82は、記憶装置13に格納された複数の上位カテゴリデータ(関連付けデータ)のうち、3つの上位カテゴリデータを示した図である。
図85は、翻訳装置1における変更指示部30の詳細を示した図である。図85を参照して、変更指示部30は、データ生成部31と、検知部32と、特定部33とを備える。特定部33は、第2置換部331と、第2抽出部332と、設定部333と、第1判断部334と、第3置換部335と、第2判断部336と、第3判断部337とを備える。
<4.第1の具体例>
以下では、説明を簡単にするため、記憶装置13には、複数の上位カテゴリデータとして、3つの上位カテゴリデータCD1、CD2、CD3のみが格納されているとする。
以下においても、「<4.第1の具体例>」と同様に、記憶装置13には、複数の上位カテゴリデータとして、3つの上位カテゴリデータCD1、CD2、CD3のみが格納されているとする。
図88は、翻訳装置1で行なわれる処理を示したフローチャートである。図88を参照して、ステップS301において、翻訳装置1は、入力として、ユーザから原文における範囲の選択(つまり、少なくとも1つの語句を選択)を受け付ける。ステップS302において、翻訳装置1は、原文において選択した範囲に対応する訳文の範囲を特定する。ステップS303において、翻訳装置1は、出力として、特定された訳文の範囲の表示態様を変更する。
(1)「<<2.翻訳装置の具体的機能>>」においては、翻訳装置1の具体的機能を述べたが、当該具体的な機能を翻訳装置1Aが備える構成としてもよい。
Claims (10)
- 対訳テンプレートを用いて第1言語による第1文を第2言語による第2文に翻訳する情報処理装置(1,1A)であって、
前記第1文と前記第2文とを表示装置(11)に表示させる表示制御部(25)と、
前記第1文に含まれる1つまたは複数の語句が選択されたことを検知する検知部(32)と、
少なくとも前記対訳テンプレートに基づいて、前記第2文に含まれる、前記選択された語句に対応する複数の対応語句を特定する特定部(33)とを備え、
前記表示制御部は、前記対応語句が特定されたことに基づき、当該対応語句の表示態様を変更する、情報処理装置。 - 前記対訳テンプレートは、前記第1言語による第1テンプレートと当該第1テンプレートと対応関係にある前記第2言語による第2テンプレートとを含み、
前記第1テンプレートと前記第2テンプレートとは、所定の語句で構成された固定部と予め定められた複数の語句のうちの何れかの語句に置換可能な可変部とをそれぞれ対応する位置に含み、
前記情報処理装置は、
前記第1言語による第3テンプレートと、当該第3テンプレートと対応関係にある前記第2言語による第4テンプレートとを互いに関連付けた関連付けデータを複数格納した記憶装置(13,13A)をさらに備え、
各前記第3テンプレートは、2つ以上の前記可変部、あるいは少なくとも1つの前記可変部と少なくとも1つの前記固定部とを含み、
前記特定部は、前記対訳テンプレートと前記関連付けデータとに基づいて、前記対応語句を特定する、請求の範囲第1項に記載の情報処理装置。 - 前記関連付けデータは、各々、前記第3テンプレートおよび前記第4テンプレートに関連付けて置換用データをさらに格納し、
前記特定部は、前記複数の第3テンプレートのうち前記選択された語句の少なくとも一部と対応関係にある第3テンプレートと、当該第3テンプレートと対応関係にある前記第4テンプレートと、当該第3テンプレートおよび当該第4テンプレートと関連付けられた前記置換用データとに基づいて、前記対応語句を特定する、請求の範囲第2項に記載の情報処理装置。 - 前記情報処理装置は、
前記第1テンプレートの前記可変部と前記第2テンプレートの前記可変部とを前記予め定められた複数の語句のうちの何れかの語句で置換する第1置換部(24)と、
前記置換に基づき、前記第1文と前記第2文とを前記表示装置に表示させるための表示用データとは異なる、前記対応語句の表示態様を変更するための処理用データとを生成する生成部(31)をさらに備え、
前記特定部は、
前記処理用データにおける前記第2テンプレートに基づくデータのうち、前記選択された語句の少なくとも連続する一部に対応する前記第3テンプレートと対応関係にある前記第4テンプレートに対応する箇所を、当該第3テンプレートおよび当該第4テンプレートと関連付けられた前記置換用データに置換する第2置換部(331)をさらに含み、
少なくとも、前記処理用データにおいて前記置換用データに置換された箇所に対応する前記第2文の箇所を、前記対応語句として特定し、
前記表示制御部は、前記特定された前記第2文の箇所の前記表示態様を変更する、請求の範囲第3項に記載の情報処理装置。 - 前記特定部は、
前記選択された語句から、前記可変部の語句をキーワードとして抽出する抽出部(332)と、
前記抽出したキーワードの組み合わせ、および前記抽出したキーワード単体を検索候補に設定する設定部(333)と、
各前記検索候補について、前記第3テンプレートが示す条件を当該検索候補が満たしているか否かを、前記第3テンプレート毎に判断する第1判断部(334)と、
前記条件を満たしていると判断されたことに基づき、前記第3テンプレートの前記可変部を前記検索候補のキーワードで置換する第3置換部(335)と、
前記検索候補のキーワードで置換した後の前記第3テンプレートが、前記選択された語句の少なくとも一部と一致するか否かを判断する第2判断部(336)とをさらに備え、
前記第2置換部は、前記第2判断部により一致していると判断されたことに基づき、前記処理用データにおける前記第2テンプレートに基づくデータのうち、前記置換後の第3テンプレートと対応関係にある前記第4テンプレートの箇所を、前記置換用データに置換する、請求の範囲第4項に記載の情報処理装置。 - 前記第2置換部が前記第4テンプレートの箇所を前記置換用データに置換した後に、前記抽出部は、当該置換用データと、前記キーワードのうち前記置換後の前記第3テンプレートに含まれていないキーワードとを、新たにキーワードとして抽出し、
前記情報処理装置は、
前記新たに抽出したキーワードに基づき、前記設定部による前記設定と、前記第1判断部による前記判断と、前記第3置換部による前記置換とを再度行い、
前記第2判断部は、前記第3置換部による前記再度の置換に基づき、当該置換後の前記第3テンプレートが、前記置換用データで置換された後の前記処理用データにおける前記第2テンプレートの少なくとも一部と一致するか否かを判断し、
前記第2置換部は、前記第2判断部により一致していると判断されたことに基づき、再度、前記処理用データにおける前記第2テンプレートに基づくデータのうち、前記置換後の第3テンプレートと対応関係にある前記第4テンプレートの箇所を、前記置換用データに置換する、請求の範囲第5項に記載の情報処理装置。 - 前記特定部は、
前記第1判断部による判断の結果、各前記検索候補について前記条件を満たしていないと判断されたことに基づき、当該各検索候補を設定するために用いた前記キーワードの個数が複数であるか否かを判断する第3判断部(337)をさらに含み、
少なくとも、当該各キーワードに対応する前記第2文の箇所を、前記対応語句として特定し、
前記表示制御部は、前記キーワードが複数であると判断されたことに基づき、前記キーワードに応じた前記第2文の箇所を、キーワード毎に互いに異なる表示態様で表示させる、請求の範囲第6項に記載の情報処理装置。 - 前記関連付けデータは、各々、前記第3テンプレートの内容を解説する解説文をさらに格納し、
前記表示制御部は、前記解説文を前記対応語句に対応付けて表示させる、請求の範囲第2項から第7項のいずれか1項に記載の情報処理装置。 - 対訳テンプレートを用いて第1言語による第1文を第2言語による第2文に翻訳する情報処理装置(1,1A)における表示制御方法であって、
前記情報処理装置のプロセッサ(110)が、前記第1文と前記第2文とを表示装置(11)に表示させるステップと、
前記プロセッサが、前記第1文に含まれる1つまたは複数の語句が選択されたことを検知するステップと(S301)、
前記プロセッサが、少なくとも前記対訳テンプレートに基づいて、前記第2文に含まれる、前記選択された語句に対応する複数の対応語句を特定するステップと(S302)、
前記プロセッサが、前記対応語句を特定したことに基づき、当該対応語句の表示態様を変更するステップ(S303)とを備える、表示制御方法。 - 対訳テンプレートを用いて第1言語による第1文を第2言語による第2文に翻訳する情報処理装置(1,1A)において実行されるプログラムであって、
前記プログラムは、
前記第1文と前記第2文とを表示装置(11)に表示させるステップと、
前記第1文に含まれる1つまたは複数の語句が選択されたことを検知するステップ(S301)と、
少なくとも前記対訳テンプレートに基づいて、前記第2文に含まれる、前記選択された語句に対応する複数の対応語句を特定するステップ(S302)と、
前記対応語句が特定されたことに基づき、当該対応語句の表示態様を変更するステップ(S303)とを前記情報処理装置に実行させる、プログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/391,528 US20120150530A1 (en) | 2009-08-21 | 2010-07-27 | Information processing device and display control method |
CN2010800473170A CN102625935A (zh) | 2009-08-21 | 2010-07-27 | 信息处理装置、显示控制方法以及程序 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009192131A JP2011044023A (ja) | 2009-08-21 | 2009-08-21 | 情報処理装置、表示制御方法、およびプログラム |
JP2009-192131 | 2009-08-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011021479A1 true WO2011021479A1 (ja) | 2011-02-24 |
Family
ID=43606934
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/062600 WO2011021479A1 (ja) | 2009-08-21 | 2010-07-27 | 情報処理装置、表示制御方法、およびプログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20120150530A1 (ja) |
JP (1) | JP2011044023A (ja) |
CN (1) | CN102625935A (ja) |
WO (1) | WO2011021479A1 (ja) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8301437B2 (en) * | 2008-07-24 | 2012-10-30 | Yahoo! Inc. | Tokenization platform |
US9460082B2 (en) * | 2012-05-14 | 2016-10-04 | International Business Machines Corporation | Management of language usage to facilitate effective communication |
JP6096489B2 (ja) * | 2012-11-30 | 2017-03-15 | 株式会社東芝 | 外国語文章作成支援装置、方法、及びプログラム |
RU2632137C2 (ru) * | 2015-06-30 | 2017-10-02 | Общество С Ограниченной Ответственностью "Яндекс" | Способ и сервер транскрипции лексической единицы из первого алфавита во второй алфавит |
CN114091483B (zh) * | 2021-10-27 | 2023-02-28 | 北京百度网讯科技有限公司 | 翻译处理方法、装置、电子设备及存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6154565A (ja) * | 1984-08-24 | 1986-03-18 | Hitachi Ltd | 多品詞訳語の校正方法 |
JPH10116286A (ja) * | 1996-10-09 | 1998-05-06 | Nippon Telegr & Teleph Corp <Ntt> | 自然言語翻訳方法及び装置 |
JP2004220616A (ja) * | 2003-01-14 | 2004-08-05 | Cross Language Inc | 3つ以上の対訳画面を同時に表示し編集可能にする機械翻訳装置 |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5696916A (en) * | 1985-03-27 | 1997-12-09 | Hitachi, Ltd. | Information storage and retrieval system and display method therefor |
JP2815714B2 (ja) * | 1991-01-11 | 1998-10-27 | シャープ株式会社 | 翻訳装置 |
JPH05151260A (ja) * | 1991-11-29 | 1993-06-18 | Hitachi Ltd | 翻訳テンプレート学習方法および翻訳テンプレート学習システム |
JPH09251462A (ja) * | 1996-03-18 | 1997-09-22 | Sharp Corp | 機械翻訳装置 |
US6275789B1 (en) * | 1998-12-18 | 2001-08-14 | Leo Moser | Method and apparatus for performing full bidirectional translation between a source language and a linked alternative language |
US7383320B1 (en) * | 1999-11-05 | 2008-06-03 | Idom Technologies, Incorporated | Method and apparatus for automatically updating website content |
JP2001297083A (ja) * | 2000-04-13 | 2001-10-26 | Hitachi Ltd | 文書作成方法、装置、文書作成プログラムを格納した記録媒体及び文書作成サービス提供システム |
JP3813911B2 (ja) * | 2002-08-22 | 2006-08-23 | 株式会社東芝 | 機械翻訳システム、機械翻訳方法及び機械翻訳プログラム |
CN101034394B (zh) * | 2007-03-30 | 2010-05-26 | 传神联合(北京)信息技术有限公司 | 一种提高翻译效率的系统及方法 |
JP5239307B2 (ja) * | 2007-11-20 | 2013-07-17 | 富士ゼロックス株式会社 | 翻訳装置及び翻訳プログラム |
WO2009107456A1 (ja) * | 2008-02-29 | 2009-09-03 | シャープ株式会社 | 情報処理装置、方法、およびプログラム |
CN101359330B (zh) * | 2008-05-04 | 2015-05-06 | 索意互动(北京)信息技术有限公司 | 内容扩展的方法和系统 |
-
2009
- 2009-08-21 JP JP2009192131A patent/JP2011044023A/ja active Pending
-
2010
- 2010-07-27 WO PCT/JP2010/062600 patent/WO2011021479A1/ja active Application Filing
- 2010-07-27 US US13/391,528 patent/US20120150530A1/en not_active Abandoned
- 2010-07-27 CN CN2010800473170A patent/CN102625935A/zh active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS6154565A (ja) * | 1984-08-24 | 1986-03-18 | Hitachi Ltd | 多品詞訳語の校正方法 |
JPH10116286A (ja) * | 1996-10-09 | 1998-05-06 | Nippon Telegr & Teleph Corp <Ntt> | 自然言語翻訳方法及び装置 |
JP2004220616A (ja) * | 2003-01-14 | 2004-08-05 | Cross Language Inc | 3つ以上の対訳画面を同時に表示し編集可能にする機械翻訳装置 |
Also Published As
Publication number | Publication date |
---|---|
JP2011044023A (ja) | 2011-03-03 |
US20120150530A1 (en) | 2012-06-14 |
CN102625935A (zh) | 2012-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5319655B2 (ja) | 情報処理装置、情報処理方法、プログラム、およびプログラムを記録したコンピュータ読取り可能な記録媒体 | |
Sin-Wai | Routledge encyclopedia of translation technology | |
JP4050755B2 (ja) | コミュニケーション支援装置、コミュニケーション支援方法およびコミュニケーション支援プログラム | |
US8055495B2 (en) | Apparatus and method for translating input speech sentences in accordance with information obtained from a pointing device | |
Osborn et al. | African languages in a digital age: Challenges and opportunities for indigenous language computing | |
WO2011021479A1 (ja) | 情報処理装置、表示制御方法、およびプログラム | |
van Esch et al. | Future directions in technological support for language documentation | |
JP2009140466A (ja) | 使用者製作問答データに基づいた会話辞書サービスの提供方法及びシステム | |
Lyons | A review of Thai–English machine translation | |
Hannay | 3.1 Types of bilingual dictionaries | |
JP6144458B2 (ja) | 手話翻訳装置及び手話翻訳プログラム | |
US20090024382A1 (en) | Language information system | |
Verma et al. | Toward machine translation linguistic issues of Indian Sign Language | |
JP5688884B2 (ja) | 情報処理装置、訳文接続方法、およびプログラム | |
Efthimiou et al. | Sign search and sign synthesis made easy to end user: the paradigm of building a SL oriented interface for accessing and managing educational content | |
JP4643183B2 (ja) | 翻訳装置および翻訳プログラム | |
Jadhav et al. | Study of machine transliteration for cross language retrieval | |
Vijayanand et al. | Named entity recognition and transliteration for Telugu language | |
JP5632213B2 (ja) | 機械翻訳装置及び機械翻訳プログラム | |
Popovych et al. | Ukrainian Redaction of Church Slavonic (URCS): Needs for Digitalization and Text Corpora Platform Generation. Part I. | |
Hurskainen | Can machine translation assist in Bible translation? | |
Jamoussi et al. | Road sign romanization in Oman: The linguistic landscape close-up | |
Divate | KTM-POP: Transliteration of K-POP Lyrics to Marathi | |
Jancso et al. | A web application for geolocalized signs in synthesized swiss german sign language | |
JP2010039864A (ja) | 形態素解析装置、形態素解析方法及びコンピュータプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080047317.0 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10809824 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13391528 Country of ref document: US |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10809824 Country of ref document: EP Kind code of ref document: A1 |