WO2021186892A1 - 翻訳文章算出装置 - Google Patents

翻訳文章算出装置 Download PDF

Info

Publication number
WO2021186892A1
WO2021186892A1 PCT/JP2021/002226 JP2021002226W WO2021186892A1 WO 2021186892 A1 WO2021186892 A1 WO 2021186892A1 JP 2021002226 W JP2021002226 W JP 2021002226W WO 2021186892 A1 WO2021186892 A1 WO 2021186892A1
Authority
WO
WIPO (PCT)
Prior art keywords
sentence
word
translated
input
created
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2021/002226
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
松岡 保静
俊允 中村
聡一朗 村上
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NTT Docomo Inc
Original Assignee
NTT Docomo Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NTT Docomo Inc filed Critical NTT Docomo Inc
Priority to JP2022508098A priority Critical patent/JP7691411B2/ja
Publication of WO2021186892A1 publication Critical patent/WO2021186892A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/44Statistical methods, e.g. probability models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • G06F40/47Machine-assisted translation, e.g. using translation memory

Definitions

  • One aspect of the present disclosure relates to a translated sentence calculation device that calculates a sentence obtained by translating a sentence in a first language into a second language.
  • neural machine translation which is a machine translation using a neural network of an encoder-decoder model composed of an encoder and a decoder
  • the encoder inputs a sentence in the first language (for example, Japanese)
  • the decoder outputs the sentence in the second language (for example, English) corresponding to the sentence in the first language as a translation result. ..
  • Patent Document 1 discloses an automatic interpreter including an encoder and a decoder composed of a neural network.
  • the above automatic interpreter will translate the translation result of the second language sentence "How do I get to Gangnam?" In response to the first language sentence "How do you get to Gangnam Station?" Entered by the user. Output. However, for example, it is not assumed that the user translates "Would you please” halfway and outputs the subsequent translation to the automatic interpreter.
  • the translated sentence calculation device is an encoder-decoder model in which the encoder inputs a sentence in the first language and the decoder sequentially outputs word candidates of the sentence in the second language corresponding to the sentence in the first language.
  • a translation sentence calculation device that uses a translator composed of the recurrent neural network of A second input unit that sequentially inputs words of a part of a sentence that is a part of a created sentence that is a sentence translated into the above, and the decoder based on the input by the first input unit and the second input unit. It is provided with a calculation unit for calculating a translated sentence which is a sentence based on the word candidate output by.
  • a translated sentence in the second language corresponding to the content of the target sentence in the first language is calculated by adding a part sentence in the second language. That is, it is possible to calculate a translated sentence in which a part of the translated sentence is added.
  • FIG. 1 is a diagram showing an example of the functional configuration of the translated sentence calculation system 3 including the translated sentence calculation device 1 according to the embodiment.
  • the translated sentence calculation system 3 includes a translated sentence calculation device 1 and a translator 2.
  • the translated text calculation device 1 and the translator 2 are communicated with each other via a network and can transmit and receive information to each other.
  • the translated sentence calculation device 1 and the translator 2 are not independent of each other, but may include the translator 2 inside the translated sentence calculation device 1.
  • the translated sentence calculation device 1 and the translator 2 in the translated sentence calculation system 3 may have any system configuration as long as the translated sentence calculation device 1 uses the translator 2.
  • the translated sentence calculation device 1 is based on a target sentence that is a sentence in the first language and a partial sentence that is a part of a created sentence that is a sentence obtained by translating the target sentence into a second language. It is a computer device that calculates translated sentences in a second language with some sentences added, corresponding to the contents of.
  • the first language is, for example, Japanese, but any other language may be used.
  • the second language is a language different from the first language, for example English, but may be any other language.
  • the first language and the second language may be different local dialects (for example, standard language and Kansai dialect in Japan).
  • the language is not limited to a natural language, and may be an artificial language or a formal language (such as a computer programming language).
  • the created sentence is intended to be created by a person such as a user of the translated sentence calculation device 1, but may be a sentence created by someone other than the person.
  • the translated sentence calculation device 1 uses the translator 2. The details of the translated text calculation device 1 will
  • the translator 2 is a computer device that translates a sentence in the first language into a sentence in the second language.
  • the translator 2 may evaluate the translation quality of each word of the translated second language sentence and the entire sentence.
  • the translator 2 includes an encoder 20 and a decoder 21.
  • the translator 2 is an encoder-decoder model (also known as an encoder-decoder) in which the encoder 20 inputs a sentence in the first language and the decoder 21 sequentially outputs word candidates of the sentence in the second language corresponding to the sentence in the first language. It is composed of a translation model, Sequence to Sequence Model) recurrent neural network (RNN).
  • the neural network is, for example, a recurrent neural network called LSTM (Long Short Term Memory).
  • the translator 2 performs neural machine translation.
  • the decoder 21 may sequentially output the likelihood for the word candidate together with the word candidate of the sentence of the second language.
  • the encoder 20 inputs a sentence in the first language and outputs a vector of an intermediate layer (hidden layer). More specifically, the encoder 20 divides a sentence in the first language into words by morphological analysis or the like, converts the word ID (Word ID) corresponding to each word into a word vector (input layer vector), and then converts it into words. Input sequentially (from the first word to the last word of the sentence), and sequentially output the vector of the intermediate layer based on the input contents so far (calculate the neural network). When the encoder 20 inputs " ⁇ EOS>" indicating the end of the sentence, the encoder 20 outputs (passes) the vector of the intermediate layer based on the input contents up to that point to the decoder 21. It can be said that the encoder 20 conceptually analyzes the meaning of a sentence in the first language and extracts the semantic expression.
  • the decoder 21 inputs the vector of the intermediate layer output from the encoder 20 and outputs the output layer based on the vector of the intermediate layer or based on the vector of the intermediate layer and the word of the second language input to the decoder 21.
  • Vectors are sequentially calculated and output.
  • the vector of the output layer is, for example, information indicating a word candidate of a second language and a list of likelihoods of the word candidate. As an example of the list, "(word candidate” It "and its likelihood” 0.66 ", word candidate” Tomorrow "and its likelihood” 0.33 ", ...)" and the like can be mentioned.
  • the decoder 21 when the vector of the intermediate layer output from the encoder 20 is input, the vector of the output layer corresponding to the first word of the text of the second language finally output is calculated based on the input vector of the intermediate layer. Output. After that, the decoder 21 extracts the word with the highest likelihood from the word candidates indicated by the vector of the output layer of the Nth word (N is an integer of 1 or more), and uses the extracted word as its own decoder (the decoder 21). The process of inputting and outputting the vector of the output layer of the (N + 1) word based on the input word and the vector of the intermediate layer used when outputting the vector of the output layer of the Nth word, Repeat until the last word of the second language sentence. It can be said that the decoder 21 conceptually generates a sentence (in a second language different from the first language) from the semantic expression extracted by the encoder 20.
  • FIG. 2 is a diagram showing a display example by the translated sentence calculation device 1.
  • the user translates the target sentence (problem sentence), which is the sentence of the first language (Japanese), which is displayed on the screen, "This is the fastest way to the post office.” Show the scene. The initially assumed model answer is "This is the shortest way to go to the post office.”
  • the main body of each process is the translated sentence calculation device 1 except for the process by the user.
  • the target sentence A, the empty text box B, and the scoring button C are displayed on the screen.
  • the user translates the target sentence A into English (translates the target sentence A of the first language (Japanese) into the second language (English) using his / her own head), and the created sentence D which is the English-translated sentence. Is entered in the text box B. As shown in FIG.
  • the created sentence D created by the user is "This is the best road to go to the office.”
  • the translation quality of the created sentence D is evaluated (scored) by using an existing technique such as the created sentence evaluation device of the above-mentioned reference document.
  • the evaluation sentence E reflecting the evaluation result and the first evaluation value F indicating the evaluation value (score) for the entire sentence of the evaluation sentence E are displayed on the screen.
  • the word whose evaluation does not meet the predetermined standard (the evaluation value is lower than the predetermined standard value) (the word that should be corrected) is different from other words such as bold or colored. Is displayed in a different form to indicate to the user that it does not meet the predetermined criteria.
  • the evaluation value is displayed near the word that does not meet the predetermined criteria, the evaluation value of the word is displayed near the word for all the words in the evaluation sentence E, and the evaluation value is set. It may be displayed in bold or colored (brightness, saturation, hue, etc.) according to the degree of boldness.
  • the words "best", "road", and "office” in the evaluation sentence E are displayed in bold because the evaluation does not meet the predetermined criteria.
  • the evaluation sentence E may be displayed in the text box B in a format that overwrites the created sentence D. Further, in the present embodiment, since the created sentence D and the evaluation sentence E are the same in content, the created sentence D and the evaluation sentence E may be equated. Further, the first evaluation value F does not have to be displayed.
  • the translated sentence calculation device 1 designates (points out) a word (or one word included in the created sentence D) whose evaluation does not meet a predetermined criterion among the words of the evaluation sentence E, the word is said to be applicable.
  • the corrected sentence which is a combination of a part of the sentence before the target word in the evaluation sentence E or the created sentence D and the translated sentence, the whole sentence. If the evaluation (reliability) of the translation quality as is not satisfying the predetermined standard, the correction is repeated while tracing back to the word before the target word, and the reliability meets the predetermined standard (reliability is the predetermined standard value). At that point, the corrected text at that time is displayed as the final translated text.
  • the translated sentence G "post office.” Is calculated and displayed. That is, the translated sentence calculation device 1 proposes that "office” should be changed to "post office”.
  • the evaluation value regarding the translated sentence G may be displayed together with the translated sentence G.
  • the corrected text is "This is the best road to go to the post office.”
  • the evaluation (reliability) of the translation quality of the text as a whole is It is assumed that it is determined that the predetermined criteria are not yet met (the whole sentence has improvements). In that case, try to correct it by going back to the previous word.
  • the reliability is set to the predetermined standard when the word "road” (the next word that does not meet the predetermined standard) is traced back to "shortcut to the post office.”
  • the correction sentence H which is the correction result of the entire sentence.
  • the second evaluation value I indicating the evaluation value (score) for the entire sentence of the corrected sentence H is displayed.
  • the second evaluation value I does not have to be displayed.
  • the correction sentence H is different from the above-mentioned model answer "This is the shortest way to go to the post office.”, But the user-created sentence D (user's answer) using "best” instead of "shortest” is used. While respecting it, it is possible to make corrections that make use of it.
  • Target text "This is the fastest way to the post office.”
  • Written text “This is the best road to go to the office.”
  • Target words "office”, “road”, etc.
  • the translated sentence calculation device 1 creates a continuation when the user specifies an arbitrary word (for example, an incorrect word) (for example, clicks on the screen) as a correction function. Also, if the sentence before the specified word is incorrect, it will be corrected by going back to the previous incorrect word. That is, the translated sentence calculation device 1 performs translation with respect for the user's answer as much as possible. In other words, the translated sentence calculation device 1 composes sentences after the word to be corrected by AI (Artificial Intelligence) while respecting the user's answer as much as possible (using the user's answer as much as possible). To display.
  • AI Artificial Intelligence
  • the translated text calculation device 1 includes a storage unit 10, an input / output unit 11 (reception unit, display unit), a first input unit 12 (first input unit), and a second input unit 13 (second input unit). Part) and a calculation part 14.
  • Each functional block of the translated sentence calculation device 1 is assumed to function in the translated sentence calculation device 1, but is not limited to this.
  • a part of the functional blocks of the translated sentence calculation device 1 is a computer device different from the translated sentence calculation device 1, and in the computer device connected to the translated sentence calculation device 1 via a network, the translated sentence calculation device 1 and the translated sentence calculation device 1 It may function while transmitting and receiving information as appropriate.
  • some functional blocks of the translated sentence calculation device 1 may be omitted, a plurality of functional blocks may be integrated into one functional block, or one functional block may be decomposed into a plurality of functional blocks. May be good.
  • the storage unit 10 stores arbitrary information used for calculation in the translated sentence calculation device 1 and the calculation result in the translated sentence calculation device 1.
  • the information stored by the storage unit 10 is appropriately referred to by each function of the translated sentence calculation device 1.
  • the storage unit 10 may store the target sentence in advance.
  • the input / output unit 11 outputs (displays) the target sentence.
  • the input / output unit 11 acquires an instruction to display the target sentence on the screen from the user or the translated sentence calculation device 1
  • the input / output unit 11 acquires the target sentence stored by the storage unit 10, and is an output device 1006 described later.
  • the acquired target sentence is displayed on the screen of the translated sentence calculation device 1 (hereinafter, simply referred to as "screen").
  • the output of the target sentence by the input / output unit 11 corresponds to the display of the target sentence A in the display example of FIG.
  • the "output" in the present embodiment is not limited to display, and includes, for example, transmission to another computer via a network, output by voice, and the like.
  • the input / output unit 11 outputs (displays) the created text.
  • the input / output unit 11 displays on the screen a created sentence input by the user using a keyboard or a microphone, which is an input device 1005 described later.
  • the output of the created sentence by the input / output unit 11 corresponds to the display of the created sentence D or the evaluation sentence E in the display example of FIG.
  • the input / output unit 11 (display unit) inputs (acquires) the created text. For example, when the input / output unit 11 obtains an instruction from the user to input the created sentence, the input / output unit 11 inputs the created sentence. The input / output unit 11 outputs the input created text to the second input unit 13 and the calculation unit 14. The input of the created sentence by the input / output unit 11 corresponds to the input of the created sentence D when the scoring button C is pressed in the display example of FIG.
  • the input / output unit 11 outputs (displays) the evaluation text.
  • the input / output unit 11 evaluates the translation quality of the created text input by the input / output unit 11 using the translator 2, and displays the evaluation text reflecting the evaluation result on the screen.
  • the input / output unit 11 may also output the evaluation value of the translation quality for the entire sentence of the evaluation sentence.
  • the output of the evaluation text by the input / output unit 11 corresponds to the output of the evaluation text E and the first evaluation value F in the display example of FIG.
  • the input / output unit 11 accepts the designation of one word included in the created sentence as the target word.
  • the input / output unit 11 may accept the designation of the word designated by the user among the created sentences displayed by the input / output unit 11 (display unit) as the target word.
  • the input / output unit 11 outputs the target word that has received the designation to the second input unit 13 and the calculation unit 14.
  • the reception of the designation of the target word by the input / output unit 11 corresponds to the reception of the designation (click, etc.) of the word "office" in the created sentence D or the evaluation sentence E in the display example of FIG.
  • the input / output unit 11 outputs (displays) the translated text. More specifically, the input / output unit 11 (display unit) displays the translated text calculated by the calculation unit 14. For example, the input / output unit 11 displays the translated text calculated by the calculation unit 14 on the screen. When the input / output unit 11 outputs the translated text, the input / output unit 11 may also output the value of the evaluation of the translation quality of the translated text.
  • the output of the translated text by the input / output unit 11 corresponds to the output of the translated text G in the display example of FIG.
  • the input / output unit 11 outputs (displays) the corrected text.
  • the input / output unit 11 calculates a part of the created sentences input by the input / output unit 11 that is a sentence before the target word designated by the input / output unit 11 and a calculation unit 14.
  • the corrected text which is a combination of the translated text and the translated text, is displayed on the screen.
  • the input / output unit 11 may also output the value of the evaluation of the translation quality for the entire sentence of the corrected sentence.
  • the output of the corrected text by the input / output unit 11 corresponds to the output of the corrected text H and the second evaluation value I in the display example of FIG.
  • the first input unit 12 inputs the target sentence, which is the sentence of the first language, into the encoder 20 of the translator 2. More specifically, the first input unit 12 inputs the words constituting the target sentence to the encoder 20 in the order in which they appear in the sentence. A specific example of the processing of the first input unit 12 will be described later.
  • the second input unit 13 sequentially inputs the words of a part of the sentence (the created sentence input by the input / output unit 11), which is a sentence obtained by translating the target sentence into the second language, into the decoder 21. do.
  • the created sentence may be a sentence created by the user translating the target sentence (output by the input / output unit 11) into a second language.
  • the partial sentence may be a partial sentence from the beginning of the created sentence.
  • the second input unit 13 sequentially inputs the words of a part of the sentence (in the order in which the words constituting the part of the sentence appear in the sentence) into the decoder 21, and after the input is completed, the words that the decoder 21 sequentially outputs. Candidates may be sequentially input to the decoder 21.
  • the second input unit 13 sequentially inputs the words of the partial sentence into the decoder 21 with the sentence before the target word (designated by the input / output unit 11) among the created sentences as a partial sentence. May be good. A specific example of the processing of the second input unit 13 will be described later.
  • the input trigger by the first input unit 12 and the second input unit 13 is the timing when the input / output unit 11 accepts the designation of the target word, the timing instructed by the user or the administrator of the translated sentence calculation device 1, or the like. , Periodic (for example, once a minute), or any other timing.
  • the calculation unit 14 calculates a translated sentence which is a sentence based on the word candidate output by the decoder 21 based on the input by the first input unit 12 and the second input unit 13.
  • the translated sentence may be a sentence following a part of the sentence.
  • the calculation unit 14 sets a word before the target word in the created sentence as a new target word until the sentence evaluation, which is the evaluation of the sentence based on the partial sentence and the calculated translated sentence, meets a predetermined criterion.
  • a new translated sentence is calculated by inputting by the first input unit 12 and the second input unit 13 with the sentence before the new target word as a new part sentence among the created sentences. May be good.
  • the calculation unit 14 may use a word that precedes the target word in the created sentence and whose evaluation of the word in the created sentence does not satisfy a predetermined criterion as a new target word.
  • the sentence evaluation may be based on the likelihood of the word candidates that are the same as the words of some sentences among the word candidates and the likelihoods output by the decoder 21 based on the inputs by the first input unit 12 and the second input unit 13. good.
  • the calculation unit 14 may output the calculated translated text to the input / output unit 11.
  • FIG. 3 is a diagram showing a usage example of the translator 2 by the translated sentence calculation device 1.
  • a schematic diagram of the encoder 20 and the decoder 21 of the translator 2 is shown in the upper part of FIG.
  • the translator 2 uses an LSTM and is divided into an encoder 20 side and a decoder 21 side.
  • the first input unit 12 divides the target sentence (input sentence) into words, and sequentially inputs the word ID corresponding to each word into the encoder 20.
  • the encoder 20 converts the input word ID into a word vector to calculate the neural network, and when " ⁇ EOS>" (end of sentence) is input, the encoder 20 transfers the vector of the intermediate layer to the decoder 21. give.
  • the decoder 21 calculates the output layer from the passed vector of the intermediate layer, and calculates the likelihood of the word output by the Softmax function.
  • the decoder 21 compares the likelihood of the output word from the output layer with the likelihood of the word in the partial sentence for the divided word of the partial sentence input by the second input unit 13, and one. Calculate the reliability of the word in the part sentence.
  • the decoder 21 has the highest likelihood of "It” as the first word in order to output "It is fine tomorrow.”, which is "0.66".
  • the word following "Tomorrow” is calculated by the decoder 21, and the likelihood of the output word is calculated in the same way as the first word.
  • "is” has the highest likelihood and is "0.5".
  • the likelihood of "will” in some sentences of the user is "0.4". Therefore, the reliability of "will” is "0.8", which is the likelihood of "will” divided by the likelihood of "is”.
  • the second input unit 13 has the highest likelihood of the output word of the decoder 21 for the word following "will”. Make it a high word, input it to the decoder 21, and complement the sentence (partial sentence).
  • the second input unit 13 adds "be” to the translated sentence because the word having the highest likelihood as the word following "will” is "be".
  • the reliability of "be” is "1" because it is a word output by the decoder 21.
  • "be” is followed by "fine”
  • "fine” is followed by " ⁇ EOS>”
  • the reliability of the entire sentence (corrected sentence) is the average value of the reliability of each word.
  • the calculation unit 14 calculates a translated sentence output by the decoder 21 or a corrected sentence which is a combination of a part of the sentence and the translated sentence output by the decoder 21.
  • the evaluation in the present embodiment is not limited to the one based on the likelihood output by the decoder 21 of the translator 2, and may be an evaluation of translation quality or the like by other existing techniques.
  • FIG. 4 is a flowchart showing an example of the processing executed by the translated sentence calculation device 1 (or the translated sentence calculation system 3).
  • the input / output unit 11 displays the target sentence on the screen (step S1).
  • the input / output unit 11 acquires a created sentence in which the user translates the target screen displayed in S1 by himself / herself (step S2).
  • the input / output unit 11 evaluates the created sentence acquired in S2 using a translator 2 or the like, and displays the evaluation result on the screen (step S3).
  • the input / output unit 11 accepts the designation of the target word in the created sentence by the user based on the evaluation result displayed in S3 (step S4).
  • the first input unit 12 inputs the target sentence into the encoder 20 of the translator 2, and the second input unit 13 is based on the created sentence acquired in S2 and the target word designated in S4.
  • the part sentence is input to the encoder 20 of the translator 2 (step S5).
  • the calculation unit 14 calculates a translated sentence based on the word candidates output by the decoder 21 of the translator 2 based on the input of S5 (step S6).
  • the calculation unit 14 evaluates a corrected sentence which is a combination sentence of a part sentence of S5 and the translated sentence calculated in S6 based on the likelihood output by the decoder 21 in S6 (step). S7).
  • the calculation unit 14 determines whether or not the evaluation of S7 satisfies a predetermined criterion (step S8).
  • the calculation unit 14 sets the word before the target word as a new target word and the created sentence acquired in S2. Among them, the sentence before the new target word is set as a new partial sentence (step S9), and the process returns to the process of S5 (S8: repeat S5 to S9 until YES is obtained).
  • the input / output unit 11 displays the corrected text of S7 (in the final loop) on the screen (step S10).
  • S1 to S4 may be omitted.
  • the target word (the first word in the loop starting from S5) is any word in the created sentence (the last word in the created sentence, the word in the created sentence whose evaluation does not meet the predetermined criteria, or the created sentence.
  • the word whose evaluation does not meet the predetermined criteria such as the word closest to the end of the created sentence
  • S7 to S9 may be omitted and S10 may be executed after S6.
  • the input / output unit 11 may display the translated sentence on the screen (as an intermediate result). Further, when the evaluation is performed in S7, the input / output unit 11 may display the evaluation result on the screen.
  • the input / output unit 11 may display the new target word on the screen, or the new target in the created sentence already displayed in S3 or the like. Words may be displayed in different forms (eg, bold). Further, in S10, the translated sentence finally calculated in S6 (in the loop) may be displayed instead of the corrected sentence or together with the corrected sentence.
  • the encoder 20 inputs a sentence in the first language and the decoder 21 sequentially outputs word candidates of the sentence in the second language corresponding to the sentence in the first language. Recurrent of the encoder-decoder model.
  • a translation sentence calculation device 1 using a translator composed of a neural network, the first input unit 12 for inputting the target sentence which is the sentence of the first language into the encoder 20, and the target sentence in the second language.
  • the decoder 21 is based on the input by the second input unit 13 and the first input unit 12 and the second input unit 13 in which the words of a part of the created sentence, which is a translated sentence, are sequentially input to the decoder 21.
  • a translated sentence in the second language corresponding to the content of the target sentence in the first language is calculated by adding a part sentence in the second language. That is, it is possible to calculate a translated sentence in which a part of the translated sentence is added.
  • a part of the sentence is a part of the sentence from the beginning of the created sentence
  • the translated sentence is a sentence following the part of the sentence.
  • the second input unit 13 sequentially inputs the words of a part of the sentence into the decoder 21, and when the input is completed, the word candidates sequentially output by the decoder 21 are sequentially output to the decoder 21. Enter in sequence.
  • the created sentence is a sentence created by the user translating the target sentence into a second language.
  • the input / output unit 11 that accepts the designation of one word included in the created sentence as the target word is further provided, and the second input unit 13 is before the target word in the created sentence. Is used as a part of the sentence, and the words of the part of the sentence are sequentially input to the decoder 21. With this configuration, it is possible to easily generate a part of a sentence and calculate a translated sentence by specifying only one word.
  • the calculation unit 14 is more than the target word in the created sentence until the sentence evaluation, which is the evaluation of the sentence based on the partial sentence and the calculated translated sentence, satisfies a predetermined criterion.
  • the sentence evaluation which is the evaluation of the sentence based on the partial sentence and the calculated translated sentence.
  • the calculation unit 14 newly uses a word in the created sentence that precedes the target word and whose evaluation of the word in the created sentence does not meet a predetermined criterion.
  • Target word a word in the created sentence that precedes the target word and whose evaluation of the word in the created sentence does not meet a predetermined criterion.
  • the decoder 21 sequentially outputs the likelihood for the word candidate together with the word candidate of the sentence of the second language, and the sentence evaluation is performed by the first input unit 12 and the second input unit 13.
  • the sentence evaluation is performed based on the likelihood for the same word candidates as the words in some sentences.
  • the input / output unit 11 for displaying the created sentence is further provided, and the input / output unit 11 targets the word specified by the user among the created sentences displayed by the input / output unit 11. Accept designation as a word.
  • the user can specify the target word from the displayed created sentence, so that usability is improved.
  • the input / output unit 11 further displays the translated sentence calculated by the calculation unit 14.
  • the user can easily confirm the new translated sentence because the sentences after the specified target word are newly displayed as the translated sentences calculated by the translated sentence calculation device 1. Usability is improved.
  • the translated sentence calculation device 1 relates to a composition correction system using a neural network.
  • learning of "writing” and “speaking” is attracting attention in learning English.
  • These studies can be effectively learned if there is a person who can correct or guide them, but it is difficult to learn by self-study. In addition, it is costly to have a person correct or give guidance. Therefore, there is a demand for a technique capable of correcting or instructing humans with AI.
  • there are various answers in English composition and there is a problem that neither correction nor guidance can be made simply by comparing with the model answer.
  • English composition of Japanese-English translation there are various expression methods even if they have the same meaning, and if correction or guidance is to be made based on the expression method of the learner, it is necessary to correct or teach the meaning of the problem sentence. ..
  • the translated sentence calculation device 1 is an English composition correction system that solves the above-mentioned problems by using a neural network learned by neural machine translation and making corrections with an emphasis on the meaning of sentences.
  • the user can have the machine correct any word in the English sentence created by the user, and the user's composition is maximized so that the reliability of the entire sentence is further increased. It corrects from the most suitable word while respecting the limit. Therefore, the user can learn the correct English sentence while trying a wide variety of expressions.
  • the scoring criteria for scoring English composition are diverse, and especially in scoring with an emphasis on meaning, there are various methods of expression, and there is a problem that it is difficult to compare with the model answer.
  • the question sentence there are many types of expressions, and it is difficult to prepare a model answer for each.
  • the word string of the sentence written by the user is input to the decoder 21, and whether the next word is suitable or not is the output of the decoder 21. Calculate based on degree. Since the encoder 20 grasps the meaning of the question sentence, it is possible to score each word while emphasizing the meaning and allowing freedom of expression in the English composition. This makes it possible to score English composition, point out not so good words, present the optimum word, etc., and automate the scoring or feedback of English composition.
  • the processing of the translated sentence calculation device 1 can be said to be an AI correction method for English composition.
  • the translated sentence calculation device 1 is the same as the evaluation (scoring) of the translation result, but uses the composition halfway through the user, and the translator 2 outputs the continuation to complete the composition. Until the middle of the process, the output of the translator 2 is rewritten into the words answered by the user for processing.
  • the translated sentence calculation device 1 composes and displays the sentences after the word to be corrected by AI while respecting the user's answer as much as possible (using the user's answer as much as possible).
  • the translated text calculation device 1 may have the following configuration. Equipped with a neural network trained for machine translation to translate from a first language to a second language A means of dividing a problem sentence in the first language into words and sequentially inputting the words into the encoder of the neural network. A means of dividing a part of a sentence written in a second language from the beginning into words and sequentially inputting the words into the decoder of the neural network. A sentence correction means that complements the continuation of the input sentence to the decoder with the output word of the decoder, A means for calculating the score of a corrected sentence by comparing the likelihood of a word output from the decoder of the neural network with the likelihood of a word in a partially input composition sentence. Composition correction system equipped with.
  • the translated text calculation device 1 may have the following configuration.
  • An essay correction display screen provided with a means for correcting one or more previous words using the essay correction system when the score of the essay does not exceed a predetermined threshold value.
  • the correction result display algorithm of the translated text calculation device 1 may be the following procedure. (1) Divide the user's English composition into words. (2) The target word is a word specified by the user. (3) Present the correction sentence from the target word. (4) It is determined whether or not the reliability of the corrected sentence from the target word is equal to or higher than the threshold value. (5) If it is determined in (4) above that it is equal to or higher than the threshold value, a correction sentence from the target word is presented. (6) If it is determined in (4) above that it is less than the threshold value, the target word is set to the word immediately before the target word, and the process returns to (4) above.
  • the English sentence (created sentence) written by the user is divided into word units.
  • the user can try correction after any word, and the translator 2 composes the specified word (target word) and after, and presents the correction sentence (translation sentence). If the reliability of the corrected sentence (corrected sentence) written by the translator 2 is lower than the predetermined value, the word immediately before the specified word (target word) is set as the target word (target word), and the words after the target word are set as the target word. Make a request to correct. If the reliability of this corrected sentence (corrected sentence) is higher than a predetermined value, the corrected sentence is presented as a corrected sentence of the entire sentence.
  • the translator 2 If the reliability is not higher than the predetermined value, the translator 2 is similarly requested to correct the previous word as the target word. Going back to the first word, all the corrected sentences created by the translator 2 are words generated by the translator 2, so the reliability is surely "1" (highest), and the process ends there.
  • the processing algorithm of the translated text calculation device 1 may be the following procedure. (1) Obtain the scoring result of English composition. (2) For each word with a low score (wrong word), make a correction request after that word. (3) Set the target word as the last error word. (4) It is determined whether or not the score of the corrected sentence from the target word is equal to or higher than the threshold value. (5) If it is determined in (4) above that it is equal to or higher than the threshold value, a correction sentence from the target word is presented. (6) If it is determined in (4) above that it is less than the threshold value, the target word is set as the error word immediately before the target word, and the process returns to (4) above.
  • an erroneous word is picked up from behind and corrected from that word. If the score of the corrected sentence is not higher than the specified value, the previous error word is picked up and the correction / scoring check is performed in the same manner. Check back to the previous error word until the score is higher than the specified value. Present when the text meets the conditions.
  • the processing algorithm of the translated text calculation device 1 may be the following procedure. (1) Receive the designation of any word in the translated text (2) Correct the sentences after the specified word, which is the specified word in the translated sentence, and create the corrected sentence that is the correction result. (3) Evaluate the created correction text and (4) If the evaluation of (3) above does not meet the predetermined criteria, the word that precedes the designated word in the translated sentence is regarded as a new designated word and returned to (2) above. (5) If the evaluation in (3) above meets the predetermined criteria, the corrected text is output. According to the above configuration, it is possible to obtain a corrected sentence in which the sentence before the designated word is respected among the translated sentences (for example, an English sentence written by the user) and the evaluation satisfies a predetermined criterion. ..
  • each functional block may be realized by using one device that is physically or logically connected, or directly or indirectly (for example, by two or more devices that are physically or logically separated). , Wired, wireless, etc.) and may be realized using these plurality of devices.
  • the functional block may be realized by combining the software with the one device or the plurality of devices.
  • Functions include judgment, decision, judgment, calculation, calculation, processing, derivation, investigation, search, confirmation, reception, transmission, output, access, solution, selection, selection, establishment, comparison, assumption, expectation, and assumption. Broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating, mapping, assigning, etc., but limited to these I can't.
  • a functional block (component) that functions transmission is called a transmitting unit or a transmitter.
  • the method of realizing each of them is not particularly limited.
  • the translated sentence calculation device 1 in the embodiment of the present disclosure may function as a computer that processes the translated sentence calculation method of the present disclosure.
  • FIG. 5 is a diagram showing an example of the hardware configuration of the translated text calculation device 1 according to the embodiment of the present disclosure.
  • the above-mentioned translated sentence calculation device 1 may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.
  • the word “device” can be read as a circuit, device, unit, etc.
  • the hardware configuration of the translated text calculation device 1 may be configured to include one or more of the devices shown in the figure, or may be configured not to include some of the devices.
  • the processor 1001 For each function in the translated text calculation device 1, the processor 1001 performs calculations by loading predetermined software (programs) on hardware such as the processor 1001 and the memory 1002, and controls communication by the communication device 1004. It is realized by controlling at least one of reading and writing of data in the memory 1002 and the storage 1003.
  • Processor 1001 operates, for example, an operating system to control the entire computer.
  • the processor 1001 may be configured by a central processing unit (CPU: Central Processing Unit) including an interface with a peripheral device, a control device, an arithmetic unit, a register, and the like.
  • CPU Central Processing Unit
  • the above-mentioned input / output unit 11, the first input unit 12, the second input unit 13, the calculation unit 14, and the like may be realized by the processor 1001.
  • the processor 1001 reads a program (program code), a software module, data, etc. from at least one of the storage 1003 and the communication device 1004 into the memory 1002, and executes various processes according to these.
  • a program program code
  • the storage unit 10 may be realized by a control program that is stored in the memory 1002 and operates in the processor 1001, and may be realized in the same manner for other functional blocks.
  • Processor 1001 may be implemented by one or more chips.
  • the program may be transmitted from the network via a telecommunication line.
  • the memory 1002 is a computer-readable recording medium, and is composed of at least one such as a ROM (Read Only Memory), an EPROM (Erasable Programmable ROM), an EEPROM (Electrically Erasable Programmable ROM), and a RAM (Random Access Memory). May be done.
  • the memory 1002 may be referred to as a register, a cache, a main memory (main storage device), or the like.
  • the memory 1002 can store a program (program code), a software module, or the like that can be executed to implement the wireless communication method according to the embodiment of the present disclosure.
  • the storage 1003 is a computer-readable recording medium, and is, for example, an optical disk such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disk, an optical magnetic disk (for example, a compact disk, a digital versatile disk, or a Blu-ray). It may consist of at least one (registered trademark) disk), smart card, flash memory (eg, card, stick, key drive), floppy (registered trademark) disk, magnetic strip, and the like.
  • the storage 1003 may be referred to as an auxiliary storage device.
  • the storage medium described above may be, for example, a database, server or other suitable medium containing at least one of the memory 1002 and the storage 1003.
  • the communication device 1004 is hardware (transmission / reception device) for communicating between computers via at least one of a wired network and a wireless network, and is also referred to as, for example, a network device, a network controller, a network card, a communication module, or the like.
  • the communication device 1004 includes, for example, a high frequency switch, a duplexer, a filter, a frequency synthesizer, and the like in order to realize at least one of frequency division duplex (FDD: Frequency Division Duplex) and time division duplex (TDD: Time Division Duplex). It may be composed of.
  • FDD Frequency Division Duplex
  • TDD Time Division Duplex
  • the input device 1005 is an input device (for example, a keyboard, a mouse, a microphone, a switch, a button, a sensor, etc.) that receives an input from the outside.
  • the output device 1006 is an output device (for example, a display, a speaker, an LED lamp, etc.) that outputs to the outside.
  • the input device 1005 and the output device 1006 may have an integrated configuration (for example, a touch panel).
  • each device such as the processor 1001 and the memory 1002 is connected by the bus 1007 for communicating information.
  • the bus 1007 may be configured by using a single bus, or may be configured by using a different bus for each device.
  • the translated text calculation device 1 uses hardware such as a microprocessor, a digital signal processor (DSP: Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). It may be configured to include, and a part or all of each functional block may be realized by the hardware. For example, processor 1001 may be implemented using at least one of these hardware.
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • PLD Programmable Logic Device
  • FPGA Field Programmable Gate Array
  • the notification of information is not limited to the mode / embodiment described in the present disclosure, and may be performed by using other methods.
  • the input / output information and the like may be stored in a specific location (for example, memory) or may be managed using a management table. Input / output information and the like can be overwritten, updated, or added. The output information and the like may be deleted. The input information or the like may be transmitted to another device.
  • the determination may be made by a value represented by 1 bit (0 or 1), by a boolean value (Boolean: true or false), or by comparing numerical values (for example, a predetermined value). It may be done by comparison with the value).
  • the notification of predetermined information (for example, the notification of "being X") is not limited to the explicit one, but is performed implicitly (for example, the notification of the predetermined information is not performed). May be good.
  • Software whether referred to as software, firmware, middleware, microcode, hardware description language, or other names, is an instruction, instruction set, code, code segment, program code, program, subprogram, software module.
  • Applications, software applications, software packages, routines, subroutines, objects, executable files, execution threads, procedures, functions, etc. should be broadly interpreted.
  • software, instructions, information, etc. may be transmitted and received via a transmission medium.
  • the software uses at least one of wired technology (coaxial cable, optical fiber cable, twisted pair, digital subscriber line (DSL: Digital Subscriber Line), etc.) and wireless technology (infrared, microwave, etc.) to create a website.
  • wired technology coaxial cable, optical fiber cable, twisted pair, digital subscriber line (DSL: Digital Subscriber Line), etc.
  • wireless technology infrared, microwave, etc.
  • the information, signals, etc. described in this disclosure may be represented using any of a variety of different techniques.
  • data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description are voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. It may be represented by a combination of.
  • system and “network” used in this disclosure are used interchangeably.
  • information, parameters, etc. described in the present disclosure may be expressed using absolute values, relative values from predetermined values, or using other corresponding information. It may be represented.
  • determining and “determining” used in this disclosure may include a wide variety of actions.
  • “Judgment” and “decision” are, for example, judgment (judging), calculation (calculating), calculation (computing), processing (processing), derivation (deriving), investigation (investigating), search (looking up, search, inquiry). (For example, searching in a table, database or another data structure), ascertaining may be regarded as “judgment” or “decision”.
  • judgment and “decision” are receiving (for example, receiving information), transmitting (for example, transmitting information), input (input), output (output), and access.
  • Accessing (for example, accessing data in memory) may be regarded as "judgment” or “decision”.
  • judgment and “decision” mean that “resolving”, “selecting”, “choosing”, “establishing”, “comparing”, etc. are regarded as “judgment” and “decision”. Can include. That is, “judgment” and “decision” may include considering some action as “judgment” and “decision”. Further, “judgment (decision)” may be read as “assuming”, “expecting”, “considering” and the like.
  • connection means any direct or indirect connection or connection between two or more elements, and each other. It can include the presence of one or more intermediate elements between two “connected” or “combined” elements.
  • the connections or connections between the elements may be physical, logical, or a combination thereof.
  • connection may be read as "access”.
  • the two elements use at least one of one or more wires, cables and printed electrical connections, and, as some non-limiting and non-comprehensive examples, the radio frequency domain. Can be considered to be “connected” or “coupled” to each other using electromagnetic energy having wavelengths in the microwave and light (both visible and invisible) regions.
  • references to elements using designations such as “first” and “second” as used in this disclosure does not generally limit the quantity or order of those elements. These designations can be used in the present disclosure as a convenient way to distinguish between two or more elements. Thus, references to the first and second elements do not mean that only two elements can be adopted, or that the first element must somehow precede the second element.
  • the term "A and B are different” may mean “A and B are different from each other”.
  • the term may mean that "A and B are different from C”.
  • Terms such as “separate” and “combined” may be interpreted in the same way as “different”.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)
PCT/JP2021/002226 2020-03-19 2021-01-22 翻訳文章算出装置 Ceased WO2021186892A1 (ja)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2022508098A JP7691411B2 (ja) 2020-03-19 2021-01-22 翻訳文章算出装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020049150 2020-03-19
JP2020-049150 2020-03-19

Publications (1)

Publication Number Publication Date
WO2021186892A1 true WO2021186892A1 (ja) 2021-09-23

Family

ID=77770802

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/002226 Ceased WO2021186892A1 (ja) 2020-03-19 2021-01-22 翻訳文章算出装置

Country Status (2)

Country Link
JP (1) JP7691411B2 (https=)
WO (1) WO2021186892A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023079911A1 (ja) * 2021-11-04 2023-05-11 株式会社Nttドコモ 文生成モデル生成装置、文生成モデル及び文生成装置

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019225154A1 (ja) * 2018-05-23 2019-11-28 株式会社Nttドコモ 作成文章評価装置

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019225154A1 (ja) * 2018-05-23 2019-11-28 株式会社Nttドコモ 作成文章評価装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HOSEI MATSUOKA, TOSHIMITSU NAKAMURA, SOICHIRO MURAKAMI, ATSUKI SAWAYAMA: "Technology to Grade and Correct Compositions in English", NTT DOCOMO TECHNICAL JOURNAL, vol. 21, no. 4, 1 April 2020 (2020-04-01), pages 61 - 66, XP055859343 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023079911A1 (ja) * 2021-11-04 2023-05-11 株式会社Nttドコモ 文生成モデル生成装置、文生成モデル及び文生成装置
JPWO2023079911A1 (https=) * 2021-11-04 2023-05-11

Also Published As

Publication number Publication date
JPWO2021186892A1 (https=) 2021-09-23
JP7691411B2 (ja) 2025-06-11

Similar Documents

Publication Publication Date Title
JP7222082B2 (ja) 認識誤り訂正装置及び訂正モデル
JP7062056B2 (ja) 作成文章評価装置
JP7103957B2 (ja) データ生成装置
WO2021070819A1 (ja) 採点モデル学習装置、採点モデル及び判定装置
JP2022029273A (ja) 文類似度算出装置、学習済モデル生成装置及び分散表現モデル
US12190073B2 (en) Internal state modifying device
JP7691411B2 (ja) 翻訳文章算出装置
US12248758B2 (en) Generation device and normalization model
WO2021199654A1 (ja) 分割装置
US20230401384A1 (en) Translation device
US11663420B2 (en) Dialogue system
JP7575894B2 (ja) 作成文章評価装置
JP7682862B2 (ja) 句点削除モデル学習装置、句点削除モデル及び判定装置
WO2020070943A1 (ja) パターン認識装置及び学習済みモデル
JP2020177387A (ja) 文出力装置
US20250005647A1 (en) Recommendation device
JP6976448B2 (ja) 機械翻訳制御装置
WO2020166125A1 (ja) 翻訳用データ生成システム
US5974370A (en) System for reviewing its processing and method therefor
CN119806625A (zh) 代码推荐方法、装置、设备及计算机可读存储介质
JP7547077B2 (ja) 文章翻訳装置及び翻訳モデル
WO2023079911A1 (ja) 文生成モデル生成装置、文生成モデル及び文生成装置
US20210012067A1 (en) Sentence matching system
US12333267B2 (en) Text generation model generating device, text generation model, and text generating device
US20230410795A1 (en) Information processing device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21771340

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022508098

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21771340

Country of ref document: EP

Kind code of ref document: A1