WO2020044509A1 - 生成方法、生成プログラムおよび情報処理装置 - Google Patents
生成方法、生成プログラムおよび情報処理装置 Download PDFInfo
- Publication number
- WO2020044509A1 WO2020044509A1 PCT/JP2018/032206 JP2018032206W WO2020044509A1 WO 2020044509 A1 WO2020044509 A1 WO 2020044509A1 JP 2018032206 W JP2018032206 W JP 2018032206W WO 2020044509 A1 WO2020044509 A1 WO 2020044509A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- word
- information
- vector information
- words
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/58—Use of machine translation, e.g. for multi-lingual retrieval, for server-side translation for client devices or for real-time translation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/51—Translation evaluation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Definitions
- the present invention relates to a generation method and the like.
- neural machine translation (NMT: Neural Machine Translation) has been used to translate a first language into another second language different from the first language.
- NMT Neural Machine Translation
- RNN Recurrent Neural Network
- the encoder is a processing unit that encodes a word included in a character string of an input sentence and assigns a vector to the encoded word.
- the RNN converts a word vector input from the encoder based on a Softmax function and outputs the converted vector.
- the decoder is a processing unit that decodes an output sentence based on a word vector output from the RNN.
- the number of words in the input / output layer used in RNN machine learning is reduced in order to reduce the amount of calculation of the Softmax function.
- the Softmax function is performed by referring to a vector table.
- the above-described conventional technique has a problem that the data amount of vector information used for generating a conversion model cannot be reduced.
- a word with a low appearance frequency included in the text to be translated is stored in the vector table. If not registered, the translation will not be performed properly and the translation accuracy will be reduced.
- an object of the present invention is to provide a generation method, a generation program, and an information processing apparatus capable of reducing the data amount of vector information used for generating a conversion model.
- the computer executes the following processing.
- the computer receives the first text information and the second text information.
- the computer extracts words whose appearance frequency is lower than a reference among words included in the first text information and words whose appearance frequency is lower than a reference among words included in the second text information.
- the computer specifies an attribute associated with the extracted word by referring to a storage unit that stores information in which one attribute is assigned to a plurality of words whose appearance frequency is lower than a reference.
- the computer refers to a storage unit that stores vector information corresponding to the attribute of the word in association with the attribute.
- the computer specifies first vector information associated with the attribute of the word extracted from the first text information and second vector information associated with the attribute of the word extracted from the second text information I do.
- the computer generates the conversion model by learning the parameters of the conversion model so that the vector information output when the first vector information is input to the conversion model approaches the second vector information.
- FIG. 1 is a diagram (1) illustrating a process performed by the information processing apparatus according to the embodiment.
- FIG. 2 is a diagram (2) illustrating a process performed by the information processing apparatus according to the embodiment.
- FIG. 3 is a diagram (3) illustrating a process performed by the information processing apparatus according to the embodiment.
- FIG. 4 is a functional block diagram illustrating the configuration of the information processing apparatus according to the embodiment.
- FIG. 5 is a diagram illustrating an example of a data structure of the first vector table according to the embodiment.
- FIG. 6 is a diagram illustrating an example of a data structure of the second vector table according to the present embodiment.
- FIG. 7 is a diagram illustrating an example of a data structure of the teacher data table according to the embodiment.
- FIG. 1 is a diagram (1) illustrating a process performed by the information processing apparatus according to the embodiment.
- FIG. 2 is a diagram (2) illustrating a process performed by the information processing apparatus according to the embodiment.
- FIG. 3 is a diagram (3) illustrating a process performed
- FIG. 8 is a diagram illustrating an example of a data structure of the code conversion table according to the embodiment.
- FIG. 9 is a diagram illustrating an example of the data structure of the dictionary information according to the embodiment.
- FIG. 10 is a diagram illustrating an example of the data structure of the RNN data according to the embodiment.
- FIG. 11 is a diagram for supplementarily explaining the parameters of the intermediate layer.
- FIG. 12 is a flowchart illustrating a process in which the information processing apparatus according to the present embodiment generates RNN data.
- FIG. 13 is a flowchart illustrating a process in which the information processing apparatus according to the present embodiment translates input sentence data.
- FIG. 14 is a diagram illustrating an example of a hardware configuration of a computer that realizes the same function as the information processing apparatus according to the embodiment.
- FIGS. 1 to 3 are diagrams for explaining the processing of the information processing apparatus according to the present embodiment.
- FIG. 1 illustrates an example of a process in which the information processing apparatus assigns a vector to each word included in the input sentence.
- the information processing apparatus when an input sentence 10 is given, the information processing apparatus performs a morphological analysis to divide a character string included in the input sentence 10 for each word, and generates a divided input sentence 10a. .
- each word is separated by "@ (space)".
- the information processing apparatus assigns each word (a code corresponding to the word) to a static code or a dynamic code based on the dictionary information 150e.
- the dictionary information 150e includes a static dictionary and a dynamic dictionary.
- the static dictionary is dictionary information that associates a static code with a word.
- the dynamic dictionary is dictionary information that holds a code (dynamic code) dynamically assigned to a word that is not included in the static dictionary.
- the information processing device converts each word of the divided input sentence 10a into a static code or a dynamic code based on each word (code) of the divided input sentence 10a and the dictionary information 150e, and converts the encoded sentence 10b. Generate. For example, static codes corresponding to the words “he ⁇ ”, “ha ⁇ ”, “history ⁇ ”,“ ni ⁇ ”, and“ shi ⁇ ”are registered in the static dictionary, and the words“ tsuyo ⁇ ” Is not registered in the static dictionary. It is assumed that the dynamic code corresponding to the word “tsukatsuki” is registered in the dynamic dictionary.
- the static codes assigned to the words “he ⁇ ”, “h ⁇ ”, “history ⁇ ”, “ni ⁇ ”, and “shi ⁇ ⁇ ” are denoted by “(he ⁇ )”, “(ha ⁇ )”. ”,“ (History ⁇ ) ”,“ (Nii) ”, and“ I am doing ”.
- the dynamic code assigned to the word “tsukayo ⁇ ” is described as “(tsukayo ⁇ )”.
- the information processing device compares each static code and dynamic code of the encoded text 10b with the first vector table 150a, and assigns each of the static codes and dynamic codes. Identify the vector.
- the first vector table 150a holds a static code and a vector corresponding to the static code.
- the first vector table 150a holds a dynamic code and a vector corresponding to the dynamic code.
- the first vector table 150a classifies dynamic codes assigned to words whose appearance frequency is lower than the reference by attribute, and assigns the same vector to each dynamic code belonging to the same attribute. .
- the appearance frequency is less than the reference and each word of the synonym (dynamic code of each word) is classified into the same attribute.
- the vector “Vec1-1a” is assigned to the dynamic codes “(Common ⁇ )”, “(Common ⁇ )”, and “(Detailed ⁇ )”.
- the appearance frequency of each word is specified in advance based on general text information such as a blue sky library. Note that synonyms are words having different word forms but the same meaning, and the same vector can be assigned using a thesaurus or a thesaurus.
- the information processing apparatus assigns “Vec1-1” to “(he ⁇ )” of the encoded sentence 10b, assigns “Vec1-2” to “( ⁇ )”, and assigns “Vec1-” to “(history ⁇ )”. "Vec1-4" to "(Nii)” and “Vec1-5" to "(Nii)”.
- the information processing device assigns “Vec1-1a” to “(tsukatsuki ⁇ )” in the encoded sentence 10b.
- the information processing apparatus includes an encoder 50, a recurrent neural network (RNN) 60, and a decoder 70.
- RNN recurrent neural network
- an input sentence in the first language is input to the encoder 50
- an output sentence in the second language is output from the decoder 70 via the RNN 60.
- the first language is Japanese and the second language is English, but the present invention is not limited to this.
- Vectors assigned to words in the first language are referred to as “first vectors”, and vectors assigned to words in the second language are referred to as “second vectors”.
- the encoder 50 is a processing unit that divides an input sentence into words constituting the sentence and converts them into first vectors.
- the RNN 60 is a processing unit that, when a plurality of first vectors are input, converts the plurality of first vectors into a second vector using parameters set therein.
- the parameters of the RNN 60 include a bias value and a weight of the activation function.
- the decoder 70 is a processing unit that decodes an output sentence based on each word corresponding to the second vector output from the RNN 60.
- the encoder 50 converts a plurality of words included in the input sentence 51 into a compressed code capable of uniquely identifying the word and the meaning of the word, using a first language code conversion table (not shown). For example, each word included in the input sentence 51 is converted into compression codes 52-1 to 52-n.
- the encoder 50 converts the compressed codes 51-1 to 51-n into static codes or dynamic codes 53-1 to 53-n based on dictionary information (not shown) in the first language.
- the encoder 50 converts a compressed code corresponding to a high-frequency word into a static code, and converts a compressed code corresponding to a low-frequency word into a dynamic code.
- the static code or the dynamic code 53-1 to 53-n generated by the encoder 50 is information corresponding to a local representation (local @ representation).
- the encoder 50 converts each static code or dynamic code into a corresponding first vector by referring to a first vector table (not shown).
- the first vector is information corresponding to a distributed representation (distributed @ representation).
- the encoder 50 outputs the converted first vectors to the RNN 60.
- the RNN 60 includes intermediate layers (hidden layers) 61-1 to 61-n, 63-1 to 63-n, and a conversion mechanism 62. Each of the intermediate layers 61-1 to 61-n and 63-1 to 63-n calculates a value based on a parameter set for itself and an input vector, and outputs the calculated value. .
- the intermediate layer 61-1 receives the input of the first vector of the static code or the dynamic code 53-1, calculates a value based on the received vector and a parameter set therein, and calculates the value. The value is output to the conversion mechanism 62.
- the intermediate layers 61-2 to 61-n also receive the input of the first vector of the corresponding static code or dynamic code, and set the value based on the received vector and the parameter set therein. Is calculated, and the calculated value is output to the conversion mechanism 62.
- the conversion mechanism 62 uses the values input from the intermediate layers 61-1 to 61-n and the internal state of the decoder 70 and the like as a determination material, and has a role of determining a notable part when translating the next word. Carry. For example, the probability of focusing on the value of the middle layer 61-1 is 0.2, the probability of focusing on the middle layer 61-2 is 0.3, and the like. .
- the conversion mechanism 62 calculates a weighted sum of the distributed expressions by adding values obtained by multiplying the values output from the intermediate layers 61-1 to 61-n by the respective attentions (probabilities). This is called a context vector (context vector).
- the conversion mechanism 63 inputs the context vector to the intermediate layers 63-1 to 63-n.
- the probabilities used when calculating the context vectors input to the intermediate layers 63-1 to 63-n are recalculated, and the location to be noted changes each time.
- the intermediate layer 63-1 receives the context vector from the conversion mechanism 62, calculates a value based on the received context vector and a parameter set therein, and outputs the calculated value to the decoder.
- the intermediate layers 63-2 to 63-n receive the corresponding context vectors, calculate the values based on the received vectors and the parameters set therein, and store the calculated values in the decoder 70. Output to
- the decoder 70 refers to a second vector table (not shown) for the values (second vectors) output from the intermediate layers 63-1 to 63-n, and converts the second vectors into static codes or dynamic codes 71. -1 to 71-n.
- the second vector table is a table that associates a static code or a dynamic code with a second vector.
- the second vector is information corresponding to the distributed representation.
- the decoder 70 converts static codes or dynamic codes 71-1 to 71-n into compressed codes 72-1 to 72-n based on dictionary information (not shown) in the second language.
- the dictionary information of the second language is information in which a compressed code is associated with a static code or a dynamic code of the second language.
- the decoder 70 generates an output sentence 73 by converting the compressed codes 72-1 to 72-n into words in the second language using a code conversion table (not shown) in the second language.
- the information processing apparatus when learning the parameters of the RNN 60, receives a set of an input sentence in the first language and an output sentence in the second language, which are teacher data.
- the information processing device learns the parameters of the RNN 60 so that when the input sentence of the teacher data is input to the encoder 50, the output sentence of the teacher data is output from the decoder 70.
- FIG. 3 is a diagram for explaining processing when the information processing apparatus according to the present embodiment learns RNN parameters.
- an input sentence “He is familiar with history” and an output sentence “He is familiar with history.” are used as teacher data.
- the information processing device performs the following processing based on the input sentence of the teacher data “He is familiar with history.”
- Each of the information input to each of the intermediate layers 61-1 to 61-n of the RNN 60 is processed.
- a first vector is calculated.
- the information processing apparatus divides the character string included in the input sentence 51a for each word to generate a divided input sentence (not shown). For example, the frequency of occurrence of each of the words “he ⁇ ”, “ha ⁇ ”, “history ⁇ ”, “ni ⁇ ”, and “having ⁇ ” included in the input sentence 51a is set to be equal to or higher than the reference. The frequency of appearance of the word “tsukatsuki” is less than the standard.
- the information processing device converts the word “he” into a compressed code 52-1 and converts the compressed code 52-1 into a static code 54-1.
- the information processing device specifies the first vector of “he” based on the static code 54-1 of “he” and the first vector table and inputs the first vector of the “he” to the intermediate layer 61-1. Vector.
- the information processing device converts the word “ ⁇ ” into a compressed code 52-2, and converts the compressed code 52-2 into a static code 54-2.
- the information processing apparatus specifies the first vector of “ ⁇ ” based on the static code 54-2 of “ ⁇ ” and the first vector table, and specifies the first vector to be input to the intermediate layer 61-2. Vector.
- the information processing device converts the word "history” into a compressed code 52-3, and converts the compressed code 52-3 into a static code 54-3.
- the information processing apparatus specifies the first vector of “History II” based on the static code 54-3 of “History II” and the first vector table, and specifies the first vector to be input to the intermediate layer 61-3. Vector.
- the information processing device converts the word “ ⁇ ” into a compressed code 52-4, and converts the compressed code 52-4 into a static code 54-4.
- the information processing device specifies the first vector of “ ⁇ ” based on the static code 54-4 of “ ⁇ ” and the first vector table and inputs the first vector of “ ⁇ ” to the intermediate layer 61-4. Vector.
- the information processing device converts the word “tsukago” into a compressed code 52-5, and converts the compressed code 52-5 into a dynamic code 54-5. For example, the frequency of appearance of the word “tsukatsuki” is less than the reference.
- the information processing apparatus specifies the first vector of “tsukasa I” based on the dynamic code 54-5 of “tsukasa I” and the first vector table, and inputs the first vector of the first vector to the intermediate layer 61-5. Vector.
- the information processing device converts the word “does” into a compressed code 52-6, and converts the compressed code 52-6 into a static code 54-6.
- the information processing apparatus specifies the first vector of “performing ⁇ ” based on the static code 54-6 of “performing ⁇ ” and the first vector table, and transmits the first vector of “performing ⁇ ” to the intermediate layer 61-6. This is the first vector to be input.
- the first vector assigned to "tsukasa” is the same vector as the first vector assigned to the synonyms "familiar” and “detail” belonging to the same attribute as “tsukasa”.
- the information processing apparatus performs the following processing based on the output sentence “He ⁇ is familiar with history.” Of the teacher data, and outputs the result from each of the intermediate layers 63-1 to 63-4 of the RNN 60.
- "Optimal second vector" is calculated. For example, the appearance frequencies of the words “He ⁇ ⁇ ”, “is ⁇ ”, “with ⁇ ”, and “history ⁇ ” are set to be equal to or higher than a reference. The appearance frequency of the word “familiar ⁇ ” is set to be lower than the standard.
- the information processing device divides the character string included in the output sentence 53a for each word, and generates a divided output sentence (not shown).
- the information processing device converts the word “He ⁇ ” into a compressed code 72-1 and converts the compressed code 72-1 into a static code 71-1.
- the information processing apparatus identifies the second vector of “He ⁇ ” based on the static code 72-1 of “He ⁇ ” and the second vector table, and outputs the ideal vector output from the intermediate layer 63-1. The value of a typical second vector.
- the ⁇ information processing device ⁇ converts the word “is ⁇ into a compressed code 72-2, and converts the compressed code 72-2 into a static code 71-2.
- the information processing device specifies the second vector of “is ⁇ ” based on the static code 72-2 of “is ⁇ ” and the second vector table, and outputs the ideal vector output from the intermediate layer 63-2. The value of a typical second vector.
- the information processing device converts the word “familiar ⁇ ” into a compressed code 72-3, and converts the compressed code 72-3 into a dynamic code 71-3.
- the information processing device specifies the second vector of “familiar ⁇ ” based on the dynamic code 72-3 of “familiar ⁇ ” and the second vector table, and outputs the ideal vector output from the intermediate layer 63-3. The value of a typical second vector.
- the information processing device converts the word “with ⁇ ” into a compressed code 72-4, and converts the compressed code 72-4 into a static code 71-4.
- the information processing device specifies the second vector “with ⁇ ” based on the static code 72-4 “with ⁇ ”and the second vector table, and outputs the ideal vector output from the intermediate layer 63-4. The value of a typical second vector.
- the information processing device converts the word “history ⁇ ” into a compressed code 72-5, and converts the compressed code 72-5 into a dynamic code 71-5.
- the information processing apparatus specifies the second vector of “history ⁇ ” based on the static code 72-5 of “history ⁇ ” and the second vector table, and outputs the ideal vector output from the intermediate layer 63-5. The value of a typical second vector.
- the information processing apparatus uses the teacher data to output each first vector to each of the intermediate layers 61-1 to 61-n of the RNN 60 and each of the intermediate layers 63-1 to 63-n of the RNN 60.
- An ideal second vector to be output is specified.
- the information processing apparatus inputs each of the identified first vectors to each of the intermediate layers 61-1 to 61-n of the RNN 60, so that the second vector output from each of the intermediate layers 63-1 to 63-n becomes The process of adjusting the parameters of the RNN 60 is performed so as to approach the ideal second vector.
- the information processing apparatus of the present embodiment assigns a unique vector to high-frequency and medium-frequency words, and assigns the same vector to low-frequency synonyms, thereby reducing the data amount. Make reductions. This makes it possible to reduce the data amount of the vector information used for generating the conversion model without lowering the translation accuracy.
- FIG. 4 is a functional block diagram illustrating the configuration of the information processing apparatus according to the embodiment.
- the information processing apparatus 100 includes a communication unit 110, an input unit 120, a display unit 130, a storage unit 150, and a control unit 160.
- the communication unit 110 is a processing unit that executes data communication with an external device via a network.
- the communication unit 110 is an example of a communication device.
- the information processing device 100 may be connected to an external device via a network, and may receive the teacher data table 150c and the like from the external device.
- the input unit 120 is an input device for inputting various types of information to the information processing device 100.
- the input unit 120 corresponds to a keyboard, a mouse, a touch panel, and the like.
- the display unit 130 is a display device for displaying various information output from the control unit 160.
- the display unit 130 corresponds to a liquid crystal display, a touch panel, or the like.
- the storage unit 150 has a first vector table 150a, a second vector table 150b, a teacher data table 150c, a code conversion table 150d, dictionary information 150e, and RNN data 150f.
- the storage unit 150 has input sentence data 150g and output sentence data 150h.
- the storage unit 150 corresponds to a semiconductor memory device such as a RAM (Random Access Memory), a ROM (Read Only Memory), a flash memory (Flash Memory), and a storage device such as an HDD (Hard Disk Drive).
- FIG. 5 is a diagram illustrating an example of the data structure of the first vector table according to the present embodiment.
- the first vector table 150a associates words (static codes and dynamic codes of words) in the first language with the first vectors.
- the first vector assigned to the static code "6002h” of the word "he @" in the first language is "Vec1-1".
- the same first vector is assigned to each dynamic code corresponding to a low-frequency synonym.
- Each dynamic code corresponding to a low-frequency synonym can be said to belong to the same attribute.
- the first vector “Vec1-1a” is included in the dynamic code “E005h” of the word “tsukago ⁇ ”, the dynamic code “E006h” of “familiar ⁇ ”, and the dynamic code “E007h” of “detailed ⁇ ”. Assigned.
- FIG. 6 is a diagram illustrating an example of a data structure of the second vector table according to the present embodiment.
- the second vector table 150b associates words (static codes and dynamic codes of words) in the second language with the second vectors.
- the first vector assigned to the static code “7073h” of the word “He ⁇ ” in the second language is “Vec2-1”.
- the second vector is allocated to the low-frequency dynamic code "F034h (familiar @)".
- F034h fumiliar @
- the same second vector is assigned to each dynamic code corresponding to the low-frequency synonym.
- Each dynamic code corresponding to a low-frequency synonym can be said to belong to the same attribute.
- the teacher data table 150c is a table that holds a set of an input sentence and an output sentence that is teacher data.
- FIG. 7 is a diagram illustrating an example of a data structure of the teacher data table according to the embodiment. As shown in FIG. 7, the teacher data table 150c associates input sentences with output sentences. For example, when the input sentence "He is familiar with history" written in the first language is translated into the second language, the appropriate output is "He is familiar with history.” Is indicated by
- the code conversion table 150d is a table that associates words with compressed codes.
- FIG. 8 is a diagram illustrating an example of a data structure of the code conversion table according to the embodiment. As shown in FIG. 8, the code conversion table 150d has a table 151a and a table 151b.
- the table 151a associates words in the first language with compressed codes. For example, the word “he @” is associated with the compression code "C101".
- the table 151b associates words in the second language with compressed codes. For example, the word “He ⁇ ” is associated with the compression code “C201”. Note that one compression code may be assigned to a collocation consisting of a plurality of words. In the example shown in FIG. 8, the compression code “C205” is associated with the word “familiar”.
- the dictionary information 150e is a table for associating static codes and dynamic codes corresponding to compressed codes.
- FIG. 9 is a diagram illustrating an example of the data structure of the dictionary information according to the embodiment. As shown in FIG. 9, the dictionary information 150e has a table 152a, a table 152b, a table 153a, and a table 153b.
- the table 152a is a static dictionary that associates compressed codes of words in the first language with static codes.
- the compressed code “C101” is associated with the static code “6002h ( ⁇ )”.
- the table 152b is a dynamic dictionary that associates compressed codes of words in the first language with dynamic codes. As shown in FIG. 9, the table 152b associates a dynamic code with a pointer to a compressed code. For example, a unique dynamic code is assigned to a compressed code that does not hit the compressed code in the table 152a, and is set as the dynamic code in the table 152b.
- the compressed code to which the dynamic code is assigned is stored in a storage area (not shown), and a pointer to the stored position is set in the table 152b.
- the dynamic code “E005h (Tatsuaki ⁇ )” is assigned to the compression code “C105” and is set in the table 152b.
- the compression code “C105” is stored in a storage area (not shown), and a pointer to a position where the compression code “C105” is stored is set in the table 152b.
- the table 153a is a static dictionary that associates compressed codes of words in the second language with static codes.
- the compressed code “C201” is associated with the static code “7073h (He ⁇ )”.
- the table 153b is a dynamic dictionary that associates compressed codes of words in the second language with dynamic codes. As shown in FIG. 9, the table 153b associates a dynamic code with a pointer to a compressed code. For example, a unique dynamic code is assigned to a compressed code that does not hit the compressed code in the table 153b, and is set as the dynamic code in the table 153b.
- the compressed code to which the dynamic code is assigned is stored in a storage area (not shown), and a pointer to the stored position is set in the table 153b.
- the dynamic code “F034h (familiar)” is assigned to the compression code “C203” and set in the table 153b.
- the compression code “C203” is stored in a storage area (not shown), and a pointer to a position where the compression code “C203” is stored is set in the table 153b.
- the RNN data 150f is a table that holds parameters and the like set in each intermediate layer of the RNN 60 described with reference to FIGS.
- FIG. 10 is a diagram illustrating an example of the data structure of the RNN data according to the embodiment. As shown in FIG. 10, the RNN data 150f associates RNN identification information with parameters.
- the RNN identification information is information for uniquely identifying the intermediate layer of the RNN 60.
- the parameter indicates a parameter set for the corresponding intermediate layer. The parameter corresponds to a bias value, a weight, and the like of the activation function set for the intermediate layer.
- FIG. 11 is a diagram for supplementarily explaining the parameters of the intermediate layer.
- FIG. 11 has an input layer “x”, an intermediate layer (hidden layer) “h”, and an output layer “y”.
- the intermediate layer “h” corresponds to the intermediate layers 61-1 to 61-n and 63-1 to 63-n shown in FIG.
- the input sentence data 150g is data of an input sentence to be translated.
- the output sentence data 150h is data obtained by translating the input sentence data 150g.
- the control unit 160 includes a receiving unit 160a, a vector specifying unit 160b, a generating unit 160c, and a translating unit 160d.
- the control unit 160 can be realized by a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or the like.
- the control unit 160 can also be realized by hard wired logic such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array).
- the processing of the encoder 50, the RNN 60, and the decoder 70 described with reference to FIGS. 2 and 3 is realized by the control unit 160.
- the vector specifying unit 160b, the generation unit 160c, and the translation unit 160d are examples of a generation processing unit.
- the information processing apparatus 100 learns RNN data 150f serving as a parameter of the RNN 60.
- the receiving unit 160a the vector specifying unit 160b, and the generating unit 160c among the processing units of the control unit 160 operate.
- the receiving unit 160a is a processing unit that receives the teacher data table 150c from an external device via a network.
- the receiving unit 160a stores the received teacher data table 150c in the storage unit 150.
- the receiving unit 160a may receive the teacher data table 150c from the input unit 120.
- the vector specifying unit 160b is a processing unit that specifies a first vector to be assigned to each word of the input sentence of the teacher data table 150c and a second vector to be assigned to each word of the output sentence.
- the vector specifying unit 160b outputs information on the first vector and the second vector to the generating unit 160c.
- the vector identifying unit 160b identifies the attribute associated with the word whose appearance frequency is less than the criterion, and Identify the first vector to be assigned.
- the vector identification unit 160b identifies an attribute associated with the word whose appearance frequency is less than the criterion, and is assigned to the identified attribute. Identify the second vector.
- the vector specifying unit 160b performs a process of converting to a compressed code, a process of converting to a static code or a dynamic code, and a process of specifying a vector.
- the vector specifying unit 160b obtains information of the input sentence from the teacher data table 150c and performs a morphological analysis on the input sentence to divide a character string included in the input sentence into words, and Generate.
- the vector specifying unit 160b compares each word included in the divided input sentence with the table 151a of the code conversion table 150d, and converts each word into a compressed code. For example, the vector specifying unit 160b converts the word "he" into a compressed code "C101".
- the vector specifying unit 160b obtains information of the output sentence from the teacher data table 150c and performs a morphological analysis on the output sentence to divide the character string included in the output sentence into words, and Generate.
- the vector specifying unit 160b compares each word included in the divided output sentence with the table 151b of the code conversion table 150d, and converts each word into a compressed code. For example, the vector specifying unit 160b converts the word “He ⁇ ” into a compressed code “C201”.
- the vector specifying unit 160b compares each compressed code converted from the divided input sentence with a table (static dictionary) 152a.
- the vector specifying unit 160b converts a compressed code that hits the compressed code in the table 152a among the compressed codes of the divided input sentence into a static code.
- a static code generated from the words of the divided input sentence is referred to as a “first static code”.
- the vector specifying unit 160b converts a compressed code that does not hit the compressed code in the table 152a among the compressed codes of the divided input sentence into a dynamic code.
- the vector specifying unit 160b compares the compressed code with a table (dynamic dictionary) 152b, and converts a compressed code already registered in the table 152b into a dynamic code registered in the table 152b.
- the vector specifying unit 160b when the compression code is not registered in the table 152b, the vector specifying unit 160b generates a dynamic code, registers the dynamic code in the table 152b, and converts the code into the registered dynamic code.
- the dynamic code generated from the words of the divided input sentence is referred to as “first dynamic code”.
- the vector specifying unit 160b compares each compressed code converted from the divided output sentence with a table (static dictionary) 153a.
- the vector specifying unit 160b converts a compressed code that hits the compressed code in the table 153a among the compressed codes of the divided output sentence into a static code.
- a static code generated from the words of the divided output sentence is referred to as a “second static code”.
- the vector specifying unit 160b converts a compressed code that does not hit the compressed code in the table 153a among the compressed codes of the divided output sentence into a dynamic code.
- the vector specifying unit 160b compares the compressed code with a table (dynamic dictionary) 153b, and converts a compressed code already registered in the table 153b into a dynamic code registered in the table 153b.
- the vector identification unit 160b when the compression code is not registered in the table 153b, the vector identification unit 160b generates a dynamic code, registers the dynamic code in the table 153b, and converts the code into the registered dynamic code.
- the dynamic code generated from the words of the divided output sentence is referred to as “second dynamic code”.
- the vector specifying unit 160b compares the first static code with the first vector table 150a, and specifies a first vector corresponding to the first static code. Further, the vector specifying unit 160b compares the first dynamic code with the first vector table 150a, and specifies the first vector corresponding to the attribute to which the first dynamic code belongs.
- a unique first vector is specified for each first static code.
- one first vector assigned to the attribute is specified for each first static code.
- the vector specifying unit 160b compares the second static code with the second vector table 150b, and specifies a second vector corresponding to the second static code. Further, the vector specifying unit 160b compares the second dynamic code with the second vector table 150b, and specifies the second vector corresponding to the attribute to which the second dynamic code belongs. Here, a unique second vector is specified for each second static code. On the other hand, for each second dynamic code belonging to the same attribute, one second vector assigned to the attribute is specified.
- the vector specifying unit 160b generates the first vector corresponding to each word of the input sentence and the second vector corresponding to each word of the output sentence by executing the above processing.
- the vector specifying unit 160b outputs the generated information on the first vector and the second vector to the generating unit 160c.
- the generation unit 160c is a processing unit that generates a conversion model by learning parameters of the conversion model based on the first vector and the second vector specified by the vector specification unit 160b.
- the learning of the parameters is performed by the following processing, and the learned parameters are registered in the RNN data 150f.
- the RNN 60 that calculates a value based on the parameters of the RNN data 150f corresponds to a conversion model.
- the generating unit 160c inputs each first vector to the intermediate layers 61-1 to 61-n of the RNN 60 using the parameters of each intermediate layer registered in the RNN data 150f, and outputs the first vectors to the intermediate layers 63-1 to 63-63. Calculate each vector output from -n.
- the generation unit 160c learns parameters of each intermediate layer registered in the RNN data 150f such that each vector output from the intermediate layers 63-1 to 63-n of the RNN 60 approaches each second vector.
- the generation unit 160c uses a cost function that defines the difference between each vector output from the intermediate layers 63-1 to 63-n and the second vector, and sets the parameters of each intermediate layer so that the difference is minimized. Learning may be performed by adjusting.
- the information processing apparatus 100 uses the learned RNN data 150f (the generated conversion model) to generate output sentence data obtained by translating the input sentence data.
- the processing units of the control unit 160 the receiving unit 160a, the vector specifying unit 160b, and the translating unit 160d operate.
- the receiving unit 160a receives the input sentence data 150g from an external device via a network.
- the receiving unit 160a stores the received input sentence data 150g in the storage unit 150.
- the vector specifying unit 160b specifies the first vector corresponding to each word of the input sentence included in the input sentence data 150g. When a word whose appearance frequency is lower than the reference is included, the vector specifying unit 160b specifies the attribute associated with the word whose appearance frequency is lower than the reference, and specifies the first vector assigned to the specified attribute. The vector specifying unit 160b outputs information of the first vector specified based on the input sentence data 150g to the translating unit 160d.
- the process in which the vector specifying unit 160b specifies the first vector of the input sentence of the input sentence data 150g is the same as the process of specifying the first vector of the input sentence in the teacher data table 150c.
- the translation unit 160d inputs the first vectors to the respective intermediate layers 61-1 to 61-n of the RNN 60 using the parameters of the respective intermediate layers 61-1 to 63-n registered in the RNN data 150f.
- the translating unit 160d converts each first vector into each second vector by obtaining each second vector output from the intermediate layers 63-1 to 63-n of the RNN 60.
- the translating unit 160d generates output sentence data 150h using each second vector converted from each first vector.
- the translation unit 160d compares each second vector with the second vector table 150b, and specifies a static code and a dynamic code corresponding to each second vector.
- the translation unit 160d specifies words corresponding to the static code and the dynamic code, respectively, based on the static code and the dynamic code, the dictionary information 150e, and the code conversion table 150d.
- the translation unit 160d generates the output sentence data 150h by arranging the specified words, and stores the output sentence data 150h in the storage unit 150.
- the translating unit 160d may notify the output device 150h of the output sentence data to an external device, or may output the data to the display unit 130 for display.
- FIG. 12 is a flowchart illustrating a process in which the information processing apparatus according to the present embodiment generates RNN data.
- the receiving unit 160a of the information processing device 100 receives a teacher data table 150c from an external device (Step S101).
- the vector specifying unit 160b of the information processing apparatus 100 acquires teacher data from the teacher data table 150c (Step S102).
- the vector specifying unit 160b assigns a compression code to each word included in the input sentence (Step S103).
- the vector specifying unit 160b assigns a static code and a dynamic code to each compressed code (Step S104).
- the vector specifying unit 160b specifies each first vector corresponding to each static code based on the first vector table 150a (Step S105).
- the vector specifying unit 160b specifies the attribute of the dynamic code based on the first vector table 150a, and specifies the first vector corresponding to the attribute (Step S106).
- the vector specifying unit 160b allocates a compression code to each word included in the output sentence (Step S107).
- the vector specifying unit 160b assigns a static code and a dynamic code to each compressed code (Step S108).
- the vector specifying unit 160b specifies a second vector corresponding to each static code based on the second vector table 150b (Step S109).
- the vector specifying unit 160b specifies the attribute of the dynamic code based on the second vector table 150b, and specifies the second vector corresponding to the attribute (Step S110).
- the generation unit 160c of the information processing apparatus 100 inputs each first vector to each intermediate layer, and adjusts parameters so that each vector output from each intermediate layer of the RNN approaches each second vector (step S111).
- the information processing apparatus 100 determines whether to continue learning (step S112). If the information processing device 100 does not continue the learning (No at Step S112), the information processing device 100 ends the process. When continuing the learning (Yes at Step S112), the information processing apparatus 100 proceeds to Step S113.
- the vector specifying unit 160b acquires new teacher data from the teacher data table 150c (Step S113), and proceeds to Step S103.
- FIG. 13 is a flowchart illustrating a process in which the information processing apparatus according to the present embodiment translates input sentence data.
- the receiving unit 160a of the information processing device 100 receives input sentence data 150g from an external device (Step S201).
- the vector specifying unit 160b of the information processing apparatus 100 assigns a compression code to each word included in the input sentence data 150g (Step S202).
- the vector specifying unit 160b assigns a static code and a dynamic code to each compressed code based on the dictionary information 150e (Step S203).
- the vector specifying unit 160b specifies each first vector corresponding to each static code with reference to the first vector table 150a (Step S204).
- the vector specifying unit 160b specifies the first vector corresponding to the attribute of the dynamic code with reference to the first vector table 150a (Step S205).
- the translation unit 160d of the information processing apparatus 100 inputs each first vector to each intermediate layer of the RNN, and acquires each second vector output from each intermediate layer (Step S206).
- the translating unit 160d converts each second vector into a static code and a dynamic code with reference to the second vector table 150b (Step S207).
- the translating unit 160d converts the static code and the dynamic code into a compressed code based on the dictionary information 150e (Step S208).
- the translation unit 160d converts the compressed code into words based on the code conversion table 150d, and generates output sentence data 150h (Step S209).
- the translation unit 160d notifies the external device of the output sentence data 150h (Step S210).
- both the input sentence and the output sentence serving as teacher data include a low-frequency word
- the present invention is not limited to this.
- a model (RNN data 150f) can be generated.
- the information processing apparatus 100 assigns a unique vector to words included in the input sentence that are equal to or higher than a reference. On the other hand, the same vector as other synonyms is assigned to words that are less than the reference.
- the information processing apparatus 100 can generate an appropriate output sentence by inputting the vector assigned to each word of the input sentence by the above processing to the RNN 60 and using the vector output from the RNN 60.
- the information processing device assigns one vector to a low-frequency word. This makes it possible to reduce the amount of data in the vector table while easily classifying low-frequency words for each attribute.
- FIG. 14 is a diagram illustrating an example of a hardware configuration of a computer that realizes the same function as the information processing apparatus according to the embodiment.
- the computer 200 includes a CPU 201 that executes various arithmetic processing, an input device 202 that receives input of data from a user, and a display 203. Further, the computer 200 includes a reading device 204 that reads a program or the like from a storage medium, and an interface device 205 that exchanges data with an external device or the like via a wired or wireless network.
- the computer 200 includes a RAM 206 for temporarily storing various information, and a hard disk device 207. Each of the devices 201 to 207 is connected to the bus 208.
- the hard disk device 207 has a reception program 207a, a vector identification program 207b, a generation program 207c, and a translation program 207d.
- the CPU 201 reads out the reception program 207a, the vector identification program 207b, the generation program 207c, and the translation program 207d and expands them on the RAM 206.
- the receiving program 207a functions as the receiving process 206a.
- the vector specifying program 207b functions as a vector specifying process 206b.
- the generation program 207c functions as a generation process 206c.
- Translation program 207d functions as translation process 206d.
- the processing of the reception process 206a corresponds to the processing of the reception unit 160a.
- the processing of the vector specifying process 206b corresponds to the processing of the vector specifying unit 160b.
- the processing of the generation process 206c corresponds to the processing of the generation unit 160c.
- the process of the translation process 206c corresponds to the process of the translation unit 160d.
- each program does not necessarily have to be stored in the hard disk device 207 from the beginning.
- each program is stored in a “portable physical medium” such as a flexible disk (FD), a CD-ROM, a DVD disk, a magneto-optical disk, or an IC card inserted into the computer 200.
- the computer 200 may read out and execute each of the programs 207a to 207h.
- REFERENCE SIGNS LIST 100 information processing device 110 communication unit 120 input unit 130 display unit 150 storage unit 150a first vector table 150b second vector table 150c teacher data table 150d code conversion table 150e dictionary information 150f RNN data 150g input sentence data 150h output sentence data 160 control Unit 160a receiving unit 160b vector specifying unit 160c generating unit 160d translating unit
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Mathematical Physics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Biophysics (AREA)
- Probability & Statistics with Applications (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
110 通信部
120 入力部
130 表示部
150 記憶部
150a 第1ベクトルテーブル
150b 第2ベクトルテーブル
150c 教師データテーブル
150d コード変換テーブル
150e 辞書情報
150f RNNデータ
150g 入力文データ
150h 出力文データ
160 制御部
160a 受付部
160b ベクトル特定部
160c 生成部
160d 翻訳部
Claims (12)
- コンピュータが、
第1のテキスト情報と、第2のテキスト情報とを受け付け、
前記第1のテキスト情報に含まれる単語のうち、出現頻度が基準未満の単語と、前記第2のテキスト情報に含まれる単語のうち、出現頻度が基準未満である単語とを抽出し、
出現頻度が基準未満の複数の単語に対して、一つの属性を割り当てた情報を記憶する記憶部を参照して、抽出した単語に対応付けられた属性を特定し、
単語の属性に応じたベクトル情報を該属性に対応付けて記憶する記憶部を参照して、前記第1のテキスト情報から抽出された単語の属性に対応付けられた第1ベクトル情報と、前記第2のテキスト情報から抽出された単語の属性に対応付けられた第2ベクトル情報とを特定し、
前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第2ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成する
処理を実行することを特徴とする生成方法。 - 前記受け付ける処理は、第3のテキスト情報を受け付け、
前記抽出する処理は、前記第3のテキスト情報に含まれる単語を抽出し、
前記記憶部は、出現頻度が基準以上の単語に応じたベクトル情報を記憶し、
前記特定する処理は、前記記憶部を参照して、前記第3のテキスト情報に含まれる単語に対応付けられた第3ベクトル情報を特定し、
前記生成する処理は、前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第3ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成することを特徴とする請求項1に記載の生成方法。 - 前記受け付ける処理は、翻訳対象のテキスト情報を受け付け、
前記抽出する処理は、前記翻訳対象のテキスト情報に含まれる複数の単語を抽出し、
前記特定する処理は、前記記憶部を参照して、前記複数の単語のうち、出現頻度が基準未満の単語には、出現頻度が基準未満の単語の属性に対応付けられたベクトル情報を特定し、出現頻度が基準以上の単語には、出現頻度が基準以上の単語に対応付けられたベクトル情報を特定し、
前記特定する処理に特定された複数のベクトル情報を、前記変換モデルに入力することで前記変換モデルから出力される複数のベクトル情報を基にして、テキスト情報を生成する処理を更に実行することを特徴とする請求項1に記載の生成方法。 - 前記記憶部は、出現頻度が基準未満の同義語に対して、一つのベクトル情報を対応付けることを特徴とする請求項1、2または3に記載の生成方法。
- コンピュータに、
第1のテキスト情報と、第2のテキスト情報とを受け付け、
前記第1のテキスト情報に含まれる単語のうち、出現頻度が基準未満の単語と、前記第2のテキスト情報に含まれる単語のうち、出現頻度が基準未満である単語とを抽出し、
出現頻度が基準未満の複数の単語に対して、一つの属性を割り当てた情報を記憶する記憶部を参照して、抽出した単語に対応付けられた属性を特定し、
単語の属性に応じたベクトル情報を該属性に対応付けて記憶する記憶部を参照して、前記第1のテキスト情報から抽出された単語の属性に対応付けられた第1ベクトル情報と、前記第2のテキスト情報から抽出された単語の属性に対応付けられた第2ベクトル情報とを特定し、
前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第2ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成する
処理を実行させることを特徴とする生成プログラム。 - 前記受け付ける処理は、第3のテキスト情報を受け付け、
前記抽出する処理は、前記第3のテキスト情報に含まれる単語を抽出し、
前記記憶部は、出現頻度が基準以上の単語に応じたベクトル情報を記憶し、
前記特定する処理は、前記記憶部を参照して、前記第3のテキスト情報に含まれる単語に対応付けられた第3ベクトル情報を特定し、
前記生成する処理は、前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第3ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成することを特徴とする請求項5に記載の生成プログラム。 - 前記受け付ける処理は、翻訳対象のテキスト情報を受け付け、
前記抽出する処理は、前記翻訳対象のテキスト情報に含まれる複数の単語を抽出し、
前記特定する処理は、前記記憶部を参照して、前記複数の単語のうち、出現頻度が基準未満の単語には、出現頻度が基準未満の単語の属性に対応付けられたベクトル情報を特定し、出現頻度が基準以上の単語には、出現頻度が基準以上の単語に対応付けられたベクトル情報を特定し、
前記特定する処理に特定された複数のベクトル情報を、前記変換モデルに入力することで前記変換モデルから出力される複数のベクトル情報を基にして、テキスト情報を生成する処理を更に実行することを特徴とする請求項5に記載の生成プログラム。 - 前記記憶部は、出現頻度が基準未満の同義語に対して、一つのベクトル情報を対応付けることを特徴とする請求項5、6または7に記載の生成プログラム。
- 第1のテキスト情報と、第2のテキスト情報とを受け付ける受付部と、
前記第1のテキスト情報に含まれる単語のうち、出現頻度が基準未満の単語と、前記第2のテキスト情報に含まれる単語のうち、出現頻度が基準未満である単語とを抽出し、
出現頻度が基準未満の複数の単語に対して、一つの属性を割り当てた情報を記憶する記憶部を参照して、抽出した単語に対応付けられた属性を特定し、単語の属性に応じたベクトル情報を該属性に対応付けて記憶する記憶部を参照して、前記第1のテキスト情報から抽出された単語の属性に対応付けられた第1ベクトル情報と、前記第2のテキスト情報から抽出された単語の属性に対応付けられた第2ベクトル情報とを特定し、前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第2ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成する生成処理部と
を有することを特徴とする情報処理装置。 - 前記受付部は、第3のテキスト情報を受け付け、前記記憶部は、出現頻度が基準以上の単語に応じたベクトル情報を記憶し、前記生成処理部は、前記第3のテキスト情報に含まれる単語を抽出し、前記記憶部を参照して、前記第3のテキスト情報に含まれる単語に対応付けられた第3ベクトル情報を特定し、前記第1ベクトル情報を、変換モデルに入力した際に出力されるベクトル情報が、前記第3ベクトル情報に近づくように、前記変換モデルのパラメータを学習することで、前記変換モデルを生成することを特徴とする請求項9に記載の情報処理装置。
- 前記受付部は、翻訳対象のテキスト情報を受け付け、生成処理部は、前記翻訳対象のテキスト情報に含まれる複数の単語を抽出し、前記記憶部を参照して、前記複数の単語のうち、出現頻度が基準未満の単語には、出現頻度が基準未満の単語の属性に対応付けられたベクトル情報を特定し、出現頻度が基準以上の単語には、出現頻度が基準以上の単語に対応付けられたベクトル情報を特定し、
特定した複数のベクトル情報を、前記変換モデルに入力することで前記変換モデルから出力される複数のベクトル情報を基にして、テキスト情報を生成する処理を更に実行することを特徴とする請求項9に記載の情報処理装置。 - 前記記憶部は、出現頻度が基準未満の同義語に対して、一つのベクトル情報を対応付けることを特徴とする請求項9、10または11に記載の情報処理装置。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP18932179.7A EP3846070A4 (en) | 2018-08-30 | 2018-08-30 | GENERATION PROCESS AND PROGRAM, AND INFORMATION PROCESSING DEVICE |
AU2018438250A AU2018438250B2 (en) | 2018-08-30 | 2018-08-30 | Generating method, generating program, and information processing apparatus |
PCT/JP2018/032206 WO2020044509A1 (ja) | 2018-08-30 | 2018-08-30 | 生成方法、生成プログラムおよび情報処理装置 |
JP2020539961A JP7173149B2 (ja) | 2018-08-30 | 2018-08-30 | 生成方法、生成プログラムおよび情報処理装置 |
US17/178,877 US20210192152A1 (en) | 2018-08-30 | 2021-02-18 | Generating method, non-transitory computer readable recording medium, and information processing apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2018/032206 WO2020044509A1 (ja) | 2018-08-30 | 2018-08-30 | 生成方法、生成プログラムおよび情報処理装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/178,877 Continuation US20210192152A1 (en) | 2018-08-30 | 2021-02-18 | Generating method, non-transitory computer readable recording medium, and information processing apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020044509A1 true WO2020044509A1 (ja) | 2020-03-05 |
Family
ID=69643992
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2018/032206 WO2020044509A1 (ja) | 2018-08-30 | 2018-08-30 | 生成方法、生成プログラムおよび情報処理装置 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20210192152A1 (ja) |
EP (1) | EP3846070A4 (ja) |
JP (1) | JP7173149B2 (ja) |
AU (1) | AU2018438250B2 (ja) |
WO (1) | WO2020044509A1 (ja) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022041130A (ja) * | 2020-08-31 | 2022-03-11 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
JP2022041089A (ja) * | 2020-08-31 | 2022-03-11 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109271497B (zh) * | 2018-08-31 | 2021-10-26 | 华南理工大学 | 一种基于词向量的事件驱动服务匹配方法 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005135217A (ja) | 2003-10-31 | 2005-05-26 | Advanced Telecommunication Research Institute International | 対訳対抽出装置及びそのためのコンピュータプログラム |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5835893A (en) * | 1996-02-15 | 1998-11-10 | Atr Interpreting Telecommunications Research Labs | Class-based word clustering for speech recognition using a three-level balanced hierarchical similarity |
JP6641857B2 (ja) * | 2015-10-05 | 2020-02-05 | 富士通株式会社 | 符号化プログラム、符号化方法、符号化装置、復号化プログラム、復号化方法および復号化装置 |
KR20190022439A (ko) * | 2016-06-30 | 2019-03-06 | 파나소닉 아이피 매니지먼트 가부시키가이샤 | 정보 처리 장치, 시계열 데이터의 정보 처리 방법, 및 프로그램 |
CN107870901B (zh) * | 2016-09-27 | 2023-05-12 | 松下知识产权经营株式会社 | 从翻译源原文生成相似文的方法、记录介质、装置以及系统 |
CN107357775A (zh) * | 2017-06-05 | 2017-11-17 | 百度在线网络技术(北京)有限公司 | 基于人工智能的循环神经网络的文本纠错方法及装置 |
CN107292528A (zh) * | 2017-06-30 | 2017-10-24 | 阿里巴巴集团控股有限公司 | 车险风险预测方法、装置及服务器 |
KR102449842B1 (ko) * | 2017-11-30 | 2022-09-30 | 삼성전자주식회사 | 언어 모델 학습 방법 및 이를 사용하는 장치 |
-
2018
- 2018-08-30 AU AU2018438250A patent/AU2018438250B2/en active Active
- 2018-08-30 JP JP2020539961A patent/JP7173149B2/ja active Active
- 2018-08-30 EP EP18932179.7A patent/EP3846070A4/en not_active Withdrawn
- 2018-08-30 WO PCT/JP2018/032206 patent/WO2020044509A1/ja unknown
-
2021
- 2021-02-18 US US17/178,877 patent/US20210192152A1/en not_active Abandoned
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005135217A (ja) | 2003-10-31 | 2005-05-26 | Advanced Telecommunication Research Institute International | 対訳対抽出装置及びそのためのコンピュータプログラム |
Non-Patent Citations (3)
Title |
---|
KOMATSU, HIROYA ET AL.: "Use of Distributed Word Representation in shift- reduce-Type Syntactic Analysis", IPSJ SIG TECHNICAL REPORT, vol. 2015, no. 2015-SLP-106, 18 May 2015 (2015-05-18), pages 1 - 8, XP009525678 * |
MASUDA, TAKASHI ET AL.: "Use of word class information in neural network Japanese - English Machine Translation", PROCEEDINGS OF THE 22ND ANNUAL MEETING OF THE NATURAL LANGUAGE PROCESSING SOCIETY, 29 February 2016 (2016-02-29), pages 294 - 297, XP009525171 * |
See also references of EP3846070A4 |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2022041130A (ja) * | 2020-08-31 | 2022-03-11 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
JP2022041089A (ja) * | 2020-08-31 | 2022-03-11 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
JP7239531B2 (ja) | 2020-08-31 | 2023-03-14 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
JP7280227B2 (ja) | 2020-08-31 | 2023-05-23 | ヤフー株式会社 | 情報処理装置、情報処理方法及び情報処理プログラム |
Also Published As
Publication number | Publication date |
---|---|
JP7173149B2 (ja) | 2022-11-16 |
AU2018438250B2 (en) | 2022-04-14 |
EP3846070A1 (en) | 2021-07-07 |
JPWO2020044509A1 (ja) | 2021-08-10 |
EP3846070A4 (en) | 2021-09-08 |
AU2018438250A1 (en) | 2021-03-18 |
US20210192152A1 (en) | 2021-06-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11657799B2 (en) | Pre-training with alignments for recurrent neural network transducer based end-to-end speech recognition | |
JP6493866B2 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
WO2020044509A1 (ja) | 生成方法、生成プログラムおよび情報処理装置 | |
KR20220007160A (ko) | 스트리밍 엔드-투-엔드 모델을 사용한 대규모 다국어 음성 인식 | |
WO2019024050A1 (en) | CORRECTION OF GRAMMAR ERRORS BASED ON DEEP CONTEXT AND USING ARTIFICIAL NEURAL NETWORKS | |
US20090192781A1 (en) | System and method of providing machine translation from a source language to a target language | |
KR101326354B1 (ko) | 문자 변환 처리 장치, 기록 매체 및 방법 | |
JP2008203469A (ja) | 音声認識装置及び方法 | |
US20210233510A1 (en) | Language-agnostic Multilingual Modeling Using Effective Script Normalization | |
CN113948060A (zh) | 一种网络训练方法、数据处理方法及相关设备 | |
US11893813B2 (en) | Electronic device and control method therefor | |
JP7367754B2 (ja) | 特定方法および情報処理装置 | |
JP7121791B2 (ja) | 言語生成方法、装置及び電子機器 | |
CN112836523B (zh) | 一种单词翻译方法、装置、设备和一种可读存储介质 | |
JP7230915B2 (ja) | 学習方法、翻訳方法、学習プログラム、翻訳プログラムおよび情報処理装置 | |
JP2019215660A (ja) | 処理プログラム、処理方法および情報処理装置 | |
WO2020021609A1 (ja) | 生成方法、生成プログラムおよび情報処理装置 | |
US20210097242A1 (en) | Electronic device and controlling method of electronic device | |
JPWO2018066083A1 (ja) | 学習プログラム、情報処理装置および学習方法 | |
JP2019012455A (ja) | 語義ベクトル生成プログラム、語義ベクトル生成方法および語義ベクトル生成装置 | |
US11120222B2 (en) | Non-transitory computer readable recording medium, identification method, generation method, and information processing device | |
JP7435740B2 (ja) | 音声認識装置、制御方法、及びプログラム | |
US20240161747A1 (en) | Electronic device including text to speech model and method for controlling the same | |
JP2007072663A (ja) | 用例翻訳装置、及び用例翻訳方法 | |
JP2023181109A (ja) | モデル訓練方法、装置およびコンピュータ読み取り可能な記憶媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2020539961 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2018438250 Country of ref document: AU Date of ref document: 20180830 Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18932179 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2018932179 Country of ref document: EP Effective date: 20210330 |