WO2023079911A1 - 文生成モデル生成装置、文生成モデル及び文生成装置 - Google Patents
文生成モデル生成装置、文生成モデル及び文生成装置 Download PDFInfo
- Publication number
- WO2023079911A1 WO2023079911A1 PCT/JP2022/037899 JP2022037899W WO2023079911A1 WO 2023079911 A1 WO2023079911 A1 WO 2023079911A1 JP 2022037899 W JP2022037899 W JP 2022037899W WO 2023079911 A1 WO2023079911 A1 WO 2023079911A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sentence
- data
- input
- output
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/40—Processing or translation of natural language
- G06F40/42—Data-driven translation
- G06F40/44—Statistical methods, e.g. probability models
Definitions
- the present invention relates to a sentence generation model generation device, a sentence generation model, and a sentence generation device.
- Patent Literature 1 discloses a technique of generating a document corresponding to an input document using a machine learning model.
- Ordinary translation engines and scoring engines which consist of models learned based on input sentences and their parallel translations, output parallel sentences that correspond in semantic content, so parallel sentences using specific expressions are output. I could't. For example, when a machine-learned model is applied to a correction and scoring engine, even if the parallel translation input by the user is correct regarding the use of the desired specific expression, there is an error in the part other than the specific expression. If there was, there was a case where it was corrected based on the parallel translation unrelated to the specific expression.
- the present invention has been made in view of the above problems, and it is an object of the present invention to obtain an output sentence in a second language using specific expressions in response to an input sentence in the first language.
- a sentence generation model generation device is a sentence generation model generation device that generates an output sentence in a second language different from the first language in response to an input sentence in the first language.
- a sentence generative model generating device for generating a generative model by machine learning, wherein the sentence generative model is an encoder-decoder model including a neural network and composed of an encoder and a decoder, and is used for machine learning of the sentence generative model.
- the data includes first data, constraint data and second data, the first data including an array of a plurality of words forming an input sentence, and the second data comprising a plurality of words forming an output sentence corresponding to the input sentence.
- the sentence generation model generation device includes an encoder input unit that inputs the first data to the encoder according to the arrangement order of the words, constraint data, and start A decoder input unit for inputting symbols and words constituting the second data to the decoder according to the order of arrangement, an arrangement of words output from the decoder after the input of the start symbol, and included in the second data
- An updating unit that updates weighting coefficients constituting the encoder and decoder based on the error of each word from the word array, and a model output unit that outputs a sentence generation model with the weighting coefficients updated by the updating unit.
- the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder. Constraints identified from the sequence of words forming an output sentence in training of a sentence generation model in which first data corresponding to an input sentence is input to an encoder and second data corresponding to an output sentence is input to a decoder Constraint data including one or more words is input to the decoder along with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. By inputting to the decoder constraint data specifying words constituting a desired specific expression as constraint words, the sentence generation model learns the relationship between the constraint data and the second data. A sentence generation model can be obtained that outputs an output sentence using a specific expression consisting of contained constraint words.
- FIG. 1 is a block diagram showing a functional configuration of a sentence generative model generation device of this embodiment;
- FIG. It is a block diagram which shows the functional structure of the sentence production
- It is a hardware block diagram of a sentence generation model generation device and a sentence generation device.
- It is a figure which shows the structure of a sentence generation model.
- It is a figure which shows an example of the production
- FIG. 3 is a diagram for explaining a schematic configuration of a Transformer, which is an example of an encoder-decoder model; It is a figure which shows typically the sentence production
- FIG. 13 is a diagram showing the configuration of the sentence generation model generation program.
- FIG. 14 is a diagram showing the configuration of the sentence generation program.
- the sentence generation model of the present embodiment is constructed by machine learning to cause a computer to function and generate an output sentence in a second language different from the first language in response to an input sentence in the first language.
- the sentence generation model includes a neural network and is composed of an encoder-decoder model including an encoder and a decoder.
- the sentence generation model generation device of this embodiment is a device that generates a sentence generation model by machine learning.
- a sentence generation device is a device that generates an output sentence in a second language according to an input sentence in a first language using a sentence generation model constructed by machine learning.
- FIG. 1 is a diagram showing the functional configuration of the sentence generative model generation device according to this embodiment.
- the sentence generation model generation device 10 is a device that generates, by machine learning, a sentence generation model that generates an output sentence in a second language different from the first language according to an input sentence in the first language.
- the sentence generative model generation device 10 functionally includes a constraint data generation unit 11 , an encoder input unit 12 , a decoder input unit 13 , an update unit 14 and a model output unit 15 .
- Each of these functional units 11 to 15 may be configured in one device, or may be configured by being distributed in a plurality of devices.
- the sentence generation model generation device 10 is configured to be able to access storage means such as the model storage unit 30 and the corpus storage unit 40 .
- the model storage unit 30 and the corpus storage unit 40 may be configured within the sentence generative model generation device 10, or as shown in FIG. It may be configured as a separate accessible device.
- the model storage unit 30 is storage means that stores sentence generation models such as those that have been learned or that are in the process of learning, and can be composed of storage, memory, and the like.
- the corpus storage unit 40 is storage means for storing learning data used for machine learning of the sentence generation model and a corpus for generating the learning data. can.
- FIG. 2 is a diagram showing the functional configuration of the sentence generation device according to this embodiment.
- the sentence generation device 20 is a device that uses a sentence generation model built by machine learning to generate an output sentence in a second language different from the first language in response to an input sentence in the first language.
- the sentence generation device 20 functionally includes an input unit 21, a constraint data input unit 22, a word input unit 23, and an output unit 24.
- the sentence generation device 20 may further include a created sentence acquisition unit 25 , a created sentence input unit 26 and a created sentence evaluation unit 27 .
- Each of these functional units 21 to 27 may be configured in one device, or may be configured by being distributed in a plurality of devices.
- the sentence generation device 20 is configured to be able to access the model storage unit 30 that stores learned sentence generation models.
- the model storage unit 30 may be configured within the sentence generation device 20, or may be configured in another external device.
- sentence generation model generation device 10 and the sentence generation device 20 are configured in different devices (computers), but they may be integrated.
- each functional block may be implemented using one device that is physically or logically coupled, or directly or indirectly using two or more devices that are physically or logically separated (e.g. , wired, wireless, etc.) and may be implemented using these multiple devices.
- a functional block may be implemented by combining software in the one device or the plurality of devices.
- Functions include judging, determining, determining, calculating, calculating, processing, deriving, investigating, searching, checking, receiving, transmitting, outputting, accessing, resolving, selecting, choosing, establishing, comparing, assuming, expecting, assuming, Broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating, mapping, assigning, etc. can't
- a functional block (component) that performs transmission is called a transmitting unit or transmitter.
- the implementation method is not particularly limited.
- the sentence generative model generation device 10 and the sentence generation device 20 in one embodiment of the present invention may function as computers.
- FIG. 3 is a diagram showing an example of the hardware configuration of the sentence generative model generation device 10 and the sentence generation device 20 according to this embodiment.
- the sentence generation model generation device 10 and the sentence generation device 20 are each physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.
- the term "apparatus” can be read as a circuit, device, unit, or the like.
- the hardware configuration of the sentence generation model generation device 10 and the sentence generation device 20 may be configured to include one or more of each device shown in the figure, or may be configured without some devices. good.
- Each function of the sentence generation model generation device 10 and the sentence generation device 20 is executed by the processor 1001 by loading predetermined software (program) onto hardware such as the processor 1001 and the memory 1002, and by the communication device 1004. It is realized by controlling communication, reading and/or writing of data in the memory 1002 and storage 1003 .
- the processor 1001 for example, operates an operating system and controls the entire computer.
- the processor 1001 may be configured with a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic device, registers, and the like.
- CPU central processing unit
- the functional units 11 to 15 and 21 to 27 shown in FIGS. 1 and 2 may be realized by the processor 1001.
- FIG. 1 the functional units 11 to 15 and 21 to 27 shown in FIGS. 1 and 2 may be realized by the processor 1001.
- the processor 1001 also reads programs (program codes), software modules and data from the storage 1003 and/or the communication device 1004 to the memory 1002, and executes various processes according to them.
- programs program codes
- software modules software modules
- data data from the storage 1003 and/or the communication device 1004 to the memory 1002, and executes various processes according to them.
- the program a program that causes a computer to execute at least part of the operations described in the above embodiments is used.
- the functional units 11 to 15 and 21 to 27 of the sentence generation model generation device 10 and the sentence generation device 20 may be stored in the memory 1002 and implemented by a control program running on the processor 1001 .
- the above-described various processes are executed by one processor 1001, they may be executed by two or more processors 1001 simultaneously or sequentially.
- Processor 1001 may be implemented with one or more chips.
- the program may be transmitted from a network via an electric communication line.
- the memory 1002 is a computer-readable recording medium, and is composed of at least one of, for example, ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), RAM (Random Access Memory), etc. may be
- ROM Read Only Memory
- EPROM Erasable Programmable ROM
- EEPROM Electrical Erasable Programmable ROM
- RAM Random Access Memory
- the memory 1002 may also be called a register, cache, main memory (main storage device), or the like.
- the memory 1002 can store executable programs (program codes), software modules, etc. for implementing the sentence generation model generation method and the sentence generation method according to an embodiment of the present invention.
- the storage 1003 is a computer-readable recording medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disk), smart card, flash memory (eg, card, stick, key drive), floppy disk, magnetic strip, and/or the like.
- Storage 1003 may also be called an auxiliary storage device.
- the storage medium described above may be, for example, a database, server, or other suitable medium including memory 1002 and/or storage 1003 .
- the communication device 1004 is hardware (transmitting/receiving device) for communicating between computers via a wired and/or wireless network, and is also called a network device, network controller, network card, communication module, etc., for example.
- the input device 1005 is an input device (for example, keyboard, mouse, microphone, switch, button, sensor, etc.) that receives input from the outside.
- the output device 1006 is an output device (eg, display, speaker, LED lamp, etc.) that outputs to the outside. Note that the input device 1005 and the output device 1006 may be integrated (for example, a touch panel).
- Each device such as the processor 1001 and the memory 1002 is connected by a bus 1007 for communicating information.
- the bus 1007 may be composed of a single bus, or may be composed of different buses between devices.
- the sentence generation model generation device 10 and the sentence generation device 20 include a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). ), and the hardware may implement part or all of each functional block.
- processor 1001 may be implemented with at least one of these hardware.
- FIG. 4 is a diagram showing the configuration of the sentence generation model of this embodiment.
- the sentence generation model MD is an encoder-decoder model including a neural network and composed of an encoder en and a decoder de.
- a neural network that configures the encoder-decoder model is not limited, but is, for example, a recurrent neural network (RNN).
- the sentence generation model MD may be a neural network called a transformer.
- the learning data used for machine learning of the sentence generation model MD of this embodiment includes first data a, second data b, and constraint data c.
- the first data a includes an arrangement of a plurality of words forming an input sentence in the first language.
- the second data b includes an arrangement of a plurality of words forming an output sentence in the second language corresponding to the input sentence.
- the output sentence is, for example, a parallel translation of the input sentence.
- the constraint data c is data containing one or more constraint words, which are words specified from the arrangement of words forming the second data b.
- the arrangement order of the constraint words in the constraint data c maintains the arrangement order of the words in the second data b.
- the first data a that constitutes the input sentence in the first language is input to the encoder en.
- the first data a is divided into words by, for example, morphological analysis. Each divided word is converted (embedded) into a corresponding word vector and input to the encoder en according to the arrangement order in the first data a (input sentence).
- the encoder en outputs to the decoder de a vector indicating the result of calculation based on the first data a (for example, the output of the hidden layer, the source target attention, etc.).
- the decoder sequentially outputs a sequence of words based on the input of a vector from the encoder and a predetermined start symbol (vector) that indicates the start of the output.
- the constraint data c is input to the decoder de of the sentence generation model MD of the present embodiment before the start symbol ss is input.
- the decoder de outputs a sequence of words (vectors) of the output sentence t based on the output from the encoder en, the constraint data c and the input of the start symbol ss.
- the output sentence t is composed of the sequence of words output up to that point.
- the second data b corresponding to the output sentence (parallel translation of the input sentence in the second language) corresponding to the first data a (input sentence) is obtained word by word after the start symbol ss is input. It is input to the decoder de according to the arrangement order.
- the constraint data c is data in which the constraint words, which are words specified from the arrangement of words forming the second data b, are included while maintaining the arrangement order of the words of the second data c.
- the generation and the like of the constraint data c will be described in detail later, but the constraint data c is a constraint data in which words or word strings other than the constraint words in the array of words forming the output sentence are replaced with predetermined replacement symbols. It may be data consisting of an array of words and replacement symbols.
- the functional units of the sentence generative model generation device 10 will be described with reference to FIG. 1 again.
- the constraint data generator 11 generates constraint data based on the corpus. Generation of constraint data and learning data including the constraint data will be described with reference to FIGS. 5 and 6. FIG.
- FIG. 5 is a diagram showing an example of generating first data, second data, and constraint data based on a corpus.
- the constraint data generation unit 11 acquires the corpus cp0 from the corpus storage unit 40, for example.
- the corpus cp0 consists of a first sentence cp01 written in the first language and a second sentence cp02 written in the second language.
- the first sentence cp01 is the Japanese sentence "He studies English so that he can speak to foreigners (kare ha gaikokujin to hanaseruyoninarutameni eigo wo benkyo suru)".
- the second sentence cp02 is a sentence "He studies English so that he can talk with foreigners.”
- the constraint data generation unit 11 identifies the constraint word cx from the array of words forming the second sentence cp02.
- the specification of the constraint word may be based on, for example, a specified input by a user or the like. For example, a word that constitutes an expression that should be used in the second sentence cp02 as a parallel translation of the first sentence cp01 may be specified as a constraint word by specifying input. Also, the specification of the constraint word may be performed at random. In the example shown in FIG. 5, three words or word strings of "He", "so that", and "talk with” are specified as the constraint word cx.
- the constraint data generation unit 11 Based on the identified constraint word cx, the constraint data generation unit 11 generates constraint data c01 that includes the constraint word cx while maintaining the order of words in the second sentence. As shown in FIG. 5, the constraint data c01 is data containing "He", "so that", and "talk with” while maintaining the arrangement order in the second sentence cp02.
- the constraint data may be data consisting of an arrangement of constraint words and replacement symbols, in which words or word strings other than the constraint words in the arrangement of words forming the output sentence are replaced with predetermined replacement symbols.
- the constraint data generation unit 11 replaces words or word strings other than the word specified as the constraint word cx in the word array forming the second sentence cp02 with the replacement symbol rs, and produces the constraint word cx and the replacement symbol Generate constraint data c01 consisting of an array of rs. In the example shown in FIG.
- the replacement symbol rs is indicated by "* (asterisk)", and the constraint data generation unit 11 generates three constraint words cx "He", “so that", "talk with” and replacement symbol rs generate constraint data c01 "He * so that * talk with *” arranged while maintaining the arrangement order in the second sentence cp02.
- the constraint data generation unit 11 generates the first data a01 and the second data b01 in the learning data based on the first sentence cp01 and the second sentence cp02, respectively.
- Learning data consisting of c01, start symbol ss, and second data b01 is generated.
- the constraint data generation unit 11 may include information indicating the relationship with the second data b01 in the constraint data.
- the constraint data generator 11 includes a symbol cl01 in the constraint data c01 indicating that the constraint data c01 contains the constraint word cx to be used in the second data b01 (output sentence).
- the constraint data can be easily generated based on the corpus, thus preventing an increase in cost for obtaining learning data including the constraint data.
- FIG. 6 is a diagram showing an example of the first data, second data, and constraint data used for learning the sentence generation model.
- the learning data for the sentence generation model MD may include, as first data, arbitrary symbols, which are predetermined symbols having no linguistic meaning and content, instead of the arrangement of a plurality of words forming the input sentence.
- the learning data includes first data consisting of arbitrary symbols a03 having no semantic content, and constraint data c03 containing constraint words specified as expressions to be used in output sentences in the second language. and b03 consisting of an output sentence in the second language.
- the constraint data generation unit 11 Based on the corpus of sentence examples in the second language, the constraint data generation unit 11 generates constraint data c03 including the constraint word specified from the arrangement of the words forming the sentence example in the same manner as in the example of FIG.
- Data for learning may be generated by extracting as second data b03 and adding an arbitrary symbol a03.
- the decoder can learn the relationship between the constraint data and the second data. can be done. Therefore, it is possible to expand the learning data at a low cost, and to improve the accuracy of the desired output of the output sentence output by the decoder.
- the encoder input unit 12 inputs the first data a to the encoder en according to the arrangement order of the words.
- the decoder input unit 13 inputs the constraint data c, the start symbol ss, which is a predetermined symbol indicating the start of output of the output sentence, and the second data b to the decoder de word by word according to the arrangement order.
- the updating unit 14 updates the encoder en and the decoder de based on the error for each word between the word array output from the decoder de after the input of the start symbol ss and the word array included in the second data b. update the weighting factors that make up
- the encoder input unit 12 puts the word vectors of the words that make up the first data a into the input layer of the RNN that makes up the encoder en in word order. Input in order according to The output of the hidden layer of the encoder en based on the input of the last word vector of the first data a is output to the decoder de.
- RNN recurrent neural network
- the decoder input unit 13 sequentially inputs the word vectors of the words that make up the constraint data c into the input layer of the RNN that makes up the decoder de according to the word order. Further, the decoder input unit 13 sequentially inputs the start symbol ss and the second data b to the decoder de according to word order. When the start symbol ss is input to the decoder de, the decoder de sequentially outputs the sequence of word vectors of the output sentence t together with the likelihood (for example, by the softmax function).
- the update unit 14 calculates an error for each word between the word sequence output from the decoder de and the word sequence of the second data b, and constructs a neural network of the encoder en and the decoder de by, for example, the error back propagation method. update the weighting factors to
- FIG. 7 is a diagram for explaining the schematic configuration of a transformer, which is an example of an encoder-decoder model.
- the encoder input unit 12 when the sentence generation model MD1 (MD) is composed of a transformer, the encoder input unit 12 generates word vectors aw11, aw12, . (n is an integer equal to or greater than 2) is input to the input layer ila of the encoder en1 according to the arrangement order of the words.
- Transformers allow parallel processing of incoming data rather than sequential word entry as in RNNs.
- the encoder en1 calculates the self-attention sa1 from the input layer ila to the middle layer mla, and converts the word vector into a vector corresponding to the self-attention sa1. Similarly, the self-attention sa2 from the middle layer mla to the output layer ola is calculated and the word vector is further transformed. Further, the source-target attention ta for the input layer ilb of the decoder de1 from the output layer ola of the encoder en1 is calculated.
- the decoder input unit 13 receives word vectors cw11, . , . . . , bw1n (where n is an integer equal to or greater than 2) are input in parallel to the input layer ilb of the decoder de1 in the learning phase according to the arrangement order of the words.
- the self-attention sa3 from the input layer ilb to the intermediate layer mlb is calculated, and the vector is converted according to the self-attention sa3.
- the self-attention sa4 for the output layer olb is calculated from the intermediate layer mlb, and vector conversion is performed according to the self-attention sa4.
- t1n (where n is an integer equal to or greater than 2) based on the word vector wv output after the input of the start symbol ss, and the words constituting the second data b1.
- the model output unit 15 outputs the sentence generation model MD obtained after machine learning based on the required amount of learning data.
- the model output unit 15 may cause the model storage unit 30 to store the sentence generation model MD.
- FIG. 8 is a diagram schematically showing sentence generation processing by the sentence generation model.
- the sentence generation model MD2 is a model learned and constructed by the sentence generation model generation device 10.
- the sentence generation model MD2 includes an encoder en2 and a decoder de2.
- the sentence generation model MD (MD1, MD2), which is a model including a trained neural network, is read or referred to by a computer, and is regarded as a program that causes the computer to execute predetermined processing and realize predetermined functions. can be done.
- the trained sentence generation models MD (MD1, MD2) of this embodiment are used in a computer having a processor and memory.
- the processor of the computer responds to the input data input to the input layer of the neural network according to instructions from the learned sentence generation models MD (MD1, MD2) stored in the memory, and corresponds to each layer. It operates to perform calculations based on learned weighting coefficients (parameters) and functions, and to output results (likelihoods) from the output layer.
- the input unit 21 inputs words aw21, aw22, .
- the encoder en2 outputs the calculation result to the decoder de2.
- the constraint data input unit 22 inputs the symbol ct2, the words cw21 to cw24, .
- the input constraint data c2 is data containing an input constraint word arbitrarily specified as a word to be used in an output sentence.
- the input constraint words are included in the input constraint data c2 while maintaining the arrangement order of the words in the output sentence.
- the identification of the input constraint word may be based on, for example, a specified input by a user or the like.
- the input constraint data c2 is data consisting of an array of input constraint words and replacement symbols in which words or word strings other than the input constraint words in the word array constituting the output sentence are replaced with predetermined replacement symbols rs.
- the input constraint data c2 are words or word strings such as "He” and "so that", and the replacement symbol rs "* (asterisk)" is Consists of data arranged in order.
- the input constraint data c2 may include the symbol ct2.
- Symbol ct2 indicates, for example, that input constraint data c2 is data containing an input constraint word to be used in output sentence t2.
- the word input unit 23 inputs the start symbol ss to the decoder de2 after the input of the input constraint data c2.
- the decoder de2 outputs a word tw21 at the beginning of the output sentence t2 according to the start symbol ss.
- the word input unit 23 sequentially inputs the words output from the decoder de2 in the previous stage to the decoder de2.
- the decoder de2 sequentially outputs a series of words tw21, tw22, .
- the output unit 24 arranges the words tw21, tw22, . Generate t2. Then, the output unit 24 outputs the generated output sentence t2.
- the form of output of the output sentence t2 is not limited, but may be, for example, storage in a predetermined storage means, display on a display means, output by voice, or the like.
- FIG. 9 is a diagram showing an example of input constraint data and an output sentence that can be output based on the input constraint data.
- the input sentence "He studies English so that he can speak to foreigners (kare ha gaikokujin to hanaseruyoninarutameni eigo wo benkyo suru)" is input to encoder en2 as input data a2. do. If this input sentence is translated by a normal translation engine without any restrictions, for example, the output sentence "He studies English to become able to speak with foreigners.”
- different output sentences can be output according to the input constraint data input to the decoder de2.
- FIG. 10 is a diagram showing processing for evaluating a created sentence in the evaluation system configured by the sentence generation device 20. As shown in FIG.
- tw3n (where n is an integer of 2 or more) output at each stage after the start symbol ss is input, the decoder de3 shown in FIG. For each word, output the likelihood that indicates the likelihood of .
- the evaluation system configured by the sentence generation device 20 evaluates the created sentence created and input by the user with the output sentence t3 as the correct answer.
- the user inputs, for example, a parallel translation of the input sentence in the second language as a created sentence.
- the created sentence acquisition unit 25 acquires the created sentence r3 that was created in the second language by the user and entered into the evaluation system.
- the created sentence r3 consists of an array of words rw31, rw32, .
- the composed sentence input unit 26 replaces the words tw31, tw32, .
- the decoder de3 outputs the likelihood of each word of the entire vocabulary handled by the sentence generation model generation device 10 and the sentence generation device 20 at each output stage.
- the output sentence t3 is constructed by arranging the words with the highest likelihood at each output stage.
- the created sentence evaluation unit 27 evaluates the likelihood of each vocabulary output according to the input of the start symbol ss and the words (rw31, rw32, . . . , rw3n) output at the previous stage. , rw3n, the likelihood associated with each word rw31, rw32, .
- the created sentence evaluation unit 27 compares the likelihood of each word tw31, tw32, . By doing so, the evaluation value of the created sentence r3 is calculated and output.
- the method of calculating the evaluation value is not limited, it may be based on, for example, the ratio of the likelihoods for each word in each sentence t3 and r3, and the sum or average of the likelihoods for each sentence t3 and r3.
- each word obtained by sequentially inputting the likelihood of each word forming an output sentence and each word forming a prepared and input prepared sentence into a decoder The constructed sentence is evaluated based on the contrast with the likelihood of . This makes it possible to configure an evaluation system that evaluates the likelihood of a created sentence as a parallel translation corresponding to an input sentence.
- FIG. 11 is a flow chart showing the processing contents of the sentence generative model generation method in the sentence generative model generation device 10.
- step S1 the sentence generative model generation device 10 acquires learning data including first data a, second data b, and constraint data c.
- the constraint data in the learning data may be data generated in advance based on the corpus and stored in the corpus storage unit 40, or may be data generated by the constraint data generation unit 11 based on the corpus. good.
- step S2 the first data a is input to the encoder en according to the arrangement order of the words.
- step S3 the decoder input unit 13 inputs the constraint data c to the decoder de. Subsequently, in step S4, the decoder input unit 13 inputs the start symbol ss to the decoder de. Furthermore, in step S5, the decoder input unit 13 inputs the second data b to the decoder de word by word in accordance with the arrangement order.
- step S6 the update unit 14 calculates the error for each word between the word array output from the decoder de after the input of the start symbol ss and the word array included in the second data b.
- Backpropagation updates the weighting factors that make up the encoder en and the decoder de.
- step S7 the update unit 14 determines whether or not machine learning based on the required amount of learning data has been completed. If it is determined that learning has ended, the process proceeds to step S8. On the other hand, if it is determined that the learning has not ended, the processing of steps S1 to S6 is repeated.
- step S8 the model output unit 15 outputs the learned sentence generation model MD.
- FIG. 12 is a flow chart showing the processing contents of the sentence generation method using the learned sentence generation model MD in the sentence generation device 20.
- FIG. 12 is a flow chart showing the processing contents of the sentence generation method using the learned sentence generation model MD in the sentence generation device 20.
- step S11 the input unit 21 inputs the words of the input data that make up the input sentence to the encoder of the sentence generation model according to the arrangement order for each word.
- the encoder outputs the calculation result to the decoder according to the input of the input data.
- step S12 the constraint data input unit 22 inputs the input constraint data to the decoder for each word according to the arrangement order. Subsequently, in step S13, the word input unit 23 inputs the start symbol ss to the decoder after inputting the input constraint data.
- step S14 the output unit 24 acquires the word (or symbol) output from the output layer of the decoder.
- step S15 the output unit 24 determines whether or not the output from the decoder is a terminal symbol indicating the end of the output sentence. If the output from the decoder is determined to be a terminal symbol, the process proceeds to step S17. On the other hand, if the output from the decoder is not determined to be a terminal symbol, the process proceeds to step S16.
- step S16 the word input unit 23 inputs the word output from the previous-stage output layer of the decoder to the current-stage input layer of the decoder. Then, the process returns to step S14.
- step S17 the output unit 24 arranges the words sequentially output from the output layer at each stage of the decoder to generate an output sentence. Then, in step S18, the output unit 24 outputs the output sentence.
- FIG. 13 is a diagram showing the configuration of the sentence generation model generation program.
- the sentence generative model generation program P1 includes a main module m10 for overall control of sentence generative model generation processing in the sentence generative model generation device 10, a constraint data generation module m11, an encoder input module m12, a decoder input module m13, an update module m14, and It is configured with a model output module m15.
- Each of the modules m11 to m15 implements the functions of the constraint data generation unit 11, the encoder input unit 12, the decoder input unit 13, the update unit 14, and the model output unit 15.
- FIG. 13 is a diagram showing the configuration of the sentence generation model generation program.
- the sentence generative model generation program P1 includes a main module m10 for overall control of sentence generative model generation processing in the sentence generative model generation device 10, a constraint data generation module m11, an encoder input module m12, a decoder input module m13, an update module m14, and It
- the sentence generation model generation program P1 may be transmitted via a transmission medium such as a communication line, or may be stored in a recording medium M1 as shown in FIG. good.
- FIG. 14 is a diagram showing the configuration of the sentence generation program.
- the sentence generation program P2 is composed of a main module m20, an input module m21, a constraint data input module m22, a word input module m23, and an output module m24, which collectively control sentence generation processing in the sentence generation device 20.
- FIG. The sentence generation program P2 may further include a created sentence acquisition module m25, a created sentence input module m26, and a created sentence evaluation module m27. configured with Functions for the input unit 21, the constraint data input unit 22, the word input unit 23, the output unit 24, the created sentence acquisition unit 25, the created sentence input unit 26, and the created sentence evaluation unit 27 are provided by the respective modules m21 to m27. is realized.
- the sentence generation program P2 may be transmitted via a transmission medium such as a communication line, or may be stored in a recording medium M2 as shown in FIG.
- the sentence generative model is composed of an encoder-decoder model including an encoder and a decoder. Constraints identified from the sequence of words forming an output sentence in training of a sentence generation model in which first data corresponding to an input sentence is input to an encoder and second data corresponding to an output sentence is input to a decoder Constraint data including one or more words is input to the decoder along with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement.
- the sentence generation model learns the relationship between the constraint data and the second data.
- a sentence generation model can be obtained that outputs an output sentence using a specific expression consisting of contained constraint words.
- the constraint data includes constraint words and It may consist of an array of replacement symbols.
- the constraint data is composed of the constraint word and the sequence of the substitution symbol substituted from the word or word string other than the constraint word, so that the word corresponding to the constraint word in the second data is output.
- words corresponding to replacement symbols in the second data are learned as arbitrary expressions in output sentences. Therefore, it is possible to generate a sentence generation model capable of outputting an output sentence using a specific expression composed of constraint words.
- a sentence generative model generation device provides a corpus consisting of a first sentence composed in a first language and a second sentence that is a parallel translation of the first sentence composed in a second language.
- a constraint data generation unit that generates constraint data including constraint words identified from the arrangement of words that make up the second sentence based on may be included.
- constraint data for designating words corresponding to desired specific expressions to be used in output sentences can be obtained as learning data based on the corpus.
- the constraint data generation unit converts words or word strings other than the words specified as constraint words in the word sequence forming the second sentence into replacement symbols.
- the replacement may be performed to generate constraint data consisting of an array of constraint words and replacement symbols.
- a word corresponding to a desired specific expression to be used in an output sentence is specified, and constraint data for specifying an arbitrary expression in the output sentence is obtained as learning data. be able to.
- the first data is an arbitrary symbol that is a predetermined symbol having no linguistic meaning, instead of the arrangement of a plurality of words constituting the input sentence. It can be a certain thing.
- the decoder can learn the relationship between the constraint data and the second data. Therefore, it is possible to expand the learning data at a low cost, and to improve the accuracy of the desired output of the output sentence output by the decoder.
- a sentence generation model operates a computer to generate an output sentence in a second language different from the first language in response to an input sentence in a first language.
- a sentence generation model that has been learned by machine learning for generating a sentence generation model, and learning data used for machine learning of the sentence generation model includes first data including an array of a plurality of words that constitute an input sentence, an input sentence second data including a sequence of a plurality of words forming an output sentence corresponding to and constraint data, wherein the constraint data includes a constraint word that is a word specified from the sequence of words forming the output sentence including one or more, the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement, the sentence generation model is an encoder-decoder model that includes a neural network and is composed of an encoder and a decoder, and the first data is input to the encoder according to the arrangement order of the words, and the constraint data, the start symbol, which is a pre
- the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder.
- the first data corresponding to the input sentence is input to the encoder
- the second data corresponding to the output sentence is input to the decoder
- the sequence of words constituting the output sentence is specified.
- Constraint data including one or more of the constrained words is input to the decoder together with the second data.
- the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. Relevance between the constraint data and the second data is learned by inputting to the decoder constraint data specifying words constituting a desired specific expression as constraint words. It is possible to output sentences using specific expressions consisting of constraint words contained in data.
- a sentence generation device uses a sentence generation model constructed by machine learning to generate an input sentence in a first language.
- a sentence generation device for generating output sentences in different second languages wherein learning data used for machine learning of a sentence generation model includes first data including an array of a plurality of words corresponding to an input sentence, second data including a sequence of a plurality of words corresponding to the corresponding output sentence;
- the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement
- the sentence generation model is an encoder-decoder model that includes a neural network and is composed of an encoder and a decoder
- the first data is Input to the encoder according to the arrangement order of words, constraint data, a start symbol that is a predetermined symbol signifying the start of output of an output sentence
- second data are words of the constraint data, the start symbol, and the second data
- the encoder and the The sentence generator is constructed by machine learning that updates the weighting
- An arbitrarily specified input constraint word is included while maintaining the arrangement order in the output sentence.
- a word input unit for sequentially inputting the words output from the decoder at the previous stage into the decoder, and generating an output sentence by arranging the words sequentially output at each stage of the decoder. and an output unit for outputting the generated output sentence.
- the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder.
- the first data corresponding to the input sentence is input to the encoder
- the second data corresponding to the output sentence is input to the decoder
- the sequence of words constituting the output sentence is specified.
- Constraint data including one or more of the constrained words is input to the decoder together with the second data.
- the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement.
- the learned sentence generation model learns the relationship between the constraint data and the second data. Therefore, by inputting input data constituting an input sentence to the encoder and input constraint data for specifying constraint conditions in the output sentence to the decoder, an output sentence using a desired specific expression can be output.
- the decoder outputs, for each word, a likelihood indicating the likelihood of each word to be output as a word forming the output sentence at each stage after the input of the start symbol. Then, in each stage after the input of the start symbol, the sentence generation device sequentially inputs to the decoder words constituting the sentence created in the second language instead of the words output from the decoder in the previous stage.
- a created sentence input unit and the likelihood of each word composing the created sentence output from the decoder at each stage after the input of the starting symbol based on the input of the starting symbol and the sequential input of each word composing the created sentence; and a prepared sentence evaluation unit that evaluates the prepared sentence based on the comparison with the likelihood of each word constituting the output sentence.
- the likelihood of each word that constitutes the output sentence is compared with the likelihood of each word that is obtained by sequentially inputting each word that constitutes the created and input sentence to the decoder. Based on this, the written sentence is evaluated. This makes it possible to configure an evaluation system that evaluates the likelihood of a created sentence as a parallel translation corresponding to an input sentence.
- LTE Long Term Evolution
- LTE-A Long Term Evolution-Advanced
- SUPER 3G IMT-Advanced
- 4G 5G
- FRA Full Radio Access
- W-CDMA registered trademark
- GSM registered trademark
- CDMA2000 Code Division Multiple Access 2000
- UMB Universal Mobile Broadband
- IEEE 802.11 Wi-Fi
- IEEE 802.16 WiMAX
- IEEE 802.20 UWB (Ultra-WideBand
- Input and output information may be saved in a specific location (for example, memory) or managed in a management table. Input/output information and the like may be overwritten, updated, or appended. The output information and the like may be deleted. The entered information and the like may be transmitted to another device.
- the determination may be made by a value represented by one bit (0 or 1), by a true/false value (Boolean: true or false), or by numerical comparison (for example, a predetermined value).
- notification of predetermined information is not limited to being performed explicitly, but may be performed implicitly (for example, not notifying the predetermined information). good too.
- Software whether referred to as software, firmware, middleware, microcode, hardware description language or otherwise, includes instructions, instruction sets, code, code segments, program code, programs, subprograms, and software modules. , applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, and the like.
- software, instructions, etc. may be transmitted and received via a transmission medium.
- the software can be used to access websites, servers, or other When transmitted from a remote source, these wired and/or wireless technologies are included within the definition of transmission media.
- data, instructions, commands, information, signals, bits, symbols, chips, etc. may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. may be represented by a combination of
- system and "network” used herein are used interchangeably.
- information, parameters, etc. described in this specification may be represented by absolute values, may be represented by relative values from a predetermined value, or may be represented by corresponding other information. .
- determining and “determining” used in this disclosure may encompass a wide variety of actions.
- “Judgement” and “determination” are, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiring (eg, lookup in a table, database, or other data structure), ascertaining as “judged” or “determined”, and the like.
- "judgment” and “determination” are used for receiving (e.g., receiving information), transmitting (e.g., transmitting information), input, output, access (accessing) (for example, accessing data in memory) may include deeming that a "judgment” or “decision” has been made.
- judgment and “decision” are considered to be “judgment” and “decision” by resolving, selecting, choosing, establishing, comparing, etc. can contain.
- judgment and “decision” may include considering that some action is “judgment” and “decision”.
- judgment (decision) may be read as “assuming”, “expecting”, “considering”, or the like.
- any reference to the elements does not generally limit the quantity or order of those elements. These designations may be used herein as a convenient method of distinguishing between two or more elements. Thus, references to first and second elements do not imply that only two elements may be employed therein or that the first element must precede the second element in any way.
- model output module M2... recording medium, m20... main module, m21... input module, m22... constraint data input module, m23... word input module, m24... output module, m25... written sentence acquisition module, m26... Created sentence input module, m27... Created sentence evaluation module, MD, MD1, MD2... Sentence generation model, P1... Sentence generation model generation program, P2... Sentence generation program.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Probability & Statistics with Applications (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023557914A JPWO2023079911A1 (https=) | 2021-11-04 | 2022-10-11 |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021180102 | 2021-11-04 | ||
| JP2021-180102 | 2021-11-04 |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023079911A1 true WO2023079911A1 (ja) | 2023-05-11 |
Family
ID=86241320
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/037899 Ceased WO2023079911A1 (ja) | 2021-11-04 | 2022-10-11 | 文生成モデル生成装置、文生成モデル及び文生成装置 |
Country Status (2)
| Country | Link |
|---|---|
| JP (1) | JPWO2023079911A1 (https=) |
| WO (1) | WO2023079911A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024235214A1 (zh) * | 2023-05-17 | 2024-11-21 | 腾讯科技(深圳)有限公司 | 语句生成方法和装置、存储介质及电子设备 |
Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019036093A (ja) * | 2017-08-14 | 2019-03-07 | 日本電信電話株式会社 | モデル学習装置、変換装置、方法、及びプログラム |
| WO2019225154A1 (ja) * | 2018-05-23 | 2019-11-28 | 株式会社Nttドコモ | 作成文章評価装置 |
| CN111160049A (zh) * | 2019-12-06 | 2020-05-15 | 华为技术有限公司 | 文本翻译方法、装置、机器翻译系统和存储介质 |
| WO2021186892A1 (ja) * | 2020-03-19 | 2021-09-23 | 株式会社Nttドコモ | 翻訳文章算出装置 |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US12159122B2 (en) * | 2019-08-23 | 2024-12-03 | Sony Group Corporation | Electronic device, method and computer program |
-
2022
- 2022-10-11 WO PCT/JP2022/037899 patent/WO2023079911A1/ja not_active Ceased
- 2022-10-11 JP JP2023557914A patent/JPWO2023079911A1/ja active Pending
Patent Citations (4)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2019036093A (ja) * | 2017-08-14 | 2019-03-07 | 日本電信電話株式会社 | モデル学習装置、変換装置、方法、及びプログラム |
| WO2019225154A1 (ja) * | 2018-05-23 | 2019-11-28 | 株式会社Nttドコモ | 作成文章評価装置 |
| CN111160049A (zh) * | 2019-12-06 | 2020-05-15 | 华为技术有限公司 | 文本翻译方法、装置、机器翻译系统和存储介质 |
| WO2021186892A1 (ja) * | 2020-03-19 | 2021-09-23 | 株式会社Nttドコモ | 翻訳文章算出装置 |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2024235214A1 (zh) * | 2023-05-17 | 2024-11-21 | 腾讯科技(深圳)有限公司 | 语句生成方法和装置、存储介质及电子设备 |
Also Published As
| Publication number | Publication date |
|---|---|
| JPWO2023079911A1 (https=) | 2023-05-11 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP7222082B2 (ja) | 認識誤り訂正装置及び訂正モデル | |
| CN115034201A (zh) | 使用弱监督多奖励强化学习扩充用于句子分类的文本数据 | |
| US20240070492A1 (en) | Reasoning method based on structural attention mechanism for knowledge-based question answering and computing apparatus for performing the same | |
| JP7062056B2 (ja) | 作成文章評価装置 | |
| US10878201B1 (en) | Apparatus and method for an adaptive neural machine translation system | |
| CN113515959B (zh) | 机器翻译模型的训练方法、机器翻译方法及相关设备 | |
| CN111142681B (zh) | 一种确定汉字拼音的方法、系统、装置及存储介质 | |
| US10657203B2 (en) | Predicting probability of occurrence of a string using sequence of vectors | |
| WO2021070819A1 (ja) | 採点モデル学習装置、採点モデル及び判定装置 | |
| JP2020112915A (ja) | データ生成装置 | |
| WO2021020299A1 (ja) | 人気度推定システム及び地理的特徴生成モデル | |
| US11361170B1 (en) | Apparatus and method for accurate translation reviews and consistency across multiple translators | |
| WO2023079911A1 (ja) | 文生成モデル生成装置、文生成モデル及び文生成装置 | |
| JP2022029273A (ja) | 文類似度算出装置、学習済モデル生成装置及び分散表現モデル | |
| JP7682862B2 (ja) | 句点削除モデル学習装置、句点削除モデル及び判定装置 | |
| JP7229347B2 (ja) | 内部状態変更装置 | |
| JP7194759B2 (ja) | 翻訳用データ生成システム | |
| JP7805309B2 (ja) | 文生成モデル生成装置、文生成モデル及び文生成装置 | |
| JP2024077792A (ja) | 文生成装置 | |
| WO2021215352A1 (ja) | 音声データ作成装置 | |
| WO2019098185A1 (ja) | 発話文生成システム及び発話文生成プログラム | |
| JP2024168531A (ja) | 文生成モデル生成装置、文生成モデル及び文生成装置 | |
| JP7691411B2 (ja) | 翻訳文章算出装置 | |
| JP7575894B2 (ja) | 作成文章評価装置 | |
| JP2019016048A (ja) | 情報処理装置及びプログラム |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22889734 Country of ref document: EP Kind code of ref document: A1 |
|
| ENP | Entry into the national phase |
Ref document number: 2023557914 Country of ref document: JP Kind code of ref document: A |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22889734 Country of ref document: EP Kind code of ref document: A1 |