WO2023079911A1

WO2023079911A1 - Sentence generation model generator, sentence generation model, and sentence generator

Info

Publication number: WO2023079911A1
Application number: PCT/JP2022/037899
Authority: WO
Inventors: 保静松岡
Original assignee: 株式会社Ｎｔｔドコモ
Priority date: 2021-11-04
Filing date: 2022-10-11
Publication date: 2023-05-11

Abstract

This sentence generation model generator generates, through machine learning, a sentence generation model for generating an output sentence in a second language in response to input of an input sentence in a first language, the sentence generation model generator comprising: an encoder input unit for inputting first data that constitutes the input sentence to an encoder; a decoder input unit for inputting constraint data that includes a constraint word specified as a word equivalent to an expression that should be used in the output sentence, and second data that constitutes a start symbol and the output sentence, to a decoder; an update unit for updating weight coefficients that constitute the encoder and the decoder on the basis of a difference, for each word, between the array of words outputted from the decoder in stages subsequent to input of the start symbol and the array of words included in the second data; and an model output unit for outputting a sentence generation model in which the weight coefficients are updated.

Description

Sentence generation model generation device, sentence generation model, and sentence generation device

The present invention relates to a sentence generation model generation device, a sentence generation model, and a sentence generation device.

A technique is known in which a model is generated by machine learning that generates an output sentence consisting of, for example, a parallel translation in a second language according to an input sentence in a first language, and a translation engine, a scoring engine, etc. are configured by the generated model. It is For example, Patent Literature 1 discloses a technique of generating a document corresponding to an input document using a machine learning model.

JP 2020-135457 A

Ordinary translation engines and scoring engines, which consist of models learned based on input sentences and their parallel translations, output parallel sentences that correspond in semantic content, so parallel sentences using specific expressions are output. I couldn't. For example, when a machine-learned model is applied to a correction and scoring engine, even if the parallel translation input by the user is correct regarding the use of the desired specific expression, there is an error in the part other than the specific expression. If there was, there was a case where it was corrected based on the parallel translation unrelated to the specific expression.

Therefore, the present invention has been made in view of the above problems, and it is an object of the present invention to obtain an output sentence in a second language using specific expressions in response to an input sentence in the first language.

In order to solve the above problems, a sentence generation model generation device according to one aspect of the present invention is a sentence generation model generation device that generates an output sentence in a second language different from the first language in response to an input sentence in the first language. A sentence generative model generating device for generating a generative model by machine learning, wherein the sentence generative model is an encoder-decoder model including a neural network and composed of an encoder and a decoder, and is used for machine learning of the sentence generative model. The data includes first data, constraint data and second data, the first data including an array of a plurality of words forming an input sentence, and the second data comprising a plurality of words forming an output sentence corresponding to the input sentence. and the constraint data includes one or more constraint words that are words specified from the word sequence that constitutes the output sentence, and the arrangement order of the constraint words in the constraint data is the order in the word arrangement The sentence generation model generation device includes an encoder input unit that inputs the first data to the encoder according to the arrangement order of the words, constraint data, and start A decoder input unit for inputting symbols and words constituting the second data to the decoder according to the order of arrangement, an arrangement of words output from the decoder after the input of the start symbol, and included in the second data An updating unit that updates weighting coefficients constituting the encoder and decoder based on the error of each word from the word array, and a model output unit that outputs a sentence generation model with the weighting coefficients updated by the updating unit. .

According to the above form, the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder. Constraints identified from the sequence of words forming an output sentence in training of a sentence generation model in which first data corresponding to an input sentence is input to an encoder and second data corresponding to an output sentence is input to a decoder Constraint data including one or more words is input to the decoder along with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. By inputting to the decoder constraint data specifying words constituting a desired specific expression as constraint words, the sentence generation model learns the relationship between the constraint data and the second data. A sentence generation model can be obtained that outputs an output sentence using a specific expression consisting of contained constraint words.

It is possible to obtain an output sentence in the second language using specific expressions according to the input sentence in the first language.

1 is a block diagram showing a functional configuration of a sentence generative model generation device of this embodiment; FIG. It is a block diagram which shows the functional structure of the sentence production|generation apparatus of this embodiment. It is a hardware block diagram of a sentence generation model generation device and a sentence generation device. It is a figure which shows the structure of a sentence generation model. It is a figure which shows an example of the production|generation of 1st data, 2nd data, and constraint data based on a corpus. It is a figure which shows the example of 1st data, 2nd data, and constraint data which are used for learning of a model. FIG. 3 is a diagram for explaining a schematic configuration of a Transformer, which is an example of an encoder-decoder model; It is a figure which shows typically the sentence production|generation process by a sentence production|generation model. It is a figure which shows the example of the output sentence which can be output based on input restrictions data and the said restrictions data. It is a figure which shows the process of evaluation of the created sentence in the evaluation system comprised by the sentence production|generation apparatus. It is a flowchart which shows the processing content of the sentence generative model generation method in a sentence generative model generation apparatus. It is a flowchart which shows the processing content of the sentence production|generation method in a sentence production|generation apparatus. FIG. 13 is a diagram showing the configuration of the sentence generation model generation program. FIG. 14 is a diagram showing the configuration of the sentence generation program.

Embodiments of a sentence generation model generation device, a sentence generation device, and a sentence generation model according to the present invention will be described with reference to the drawings. Where possible, the same parts are denoted by the same reference numerals, and duplicate descriptions are omitted.

The sentence generation model of the present embodiment is constructed by machine learning to cause a computer to function and generate an output sentence in a second language different from the first language in response to an input sentence in the first language. is a model. The sentence generation model includes a neural network and is composed of an encoder-decoder model including an encoder and a decoder.

The sentence generation model generation device of this embodiment is a device that generates a sentence generation model by machine learning. A sentence generation device is a device that generates an output sentence in a second language according to an input sentence in a first language using a sentence generation model constructed by machine learning.

An example of a problem solved by the sentence generation model generation device, sentence generation model, and sentence generation device of this embodiment will be described.

In a system that corrects an English composition based on a given Japanese sentence by AI (artificial intelligence), for example, a word with a low score is extracted from among the words contained in the input English sentence, and the extracted word with a low score is followed by Corrections are made based on correction sentences in which words have been rewritten. For example, the Japanese sentence ``He studies English to become able to speak to foreigners. with foreigners.”

When the English sentence "He study English so that can talk with American." is an obvious error and lowers its score, the correction system replaces the second word "study" with the word "studies" with a higher score, and also rewrites the third and subsequent words with higher-scoring words. Output the English sentence “He studies English to become able to speak with foreigners.”

However, since the word strings "so that" and "talk with" in the English sentences input by the user are correct expressions as English translations, "He studies English so that He can talk with foreigners.” In this embodiment, it is possible to generate an output sentence using a desired expression, and to appropriately evaluate a created sentence based on the output sentence using the desired expression.

FIG. 1 is a diagram showing the functional configuration of the sentence generative model generation device according to this embodiment. The sentence generation model generation device 10 is a device that generates, by machine learning, a sentence generation model that generates an output sentence in a second language different from the first language according to an input sentence in the first language. As shown in FIG. 1 , the sentence generative model generation device 10 functionally includes a constraint data generation unit 11 , an encoder input unit 12 , a decoder input unit 13 , an update unit 14 and a model output unit 15 . Each of these functional units 11 to 15 may be configured in one device, or may be configured by being distributed in a plurality of devices.

In addition, the sentence generation model generation device 10 is configured to be able to access storage means such as the model storage unit 30 and the corpus storage unit 40 . The model storage unit 30 and the corpus storage unit 40 may be configured within the sentence generative model generation device 10, or as shown in FIG. It may be configured as a separate accessible device.

The model storage unit 30 is storage means that stores sentence generation models such as those that have been learned or that are in the process of learning, and can be composed of storage, memory, and the like.

The corpus storage unit 40 is storage means for storing learning data used for machine learning of the sentence generation model and a corpus for generating the learning data. can.

FIG. 2 is a diagram showing the functional configuration of the sentence generation device according to this embodiment. The sentence generation device 20 is a device that uses a sentence generation model built by machine learning to generate an output sentence in a second language different from the first language in response to an input sentence in the first language. As shown in FIG. 2, the sentence generation device 20 functionally includes an input unit 21, a constraint data input unit 22, a word input unit 23, and an output unit 24. The sentence generation device 20 may further include a created sentence acquisition unit 25 , a created sentence input unit 26 and a created sentence evaluation unit 27 . Each of these functional units 21 to 27 may be configured in one device, or may be configured by being distributed in a plurality of devices.

In addition, the sentence generation device 20 is configured to be able to access the model storage unit 30 that stores learned sentence generation models. The model storage unit 30 may be configured within the sentence generation device 20, or may be configured in another external device.

Also, in this embodiment, an example is shown in which the sentence generation model generation device 10 and the sentence generation device 20 are configured in different devices (computers), but they may be integrated.

It should be noted that the block diagrams shown in FIGS. 1 and 2 show blocks for each function. These functional blocks (components) are realized by any combination of at least one of hardware and software. Also, the method of realizing each functional block is not particularly limited. That is, each functional block may be implemented using one device that is physically or logically coupled, or directly or indirectly using two or more devices that are physically or logically separated (e.g. , wired, wireless, etc.) and may be implemented using these multiple devices. A functional block may be implemented by combining software in the one device or the plurality of devices.

Functions include judging, determining, determining, calculating, calculating, processing, deriving, investigating, searching, checking, receiving, transmitting, outputting, accessing, resolving, selecting, choosing, establishing, comparing, assuming, expecting, assuming, Broadcasting, notifying, communicating, forwarding, configuring, reconfiguring, allocating, mapping, assigning, etc. can't For example, a functional block (component) that performs transmission is called a transmitting unit or transmitter. In either case, as described above, the implementation method is not particularly limited.

For example, the sentence generative model generation device 10 and the sentence generation device 20 in one embodiment of the present invention may function as computers. FIG. 3 is a diagram showing an example of the hardware configuration of the sentence generative model generation device 10 and the sentence generation device 20 according to this embodiment. The sentence generation model generation device 10 and the sentence generation device 20 are each physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like. may

In the following explanation, the term "apparatus" can be read as a circuit, device, unit, or the like. The hardware configuration of the sentence generation model generation device 10 and the sentence generation device 20 may be configured to include one or more of each device shown in the figure, or may be configured without some devices. good.

Each function of the sentence generation model generation device 10 and the sentence generation device 20 is executed by the processor 1001 by loading predetermined software (program) onto hardware such as the processor 1001 and the memory 1002, and by the communication device 1004. It is realized by controlling communication, reading and/or writing of data in the memory 1002 and storage 1003 .

The processor 1001, for example, operates an operating system and controls the entire computer. The processor 1001 may be configured with a central processing unit (CPU) including an interface with peripheral devices, a control device, an arithmetic device, registers, and the like. For example, the functional units 11 to 15 and 21 to 27 shown in FIGS. 1 and 2 may be realized by the processor 1001. FIG.

The processor 1001 also reads programs (program codes), software modules and data from the storage 1003 and/or the communication device 1004 to the memory 1002, and executes various processes according to them. As the program, a program that causes a computer to execute at least part of the operations described in the above embodiments is used. For example, the functional units 11 to 15 and 21 to 27 of the sentence generation model generation device 10 and the sentence generation device 20 may be stored in the memory 1002 and implemented by a control program running on the processor 1001 . Although it has been described that the above-described various processes are executed by one processor 1001, they may be executed by two or more processors 1001 simultaneously or sequentially. Processor 1001 may be implemented with one or more chips. Note that the program may be transmitted from a network via an electric communication line.

The memory 1002 is a computer-readable recording medium, and is composed of at least one of, for example, ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (Electrically Erasable Programmable ROM), RAM (Random Access Memory), etc. may be The memory 1002 may also be called a register, cache, main memory (main storage device), or the like. The memory 1002 can store executable programs (program codes), software modules, etc. for implementing the sentence generation model generation method and the sentence generation method according to an embodiment of the present invention.

The storage 1003 is a computer-readable recording medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disk drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disk), smart card, flash memory (eg, card, stick, key drive), floppy disk, magnetic strip, and/or the like. Storage 1003 may also be called an auxiliary storage device. The storage medium described above may be, for example, a database, server, or other suitable medium including memory 1002 and/or storage 1003 .

The communication device 1004 is hardware (transmitting/receiving device) for communicating between computers via a wired and/or wireless network, and is also called a network device, network controller, network card, communication module, etc., for example.

The input device 1005 is an input device (for example, keyboard, mouse, microphone, switch, button, sensor, etc.) that receives input from the outside. The output device 1006 is an output device (eg, display, speaker, LED lamp, etc.) that outputs to the outside. Note that the input device 1005 and the output device 1006 may be integrated (for example, a touch panel).

Each device such as the processor 1001 and the memory 1002 is connected by a bus 1007 for communicating information. The bus 1007 may be composed of a single bus, or may be composed of different buses between devices.

In addition, the sentence generation model generation device 10 and the sentence generation device 20 include a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). ), and the hardware may implement part or all of each functional block. For example, processor 1001 may be implemented with at least one of these hardware.

FIG. 4 is a diagram showing the configuration of the sentence generation model of this embodiment. As shown in FIG. 4, the sentence generation model MD is an encoder-decoder model including a neural network and composed of an encoder en and a decoder de. A neural network that configures the encoder-decoder model is not limited, but is, for example, a recurrent neural network (RNN). Also, the sentence generation model MD may be a neural network called a transformer.

The learning data used for machine learning of the sentence generation model MD of this embodiment includes first data a, second data b, and constraint data c. The first data a includes an arrangement of a plurality of words forming an input sentence in the first language. The second data b includes an arrangement of a plurality of words forming an output sentence in the second language corresponding to the input sentence. The output sentence is, for example, a parallel translation of the input sentence. The constraint data c is data containing one or more constraint words, which are words specified from the arrangement of words forming the second data b. The arrangement order of the constraint words in the constraint data c maintains the arrangement order of the words in the second data b.

The first data a that constitutes the input sentence in the first language is input to the encoder en. Specifically, the first data a is divided into words by, for example, morphological analysis. Each divided word is converted (embedded) into a corresponding word vector and input to the encoder en according to the arrangement order in the first data a (input sentence). The encoder en outputs to the decoder de a vector indicating the result of calculation based on the first data a (for example, the output of the hidden layer, the source target attention, etc.).

In a typical encoder-decoder model, the decoder sequentially outputs a sequence of words based on the input of a vector from the encoder and a predetermined start symbol (vector) that indicates the start of the output. On the other hand, the constraint data c is input to the decoder de of the sentence generation model MD of the present embodiment before the start symbol ss is input. The decoder de outputs a sequence of words (vectors) of the output sentence t based on the output from the encoder en, the constraint data c and the input of the start symbol ss. When the terminal symbol es indicating the end of the output sentence is output from the decoder de, the output sentence t is composed of the sequence of words output up to that point. In the learning phase, the second data b corresponding to the output sentence (parallel translation of the input sentence in the second language) corresponding to the first data a (input sentence) is obtained word by word after the start symbol ss is input. It is input to the decoder de according to the arrangement order.

The constraint data c is data in which the constraint words, which are words specified from the arrangement of words forming the second data b, are included while maintaining the arrangement order of the words of the second data c. The generation and the like of the constraint data c will be described in detail later, but the constraint data c is a constraint data in which words or word strings other than the constraint words in the array of words forming the output sentence are replaced with predetermined replacement symbols. It may be data consisting of an array of words and replacement symbols.

The functional units of the sentence generative model generation device 10 will be described with reference to FIG. 1 again. The constraint data generator 11 generates constraint data based on the corpus. Generation of constraint data and learning data including the constraint data will be described with reference to FIGS. 5 and 6. FIG.

FIG. 5 is a diagram showing an example of generating first data, second data, and constraint data based on a corpus. The constraint data generation unit 11 acquires the corpus cp0 from the corpus storage unit 40, for example. The corpus cp0 consists of a first sentence cp01 written in the first language and a second sentence cp02 written in the second language. In the example of FIG. 5, the first sentence cp01 is the Japanese sentence "He studies English so that he can speak to foreigners (kare ha gaikokujin to hanaseruyoninarutameni eigo wo benkyo suru)". The second sentence cp02 is a sentence "He studies English so that he can talk with foreigners."

The constraint data generation unit 11 identifies the constraint word cx from the array of words forming the second sentence cp02. The specification of the constraint word may be based on, for example, a specified input by a user or the like. For example, a word that constitutes an expression that should be used in the second sentence cp02 as a parallel translation of the first sentence cp01 may be specified as a constraint word by specifying input. Also, the specification of the constraint word may be performed at random. In the example shown in FIG. 5, three words or word strings of "He", "so that", and "talk with" are specified as the constraint word cx.

Based on the identified constraint word cx, the constraint data generation unit 11 generates constraint data c01 that includes the constraint word cx while maintaining the order of words in the second sentence. As shown in FIG. 5, the constraint data c01 is data containing "He", "so that", and "talk with" while maintaining the arrangement order in the second sentence cp02.

Alternatively, the constraint data may be data consisting of an arrangement of constraint words and replacement symbols, in which words or word strings other than the constraint words in the arrangement of words forming the output sentence are replaced with predetermined replacement symbols. . The constraint data generation unit 11 replaces words or word strings other than the word specified as the constraint word cx in the word array forming the second sentence cp02 with the replacement symbol rs, and produces the constraint word cx and the replacement symbol Generate constraint data c01 consisting of an array of rs. In the example shown in FIG. 5, the replacement symbol rs is indicated by "* (asterisk)", and the constraint data generation unit 11 generates three constraint words cx "He", "so that", "talk with" and replacement symbol rs generate constraint data c01 "He * so that * talk with *" arranged while maintaining the arrangement order in the second sentence cp02.

Further, the constraint data generation unit 11 generates the first data a01 and the second data b01 in the learning data based on the first sentence cp01 and the second sentence cp02, respectively. Learning data consisting of c01, start symbol ss, and second data b01 is generated.

Note that the constraint data generation unit 11 may include information indicating the relationship with the second data b01 in the constraint data. In the example shown in FIG. 5, the constraint data generator 11 includes a symbol cl01 in the constraint data c01 indicating that the constraint data c01 contains the constraint word cx to be used in the second data b01 (output sentence). Let

As described with reference to FIG. 5, the constraint data can be easily generated based on the corpus, thus preventing an increase in cost for obtaining learning data including the constraint data.

FIG. 6 is a diagram showing an example of the first data, second data, and constraint data used for learning the sentence generation model. The learning data for the sentence generation model MD may include, as first data, arbitrary symbols, which are predetermined symbols having no linguistic meaning and content, instead of the arrangement of a plurality of words forming the input sentence.

As shown in FIG. 6, the learning data includes first data consisting of arbitrary symbols a03 having no semantic content, and constraint data c03 containing constraint words specified as expressions to be used in output sentences in the second language. and b03 consisting of an output sentence in the second language. Based on the corpus of sentence examples in the second language, the constraint data generation unit 11 generates constraint data c03 including the constraint word specified from the arrangement of the words forming the sentence example in the same manner as in the example of FIG. Data for learning may be generated by extracting as second data b03 and adding an arbitrary symbol a03.

According to the learning data example shown in FIG. 6, even if there is no first data corresponding to the input sentence that is the parallel translation of the output sentence, the decoder can learn the relationship between the constraint data and the second data. can be done. Therefore, it is possible to expand the learning data at a low cost, and to improve the accuracy of the desired output of the output sentence output by the decoder.

　Referring to FIG. 1 again, the encoder input unit 12 inputs the first data a to the encoder en according to the arrangement order of the words.

The decoder input unit 13 inputs the constraint data c, the start symbol ss, which is a predetermined symbol indicating the start of output of the output sentence, and the second data b to the decoder de word by word according to the arrangement order.

The updating unit 14 updates the encoder en and the decoder de based on the error for each word between the word array output from the decoder de after the input of the start symbol ss and the word array included in the second data b. update the weighting factors that make up

When the sentence generation model MD is configured by, for example, a recurrent neural network (RNN), the encoder input unit 12 puts the word vectors of the words that make up the first data a into the input layer of the RNN that makes up the encoder en in word order. Input in order according to The output of the hidden layer of the encoder en based on the input of the last word vector of the first data a is output to the decoder de.

Subsequently, the decoder input unit 13 sequentially inputs the word vectors of the words that make up the constraint data c into the input layer of the RNN that makes up the decoder de according to the word order. Further, the decoder input unit 13 sequentially inputs the start symbol ss and the second data b to the decoder de according to word order. When the start symbol ss is input to the decoder de, the decoder de sequentially outputs the sequence of word vectors of the output sentence t together with the likelihood (for example, by the softmax function).

The update unit 14 calculates an error for each word between the word sequence output from the decoder de and the word sequence of the second data b, and constructs a neural network of the encoder en and the decoder de by, for example, the error back propagation method. update the weighting factors to

FIG. 7 is a diagram for explaining the schematic configuration of a transformer, which is an example of an encoder-decoder model. As shown in FIG. 7, when the sentence generation model MD1 (MD) is composed of a transformer, the encoder input unit 12 generates word vectors aw11, aw12, . (n is an integer equal to or greater than 2) is input to the input layer ila of the encoder en1 according to the arrangement order of the words. Transformers allow parallel processing of incoming data rather than sequential word entry as in RNNs.

The encoder en1 calculates the self-attention sa1 from the input layer ila to the middle layer mla, and converts the word vector into a vector corresponding to the self-attention sa1. Similarly, the self-attention sa2 from the middle layer mla to the output layer ola is calculated and the word vector is further transformed. Further, the source-target attention ta for the input layer ilb of the decoder de1 from the output layer ola of the encoder en1 is calculated.

The decoder input unit 13 receives word vectors cw11, . , . . . , bw1n (where n is an integer equal to or greater than 2) are input in parallel to the input layer ilb of the decoder de1 in the learning phase according to the arrangement order of the words.

In the decoder de1, similarly to the encoder en1, the self-attention sa3 from the input layer ilb to the intermediate layer mlb is calculated, and the vector is converted according to the self-attention sa3. Similarly, the self-attention sa4 for the output layer olb is calculated from the intermediate layer mlb, and vector conversion is performed according to the self-attention sa4.

, t1n (where n is an integer equal to or greater than 2) based on the word vector wv output after the input of the start symbol ss, and the words constituting the second data b1. The error with the word sequence bw11, .

　Referring to FIG. 1 again, the model output unit 15 outputs the sentence generation model MD obtained after machine learning based on the required amount of learning data. The model output unit 15 may cause the model storage unit 30 to store the sentence generation model MD.

Next, with reference to FIGS. 2 and 8, the processing of the sentence generation phase using the functional units of the sentence generation device 20 and the learned sentence generation model will be described. FIG. 8 is a diagram schematically showing sentence generation processing by the sentence generation model.

As shown in FIG. 8, the sentence generation model MD2 is a model learned and constructed by the sentence generation model generation device 10. The sentence generation model MD2 includes an encoder en2 and a decoder de2.

The sentence generation model MD (MD1, MD2), which is a model including a trained neural network, is read or referred to by a computer, and is regarded as a program that causes the computer to execute predetermined processing and realize predetermined functions. can be done.

That is, the trained sentence generation models MD (MD1, MD2) of this embodiment are used in a computer having a processor and memory. Specifically, the processor of the computer responds to the input data input to the input layer of the neural network according to instructions from the learned sentence generation models MD (MD1, MD2) stored in the memory, and corresponds to each layer. It operates to perform calculations based on learned weighting coefficients (parameters) and functions, and to output results (likelihoods) from the output layer.

The input unit 21 inputs words aw21, aw22, . The encoder en2 outputs the calculation result to the decoder de2.

The constraint data input unit 22 inputs the symbol ct2, the words cw21 to cw24, . The input constraint data c2 is data containing an input constraint word arbitrarily specified as a word to be used in an output sentence. The input constraint words are included in the input constraint data c2 while maintaining the arrangement order of the words in the output sentence. The identification of the input constraint word may be based on, for example, a specified input by a user or the like.

The input constraint data c2 is data consisting of an array of input constraint words and replacement symbols in which words or word strings other than the input constraint words in the word array constituting the output sentence are replaced with predetermined replacement symbols rs. may be In the example shown in FIG. 8, the input constraint data c2 are words or word strings such as "He" and "so that", and the replacement symbol rs "* (asterisk)" is Consists of data arranged in order.

Also, the input constraint data c2 may include the symbol ct2. Symbol ct2 indicates, for example, that input constraint data c2 is data containing an input constraint word to be used in output sentence t2.

The word input unit 23 inputs the start symbol ss to the decoder de2 after the input of the input constraint data c2. The decoder de2 outputs a word tw21 at the beginning of the output sentence t2 according to the start symbol ss. At each stage after the input of the start symbol ss, the word input unit 23 sequentially inputs the words output from the decoder de2 in the previous stage to the decoder de2. The decoder de2 sequentially outputs a series of words tw21, tw22, .

The output unit 24 arranges the words tw21, tw22, . Generate t2. Then, the output unit 24 outputs the generated output sentence t2. The form of output of the output sentence t2 is not limited, but may be, for example, storage in a predetermined storage means, display on a display means, output by voice, or the like.

FIG. 9 is a diagram showing an example of input constraint data and an output sentence that can be output based on the input constraint data. In the example shown in FIG. 9, the input sentence "He studies English so that he can speak to foreigners (kare ha gaikokujin to hanaseruyoninarutameni eigo wo benkyo suru)" is input to encoder en2 as input data a2. do. If this input sentence is translated by a normal translation engine without any restrictions, for example, the output sentence "He studies English to become able to speak with foreigners." In the sentence generation device 20 of this embodiment, different output sentences can be output according to the input constraint data input to the decoder de2.

As shown in FIG. 9, when the input constraint data "He * * so * talk with *" is input, for example, an output sentence "He studies English so that he can talk with foreigners." . Also, when the input constraint data "He * * so * that * chat *" is input, an output sentence such as "He studies English so that he can chat foreigners." is output. Also, when the input constraint data "He *in order to *speak*" is input, for example, an output sentence "He studies English in order to become able to speak with foreigners." is output.

Also, when the input constraint data "He * to be able to *" is input, an output sentence such as "He studies English to be able to speak with foreigners." is output. Also, when the input constraint data "He * for being *" is input, for example, an output sentence "He studies English for being able to speak with foreigners." is output. Also, when the input constraint data "His study of English *talk with *" is input, for example, an output sentence "His study of English is to be able to talk with foreigners." is output.

An evaluation system configured by the sentence generation device 20 for evaluating created sentences will be described with reference to FIG. 10 together with FIG. FIG. 10 is a diagram showing processing for evaluating a created sentence in the evaluation system configured by the sentence generation device 20. As shown in FIG.

, tw3n (where n is an integer of 2 or more) output at each stage after the start symbol ss is input, the decoder de3 shown in FIG. For each word, output the likelihood that indicates the likelihood of . The evaluation system configured by the sentence generation device 20 evaluates the created sentence created and input by the user with the output sentence t3 as the correct answer. In response to the presentation of the input sentence corresponding to the output sentence t3, the user inputs, for example, a parallel translation of the input sentence in the second language as a created sentence. In this embodiment, it is assumed that a created sentence input by a user is evaluated, but a created sentence created and input by a person other than the user or by a device may be evaluated.

The created sentence acquisition unit 25 acquires the created sentence r3 that was created in the second language by the user and entered into the evaluation system. The created sentence r3 consists of an array of words rw31, rw32, .

At each stage after the input of the start symbol ss, the composed sentence input unit 26 replaces the words tw31, tw32, . Word vectors of words rw31, rw32, .

Based on the input of the start symbol ss and the sequential inputs of the words rw31, rw32, . , rw3n and the likelihoods of the words tw31, tw32, . . . , tw3n forming the output sentence t3, make an evaluation.

Specifically, the decoder de3 outputs the likelihood of each word of the entire vocabulary handled by the sentence generation model generation device 10 and the sentence generation device 20 at each output stage. In the sentence generation process phase, the output sentence t3 is constructed by arranging the words with the highest likelihood at each output stage.

At each stage of the decoder de3, the created sentence evaluation unit 27 evaluates the likelihood of each vocabulary output according to the input of the start symbol ss and the words (rw31, rw32, . . . , rw3n) output at the previous stage. , rw3n, the likelihood associated with each word rw31, rw32, .

The created sentence evaluation unit 27 compares the likelihood of each word tw31, tw32, . By doing so, the evaluation value of the created sentence r3 is calculated and output. Although the method of calculating the evaluation value is not limited, it may be based on, for example, the ratio of the likelihoods for each word in each sentence t3 and r3, and the sum or average of the likelihoods for each sentence t3 and r3.

According to the evaluation system described with reference to FIG. 10, each word obtained by sequentially inputting the likelihood of each word forming an output sentence and each word forming a prepared and input prepared sentence into a decoder The constructed sentence is evaluated based on the contrast with the likelihood of . This makes it possible to configure an evaluation system that evaluates the likelihood of a created sentence as a parallel translation corresponding to an input sentence.

FIG. 11 is a flow chart showing the processing contents of the sentence generative model generation method in the sentence generative model generation device 10. FIG.

In step S1, the sentence generative model generation device 10 acquires learning data including first data a, second data b, and constraint data c. The constraint data in the learning data may be data generated in advance based on the corpus and stored in the corpus storage unit 40, or may be data generated by the constraint data generation unit 11 based on the corpus. good.

In step S2, the first data a is input to the encoder en according to the arrangement order of the words.

In step S3, the decoder input unit 13 inputs the constraint data c to the decoder de. Subsequently, in step S4, the decoder input unit 13 inputs the start symbol ss to the decoder de. Furthermore, in step S5, the decoder input unit 13 inputs the second data b to the decoder de word by word in accordance with the arrangement order.

In step S6, the update unit 14 calculates the error for each word between the word array output from the decoder de after the input of the start symbol ss and the word array included in the second data b. Backpropagation updates the weighting factors that make up the encoder en and the decoder de.

In step S7, the update unit 14 determines whether or not machine learning based on the required amount of learning data has been completed. If it is determined that learning has ended, the process proceeds to step S8. On the other hand, if it is determined that the learning has not ended, the processing of steps S1 to S6 is repeated.

In step S8, the model output unit 15 outputs the learned sentence generation model MD.

FIG. 12 is a flow chart showing the processing contents of the sentence generation method using the learned sentence generation model MD in the sentence generation device 20. FIG.

In step S11, the input unit 21 inputs the words of the input data that make up the input sentence to the encoder of the sentence generation model according to the arrangement order for each word. The encoder outputs the calculation result to the decoder according to the input of the input data.

In step S12, the constraint data input unit 22 inputs the input constraint data to the decoder for each word according to the arrangement order. Subsequently, in step S13, the word input unit 23 inputs the start symbol ss to the decoder after inputting the input constraint data.

In step S14, the output unit 24 acquires the word (or symbol) output from the output layer of the decoder. In step S15, the output unit 24 determines whether or not the output from the decoder is a terminal symbol indicating the end of the output sentence. If the output from the decoder is determined to be a terminal symbol, the process proceeds to step S17. On the other hand, if the output from the decoder is not determined to be a terminal symbol, the process proceeds to step S16.

In step S16, the word input unit 23 inputs the word output from the previous-stage output layer of the decoder to the current-stage input layer of the decoder. Then, the process returns to step S14.

In step S17, the output unit 24 arranges the words sequentially output from the output layer at each stage of the decoder to generate an output sentence. Then, in step S18, the output unit 24 outputs the output sentence.

Next, a sentence generation model generation program for causing a computer to function as the sentence generation model generation device 10 of this embodiment will be described with reference to FIG.

FIG. 13 is a diagram showing the configuration of the sentence generation model generation program. The sentence generative model generation program P1 includes a main module m10 for overall control of sentence generative model generation processing in the sentence generative model generation device 10, a constraint data generation module m11, an encoder input module m12, a decoder input module m13, an update module m14, and It is configured with a model output module m15. Each of the modules m11 to m15 implements the functions of the constraint data generation unit 11, the encoder input unit 12, the decoder input unit 13, the update unit 14, and the model output unit 15. FIG.

The sentence generation model generation program P1 may be transmitted via a transmission medium such as a communication line, or may be stored in a recording medium M1 as shown in FIG. good.

Next, a sentence generation program for causing a computer to function as the sentence generation device 20 of this embodiment will be described with reference to FIG.

FIG. 14 is a diagram showing the configuration of the sentence generation program. The sentence generation program P2 is composed of a main module m20, an input module m21, a constraint data input module m22, a word input module m23, and an output module m24, which collectively control sentence generation processing in the sentence generation device 20. FIG. The sentence generation program P2 may further include a created sentence acquisition module m25, a created sentence input module m26, and a created sentence evaluation module m27. configured with Functions for the input unit 21, the constraint data input unit 22, the word input unit 23, the output unit 24, the created sentence acquisition unit 25, the created sentence input unit 26, and the created sentence evaluation unit 27 are provided by the respective modules m21 to m27. is realized.

The sentence generation program P2 may be transmitted via a transmission medium such as a communication line, or may be stored in a recording medium M2 as shown in FIG.

According to the sentence generative model generating device 10, the sentence generative model generating method, and the sentence generative model generating program P1 of this embodiment described above, the sentence generative model is composed of an encoder-decoder model including an encoder and a decoder. Constraints identified from the sequence of words forming an output sentence in training of a sentence generation model in which first data corresponding to an input sentence is input to an encoder and second data corresponding to an output sentence is input to a decoder Constraint data including one or more words is input to the decoder along with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. By inputting to the decoder constraint data specifying words constituting a desired specific expression as constraint words, the sentence generation model learns the relationship between the constraint data and the second data. A sentence generation model can be obtained that outputs an output sentence using a specific expression consisting of contained constraint words.

Further, in the sentence generative model generation device according to another aspect, the constraint data includes constraint words and It may consist of an array of replacement symbols.

According to the above aspect, the constraint data is composed of the constraint word and the sequence of the substitution symbol substituted from the word or word string other than the constraint word, so that the word corresponding to the constraint word in the second data is output. While learning as words to be used in sentences, words corresponding to replacement symbols in the second data are learned as arbitrary expressions in output sentences. Therefore, it is possible to generate a sentence generation model capable of outputting an output sentence using a specific expression composed of constraint words.

Further, a sentence generative model generation device according to another aspect provides a corpus consisting of a first sentence composed in a first language and a second sentence that is a parallel translation of the first sentence composed in a second language. a constraint data generation unit that generates constraint data including constraint words identified from the arrangement of words that make up the second sentence based on may be included.

According to the above embodiment, constraint data for designating words corresponding to desired specific expressions to be used in output sentences can be obtained as learning data based on the corpus.

Further, in a sentence generation model generation device according to another aspect, the constraint data generation unit converts words or word strings other than the words specified as constraint words in the word sequence forming the second sentence into replacement symbols. The replacement may be performed to generate constraint data consisting of an array of constraint words and replacement symbols.

According to the above aspect, based on the corpus, a word corresponding to a desired specific expression to be used in an output sentence is specified, and constraint data for specifying an arbitrary expression in the output sentence is obtained as learning data. be able to.

Further, in the sentence generation model generating device according to another aspect, the first data is an arbitrary symbol that is a predetermined symbol having no linguistic meaning, instead of the arrangement of a plurality of words constituting the input sentence. It can be a certain thing.

According to the above embodiment, even if the first data corresponding to the input sentence which is the parallel translation of the output sentence does not exist, the decoder can learn the relationship between the constraint data and the second data. Therefore, it is possible to expand the learning data at a low cost, and to improve the accuracy of the desired output of the output sentence output by the decoder.

In order to solve the above problems, a sentence generation model according to one aspect of the present invention operates a computer to generate an output sentence in a second language different from the first language in response to an input sentence in a first language. A sentence generation model that has been learned by machine learning for generating a sentence generation model, and learning data used for machine learning of the sentence generation model includes first data including an array of a plurality of words that constitute an input sentence, an input sentence second data including a sequence of a plurality of words forming an output sentence corresponding to and constraint data, wherein the constraint data includes a constraint word that is a word specified from the sequence of words forming the output sentence including one or more, the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement, the sentence generation model is an encoder-decoder model that includes a neural network and is composed of an encoder and a decoder, and the first data is input to the encoder according to the arrangement order of the words, and the constraint data, the start symbol, which is a predetermined symbol that means the start of output of the output sentence, and the second data, the constraint data, the start symbol, and the words of the second data and based on the error for each word between the arrangement of words input to the decoder according to the arrangement order of the symbols and output from the decoder after the input of the start symbol and the arrangement of words contained in the second data, the encoder and built by machine learning to update the weighting coefficients that make up the decoder.

According to the above form, the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder. In learning the sentence generation model, the first data corresponding to the input sentence is input to the encoder, the second data corresponding to the output sentence is input to the decoder, and the sequence of words constituting the output sentence is specified. Constraint data including one or more of the constrained words is input to the decoder together with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. Relevance between the constraint data and the second data is learned by inputting to the decoder constraint data specifying words constituting a desired specific expression as constraint words. It is possible to output sentences using specific expressions consisting of constraint words contained in data.

In order to solve the above problems, a sentence generation device according to one aspect of the present invention uses a sentence generation model constructed by machine learning to generate an input sentence in a first language. A sentence generation device for generating output sentences in different second languages, wherein learning data used for machine learning of a sentence generation model includes first data including an array of a plurality of words corresponding to an input sentence, second data including a sequence of a plurality of words corresponding to the corresponding output sentence; Including the above, the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement, the sentence generation model is an encoder-decoder model that includes a neural network and is composed of an encoder and a decoder, and the first data is Input to the encoder according to the arrangement order of words, constraint data, a start symbol that is a predetermined symbol signifying the start of output of an output sentence, and second data are words of the constraint data, the start symbol, and the second data The encoder and the The sentence generator is constructed by machine learning that updates the weighting coefficients that make up the decoder. An arbitrarily specified input constraint word is included while maintaining the arrangement order in the output sentence. , at each stage after the input of the start symbol, a word input unit for sequentially inputting the words output from the decoder at the previous stage into the decoder, and generating an output sentence by arranging the words sequentially output at each stage of the decoder. and an output unit for outputting the generated output sentence.

According to the above form, the sentence generation model is composed of an encoder-decoder model including an encoder and a decoder. In learning the sentence generation model, the first data corresponding to the input sentence is input to the encoder, the second data corresponding to the output sentence is input to the decoder, and the sequence of words constituting the output sentence is specified. Constraint data including one or more of the constrained words is input to the decoder together with the second data. The arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. As a result, the learned sentence generation model learns the relationship between the constraint data and the second data. Therefore, by inputting input data constituting an input sentence to the encoder and input constraint data for specifying constraint conditions in the output sentence to the decoder, an output sentence using a desired specific expression can be output.

Further, in the sentence generation device according to another aspect, the decoder outputs, for each word, a likelihood indicating the likelihood of each word to be output as a word forming the output sentence at each stage after the input of the start symbol. Then, in each stage after the input of the start symbol, the sentence generation device sequentially inputs to the decoder words constituting the sentence created in the second language instead of the words output from the decoder in the previous stage. a created sentence input unit, and the likelihood of each word composing the created sentence output from the decoder at each stage after the input of the starting symbol based on the input of the starting symbol and the sequential input of each word composing the created sentence; and a prepared sentence evaluation unit that evaluates the prepared sentence based on the comparison with the likelihood of each word constituting the output sentence.

According to the above embodiment, the likelihood of each word that constitutes the output sentence is compared with the likelihood of each word that is obtained by sequentially inputting each word that constitutes the created and input sentence to the decoder. Based on this, the written sentence is evaluated. This makes it possible to configure an evaluation system that evaluates the likelihood of a created sentence as a parallel translation corresponding to an input sentence.

Although the present embodiment has been described in detail above, it is obvious to those skilled in the art that the present embodiment is not limited to the embodiments described herein. This embodiment can be implemented as modifications and changes without departing from the spirit and scope of the present invention defined by the description of the claims. Therefore, the description in this specification is for the purpose of illustration and explanation, and does not have any restrictive meaning with respect to the present embodiment.

Each aspect/embodiment described herein includes LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G, 5G, FRA (Future Radio Access), W-CDMA (registered trademark), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broadband), IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, UWB (Ultra-WideBand), It may be applied to systems utilizing Bluetooth®, other suitable systems, and/or advanced next generation systems based thereon.

The order of the processing procedures, sequences, flowcharts, etc. of each aspect/embodiment described in this specification may be changed as long as there is no contradiction. For example, the methods described herein present elements of the various steps in a sample order, and are not limited to the specific order presented.

Input and output information may be saved in a specific location (for example, memory) or managed in a management table. Input/output information and the like may be overwritten, updated, or appended. The output information and the like may be deleted. The entered information and the like may be transmitted to another device.

The determination may be made by a value represented by one bit (0 or 1), by a true/false value (Boolean: true or false), or by numerical comparison (for example, a predetermined value).

Each aspect/embodiment described in this specification may be used alone, may be used in combination, or may be used by switching according to execution. In addition, the notification of predetermined information (for example, notification of “being X”) is not limited to being performed explicitly, but may be performed implicitly (for example, not notifying the predetermined information). good too.

Although the present disclosure has been described in detail above, it is clear to those skilled in the art that the present disclosure is not limited to the embodiments described in the present disclosure. The present disclosure can be practiced with modifications and variations without departing from the spirit and scope of the present disclosure as defined by the claims. Accordingly, the description of the present disclosure is for illustrative purposes and is not meant to be limiting in any way.

Software, whether referred to as software, firmware, middleware, microcode, hardware description language or otherwise, includes instructions, instruction sets, code, code segments, program code, programs, subprograms, and software modules. , applications, software applications, software packages, routines, subroutines, objects, executables, threads of execution, procedures, functions, and the like.

Also, software, instructions, etc. may be transmitted and received via a transmission medium. For example, the software can be used to access websites, servers, or other When transmitted from a remote source, these wired and/or wireless technologies are included within the definition of transmission media.

The information, signals, etc. described in this disclosure may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description may refer to voltages, currents, electromagnetic waves, magnetic fields or magnetic particles, light fields or photons, or any of these. may be represented by a combination of

The terms explained in the present disclosure and/or the terms necessary for understanding the specification may be replaced with terms having the same or similar meanings.

The terms "system" and "network" used herein are used interchangeably.

In addition, the information, parameters, etc. described in this specification may be represented by absolute values, may be represented by relative values from a predetermined value, or may be represented by corresponding other information. .

The terms "determining" and "determining" used in this disclosure may encompass a wide variety of actions. "Judgement" and "determination" are, for example, judging, calculating, computing, processing, deriving, investigating, looking up, searching, inquiring (eg, lookup in a table, database, or other data structure), ascertaining as "judged" or "determined", and the like. Also, "judgment" and "determination" are used for receiving (e.g., receiving information), transmitting (e.g., transmitting information), input, output, access (accessing) (for example, accessing data in memory) may include deeming that a "judgment" or "decision" has been made. In addition, "judgment" and "decision" are considered to be "judgment" and "decision" by resolving, selecting, choosing, establishing, comparing, etc. can contain. In other words, "judgment" and "decision" may include considering that some action is "judgment" and "decision". Also, "judgment (decision)" may be read as "assuming", "expecting", "considering", or the like.

The term "based on" as used in this disclosure does not mean "based only on," unless otherwise specified. In other words, the phrase "based on" means both "based only on" and "based at least on."

Where the designations "first", "second", etc. are used herein, any reference to the elements does not generally limit the quantity or order of those elements. These designations may be used herein as a convenient method of distinguishing between two or more elements. Thus, references to first and second elements do not imply that only two elements may be employed therein or that the first element must precede the second element in any way.

Wherever "include," "including," and variations thereof are used in the specification or claims, these terms are synonymous with the term "comprising." are intended to be inclusive. Furthermore, the term "or" as used in this specification or the claims is not intended to be an exclusive OR.

In this specification, a plurality of devices shall also be included unless there is clearly only one device from the context or technically.

Throughout this disclosure, the plural shall be included unless the singular is clearly indicated from the context.

REFERENCE SIGNS LIST 10 sentence generation model generation device 11 constraint data generation unit 12 encoder input unit 13 decoder input unit 14 update unit 15 model output unit 20 sentence generation device 21 input unit 22 ... constraint data input unit, 23... word input unit, 24... output unit, 25... created sentence acquisition unit, 26... created sentence input unit, 27... created sentence evaluation unit, 30... model storage unit, 40... corpus storage unit, de, de1, de2, de3 ... decoder, en, en1, en2 ... encoder, M1 ... recording medium, m10 ... main module, m11 ... constraint data generation module, m12 ... encoder input module, m13 ... decoder input module, m14 ... update Modules m15... model output module, M2... recording medium, m20... main module, m21... input module, m22... constraint data input module, m23... word input module, m24... output module, m25... written sentence acquisition module, m26... Created sentence input module, m27... Created sentence evaluation module, MD, MD1, MD2... Sentence generation model, P1... Sentence generation model generation program, P2... Sentence generation program.

Claims

A sentence generation model generation device for generating, by machine learning, a sentence generation model for generating an output sentence in a second language different from the first language according to an input sentence in the first language,
The sentence generation model is an encoder-decoder model composed of an encoder and a decoder including a neural network,
The learning data used for machine learning of the sentence generation model includes first data, constraint data and second data,
the first data includes an array of a plurality of words that make up the input sentence;
the second data includes an array of a plurality of words forming the output sentence corresponding to the input sentence;
The constraint data includes at least one constraint word that is a word specified from the word arrangement that constitutes the output sentence, and the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. and
The sentence generation model generation device is
an encoder input unit for inputting the first data to the encoder in accordance with the arrangement order of words;
a decoder input unit for inputting the restriction data, a start symbol, which is a predetermined symbol indicating the start of output of the output sentence, and the words constituting the second data to the decoder according to the arrangement order;
A weighting factor that configures the encoder and the decoder based on an error for each word between the word arrangement output from the decoder in the subsequent stage after the input of the start symbol and the word arrangement contained in the second data. an updating unit to update;
a model output unit that outputs a sentence generation model in which the weighting factor is updated by the update unit;
A sentence generation model generation device comprising:
The constraint data consists of an array of the constraint words and the replacement symbols, in which words or word strings other than the constraint words in the array of words forming the output sentence are replaced with predetermined replacement symbols.
The sentence generative model generating device according to claim 1.
constructing the second sentence based on a corpus consisting of a first sentence constructed in the first language and a second sentence that is a parallel translation of the first sentence constructed in the second language; a constraint data generation unit that generates the constraint data including the constraint word identified from the word arrangement while maintaining the arrangement order of the words in the second sentence;
3. The sentence generative model generation device according to claim 1 or 2.
The constraint data generation unit replaces words or word strings other than the words specified as the constraint words in the word array forming the second sentence with predetermined replacement symbols, and replaces the constraint words and the generating the constraint data consisting of an array of permutation symbols;
The sentence generative model generating device according to claim 3.
The first data is an arbitrary symbol that is a predetermined symbol that does not have linguistic meaning in place of the arrangement of a plurality of words that make up the input sentence.
The sentence generative model generation device according to any one of claims 1 to 4.
A sentence generation model trained by machine learning for operating a computer to generate an output sentence in a second language different from the first language in response to an input sentence in the first language,
The learning data used for machine learning of the sentence generation model includes first data including an array of a plurality of words forming the input sentence, and an array of a plurality of words forming the output sentence corresponding to the input sentence. including second data including and constraint data,
The constraint data includes at least one constraint word that is a word specified from the word arrangement that constitutes the output sentence, and the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. and
The sentence generation model is
An encoder-decoder model composed of an encoder and a decoder including a neural network,
the first data is input to the encoder according to the arrangement order of words;
The constraint data, a start symbol that is a predetermined symbol signifying the start of output of the output sentence, and the second data according to the arrangement order of the words and symbols of the constraint data, the start symbol, and the second data. is input to the decoder,
A weighting factor that configures the encoder and the decoder based on an error for each word between the word arrangement output from the decoder in the subsequent stage after the input of the start symbol and the word arrangement contained in the second data. Built with updating machine learning,
Trained sentence generation model.
A sentence generation device that generates an output sentence in a second language different from the first language according to an input sentence in the first language using a sentence generation model constructed by machine learning,
The learning data used for machine learning of the sentence generation model includes first data including an array of a plurality of words corresponding to the input sentence, and an array of a plurality of words corresponding to the output sentence corresponding to the input sentence. including second data including and constraint data,
The constraint data includes at least one constraint word that is a word specified from the word arrangement that constitutes the output sentence, and the arrangement order of the constraint words in the constraint data maintains the arrangement order of the word arrangement. and
The sentence generation model is
An encoder-decoder model composed of an encoder and a decoder including a neural network,
the first data is input to the encoder according to the arrangement order of words;
The constraint data, a start symbol that is a predetermined symbol signifying the start of output of the output sentence, and the second data according to the arrangement order of the words and symbols of the constraint data, the start symbol, and the second data. is input to the decoder,
A weighting factor that configures the encoder and the decoder based on an error for each word between the word arrangement output from the decoder in the subsequent stage after the input of the start symbol and the word arrangement contained in the second data. Built with updating machine learning,
The sentence generation device is
an input unit for inputting input data constituting the input sentence to the encoder according to the arrangement order of words;
a constraint data input unit for inputting to the decoder input constraint data in which input constraint words arbitrarily specified as words to be used in the output sentence are included while maintaining the arrangement order in the output sentence;
Word input for inputting the start symbol to the decoder in a subsequent stage of inputting the input constraint data, and sequentially inputting the words output from the decoder in the previous stage to the decoder in each stage after inputting the start symbol. Department and
an output unit for arranging the words sequentially output at each stage of the decoder to generate the output sentence, and for outputting the generated output sentence;
A sentence generation device comprising:
The decoder outputs, for each word, a likelihood indicating the likelihood of a word constituting the output sentence for each word to be output in each stage after the input of the start symbol,
The sentence generation device is
In each step after the input of the start symbol, inputting a composed sentence by sequentially inputting words constituting a composed sentence composed in the second language to the decoder in place of the words output from the decoder in the previous stage. Department and
a likelihood of each word composing the created sentence output from the decoder at each stage after the input of the start symbol based on the input of the start symbol and the sequential input of each word composing the created sentence; a created sentence evaluation unit that evaluates the created sentence based on a comparison with the likelihood of each word that constitutes the output sentence,
The sentence generation device according to claim 7.