WO2020166125A1 - Translation data generating system - Google Patents

Translation data generating system Download PDF

Info

Publication number
WO2020166125A1
WO2020166125A1 PCT/JP2019/039337 JP2019039337W WO2020166125A1 WO 2020166125 A1 WO2020166125 A1 WO 2020166125A1 JP 2019039337 W JP2019039337 W JP 2019039337W WO 2020166125 A1 WO2020166125 A1 WO 2020166125A1
Authority
WO
WIPO (PCT)
Prior art keywords
noise
language text
source language
label
translation
Prior art date
Application number
PCT/JP2019/039337
Other languages
French (fr)
Japanese (ja)
Inventor
聡一朗 村上
Original Assignee
株式会社Nttドコモ
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 株式会社Nttドコモ filed Critical 株式会社Nttドコモ
Priority to JP2020572078A priority Critical patent/JP7194759B2/en
Publication of WO2020166125A1 publication Critical patent/WO2020166125A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • One aspect of the present invention relates to a translation data generation system.
  • the translation accuracy is reduced due to the fact that the user's natural utterance input includes stagnation, rewording, or fillers (hereinafter these may be collectively referred to as "noise"). There is a case.
  • Patent Document 1 and Patent Document 2 there is known a technique of identifying a reworded portion in an utterance and correcting the utterance content of the user.
  • One aspect of the present invention has been made in view of the above situation, and an object thereof is to perform highly accurate translation even for natural speech including noise.
  • a translation data generation system includes a noise adding unit that adds noise to a source language text to obtain a noise added source language text, a noise added source language text, and noise of the noise added source language text. And a corpus constructing unit for constructing a pseudo bilingual corpus in which a target language text corresponding to a source language text before being assigned is associated.
  • noise is added to the source language text, and the pseudo-translation corpus in which the noise-added source language text and the target language text corresponding to the source language text before the addition of noise are associated with each other. Is built.
  • the bilingual corpus in which the source language text with noise is associated with the target language text corresponding to the source language text before adding noise using such a bilingual corpus, for example, natural speech Even when noise such as filler is included in the input, it is possible to appropriately derive the target language text corresponding to the source language text before noise addition.
  • the translation data generation system of one aspect of the present invention it is possible to construct a robust corpus (pseudo bilingual corpus) for natural utterances including noise, and for natural utterances including noise. However, it is possible to translate with high accuracy.
  • the translation data generation system may further include a translation model learning unit that learns a translation model using a pseudo bilingual corpus. By learning the translation model based on the constructed corpus, it is possible to translate the natural utterance containing noise with higher accuracy.
  • the translation data generation system further includes a noise model learning unit that uses a training data that is a source language text group containing noise to learn a noise model related to the addition of noise to the source language text, and the noise addition unit is ,
  • a noise model may be used to add noise to the source language text.
  • a noise model is learned based on a source language text group that includes noise in advance, and noise is added based on the noise model, so that noise that is likely to be actually included is easily added. The accuracy can be further improved.
  • the noise adding unit adds a noise label indicating the type of noise to each word of the source language text, and replaces the noise label with the word corresponding to the noise label Noise may be added to the text. Since the noise label corresponding to each word of the source language text is added and then the word (noise) corresponding to the noise label is derived, it is possible to ensure the easiness and validity of the noise addition.
  • the noise adding unit may derive a plurality of patterns of words to be replaced with respect to one noise label, and obtain a plurality of patterns of noise adding source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
  • the noise adding unit may derive a plurality of patterns of noise labels corresponding to each word and obtain a plurality of patterns of noise-added source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
  • the noise adding unit may add a noise label according to the characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the noise adding unit may add a noise label according to at least one of the morpheme, the part of speech, and the reading of the word, which are the features of each word of the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the noise adding unit samples the noise label according to the probability distribution of each noise level based on the score of each noise label output from the noise model with the feature of each word of the source language text as an input, You may decide the noise label given to a source language text.
  • a noise label having a high score output from the noise model it is possible to give a noise label having a high score output from the noise model, and it is possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the noise model may be constructed by a method using a conditional random field or neural network. Thereby, the noise model can be appropriately configured by machine learning.
  • FIG. 1 is a diagram schematically showing a processing image of the translation data generation system 1 according to the present embodiment.
  • the translation data generation system 1 adds noise to a text (source language text) on the source language side of an existing bilingual corpus (generally used bilingual corpus), and at the same time, adds noise to the source language text ( A source language text (noise-added) and a target language side text (target language text) corresponding to the source language text before noise are associated to construct a pseudo-translation corpus, and a machine translation model is created using the pseudo-translation corpus.
  • This is a system for learning (constructing) (for example, NMT (Neural Machine Translation) model).
  • the noise here is stagnation, rewording, filler, or the like that may be included in the user's natural speech input.
  • FIG. 2 is a diagram showing a functional configuration of the translation data generation system 1 according to the present embodiment.
  • the translation data generation system 1 includes a translation data generation device 10, a parallel translation corpus DB 20, a training information DB 30, a noise model learning device 40 (noise model learning unit), and a translation model learning.
  • the apparatus 50 (translation model learning unit).
  • the translation data generation system 1 does not necessarily have to have the above-described components, and may be configured with only the translation data generation device 10, for example, or the translation data generation device 10 and the noise model. It may be configured by only the learning device 40, or may be configured by only the translation data generation device 10, the noise model learning device 40, and the translation model learning device 50.
  • the parallel translation corpus DB 20 is a database that stores the parallel translation corpus.
  • a bilingual corpus is a structured combination of source language text and target language text.
  • the bilingual corpus stored in the bilingual corpus DB 20 may be a commonly used bilingual corpus, and is, for example, a Japanese/English bilingual corpus such as KFTT (Kyoto Free Translation Task) or BTEC.
  • the translation data generation device 10 adds noise to the source language text of the bilingual corpus stored in the bilingual corpus DB 20 to generate a pseudo bilingual corpus (details will be described later).
  • the training information DB 30 is a database that stores training information (training data) for learning a noise model (details will be described later).
  • the training information is a source language text group (transcription corpus of natural utterances, utterance data for learning) in which noise is annotated in advance.
  • Such training information is constructed, for example, by annotating a source language text included in a normal corpus with noise.
  • the noise model learning device 40 uses the training information (training data that is a source language text group including noise) stored in the training information DB 30 to learn a noise model related to adding noise to the source language text.
  • the learning data (training data) of the noise model for example, a transcription corpus of a Japanese utterance corpus (CSJ) or a spontaneous speech corpus such as Switch Board Corpus may be used.
  • the noise model outputs noise label information related to the source language text when the source language text is input.
  • the noise label is information indicating the type of noise.
  • FIG. 4 is a diagram illustrating a noise label. As shown in FIG. 4, in this embodiment, there are three types of noise labels, ⁇ F>, ⁇ D>, and 0.
  • ⁇ F> is a noise label indicating a filler.
  • ⁇ D> is a noise label indicating stagnation or rewording.
  • 0 is a noise label indicating that there is no noise.
  • the noise label information is information in which the type of noise label ( ⁇ F>, ⁇ D>, 0 described above) and a word (specifically, morpheme) associated with each noise label are associated with each other. Noise label sequence.
  • FIG. 3 is a diagram for explaining the outline of the noise model.
  • the noise model is constructed using, for example, a bi-directional recursive neural network (BiRNN) that is widely used in part-of-speech tagging and proper expression extraction tasks. ing.
  • the noise model may be constructed by a method using another neural network such as RNN or a method using a conditional random field such as CRF (Conditional random field).
  • the noise model has been trained to predict an appropriate noise label for each input element (word-specific morpheme) of the input source language text when noise is next to the element.
  • w (w 0 , w 1 ,..., W n ) of a source language text.
  • a noise label sequence l ( ⁇ F>,0,0, ⁇ D>, ⁇ F>,0,0) is generated based on the same learning utterance data in which noise is annotated. It Finally, BiRNN is learned as a sequence labeling problem that predicts the noise label sequence 1 from the morpheme sequence w. In BiRNN, parameter learning is performed using the prediction error of the output sequence with respect to the input sequence.
  • the translation model learning device 50 learns a translation model using the pseudo-parallel translation corpus constructed in the translation data generation device 10.
  • a Translation model a Transformer or RNN-based Sequence-to-Sequence model or the like may be used.
  • the translation data generation device 10 includes, as its functions, an analysis unit 11, a noise addition unit 12, a corpus construction unit 13, and a storage unit 14.
  • the noise adding unit 12 adds noise to the source language text (specifically, the morpheme sequence extracted by the analyzing unit 11) to obtain the noise adding source language text.
  • the noise imparting unit 12 imparts noise to the source language text using the noise model learned by the noise model learning device 40.
  • the noise imparting unit 12 imparts a noise label to each morpheme according to the characteristic (specifically, morpheme) of each word of the source language text, and assigns the noise label to the word corresponding to the noise label (the word as noise. ) To add noise to the source language text.
  • the noise adding unit 12 predicts a noise label sequence corresponding to the morpheme sequence of the input source language text by using the noise model, and inserts a noise label next to the corresponding morpheme sequence.
  • the noise adding unit 12 replaces the inserted noise label with a word representing noise, and obtains a noise-added source language text that is the final output and is a source language text with noise added.
  • the noise imparting unit 12 has been described as imparting a noise label according to the morpheme of the source language text, but the present invention is not limited to this, and the noise label is assigned according to the part of speech or reading (pronunciation) of each word in the source language text. May be given. Further, the noise adding unit 12 may add a noise label according to two or more pieces of information such as a morpheme of a word, a part of speech, and a reading.
  • the noise adding unit 12 first inputs the morpheme sequence of the source language text into the noise model, and acquires the output vector h t of the noise model at each time step (each morpheme sequence).
  • the noise label at each time step the one that has the maximum posterior probability of the noise label is not simply used as the estimation result, but a value exp(h t / ⁇ that is an exponent of the output vector h t ) Is determined by sampling based on the multinomial distribution defined in (1). That is, the noise label l t at each time step is estimated based on the following equation (1).
  • l t is the estimation result of the noise label
  • h t is the output vector of the noise model
  • is the temperature parameter.
  • the output vector h t is represented by a three-dimensional vector for three label types ( ⁇ F>, ⁇ D>, 0).
  • the temperature parameter ⁇ is a parameter for operating the strength of variation of the noise label. When the value of the temperature parameter ⁇ is increased ( ⁇ ), the noise label probability distribution approaches a uniform distribution, and when the temperature parameter ⁇ is decreased ( ⁇ 0), the noise label with the highest probability is selected.
  • the determination of the noise label when the temperature parameter ⁇ is relatively small will be described.
  • h t / ⁇ ( ⁇ 0.6666..., 2, ⁇ 2).
  • exp(h t / ⁇ ) (0.51, 7.39, 0.13).
  • the probability distribution becomes (0.06 (probability that 0 is selected as a noise label), 0 .92 (the probability that ⁇ F> is selected as the noise label) and 0.02 (the probability that ⁇ D> is selected as the noise label)).
  • Sampling (trial) the noise label only once based on such a probability distribution corresponds to sampling from the categorical distribution. In this case, the establishment of the noise label ⁇ F> is extremely high at 92%, and the possibility of being selected as the sampling result is extremely high.
  • the determination of the noise label when the temperature parameter ⁇ is relatively large will be described.
  • the probability distribution is (0.30 (0 is the probability that 0 is selected as a noise label), 0 .45 (the probability that ⁇ F> is selected as a noise label) and 0.25 (the probability that ⁇ D> is selected as a noise label)).
  • the probability of establishment of each noise label approaches 33.333...%, and the probability distribution approaches a uniform distribution.
  • the noise adding unit 12 samples the noise label according to the probability distribution based on the score of each noise label output from the noise model, with the feature (morpheme sequence) of each word of the source language text as an input, and the source language text.
  • the noise label to be given to is determined.
  • the probability distribution defined based on the output value of the noise model is described as representing a polynomial distribution, but the present invention is not limited to this, and the probability distribution represents a Poisson distribution, a normal distribution, or the like. It may be.
  • the noise adding unit 12 then replaces the noise label sequence predicted using the noise model with a word representing noise.
  • the noise adding unit 12 performs sampling, for example, from the vocabulary set V type corresponding to each noise label based on the unigram probability. For example, when replacing the noise label ⁇ F> of the filler with the word representing the filler, the word representing the filler is determined based on the following equation (2).
  • w t ′ ⁇ V ⁇ F> ⁇ (2) In the above formula (2), V ⁇ F> is a vocabulary set of noise labels ⁇ F>, and w t ′ is a word representing a filler (noise) inserted at time step t.
  • the noise imparting unit 12 obtains a plurality of patterns of noise imparting source language text from one source language text.
  • the noise adding unit 12 may derive, for example, a plurality of patterns of words (words representing noise) to be replaced with respect to one noise label, and obtain a plurality of patterns of noise added source language text from one source language text. Further, the noise adding unit 12 may derive a plurality of patterns of noise labels corresponding to each morpheme, and obtain a plurality of patterns of noise-added source language text from one source language text.
  • the corpus construction unit 13 constructs a pseudo bilingual corpus in which the noise-added source language text and the target language text corresponding to the noise-added source language text before the noise addition are associated with each other.
  • FIG. 5 is a diagram showing an image of constructing a pseudo bilingual corpus.
  • the noise-added source language text “I want to drive the tourist route better than the main expressway”, which is the source language text before adding noise (“Early tourism than main expressways I want to run on the route”, etc.), and the target language text corresponding to the source language text before noise addition, “I would rather take a scenic route than a main highway.”
  • a pseudo-translation corpus (corresponding to a translation pair) associated with is constructed.
  • the storage unit 14 is a DB that stores the pseudo bilingual corpus constructed by the corpus construction unit 13.
  • the translation model learning device 50 learns a translation model using the pseudo-parallel translation corpus stored in the storage unit 14.
  • FIG. 6 is a flowchart showing the processing executed by the translation data generation system 1. It is assumed that the noise model is constructed (learned) by the noise model learning device 40 as a premise for executing the processing shown in FIG.
  • the analysis unit 11 of the translation data generation device 10 acquires the source language text from the parallel translation corpus DB 20 (step S1). Subsequently, the analysis unit 11 executes morphological analysis on the acquired source language text (step S2).
  • the noise adding unit 12 of the translation data generating device 10 adds noise to the morpheme sequence extracted by the analyzing unit 11 to obtain a noise-added source language text (step S3). Specifically, the noise adding unit 12 predicts a noise label sequence corresponding to the morpheme sequence of the input source language text by using the noise model, and inserts a noise label next to the corresponding morpheme sequence. Then, the noise adding unit 12 replaces the inserted noise label with a word representing noise, and obtains a noise-added source language text that is the final output and is a source language text with noise added.
  • the corpus construction unit 13 of the translation data generation device 10 associates the noise-added source language text with the target language text corresponding to the noise-added source language text before the noise addition.
  • a corpus is constructed (step S4).
  • the translation model learning device 50 learns a translation model using the pseudo bilingual corpus constructed by the corpus construction unit 13 (step S5).
  • the above is an example of the processing executed by the translation data generation system 1.
  • the translation data generation system 1 includes a noise adding unit 12 that adds noise to a source language text to obtain a noise added source language text, a noise added source language text, and noise of the noise added source language text. And a corpus constructing unit 13 for constructing a pseudo bilingual corpus in which the target language text corresponding to the source language text before being assigned is associated.
  • noise is added to the source language text, and the pseudo-translation corpus that associates the noise-added source language text with the target language text corresponding to the source language text before the noise addition is created. Be built. In this way, by constructing the bilingual corpus in which the source language text with noise is associated with the target language text corresponding to the source language text before adding noise, using such a bilingual corpus, for example, natural speech Even when noise such as a filler is included in the input, it is possible to appropriately derive the target language text corresponding to the source language text before noise addition.
  • a robust corpus (pseudo-translation corpus) can be constructed with respect to natural utterances containing noise, and non-fluent natural utterances containing noise. Can be translated with high accuracy.
  • the information generated by the translation data generating system 1 is used for translation, it is not necessary to correct the user's utterance content and input it into the translation model, and the user's utterance content is not changed. You can enter the translation model.
  • a speech recognition device is used to sequentially receive a user's utterance and make a rephrasing determination.
  • the translation data generation system 1 does not require a voice recognition device, and only needs to use the text information of the recognition result. As described above, in the translation data generation system 1 according to the present embodiment, it is possible to suppress the correction processing of the utterance content and the rewording determination processing, so that the processing load on the processing unit such as the CPU is reduced. It also plays a dynamic effect.
  • FIG. 7 is a table showing a translation example of this embodiment and a comparative example.
  • the upper part of FIG. 7 there is a lack of translation in the comparative example with respect to the natural utterance input including noise.
  • a natural utterance input including noise is translated in a state including noise in the comparative example, and desired translation cannot be performed.
  • the comparative example conventionally, it has been difficult to accurately translate a natural utterance containing noise.
  • the translation is performed in consideration of the pseudo bilingual corpus constructed by the translation data generation system 1 of the present embodiment, noise is naturally included. Translation errors are unlikely to occur even when uttered, and translation can be performed with high accuracy.
  • the translation data generation system 1 includes a translation model learning device 50 that learns a translation model using a pseudo bilingual corpus. By learning the translation model based on the constructed corpus, it is possible to translate the natural utterance containing noise with higher accuracy.
  • the translation data generation system 1 includes a noise model learning device 40 that learns a noise model related to noise addition to a source language text by using training data that is a source language text group including noise, and the noise addition unit 12 Adds noise to the source language text using a noise model.
  • a noise model is learned based on a source language text group that includes noise in advance, and noise is added based on the noise model, so that noise that is likely to be actually included is easily added. The accuracy can be further improved.
  • the noise imparting unit 12 imparts a noise label indicating the type of noise to each word of the source language text, and replaces the noise label with the word corresponding to the noise label. Add noise to linguistic text. Since the noise label corresponding to each word of the source language text is added and then the word (noise) corresponding to the noise label is derived, it is possible to ensure the easiness and validity of the noise addition.
  • the noise adding unit 12 derives a plurality of patterns of words to be replaced with respect to one noise label, and obtains a plurality of patterns of noise adding source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
  • the noise adding unit 12 derives a plurality of patterns of noise labels corresponding to each word, and obtains a plurality of patterns of noise-added source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
  • the noise adding unit 12 adds a noise label according to the characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the noise adding unit 12 adds a noise label according to at least one of a morpheme, a part of speech, and a word reading, which are characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the noise adding unit 12 samples the noise label according to the probability distribution of each noise level based on the score of each noise label output from the noise model with the feature of each word of the source language text as an input. , Determine the noise label added to the source language text.
  • the noise label For example, it is possible to give a noise label having a high score output from the noise model, and it is possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
  • the translation data generation device 10 described above may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.
  • the word “device” can be read as a circuit, device, unit, or the like.
  • the hardware configuration of the translation data generating device 10 may be configured to include one or a plurality of each device illustrated in the figure, or may be configured not to include some devices.
  • Each function in the translation data generation device 10 causes a predetermined software (program) to be loaded on hardware such as the processor 1001 and the memory 1002, so that the processor 1001 performs an operation to perform communication by the communication device 1004 and the memory 1002. It is realized by controlling the reading and/or writing of data in the storage 1003.
  • a predetermined software program
  • the processor 1001 operates an operating system to control the entire computer, for example.
  • the processor 1001 may be configured by a central processing unit (CPU) including an interface with peripheral devices, a control device, a calculation device, a register, and the like.
  • CPU central processing unit
  • the control function of the noise adding unit 12 and the like of the translation data generating device 10 may be realized by the processor 1001.
  • the processor 1001 reads a program (program code), a software module and data from the storage 1003 and/or the communication device 1004 into the memory 1002, and executes various processes according to these.
  • a program program code
  • a program that causes a computer to execute at least part of the operations described in the above-described embodiments is used.
  • the control function of the noise adding unit 12 and the like of the translation data generating device 10 may be realized by a control program stored in the memory 1002 and operated by the processor 1001, and other function blocks are similarly realized. Good.
  • the various processes described above are executed by one processor 1001, they may be executed simultaneously or sequentially by two or more processors 1001.
  • the processor 1001 may be implemented by one or more chips.
  • the program may be transmitted from the network via an electric communication line.
  • the memory 1002 is a computer-readable recording medium, and is composed of, for example, at least one of ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (ElectricallyErasable Programmable ROM), RAM (Random Access Memory), and the like. May be done.
  • the memory 1002 may be called a register, a cache, a main memory (main storage device), or the like.
  • the memory 1002 can store an executable program (program code), a software module, etc. for implementing the wireless communication method according to the embodiment of the present invention.
  • the storage 1003 is a computer-readable recording medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disc drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disc). (Registered trademark) disk), smart card, flash memory (eg, card, stick, key drive), floppy (registered trademark) disk, magnetic strip, and the like.
  • the storage 1003 may be called an auxiliary storage device.
  • the storage medium described above may be, for example, a database including the memory 1002 and/or the storage 1003, a server, or another appropriate medium.
  • the communication device 1004 is hardware (transmission/reception device) for performing communication between computers via a wired and/or wireless network, and is also called, for example, a network device, a network controller, a network card, a communication module, or the like.
  • the input device 1005 is an input device (for example, a keyboard, a mouse, a microphone, a switch, a button, a sensor, etc.) that receives an input from the outside.
  • the output device 1006 is an output device (for example, a display, a speaker, an LED lamp, etc.) that performs output to the outside.
  • the input device 1005 and the output device 1006 may be integrated (for example, a touch panel).
  • Each device such as the processor 1001 and the memory 1002 is connected by a bus 1007 for communicating information.
  • the bus 1007 may be configured with a single bus or different buses among devices.
  • the translation data generation device 10 includes hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). May be included, and a part or all of each functional block may be realized by the hardware.
  • processor 1001 may be implemented with at least one of these hardware.
  • the translation data generation system may add a noise word (filler, stagnation, rewording, etc.) defined in advance to a random position of the source language text.
  • the word (noise) to be added to the random position may be randomly selected from noise word candidates, for example.
  • a noise model that randomly adds noise can be constructed as long as the noise word can be defined.
  • LTE Long Term Evolution
  • LTE-A Long Term Evolution-Advanced
  • SUPER 3G IMT-Advanced
  • 4G 5G
  • FRA Full Radio Access
  • W-CDMA Wideband Code Division Multiple Access
  • GSM Global System for Mobile Communications
  • CDMA2000 Code Division Multiple Access 2000
  • UMB Universal Mobile Broad-band
  • IEEE 802.11 Wi-Fi
  • IEEE 802.16 WiMAX
  • IEEE 802.20 UWB (Ultra-Wide) Band
  • Bluetooth registered trademark
  • a system using other appropriate systems and/or a next-generation system extended based on the above.
  • Information that has been input and output may be stored in a specific location (for example, memory), or may be managed in a management table. Information that is input/output may be overwritten, updated, or added. The output information and the like may be deleted. The input information and the like may be transmitted to another device.
  • the determination may be performed by a value represented by 1 bit (whether 0 or 1), may be performed by a Boolean value (Boolean: true or false), and may be performed by comparing numerical values (for example, a predetermined value). Value comparison).
  • the notification of the predetermined information (for example, the notification of “being X”) is not limited to the explicit notification, and is performed implicitly (for example, the notification of the predetermined information is not performed). Good.
  • software, instructions, etc. may be transmitted and received via a transmission medium.
  • the software may use wired technologies such as coaxial cable, fiber optic cable, twisted pair and digital subscriber line (DSL) and/or wireless technologies such as infrared, wireless and microwave to websites, servers, or other When transmitted from a remote source, these wireline and/or wireless technologies are included within the definition of transmission medium.
  • wired technologies such as coaxial cable, fiber optic cable, twisted pair and digital subscriber line (DSL) and/or wireless technologies such as infrared, wireless and microwave to websites, servers, or other
  • the information, signals, etc. described herein may be represented using any of a variety of different technologies.
  • data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description include voltage, current, electromagnetic waves, magnetic fields or magnetic particles, optical fields or photons, or any of these. May be represented by a combination of
  • information, parameters, etc. described in this specification may be represented by absolute values, may be represented by relative values from predetermined values, or may be represented by other corresponding information. ..
  • User terminals are defined by those skilled in the art as mobile communication terminals, subscriber stations, mobile units, subscriber units, wireless units, remote units, mobile devices, wireless devices, wireless communication devices, remote devices, mobile subscriber stations, access terminals, It may also be referred to as a mobile terminal, wireless terminal, remote terminal, handset, user agent, mobile client, client, or some other suitable term.
  • determining may encompass a wide variety of actions.
  • “Judgment” and “decision” are, for example, calculating, computing, processing, deriving, investigating, looking up (e.g., table, database or another). (Search in data structure), ascertaining (ascertaining) can be regarded as “judgment” and “decision”.
  • “decision” and “decision” include receiving (eg, receiving information), transmitting (eg, transmitting information), input (input), output (output), access (accessing) (for example, accessing data in a memory) may be regarded as “judging” and “deciding”.
  • judgment and “decision” are considered to be “judgment” and “decision” when things such as resolving, selecting, choosing, selecting, establishing, and comparing are done. May be included. That is, the “judgment” and “decision” may include considering some action as “judgment” and “decision”.
  • the phrase “based on” does not mean “based only on,” unless expressly specified otherwise. In other words, the phrase “based on” means both "based only on” and “based at least on.”
  • any reference to that element does not generally limit the amount or order of those elements. These designations may be used herein as a convenient way of distinguishing between two or more elements. Thus, references to the first and second elements do not imply that only two elements may be employed therein, or that the first element must precede the second element in any way.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Machine Translation (AREA)

Abstract

A translation data generating system 1 is provided with a noise imparting unit 12 that imparts noise to source language text to yield noise-imparted source language text, and a corpus constructing unit 13 that constructs a pseudo-parallel corpus where the noise-imparted source language text, and target language text corresponding to the source language text of the noise-imparted source language text before imparting noise, are correlated.

Description

翻訳用データ生成システムData generation system for translation
 本発明の一態様は、翻訳用データ生成システムに関する。 One aspect of the present invention relates to a translation data generation system.
 機械翻訳システムにおいて、利用者の自然発話入力に、言い淀み、言い直し、又はフィラー等(以下、これらを総称して「ノイズ」と記載する場合がある)が含まれることによって、翻訳精度が低下する場合がある。 In the machine translation system, the translation accuracy is reduced due to the fact that the user's natural utterance input includes stagnation, rewording, or fillers (hereinafter these may be collectively referred to as "noise"). There is a case.
 このような課題に対して、例えば特許文献1及び特許文献2等に示されるように、発話における言い直し箇所等を特定し利用者の発話内容を修正する技術が知られている。 With respect to such a problem, as disclosed in, for example, Patent Document 1 and Patent Document 2, there is known a technique of identifying a reworded portion in an utterance and correcting the utterance content of the user.
特開2010-079647号公報JP, 2010-079647, A 特開2007-057844号公報JP, 2007-057844, A
 しかしながら、ノイズ箇所を特定して修正することは容易ではなく、上述した技術によっても翻訳精度を十分に担保することは困難である。 However, it is not easy to identify and correct noise points, and it is difficult to ensure sufficient translation accuracy even with the techniques described above.
 本発明の一態様は上記実情に鑑みてなされたものであり、ノイズが含まれる自然発話に対しても高精度に翻訳を行うことを目的とする。 One aspect of the present invention has been made in view of the above situation, and an object thereof is to perform highly accurate translation even for natural speech including noise.
 本発明の一態様に係る翻訳用データ生成システムは、原言語テキストにノイズを付与してノイズ付与原言語テキストを得るノイズ付与部と、ノイズ付与原言語テキストと、該ノイズ付与原言語テキストのノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスを構築するコーパス構築部と、を備える。 A translation data generation system according to an aspect of the present invention includes a noise adding unit that adds noise to a source language text to obtain a noise added source language text, a noise added source language text, and noise of the noise added source language text. And a corpus constructing unit for constructing a pseudo bilingual corpus in which a target language text corresponding to a source language text before being assigned is associated.
 本発明の一態様に係る翻訳用データ生成システムでは、原言語テキストにノイズが付与され、ノイズ付与原言語テキストとノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスが構築される。このように、ノイズ付与原言語テキストがノイズ付与前の原言語テキストに対応する目的言語テキストに対応付けられた対訳コーパスが構築されることにより、このような対訳コーパスを利用して、例えば自然発話入力にフィラー等のノイズが含まれている場合においても、ノイズ付与前の原言語テキストに対応する目的言語テキストを適切に導出することが可能となる。すなわち、本発明の一態様に係る翻訳用データ生成システムによれば、ノイズが含まれる自然発話に対して頑健なコーパス(疑似対訳コーパス)を構築することができ、ノイズが含まれる自然発話に対しても高精度に翻訳を行うことができる。 In the translation data generation system according to one aspect of the present invention, noise is added to the source language text, and the pseudo-translation corpus in which the noise-added source language text and the target language text corresponding to the source language text before the addition of noise are associated with each other. Is built. In this way, by constructing the bilingual corpus in which the source language text with noise is associated with the target language text corresponding to the source language text before adding noise, using such a bilingual corpus, for example, natural speech Even when noise such as filler is included in the input, it is possible to appropriately derive the target language text corresponding to the source language text before noise addition. That is, according to the translation data generation system of one aspect of the present invention, it is possible to construct a robust corpus (pseudo bilingual corpus) for natural utterances including noise, and for natural utterances including noise. However, it is possible to translate with high accuracy.
 上記翻訳用データ生成システムは、疑似対訳コーパスを用いて翻訳モデルを学習する翻訳モデル学習部を更に備えていてもよい。構築したコーパスに基づいて翻訳モデルが学習されることにより、ノイズが含まれる自然発話に対してより高精度に翻訳を行うことができる。 The translation data generation system may further include a translation model learning unit that learns a translation model using a pseudo bilingual corpus. By learning the translation model based on the constructed corpus, it is possible to translate the natural utterance containing noise with higher accuracy.
 上記翻訳用データ生成システムは、ノイズを含んだ原言語テキスト群である訓練データを用いて、原言語テキストに対するノイズの付与に係るノイズモデルを学習するノイズモデル学習部を更に備え、ノイズ付与部は、ノイズモデルを用いて、原言語テキストにノイズを付与してもよい。予めノイズが含まれている原言語テキスト群に基づきノイズモデルが学習され、該ノイズモデルに基づいてノイズの付与が行われることによって、実際に含まれる可能性が高いノイズが付与され易くなり、翻訳精度をより向上させることができる。 The translation data generation system further includes a noise model learning unit that uses a training data that is a source language text group containing noise to learn a noise model related to the addition of noise to the source language text, and the noise addition unit is , A noise model may be used to add noise to the source language text. A noise model is learned based on a source language text group that includes noise in advance, and noise is added based on the noise model, so that noise that is likely to be actually included is easily added. The accuracy can be further improved.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、原言語テキストの各単語に、ノイズのタイプを示すノイズラベルを付与し、該ノイズラベルを該ノイズラベルに対応する単語へ置き換えることにより、原言語テキストにノイズを付与してもよい。原言語テキストの各単語に応じたノイズラベルが付与された後に該ノイズラベルに応じた単語(ノイズ)が導出されることにより、ノイズ付与の容易性及び妥当性を担保することができる。 In the translation data generation system, the noise adding unit adds a noise label indicating the type of noise to each word of the source language text, and replaces the noise label with the word corresponding to the noise label Noise may be added to the text. Since the noise label corresponding to each word of the source language text is added and then the word (noise) corresponding to the noise label is derived, it is possible to ensure the easiness and validity of the noise addition.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、1つのノイズラベルに対して置き換える単語を複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得てもよい。これにより、1つの原言語テキストから効率的に疑似対訳コーパスを充実させ、翻訳精度をより向上させることができる。 In the above translation data generation system, the noise adding unit may derive a plurality of patterns of words to be replaced with respect to one noise label, and obtain a plurality of patterns of noise adding source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、各単語に対応するノイズラベルを複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得てもよい。これにより、1つの原言語テキストから効率的に疑似対訳コーパスを充実させ、翻訳精度をより向上させることができる。 In the above translation data generation system, the noise adding unit may derive a plurality of patterns of noise labels corresponding to each word and obtain a plurality of patterns of noise-added source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、原言語テキストの各単語の特徴に応じて、ノイズラベルを付与してもよい。これにより、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the above translation data generation system, the noise adding unit may add a noise label according to the characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、原言語テキストの各単語の特徴である、形態素、品詞、及び単語の読みの少なくとも一つに応じて、ノイズラベルを付与してもよい。これにより、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the above translation data generation system, the noise adding unit may add a noise label according to at least one of the morpheme, the part of speech, and the reading of the word, which are the features of each word of the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 上記翻訳用データ生成システムにおいて、ノイズ付与部は、原言語テキストの各単語の特徴を入力としてノイズモデルから出力される各ノイズラベルのスコアに基づく各ノイズレベルの確率分布に従ってノイズラベルをサンプリングし、原言語テキストに付与するノイズラベルを決定してもよい。これにより、例えばノイズモデルから出力されたスコアが高いノイズラベルを付与することが可能となり、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the above translation data generation system, the noise adding unit samples the noise label according to the probability distribution of each noise level based on the score of each noise label output from the noise model with the feature of each word of the source language text as an input, You may decide the noise label given to a source language text. As a result, for example, it is possible to give a noise label having a high score output from the noise model, and it is possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 上記翻訳用データ生成システムにおいて、ノイズモデルは、条件付き確率場又はニューラルネットワークを用いた手法により構築されていてもよい。これにより、機械学習によってノイズモデルを適切に構成することができる。 In the above translation data generation system, the noise model may be constructed by a method using a conditional random field or neural network. Thereby, the noise model can be appropriately configured by machine learning.
 本発明の一態様によれば、ノイズが含まれる自然発話に対しても高精度に翻訳を行うことができる。 According to one aspect of the present invention, it is possible to perform highly accurate translation even for natural speech including noise.
本実施形態に係る翻訳用データ生成システムの処理イメージを模式的に示す図である。It is a figure which shows typically the processing image of the data production system for translation which concerns on this embodiment. 本実施形態に係る翻訳用データ生成システムの機能構成を示す図である。It is a figure which shows the function structure of the data generation system for translations which concerns on this embodiment. ノイズモデルの概要を説明する図である。It is a figure explaining the outline of a noise model. ノイズラベルを説明する図である。It is a figure explaining a noise label. 疑似対訳コーパスの構築イメージを示す図である。It is a figure which shows the construction image of a pseudo bilingual corpus. 翻訳用データ生成システムが実行する処理を示すフローチャートである。It is a flow chart which shows the processing which a translation data generation system performs. 本実施形態及び比較例の翻訳例を示す表である。It is a table which shows the translation example of this embodiment and a comparative example. 翻訳用データ生成装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of a translation data generation apparatus.
 以下、添付図面を参照しながら本発明の実施形態を詳細に説明する。図面の説明において、同一又は同等の要素には同一符号を用い、重複する説明を省略する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same or equivalent elements will be denoted by the same reference symbols, without redundant description.
 最初に、図1を参照して、本実施形態に係る翻訳用データ生成システム1の処理イメージを説明する。図1は、本実施形態に係る翻訳用データ生成システム1の処理イメージを模式的に示す図である。翻訳用データ生成システム1は、既存の対訳コーパス(一般的に用いられる対訳コーパス)の原言語側のテキスト(原言語テキスト)に対してノイズを付与すると共に、ノイズが付与された原言語テキスト(ノイズ付与原言語テキスト)と、ノイズ付与前の原言語テキストに対応する目的言語側のテキスト(目的言語テキスト)とを対応付けた疑似対訳コーパスを構築し、該疑似対訳コーパスを用いて機械翻訳モデル(例えばNMT(Neural Machine Translation)モデル)を学習(構築)するシステムである。ここでのノイズとは、利用者の自然発話入力に含まれ得る言い淀み、言い直し、又はフィラー等である。 First, a processing image of the translation data generation system 1 according to the present embodiment will be described with reference to FIG. FIG. 1 is a diagram schematically showing a processing image of the translation data generation system 1 according to the present embodiment. The translation data generation system 1 adds noise to a text (source language text) on the source language side of an existing bilingual corpus (generally used bilingual corpus), and at the same time, adds noise to the source language text ( A source language text (noise-added) and a target language side text (target language text) corresponding to the source language text before noise are associated to construct a pseudo-translation corpus, and a machine translation model is created using the pseudo-translation corpus. This is a system for learning (constructing) (for example, NMT (Neural Machine Translation) model). The noise here is stagnation, rewording, filler, or the like that may be included in the user's natural speech input.
 図1に示される例では、既存の対訳コーパスにおける「主要な高速道路よりも観光ルートの方を走りたいです」との原言語テキストに対して、所定のルールに従って複数パターンのノイズを付与し(詳細は後述)、「えー主要な高速道路よりもまー観光ルートの方を走りたいです」「あ主要な高速道路よりもえー観光ルートの方を走りたいです」「えーっと主要な高速道路よりもまー観光ルートの方を走りたいです」という3パターンのノイズ付与原言語テキスト得ている。そして、ノイズ付与前の原言語テキストに対応する目的言語テキストである「I would rather take a scenic route than a main highway.」と、上述した3パターンのノイズ付与原言語テキストとを対応付けた疑似対訳コーパスが構築されて、該疑似対訳コーパスを用いて機械翻訳モデルが学習(構築)されている。このように、ノイズ付与原言語テキストがノイズ付与前の原言語テキストに対応する目的言語テキストに対応付けられた疑似対訳コーパスが構築されることにより、このような疑似対訳コーパスを利用して、例えば自然発話入力にフィラー等のノイズが含まれている場合においても、ノイズ付与前の原言語テキストに対応する目的言語テキストを適切に導出することが可能となる。以下、翻訳用データ生成システム1の機能の詳細について説明する。 In the example shown in Fig. 1, multiple patterns of noise are added to the source language text "I want to drive on a tourist route rather than on a major highway" in an existing bilingual corpus according to a predetermined rule ( (Details will be given later), "Well, I would like to drive on a sightseeing route than a major highway." "Oh, I would like to drive on a tourist route more than a major freeway." "Well than a major highway. I want to drive on a sightseeing route." Then, the target language text corresponding to the source language text before noise is added, "I would rather take a scenic route than a main highway." A corpus is constructed, and a machine translation model is learned (constructed) using the pseudo bilingual corpus. In this way, by constructing the pseudo bilingual corpus in which the noise-added source language text is associated with the target language text corresponding to the source language text before the addition of noise, using such a pseudo bilingual corpus, for example, Even when the natural utterance input includes noise such as filler, it is possible to appropriately derive the target language text corresponding to the source language text before noise addition. The details of the function of the translation data generation system 1 will be described below.
 図2は、本実施形態に係る翻訳用データ生成システム1の機能構成を示す図である。図2に示されるように、翻訳用データ生成システム1は、翻訳用データ生成装置10と、対訳コーパスDB20と、訓練情報DB30と、ノイズモデル学習装置40(ノイズモデル学習部)と、翻訳モデル学習装置50(翻訳モデル学習部)と、を備えている。なお、翻訳用データ生成システム1は、必ずしも上記の各構成を備えるものでなくてもよく、例えば翻訳用データ生成装置10のみで構成されていてもよいし、翻訳用データ生成装置10及びノイズモデル学習装置40のみで構成されていてもよいし、翻訳用データ生成装置10、ノイズモデル学習装置40、及び翻訳モデル学習装置50のみで構成されていてもよいし。 FIG. 2 is a diagram showing a functional configuration of the translation data generation system 1 according to the present embodiment. As shown in FIG. 2, the translation data generation system 1 includes a translation data generation device 10, a parallel translation corpus DB 20, a training information DB 30, a noise model learning device 40 (noise model learning unit), and a translation model learning. The apparatus 50 (translation model learning unit). It should be noted that the translation data generation system 1 does not necessarily have to have the above-described components, and may be configured with only the translation data generation device 10, for example, or the translation data generation device 10 and the noise model. It may be configured by only the learning device 40, or may be configured by only the translation data generation device 10, the noise model learning device 40, and the translation model learning device 50.
 対訳コーパスDB20は、対訳コーパスを記憶しているデータベースである。対訳コーパスとは、原言語テキスト及び目的言語テキストの組み合わせを構造化したものである。対訳コーパスDB20が記憶する対訳コーパスは、通常利用されるものでよく、例えばKFTT(Kyoto Free Translation Task)又はBTEC等の日本語・英語の対訳コーパスである。本実施形態では、翻訳用データ生成装置10によって、対訳コーパスDB20が記憶する対訳コーパスの原言語テキストにノイズが付与され、疑似対訳コーパスが生成される(詳細は後述)。 The parallel translation corpus DB 20 is a database that stores the parallel translation corpus. A bilingual corpus is a structured combination of source language text and target language text. The bilingual corpus stored in the bilingual corpus DB 20 may be a commonly used bilingual corpus, and is, for example, a Japanese/English bilingual corpus such as KFTT (Kyoto Free Translation Task) or BTEC. In the present embodiment, the translation data generation device 10 adds noise to the source language text of the bilingual corpus stored in the bilingual corpus DB 20 to generate a pseudo bilingual corpus (details will be described later).
 訓練情報DB30は、ノイズモデル(詳細は後述)を学習するための訓練情報(訓練データ)を記憶しているデータベースである。訓練情報とは、予めノイズがアノテーションされた原言語テキスト群(自然発話の書き起こしコーパス。学習用発話データ)である。このような訓練情報は、例えば通常のコーパスに含まれる原言語テキストにノイズがアノテーションされることによって構築されている。 The training information DB 30 is a database that stores training information (training data) for learning a noise model (details will be described later). The training information is a source language text group (transcription corpus of natural utterances, utterance data for learning) in which noise is annotated in advance. Such training information is constructed, for example, by annotating a source language text included in a normal corpus with noise.
 ノイズモデル学習装置40は、訓練情報DB30に記憶されている訓練情報(ノイズを含んだ原言語テキスト群である訓練データ)を用いて、原言語テキストに対するノイズの付与に係るノイズモデルを学習する。ノイズモデルの学習データ(訓練データ)としては、例えば、日本語はなし言葉コーパス(CSJ)又はSwitch Board Corpus等の自然発話コーパスの書き起こしコーパスが用いられてもよい。ノイズモデルは、原言語テキストが入力された場合に、該原言語テキストに係るノイズラベルの情報を出力するものである。ノイズラベルとは、ノイズのタイプ(種別)を示す情報である。図4は、ノイズラベルを説明する図である。図4に示されるように、本実施形態では、ノイズラベルとして、<F>、<D>、0の3種類がある。<F>は、フィラーを示すノイズラベルである。<D>は言い淀み又は言い直しを示すノイズラベルである。0はノイズ無しを示すノイズラベルである。ノイズラベルの情報とは、ノイズラベルの種類(上述した<F>、<D>、0)と各ノイズラベルが対応付けられる単語(詳細には形態素)とが紐づいた情報であり、例えば後述するノイズラベル系列である。 The noise model learning device 40 uses the training information (training data that is a source language text group including noise) stored in the training information DB 30 to learn a noise model related to adding noise to the source language text. As the learning data (training data) of the noise model, for example, a transcription corpus of a Japanese utterance corpus (CSJ) or a spontaneous speech corpus such as Switch Board Corpus may be used. The noise model outputs noise label information related to the source language text when the source language text is input. The noise label is information indicating the type of noise. FIG. 4 is a diagram illustrating a noise label. As shown in FIG. 4, in this embodiment, there are three types of noise labels, <F>, <D>, and 0. <F> is a noise label indicating a filler. <D> is a noise label indicating stagnation or rewording. 0 is a noise label indicating that there is no noise. The noise label information is information in which the type of noise label (<F>, <D>, 0 described above) and a word (specifically, morpheme) associated with each noise label are associated with each other. Noise label sequence.
 図3は、ノイズモデルの概要を説明する図である。図3に示されるように、ノイズモデルは、例えば、品詞タグ付けや固有表現抽出タスク等で広く用いられている双方向再帰的ニューラルネットワーク(BiRNN:Bi-directional Recurrent Neural Networks)を用いて構築されている。なお、ノイズモデルは、RNN等のその他のニューラルネットワークを用いた手法や、CRF(Conditional random field)等の条件付き確率場を用いた手法により構築されていてもよい。ノイズモデルは、入力された原言語テキストの各入力要素(単語詳細には形態素)の次にノイズが入る場合、その要素に対して適当なノイズラベルを予測するように学習されている。ノイズモデルを用いたノイズ付与においては、原言語テキストの形態素系列w=(w,w,…,w)からノイズラベル系列l=(l,l,…,l)を予測する系列ラベリング問題として考える。 FIG. 3 is a diagram for explaining the outline of the noise model. As shown in FIG. 3, the noise model is constructed using, for example, a bi-directional recursive neural network (BiRNN) that is widely used in part-of-speech tagging and proper expression extraction tasks. ing. The noise model may be constructed by a method using another neural network such as RNN or a method using a conditional random field such as CRF (Conditional random field). The noise model has been trained to predict an appropriate noise label for each input element (word-specific morpheme) of the input source language text when noise is next to the element. In noise addition using a noise model, a noise label sequence l=(l 0 , l 1 ,..., L n ) is predicted from a morpheme sequence w=(w 0 , w 1 ,..., W n ) of a source language text. Think of it as a serial labeling problem.
 いま、「<Fえー>それでは会議を<Dを>始め<Fあー>ます」という学習用発話データを例にノイズモデルを学習する方法を説明する。ここで、学習用発話データに含まれる「<Fえー>」は「えー」がフィラー<F>に相当することを表している。この場合、まず、学習用発話データから形態素系列w=(<BOS>,それでは,会議,を,始め,ます,<EOS>)が抽出される。図3に示されるように形態素系列は、t=0~6までのタイムステップに各形態素(<BOS>及び<EOS>を含む)が対応付いている。次に、同じ学習用発話データであってノイズがアノテーションされた情報に基づいて、ノイズラベル系列l=(<F>,0,0,<D>,<F>,0,0)が生成される。最後に、形態素系列wからノイズラベル系列lを予測する系列ラベリング問題としてBiRNNを学習する。BiRNNでは、入力系列に対する出力系列の予測誤差が用いられ、パラメータ学習が行われる。 Now, I will explain the method of learning the noise model by taking the learning utterance data as "<F-E> Then start <D>> <F-A>". Here, “<F>” included in the learning utterance data indicates that “E” corresponds to the filler <F>. In this case, first, the morpheme sequence w=(<BOS>, then conference, start, <EOS>) is extracted from the learning utterance data. As shown in FIG. 3, in the morpheme sequence, each morpheme (including <BOS> and <EOS>) is associated with the time step from t=0 to 6. Next, a noise label sequence l=(<F>,0,0,<D>,<F>,0,0) is generated based on the same learning utterance data in which noise is annotated. It Finally, BiRNN is learned as a sequence labeling problem that predicts the noise label sequence 1 from the morpheme sequence w. In BiRNN, parameter learning is performed using the prediction error of the output sequence with respect to the input sequence.
 図2に戻り、翻訳モデル学習装置50は、翻訳用データ生成装置10において構築された疑似対訳コーパスを用いて翻訳モデルを学習する。翻訳モデルとしては、Transformer又はRNN-based Sequence-to-Sequenceモデル等を用いてもよい。 Returning to FIG. 2, the translation model learning device 50 learns a translation model using the pseudo-parallel translation corpus constructed in the translation data generation device 10. As the translation model, a Transformer or RNN-based Sequence-to-Sequence model or the like may be used.
 翻訳用データ生成装置10は、その機能として、解析部11と、ノイズ付与部12と、コーパス構築部13と、記憶部14とを備えている。 The translation data generation device 10 includes, as its functions, an analysis unit 11, a noise addition unit 12, a corpus construction unit 13, and a storage unit 14.
 解析部11は、対訳コーパスDB20から原言語テキストを取得し、取得した原言語テキストに対して形態素解析を行う。すなわち、例えば、解析部11は、「主要な高速道路よりも観光ルートの方を走りたいです。」という原言語テキストを取得すると、該原言語テキストについて形態素系列w=(主要,な,高速,道路,より,も,観光,ルート,の,方,を,走り,たい,です)を抽出する。 The analysis unit 11 acquires a source language text from the bilingual corpus DB 20 and performs a morphological analysis on the acquired source language text. That is, for example, when the analysis unit 11 acquires a source language text "I want to drive on a tourist route rather than a major expressway.", the morpheme sequence w=(main, major, high speed, Road, more than sightseeing, route, one, run, want).
 ノイズ付与部12は、原言語テキスト(詳細には解析部11が抽出した形態素系列)にノイズを付与してノイズ付与原言語テキストを得る。ノイズ付与部12は、ノイズモデル学習装置40によって学習されたノイズモデルを用いて、原言語テキストにノイズを付与する。ノイズ付与部12は、原言語テキストの各単語の特徴(具体的には形態素)に応じて、各形態素にノイズラベルを付与し、該ノイズラベルを該ノイズラベルに対応する単語(ノイズとしての単語)へ置き換えることにより、原言語テキストにノイズを付与する。ノイズ付与部12は、ノイズモデルを用いることにより、入力された原言語テキストの形態素系列に対応するノイズラベル系列を予測し、対応する形態素系列の次にノイズラベルを挿入する。そして、ノイズ付与部12は、挿入したノイズラベルを、ノイズを表す単語に置換し、最終的な出力であるノイズが付与された原言語テキストであるノイズ付き原言語テキストを得る。なお、ノイズ付与部12は、原言語テキストの形態素に応じてノイズラベルを付与するとして説明したがこれに限定されず、原言語テキストの各単語の品詞や読み(発音)に応じてノイズラベルを付与してもよい。また、ノイズ付与部12は、単語の形態素、品詞、及び読み等の2つ以上の情報に応じて、ノイズラベルを付与してもよい。 The noise adding unit 12 adds noise to the source language text (specifically, the morpheme sequence extracted by the analyzing unit 11) to obtain the noise adding source language text. The noise imparting unit 12 imparts noise to the source language text using the noise model learned by the noise model learning device 40. The noise imparting unit 12 imparts a noise label to each morpheme according to the characteristic (specifically, morpheme) of each word of the source language text, and assigns the noise label to the word corresponding to the noise label (the word as noise. ) To add noise to the source language text. The noise adding unit 12 predicts a noise label sequence corresponding to the morpheme sequence of the input source language text by using the noise model, and inserts a noise label next to the corresponding morpheme sequence. Then, the noise adding unit 12 replaces the inserted noise label with a word representing noise, and obtains a noise-added source language text that is the final output and is a source language text with noise added. The noise imparting unit 12 has been described as imparting a noise label according to the morpheme of the source language text, but the present invention is not limited to this, and the noise label is assigned according to the part of speech or reading (pronunciation) of each word in the source language text. May be given. Further, the noise adding unit 12 may add a noise label according to two or more pieces of information such as a morpheme of a word, a part of speech, and a reading.
 ノイズ付与部12は、具体的には、まず、原言語テキストの形態素系列をノイズモデルに入力し、各タイムステップ(各形態素系列)におけるノイズモデルの出力ベクトルhを取得する。本実施形態では、各タイムステップにおけるノイズラベルについて、単純にノイズラベルの事後確率が最大となるものを推定結果とするのではなく、出力ベクトルhに指数をとった値exp(h/τ)で定義される多項分布に基づくサンプリングにより決定する。すなわち、各タイムステップにおけるノイズラベルlは以下の(1)式に基づき推定される。
    l~exp(h/τ)・・・(1)
上記(1)式において、lはノイズラベルの推定結果、hはノイズモデルの出力ベクトル、τは温度パラメータである。出力ベクトルhは、3種類のラベルタイプ(<F>,<D>,0)についての3次元ベクトルで示される。温度パラメータτは、ノイズラベルのバリエーションの強弱を操作するためのパラメータである。温度パラメータτの値を大きく(τ→∞)するとノイズラベルの確率分布は一様分布に近づき、小さく(τ→0)すると最も高い確率のノイズラベルが選択されるようになる。
Specifically, the noise adding unit 12 first inputs the morpheme sequence of the source language text into the noise model, and acquires the output vector h t of the noise model at each time step (each morpheme sequence). In the present embodiment, regarding the noise label at each time step, the one that has the maximum posterior probability of the noise label is not simply used as the estimation result, but a value exp(h t /τ that is an exponent of the output vector h t ) Is determined by sampling based on the multinomial distribution defined in (1). That is, the noise label l t at each time step is estimated based on the following equation (1).
l t to exp(h t /τ) (1)
In the above formula (1), l t is the estimation result of the noise label, h t is the output vector of the noise model, and τ is the temperature parameter. The output vector h t is represented by a three-dimensional vector for three label types (<F>, <D>, 0). The temperature parameter τ is a parameter for operating the strength of variation of the noise label. When the value of the temperature parameter τ is increased (τ→∞), the noise label probability distribution approaches a uniform distribution, and when the temperature parameter τ is decreased (τ→0), the noise label with the highest probability is selected.
 例えば温度パラメータτが比較的小さい場合のノイズラベルの決定について説明する。いま、ノイズモデルの出力ベクトルh=(-0.1(0の重みスコア),0.3(<F>の重みスコア),-0.3(<D>の重みスコア))であり、温度パラメータτ=0.15であるとする。この場合、h/τ=(-0.6666…,2,-2)となる。各ノイズラベルの重みスコアを0以上とすべく指数をとると、exp(h/τ)=(0.51,7.39,0.13)となる。重みスコアを確率値として扱うべく値域が[0,1]且つ全ての値を足して1になるように正規化すると、確率分布は(0.06(0がノイズラベルとして選ばれる確率),0.92(<F>がノイズラベルとして選ばれる確率),0.02(<D>がノイズラベルとして選ばれる確率))となる。このような確率分布(多項分布)に基づきノイズラベルを1回だけサンプリング(試行)することは、カテゴリカル分布からのサンプリングに相当する。この場合、ノイズラベル<F>の確立が92%と極めて高く、サンプリング結果として選択される可能性が極めて高い。 For example, the determination of the noise label when the temperature parameter τ is relatively small will be described. Now, the output vector h t of the noise model = (-0.1 (weight score of 0), 0.3 (weight score of <F>), -0.3 (weight score of <D>)), It is assumed that the temperature parameter τ=0.15. In this case, h t / τ =(−0.6666..., 2,−2). Taking an index so that the weight score of each noise label is 0 or more, exp(h t /τ)=(0.51, 7.39, 0.13). When the weighted score is treated as a probability value and normalized so that the range is [0, 1] and all values are added, the probability distribution becomes (0.06 (probability that 0 is selected as a noise label), 0 .92 (the probability that <F> is selected as the noise label) and 0.02 (the probability that <D> is selected as the noise label)). Sampling (trial) the noise label only once based on such a probability distribution (multinomial distribution) corresponds to sampling from the categorical distribution. In this case, the establishment of the noise label <F> is extremely high at 92%, and the possibility of being selected as the sampling result is extremely high.
 例えば温度パラメータτが比較的大きい場合のノイズラベルの決定について説明する。いま、ノイズモデルの出力ベクトルh=(-0.1(0の重みスコア),0.3(<F>の重みスコア),-0.3(<D>の重みスコア))であり、温度パラメータτ=1.0であるとする。この場合、h/τ=(-0.1,0.3,-0.3)となる。各ノイズラベルの重みスコアを0以上とすべく指数をとると、exp(h/τ)=(0.90,1.35,0.74)となる。重みスコアを確率値として扱うべく値域が[0,1]且つ全ての値を足して1になるように正規化すると、確率分布は(0.30(0がノイズラベルとして選ばれる確率),0.45(<F>がノイズラベルとして選ばれる確率),0.25(<D>がノイズラベルとして選ばれる確率))となる。このように、温度パラメータτを大きくすると、上述した温度パラメータτ=0.15の場合と比較して、ノイズラベル0及びノイズラベル<D>が選択されやすくなっていることがわかる。温度パラメータτが∞に近づくほど、各ノイズラベルの確立は33.333…%に近づき、確率分布が一様分布に近づく。 For example, the determination of the noise label when the temperature parameter τ is relatively large will be described. Now, the output vector h t of the noise model = (-0.1 (weight score of 0), 0.3 (weight score of <F>), -0.3 (weight score of <D>)), It is assumed that the temperature parameter τ=1.0. In this case, h t /τ=(−0.1, 0.3, −0.3). When the weight score of each noise label take exponential order to zero or more, the exp (h t /τ)=(0.90,1.35,0.74). When the weighted score is treated as a probability value and normalized so that the range is [0, 1] and all values are added to become 1, the probability distribution is (0.30 (0 is the probability that 0 is selected as a noise label), 0 .45 (the probability that <F> is selected as a noise label) and 0.25 (the probability that <D> is selected as a noise label)). Thus, it can be seen that when the temperature parameter τ is increased, the noise label 0 and the noise label <D> are more easily selected as compared with the case where the temperature parameter τ=0.15 described above. As the temperature parameter τ approaches ∞, the probability of establishment of each noise label approaches 33.333...%, and the probability distribution approaches a uniform distribution.
 このように、ノイズ付与部12は、原言語テキストの各単語の特徴(形態素系列)を入力としてノイズモデルから出力される各ノイズラベルのスコアに基づく確率分布に従ってノイズラベルをサンプリングし、原言語テキストに付与するノイズラベルを決定している。なお、上述した説明においては、ノイズモデルの出力値を基に定義される確率分布が多項分布を表すとして説明したが、これに限定されず、確率分布はポアソン分布又は正規分布等を表すものであってもよい。 As described above, the noise adding unit 12 samples the noise label according to the probability distribution based on the score of each noise label output from the noise model, with the feature (morpheme sequence) of each word of the source language text as an input, and the source language text. The noise label to be given to is determined. In the above description, the probability distribution defined based on the output value of the noise model is described as representing a polynomial distribution, but the present invention is not limited to this, and the probability distribution represents a Poisson distribution, a normal distribution, or the like. It may be.
 ノイズ付与部12は、つづいて、ノイズモデルを用いて予測したノイズラベル系列を、ノイズを表す単語に置き換える。ノイズ付与部12は、例えば、各ノイズラベルに対応する語彙集合Vtypeからユニグラム確率に基づきサンプリングを行う。例えば、フィラーのノイズラベル<F>を、フィラーを表す単語へ置換する場合、以下の(2)式に基づきフィラーを表す単語が決定される。
    w´~V<F>・・・(2)
上記(2)式において、V<F>はノイズラベル<F>の語彙集合、w´はタイムステップtに挿入されるフィラー(ノイズ)を表す単語である。以上によって、原言語テキストの形態素系列w=(w,w,…,w)からノイズを表す単語を含む系列w´=(w,w,w´,w,w´,…,wn)を得る。
The noise adding unit 12 then replaces the noise label sequence predicted using the noise model with a word representing noise. The noise adding unit 12 performs sampling, for example, from the vocabulary set V type corresponding to each noise label based on the unigram probability. For example, when replacing the noise label <F> of the filler with the word representing the filler, the word representing the filler is determined based on the following equation (2).
w t ′~V <F>・・・(2)
In the above formula (2), V <F> is a vocabulary set of noise labels <F>, and w t ′ is a word representing a filler (noise) inserted at time step t. From the above, from the morpheme sequence w=(w 0 , w 1 ,..., W n ) of the source language text, a sequence w′=(w 0 , w 1 , w 1 ′, w 2 , w 2 ) including a word expressing noise ,', wn) is obtained.
 ノイズ付与部12は、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得る。ノイズ付与部12は、例えば、1つのノイズラベルに対して置き換える単語(ノイズを表す単語)を複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得てもよい。また、ノイズ付与部12は、例えば各形態素に対応するノイズラベルを複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得てもよい。 The noise imparting unit 12 obtains a plurality of patterns of noise imparting source language text from one source language text. The noise adding unit 12 may derive, for example, a plurality of patterns of words (words representing noise) to be replaced with respect to one noise label, and obtain a plurality of patterns of noise added source language text from one source language text. Further, the noise adding unit 12 may derive a plurality of patterns of noise labels corresponding to each morpheme, and obtain a plurality of patterns of noise-added source language text from one source language text.
 コーパス構築部13は、ノイズ付与原言語テキストと、該ノイズ付与原言語テキストのノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスを構築する。図5は、疑似対訳コーパスの構築イメージを示す図である。図5に示される例では、ノイズ付与前の原言語テキストである「主要な高速道路よりも観光ルートの方を走りたいです」についてのノイズ付与原言語テキスト(「えー主要な高速道路よりも観光ルートの方を走りたいです」等の7つのノイズ付与原言語テキスト)と、ノイズ付与前の原言語テキストに対応する目的言語テキストである「I would rather take a scenic route than a main highway.」とが対応付けられた(対訳ペアとした)疑似対訳コーパスが構築されている。 The corpus construction unit 13 constructs a pseudo bilingual corpus in which the noise-added source language text and the target language text corresponding to the noise-added source language text before the noise addition are associated with each other. FIG. 5 is a diagram showing an image of constructing a pseudo bilingual corpus. In the example shown in FIG. 5, the noise-added source language text “I want to drive the tourist route better than the main expressway”, which is the source language text before adding noise (“Early tourism than main expressways I want to run on the route”, etc.), and the target language text corresponding to the source language text before noise addition, “I would rather take a scenic route than a main highway.” A pseudo-translation corpus (corresponding to a translation pair) associated with is constructed.
 記憶部14は、コーパス構築部13によって構築された疑似対訳コーパスを記憶するDBである。翻訳モデル学習装置50は、記憶部14に記憶されている疑似対訳コーパスを用いて翻訳モデルを学習する。 The storage unit 14 is a DB that stores the pseudo bilingual corpus constructed by the corpus construction unit 13. The translation model learning device 50 learns a translation model using the pseudo-parallel translation corpus stored in the storage unit 14.
 次に、図6を参照して、翻訳用データ生成システム1が実行する処理を説明する。図6は、翻訳用データ生成システム1が実行する処理を示すフローチャートである。なお、図6に示される処理が実行される前提として、ノイズモデル学習装置40によってノイズモデルが構築(学習)されているものとする。 Next, the processing executed by the translation data generation system 1 will be described with reference to FIG. FIG. 6 is a flowchart showing the processing executed by the translation data generation system 1. It is assumed that the noise model is constructed (learned) by the noise model learning device 40 as a premise for executing the processing shown in FIG.
 図6に示されるように、翻訳用データ生成システム1では、まず、翻訳用データ生成装置10の解析部11が対訳コーパスDB20から原言語テキストを取得する(ステップS1)。つづいて、解析部11は、取得した原言語テキストに対して形態素解析を実行する(ステップS2)。 As shown in FIG. 6, in the translation data generation system 1, first, the analysis unit 11 of the translation data generation device 10 acquires the source language text from the parallel translation corpus DB 20 (step S1). Subsequently, the analysis unit 11 executes morphological analysis on the acquired source language text (step S2).
 つづいて、翻訳用データ生成装置10のノイズ付与部12は、解析部11が抽出した形態素系列に対してノイズを付与し、ノイズ付与原言語テキストを得る(ステップS3)。詳細には、ノイズ付与部12は、ノイズモデルを用いることにより、入力された原言語テキストの形態素系列に対応するノイズラベル系列を予測し、対応する形態素系列の次にノイズラベルを挿入する。そして、ノイズ付与部12は、挿入したノイズラベルを、ノイズを表す単語に置換し、最終的な出力であるノイズが付与された原言語テキストであるノイズ付き原言語テキストを得る。 Next, the noise adding unit 12 of the translation data generating device 10 adds noise to the morpheme sequence extracted by the analyzing unit 11 to obtain a noise-added source language text (step S3). Specifically, the noise adding unit 12 predicts a noise label sequence corresponding to the morpheme sequence of the input source language text by using the noise model, and inserts a noise label next to the corresponding morpheme sequence. Then, the noise adding unit 12 replaces the inserted noise label with a word representing noise, and obtains a noise-added source language text that is the final output and is a source language text with noise added.
 つづいて、翻訳用データ生成装置10のコーパス構築部13は、ノイズ付与原言語テキストと、該ノイズ付与原言語テキストのノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスを構築する(ステップS4)。 Subsequently, the corpus construction unit 13 of the translation data generation device 10 associates the noise-added source language text with the target language text corresponding to the noise-added source language text before the noise addition. A corpus is constructed (step S4).
 最後に、翻訳モデル学習装置50は、コーパス構築部13によって構築された疑似対訳コーパスを用いて翻訳モデルを学習する(ステップS5)。以上が、翻訳用データ生成システム1が実行する処理の一例である。 Finally, the translation model learning device 50 learns a translation model using the pseudo bilingual corpus constructed by the corpus construction unit 13 (step S5). The above is an example of the processing executed by the translation data generation system 1.
 次に、本実施形態の作用効果について説明する。 Next, the function and effect of this embodiment will be described.
 本実施形態に係る翻訳用データ生成システム1は、原言語テキストにノイズを付与してノイズ付与原言語テキストを得るノイズ付与部12と、ノイズ付与原言語テキストと、該ノイズ付与原言語テキストのノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスを構築するコーパス構築部13と、を備える。 The translation data generation system 1 according to the present embodiment includes a noise adding unit 12 that adds noise to a source language text to obtain a noise added source language text, a noise added source language text, and noise of the noise added source language text. And a corpus constructing unit 13 for constructing a pseudo bilingual corpus in which the target language text corresponding to the source language text before being assigned is associated.
 本実施形態に係る翻訳用データ生成システム1では、原言語テキストにノイズが付与され、ノイズ付与原言語テキストとノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスが構築される。このように、ノイズ付与原言語テキストがノイズ付与前の原言語テキストに対応する目的言語テキストに対応付けられた対訳コーパスが構築されることにより、このような対訳コーパスを利用して、例えば自然発話入力にフィラー等のノイズが含まれている場合においても、ノイズ付与前の原言語テキストに対応する目的言語テキストを適切に導出することが可能となる。すなわち、本実施形態に係る翻訳用データ生成システム1によれば、ノイズが含まれる自然発話に対して頑健なコーパス(疑似対訳コーパス)を構築することができ、ノイズが含まれる非流暢な自然発話に対しても高精度に翻訳を行うことができる。なお、このような翻訳用データ生成システム1により生成された情報が翻訳に用いられる場合には、利用者の発話内容を修正して翻訳モデルに入力する必要がなく、利用者の発話内容をそのまま翻訳モデルに入力することができる。また、例えば、特開2010-079647号公報及び特開2007-057844号公報に記載されたシステムでは、音声認識装置を用いて、逐次利用者の発話を受け取り言い直し判定を行っているが、本実施形態に係る翻訳用データ生成システム1では音声認識装置が不要であり、認識結果のテキスト情報のみが利用できればよい。このように、本実施形態に係る翻訳用データ生成システム1では、発話内容の修正処理や言い直し判定処理が実施されることを抑制できるため、CPU等の処理部における処理負荷を軽減するという技術的効果も併せて奏する。 In the translation data generation system 1 according to the present embodiment, noise is added to the source language text, and the pseudo-translation corpus that associates the noise-added source language text with the target language text corresponding to the source language text before the noise addition is created. Be built. In this way, by constructing the bilingual corpus in which the source language text with noise is associated with the target language text corresponding to the source language text before adding noise, using such a bilingual corpus, for example, natural speech Even when noise such as a filler is included in the input, it is possible to appropriately derive the target language text corresponding to the source language text before noise addition. That is, according to the translation data generation system 1 according to the present embodiment, a robust corpus (pseudo-translation corpus) can be constructed with respect to natural utterances containing noise, and non-fluent natural utterances containing noise. Can be translated with high accuracy. When the information generated by the translation data generating system 1 is used for translation, it is not necessary to correct the user's utterance content and input it into the translation model, and the user's utterance content is not changed. You can enter the translation model. Further, for example, in the systems described in Japanese Patent Laid-Open Nos. 2010-079647 and 2007-057844, a speech recognition device is used to sequentially receive a user's utterance and make a rephrasing determination. The translation data generation system 1 according to the embodiment does not require a voice recognition device, and only needs to use the text information of the recognition result. As described above, in the translation data generation system 1 according to the present embodiment, it is possible to suppress the correction processing of the utterance content and the rewording determination processing, so that the processing load on the processing unit such as the CPU is reduced. It also plays a dynamic effect.
 図7は、本実施形態及び比較例の翻訳例を示す表である。図7の上段に示されるように、ノイズが含まれる自然発話入力に対して、比較例では訳抜けが生じている。また、図7の下段に示されるように、ノイズが含まれる自然発話入力に対して、比較例ではノイズを含めた状態で翻訳しており、所望の翻訳を行うことができていない。比較例に示されるように、従来、ノイズが含まれる自然発話に対して高精度に翻訳を行うことは困難であった。この点、図7の上段及び下段に示されるように、本実施形態の翻訳用データ生成システム1によって構築された疑似対訳コーパスが考慮されて翻訳が行われた場合には、ノイズが含まれる自然発話に対しても翻訳誤りが起きにくく、高精度に翻訳を行うことができる。 FIG. 7 is a table showing a translation example of this embodiment and a comparative example. As shown in the upper part of FIG. 7, there is a lack of translation in the comparative example with respect to the natural utterance input including noise. Further, as shown in the lower part of FIG. 7, a natural utterance input including noise is translated in a state including noise in the comparative example, and desired translation cannot be performed. As shown in the comparative example, conventionally, it has been difficult to accurately translate a natural utterance containing noise. In this regard, as shown in the upper and lower parts of FIG. 7, when the translation is performed in consideration of the pseudo bilingual corpus constructed by the translation data generation system 1 of the present embodiment, noise is naturally included. Translation errors are unlikely to occur even when uttered, and translation can be performed with high accuracy.
 翻訳用データ生成システム1は、疑似対訳コーパスを用いて翻訳モデルを学習する翻訳モデル学習装置50を備えている。構築したコーパスに基づいて翻訳モデルが学習されることにより、ノイズが含まれる自然発話に対してより高精度に翻訳を行うことができる。 The translation data generation system 1 includes a translation model learning device 50 that learns a translation model using a pseudo bilingual corpus. By learning the translation model based on the constructed corpus, it is possible to translate the natural utterance containing noise with higher accuracy.
 翻訳用データ生成システム1は、ノイズを含んだ原言語テキスト群である訓練データを用いて、原言語テキストに対するノイズの付与に係るノイズモデルを学習するノイズモデル学習装置40を備え、ノイズ付与部12は、ノイズモデルを用いて、原言語テキストにノイズを付与する。予めノイズが含まれている原言語テキスト群に基づきノイズモデルが学習され、該ノイズモデルに基づいてノイズの付与が行われることによって、実際に含まれる可能性が高いノイズが付与され易くなり、翻訳精度をより向上させることができる。 The translation data generation system 1 includes a noise model learning device 40 that learns a noise model related to noise addition to a source language text by using training data that is a source language text group including noise, and the noise addition unit 12 Adds noise to the source language text using a noise model. A noise model is learned based on a source language text group that includes noise in advance, and noise is added based on the noise model, so that noise that is likely to be actually included is easily added. The accuracy can be further improved.
 翻訳用データ生成システム1において、ノイズ付与部12は、原言語テキストの各単語に、ノイズのタイプを示すノイズラベルを付与し、該ノイズラベルを該ノイズラベルに対応する単語へ置き換えることにより、原言語テキストにノイズを付与する。原言語テキストの各単語に応じたノイズラベルが付与された後に該ノイズラベルに応じた単語(ノイズ)が導出されることにより、ノイズ付与の容易性及び妥当性を担保することができる。 In the translation data generation system 1, the noise imparting unit 12 imparts a noise label indicating the type of noise to each word of the source language text, and replaces the noise label with the word corresponding to the noise label. Add noise to linguistic text. Since the noise label corresponding to each word of the source language text is added and then the word (noise) corresponding to the noise label is derived, it is possible to ensure the easiness and validity of the noise addition.
 翻訳用データ生成システム1において、ノイズ付与部12は、1つのノイズラベルに対して置き換える単語を複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得る。これにより、1つの原言語テキストから効率的に疑似対訳コーパスを充実させ、翻訳精度をより向上させることができる。 In the translation data generation system 1, the noise adding unit 12 derives a plurality of patterns of words to be replaced with respect to one noise label, and obtains a plurality of patterns of noise adding source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
 翻訳用データ生成システム1において、ノイズ付与部12は、各単語に対応するノイズラベルを複数パターン導出し、1つの原言語テキストから複数パターンのノイズ付与原言語テキストを得る。これにより、1つの原言語テキストから効率的に疑似対訳コーパスを充実させ、翻訳精度をより向上させることができる。 In the translation data generation system 1, the noise adding unit 12 derives a plurality of patterns of noise labels corresponding to each word, and obtains a plurality of patterns of noise-added source language text from one source language text. As a result, it is possible to efficiently enhance the pseudo bilingual corpus from one source language text and further improve the translation accuracy.
 翻訳用データ生成システム1において、ノイズ付与部12は、原言語テキストの各単語の特徴に応じて、ノイズラベルを付与する。これにより、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the translation data generation system 1, the noise adding unit 12 adds a noise label according to the characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 翻訳用データ生成システム1において、ノイズ付与部12は、原言語テキストの各単語の特徴である、形態素、品詞、及び単語の読みの少なくとも一つに応じて、ノイズラベルを付与する。これにより、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the translation data generation system 1, the noise adding unit 12 adds a noise label according to at least one of a morpheme, a part of speech, and a word reading, which are characteristics of each word in the source language text. This makes it possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 翻訳用データ生成システム1において、ノイズ付与部12は、原言語テキストの各単語の特徴を入力としてノイズモデルから出力される各ノイズラベルのスコアに基づく各ノイズレベルの確率分布に従ってノイズラベルをサンプリングし、原言語テキストに付与するノイズラベルを決定する。これにより、例えばノイズモデルから出力されたスコアが高いノイズラベルを付与することが可能となり、各単語に関連して含まれやすいノイズに係るノイズラベルを、各単語に適切に付与することができる。 In the translation data generation system 1, the noise adding unit 12 samples the noise label according to the probability distribution of each noise level based on the score of each noise label output from the noise model with the feature of each word of the source language text as an input. , Determine the noise label added to the source language text. As a result, for example, it is possible to give a noise label having a high score output from the noise model, and it is possible to appropriately give each word a noise label relating to noise that is likely to be included in association with each word.
 最後に、翻訳用データ生成装置10のハードウェア構成について、図8を参照して説明する。上述の翻訳用データ生成装置10は、物理的には、プロセッサ1001、メモリ1002、ストレージ1003、通信装置1004、入力装置1005、出力装置1006、バス1007などを含むコンピュータ装置として構成されてもよい。 Finally, the hardware configuration of the translation data generation device 10 will be described with reference to FIG. The translation data generation device 10 described above may be physically configured as a computer device including a processor 1001, a memory 1002, a storage 1003, a communication device 1004, an input device 1005, an output device 1006, a bus 1007, and the like.
 なお、以下の説明では、「装置」という文言は、回路、デバイス、ユニットなどに読み替えることができる。翻訳用データ生成装置10のハードウェア構成は、図に示した各装置を1つ又は複数含むように構成されてもよいし、一部の装置を含まずに構成されてもよい。 Note that in the following description, the word "device" can be read as a circuit, device, unit, or the like. The hardware configuration of the translation data generating device 10 may be configured to include one or a plurality of each device illustrated in the figure, or may be configured not to include some devices.
 翻訳用データ生成装置10における各機能は、プロセッサ1001、メモリ1002などのハードウェア上に所定のソフトウェア(プログラム)を読み込ませることで、プロセッサ1001が演算を行い、通信装置1004による通信や、メモリ1002及びストレージ1003におけるデータの読み出し及び/又は書き込みを制御することで実現される。 Each function in the translation data generation device 10 causes a predetermined software (program) to be loaded on hardware such as the processor 1001 and the memory 1002, so that the processor 1001 performs an operation to perform communication by the communication device 1004 and the memory 1002. It is realized by controlling the reading and/or writing of data in the storage 1003.
 プロセッサ1001は、例えば、オペレーティングシステムを動作させてコンピュータ全体を制御する。プロセッサ1001は、周辺装置とのインターフェース、制御装置、演算装置、レジスタなどを含む中央処理装置(CPU:Central Processing Unit)で構成されてもよい。例えば、翻訳用データ生成装置10のノイズ付与部12等の制御機能はプロセッサ1001で実現されてもよい。 The processor 1001 operates an operating system to control the entire computer, for example. The processor 1001 may be configured by a central processing unit (CPU) including an interface with peripheral devices, a control device, a calculation device, a register, and the like. For example, the control function of the noise adding unit 12 and the like of the translation data generating device 10 may be realized by the processor 1001.
 また、プロセッサ1001は、プログラム(プログラムコード)、ソフトウェアモジュールやデータを、ストレージ1003及び/又は通信装置1004からメモリ1002に読み出し、これらに従って各種の処理を実行する。プログラムとしては、上述の実施の形態で説明した動作の少なくとも一部をコンピュータに実行させるプログラムが用いられる。例えば、翻訳用データ生成装置10のノイズ付与部12等の制御機能は、メモリ1002に格納され、プロセッサ1001で動作する制御プログラムによって実現されてもよく、他の機能ブロックについても同様に実現されてもよい。上述の各種処理は、1つのプロセッサ1001で実行される旨を説明してきたが、2以上のプロセッサ1001により同時又は逐次に実行されてもよい。プロセッサ1001は、1以上のチップで実装されてもよい。なお、プログラムは、電気通信回線を介してネットワークから送信されても良い。 Further, the processor 1001 reads a program (program code), a software module and data from the storage 1003 and/or the communication device 1004 into the memory 1002, and executes various processes according to these. As the program, a program that causes a computer to execute at least part of the operations described in the above-described embodiments is used. For example, the control function of the noise adding unit 12 and the like of the translation data generating device 10 may be realized by a control program stored in the memory 1002 and operated by the processor 1001, and other function blocks are similarly realized. Good. Although it has been described that the various processes described above are executed by one processor 1001, they may be executed simultaneously or sequentially by two or more processors 1001. The processor 1001 may be implemented by one or more chips. The program may be transmitted from the network via an electric communication line.
 メモリ1002は、コンピュータ読み取り可能な記録媒体であり、例えば、ROM(Read Only Memory)、EPROM(Erasable Programmable ROM)、EEPROM(Electrically Erasable Programmable ROM)、RAM(Random Access Memory)などの少なくとも1つで構成されてもよい。メモリ1002は、レジスタ、キャッシュ、メインメモリ(主記憶装置)などと呼ばれてもよい。メモリ1002は、本発明の一実施の形態に係る無線通信方法を実施するために実行可能なプログラム(プログラムコード)、ソフトウェアモジュールなどを保存することができる。 The memory 1002 is a computer-readable recording medium, and is composed of, for example, at least one of ROM (Read Only Memory), EPROM (Erasable Programmable ROM), EEPROM (ElectricallyErasable Programmable ROM), RAM (Random Access Memory), and the like. May be done. The memory 1002 may be called a register, a cache, a main memory (main storage device), or the like. The memory 1002 can store an executable program (program code), a software module, etc. for implementing the wireless communication method according to the embodiment of the present invention.
 ストレージ1003は、コンピュータ読み取り可能な記録媒体であり、例えば、CD-ROM(Compact Disc ROM)などの光ディスク、ハードディスクドライブ、フレキシブルディスク、光磁気ディスク(例えば、コンパクトディスク、デジタル多用途ディスク、Blu-ray(登録商標)ディスク)、スマートカード、フラッシュメモリ(例えば、カード、スティック、キードライブ)、フロッピー(登録商標)ディスク、磁気ストリップなどの少なくとも1つで構成されてもよい。ストレージ1003は、補助記憶装置と呼ばれてもよい。上述の記憶媒体は、例えば、メモリ1002及び/又はストレージ1003を含むデータベース、サーバその他の適切な媒体であってもよい。 The storage 1003 is a computer-readable recording medium, for example, an optical disc such as a CD-ROM (Compact Disc ROM), a hard disc drive, a flexible disc, a magneto-optical disc (for example, a compact disc, a digital versatile disc, a Blu-ray disc). (Registered trademark) disk), smart card, flash memory (eg, card, stick, key drive), floppy (registered trademark) disk, magnetic strip, and the like. The storage 1003 may be called an auxiliary storage device. The storage medium described above may be, for example, a database including the memory 1002 and/or the storage 1003, a server, or another appropriate medium.
 通信装置1004は、有線及び/又は無線ネットワークを介してコンピュータ間の通信を行うためのハードウェア(送受信デバイス)であり、例えばネットワークデバイス、ネットワークコントローラ、ネットワークカード、通信モジュールなどともいう。 The communication device 1004 is hardware (transmission/reception device) for performing communication between computers via a wired and/or wireless network, and is also called, for example, a network device, a network controller, a network card, a communication module, or the like.
 入力装置1005は、外部からの入力を受け付ける入力デバイス(例えば、キーボード、マウス、マイクロフォン、スイッチ、ボタン、センサなど)である。出力装置1006は、外部への出力を実施する出力デバイス(例えば、ディスプレイ、スピーカー、LEDランプなど)である。なお、入力装置1005及び出力装置1006は、一体となった構成(例えば、タッチパネル)であってもよい。 The input device 1005 is an input device (for example, a keyboard, a mouse, a microphone, a switch, a button, a sensor, etc.) that receives an input from the outside. The output device 1006 is an output device (for example, a display, a speaker, an LED lamp, etc.) that performs output to the outside. The input device 1005 and the output device 1006 may be integrated (for example, a touch panel).
 また、プロセッサ1001やメモリ1002などの各装置は、情報を通信するためのバス1007で接続される。バス1007は、単一のバスで構成されてもよいし、装置間で異なるバスで構成されてもよい。 Each device such as the processor 1001 and the memory 1002 is connected by a bus 1007 for communicating information. The bus 1007 may be configured with a single bus or different buses among devices.
 また、翻訳用データ生成装置10は、マイクロプロセッサ、デジタル信号プロセッサ(DSP:Digital Signal Processor)、ASIC(Application Specific Integrated Circuit)、PLD(Programmable Logic Device)、FPGA(Field Programmable Gate Array)などのハードウェアを含んで構成されてもよく、当該ハードウェアにより、各機能ブロックの一部又は全てが実現されてもよい。例えば、プロセッサ1001は、これらのハードウェアの少なくとも1つで実装されてもよい。 The translation data generation device 10 includes hardware such as a microprocessor, a digital signal processor (DSP), an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), and an FPGA (Field Programmable Gate Array). May be included, and a part or all of each functional block may be realized by the hardware. For example, processor 1001 may be implemented with at least one of these hardware.
 以上、本実施形態について詳細に説明したが、当業者にとっては、本実施形態が本明細書中に説明した実施形態に限定されるものではないということは明らかである。本実施形態は、特許請求の範囲の記載により定まる本発明の趣旨及び範囲を逸脱することなく修正及び変更態様として実施することができる。したがって、本明細書の記載は、例示説明を目的とするものであり、本実施形態に対して何ら制限的な意味を有するものではない。例えば、本発明の一態様に係る翻訳用データ生成システムは、事前に定義したノイズ単語(フィラー、言い淀み、言い直し等)を原言語テキストのランダムな位置に付与するものであってもよい。ランダムな位置に付与する単語(ノイズ)は、例えばノイズ単語候補からランダムに選択されてもよい。このような構成においては、ノイズモデルの学習データ(ラベル付きデータ)がなくても、ノイズ単語が定義できさえすれば、ノイズをランダムに付与するノイズモデルを構築することができる。 Although the present embodiment has been described above in detail, it is obvious to those skilled in the art that the present embodiment is not limited to the embodiment described in this specification. The present embodiment can be implemented as modified and changed modes without departing from the spirit and scope of the present invention defined by the description of the claims. Therefore, the description of the present specification is for the purpose of exemplifying explanation, and does not have any restrictive meaning to the present embodiment. For example, the translation data generation system according to one aspect of the present invention may add a noise word (filler, stagnation, rewording, etc.) defined in advance to a random position of the source language text. The word (noise) to be added to the random position may be randomly selected from noise word candidates, for example. In such a configuration, even if there is no noise model learning data (labeled data), a noise model that randomly adds noise can be constructed as long as the noise word can be defined.
 本明細書で説明した各態様/実施形態は、LTE(Long Term Evolution)、LTE-A(LTE-Advanced)、SUPER 3G、IMT-Advanced、4G、5G、FRA(Future Radio Access)、W-CDMA(登録商標)、GSM(登録商標)、CDMA2000、UMB(Ultra Mobile Broad-band)、IEEE 802.11(Wi-Fi)、IEEE 802.16(WiMAX)、IEEE 802.20、UWB(Ultra-Wide Band)、Bluetooth(登録商標)、その他の適切なシステムを利用するシステム及び/又はこれらに基づいて拡張された次世代システムに適用されてもよい。 Each aspect/embodiment described in this specification is LTE (Long Term Evolution), LTE-A (LTE-Advanced), SUPER 3G, IMT-Advanced, 4G, 5G, FRA (Future Radio Access), W-CDMA. (Registered trademark), GSM (registered trademark), CDMA2000, UMB (Ultra Mobile Broad-band), IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802.20, UWB (Ultra-Wide) Band), Bluetooth (registered trademark), a system using other appropriate systems, and/or a next-generation system extended based on the above.
 本明細書で説明した各態様/実施形態の処理手順、フローチャートなどは、矛盾の無い限り、順序を入れ替えてもよい。例えば、本明細書で説明した方法については、例示的な順序で様々なステップの要素を提示しており、提示した特定の順序に限定されない。 As long as there is no contradiction, the order of the processing procedures, flowcharts, etc. of each aspect/embodiment described in this specification may be changed. For example, the methods described herein present elements of the various steps in a sample order, and are not limited to the specific order presented.
 入出力された情報等は特定の場所(例えば、メモリ)に保存されてもよいし、管理テーブルで管理してもよい。入出力される情報等は、上書き、更新、または追記され得る。出力された情報等は削除されてもよい。入力された情報等は他の装置へ送信されてもよい。 Information that has been input and output may be stored in a specific location (for example, memory), or may be managed in a management table. Information that is input/output may be overwritten, updated, or added. The output information and the like may be deleted. The input information and the like may be transmitted to another device.
 判定は、1ビットで表される値(0か1か)によって行われてもよいし、真偽値(Boolean:trueまたはfalse)によって行われてもよいし、数値の比較(例えば、所定の値との比較)によって行われてもよい。 The determination may be performed by a value represented by 1 bit (whether 0 or 1), may be performed by a Boolean value (Boolean: true or false), and may be performed by comparing numerical values (for example, a predetermined value). Value comparison).
 本明細書で説明した各態様/実施形態は単独で用いてもよいし、組み合わせて用いてもよいし、実行に伴って切り替えて用いてもよい。また、所定の情報の通知(例えば、「Xであること」の通知)は、明示的に行うものに限られず、暗黙的(例えば、当該所定の情報の通知を行わない)ことによって行われてもよい。 Each aspect/embodiment described in the present specification may be used alone, may be used in combination, or may be switched according to execution. Further, the notification of the predetermined information (for example, the notification of “being X”) is not limited to the explicit notification, and is performed implicitly (for example, the notification of the predetermined information is not performed). Good.
 ソフトウェアは、ソフトウェア、ファームウェア、ミドルウェア、マイクロコード、ハードウェア記述言語と呼ばれるか、他の名称で呼ばれるかを問わず、命令、命令セット、コード、コードセグメント、プログラムコード、プログラム、サブプログラム、ソフトウェアモジュール、アプリケーション、ソフトウェアアプリケーション、ソフトウェアパッケージ、ルーチン、サブルーチン、オブジェクト、実行可能ファイル、実行スレッド、手順、機能などを意味するよう広く解釈されるべきである。 Software, whether called software, firmware, middleware, microcode, hardware description language, or any other name, instructions, instruction sets, code, code segments, program code, programs, subprograms, software modules. , Application, software application, software package, routine, subroutine, object, executable, thread of execution, procedure, function, etc. should be construed broadly.
 また、ソフトウェア、命令などは、伝送媒体を介して送受信されてもよい。例えば、ソフトウェアが、同軸ケーブル、光ファイバケーブル、ツイストペア及びデジタル加入者回線(DSL)などの有線技術及び/又は赤外線、無線及びマイクロ波などの無線技術を使用してウェブサイト、サーバ、又は他のリモートソースから送信される場合、これらの有線技術及び/又は無線技術は、伝送媒体の定義内に含まれる。 Also, software, instructions, etc. may be transmitted and received via a transmission medium. For example, the software may use wired technologies such as coaxial cable, fiber optic cable, twisted pair and digital subscriber line (DSL) and/or wireless technologies such as infrared, wireless and microwave to websites, servers, or other When transmitted from a remote source, these wireline and/or wireless technologies are included within the definition of transmission medium.
 本明細書で説明した情報、信号などは、様々な異なる技術のいずれかを使用して表されてもよい。例えば、上記の説明全体に渡って言及され得るデータ、命令、コマンド、情報、信号、ビット、シンボル、チップなどは、電圧、電流、電磁波、磁界若しくは磁性粒子、光場若しくは光子、又はこれらの任意の組み合わせによって表されてもよい。 The information, signals, etc. described herein may be represented using any of a variety of different technologies. For example, data, instructions, commands, information, signals, bits, symbols, chips, etc. that may be referred to throughout the above description include voltage, current, electromagnetic waves, magnetic fields or magnetic particles, optical fields or photons, or any of these. May be represented by a combination of
 なお、本明細書で説明した用語及び/又は本明細書の理解に必要な用語については、同一の又は類似する意味を有する用語と置き換えてもよい。 The terms described in this specification and/or the terms necessary for understanding this specification may be replaced with terms having the same or similar meanings.
 また、本明細書で説明した情報、パラメータなどは、絶対値で表されてもよいし、所定の値からの相対値で表されてもよいし、対応する別の情報で表されてもよい。 Further, the information, parameters, etc. described in this specification may be represented by absolute values, may be represented by relative values from predetermined values, or may be represented by other corresponding information. ..
 ユーザ端末は、当業者によって、移動通信端末、加入者局、モバイルユニット、加入者ユニット、ワイヤレスユニット、リモートユニット、モバイルデバイス、ワイヤレスデバイス、ワイヤレス通信デバイス、リモートデバイス、モバイル加入者局、アクセス端末、モバイル端末、ワイヤレス端末、リモート端末、ハンドセット、ユーザエージェント、モバイルクライアント、クライアント、またはいくつかの他の適切な用語で呼ばれる場合もある。 User terminals are defined by those skilled in the art as mobile communication terminals, subscriber stations, mobile units, subscriber units, wireless units, remote units, mobile devices, wireless devices, wireless communication devices, remote devices, mobile subscriber stations, access terminals, It may also be referred to as a mobile terminal, wireless terminal, remote terminal, handset, user agent, mobile client, client, or some other suitable term.
 本明細書で使用する「判断(determining)」、「決定(determining)」という用語は、多種多様な動作を包含する場合がある。「判断」、「決定」は、例えば、計算(calculating)、算出(computing)、処理(processing)、導出(deriving)、調査(investigating)、探索(looking up)(例えば、テーブル、データベースまたは別のデータ構造での探索)、確認(ascertaining)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、受信(receiving)(例えば、情報を受信すること)、送信(transmitting)(例えば、情報を送信すること)、入力(input)、出力(output)、アクセス(accessing)(例えば、メモリ中のデータにアクセスすること)した事を「判断」「決定」したとみなす事などを含み得る。また、「判断」、「決定」は、解決(resolving)、選択(selecting)、選定(choosing)、確立(establishing)、比較(comparing)などした事を「判断」「決定」したとみなす事を含み得る。つまり、「判断」「決定」は、何らかの動作を「判断」「決定」したとみなす事を含み得る。 As used herein, the terms "determining" and "determining" may encompass a wide variety of actions. "Judgment" and "decision" are, for example, calculating, computing, processing, deriving, investigating, looking up (e.g., table, database or another). (Search in data structure), ascertaining (ascertaining) can be regarded as "judgment" and "decision". In addition, “decision” and “decision” include receiving (eg, receiving information), transmitting (eg, transmitting information), input (input), output (output), access (accessing) (for example, accessing data in a memory) may be regarded as “judging” and “deciding”. In addition, "judgment" and "decision" are considered to be "judgment" and "decision" when things such as resolving, selecting, choosing, selecting, establishing, and comparing are done. May be included. That is, the “judgment” and “decision” may include considering some action as “judgment” and “decision”.
 本明細書で使用する「に基づいて」という記載は、別段に明記されていない限り、「のみに基づいて」を意味しない。言い換えれば、「に基づいて」という記載は、「のみに基づいて」と「に少なくとも基づいて」の両方を意味する。 As used herein, the phrase “based on” does not mean “based only on,” unless expressly specified otherwise. In other words, the phrase "based on" means both "based only on" and "based at least on."
 本明細書で「第1の」、「第2の」などの呼称を使用した場合においては、その要素へのいかなる参照も、それらの要素の量または順序を全般的に限定するものではない。これらの呼称は、2つ以上の要素間を区別する便利な方法として本明細書で使用され得る。したがって、第1および第2の要素への参照は、2つの要素のみがそこで採用され得ること、または何らかの形で第1の要素が第2の要素に先行しなければならないことを意味しない。 Where the designations "first," "second," etc. are used herein, any reference to that element does not generally limit the amount or order of those elements. These designations may be used herein as a convenient way of distinguishing between two or more elements. Thus, references to the first and second elements do not imply that only two elements may be employed therein, or that the first element must precede the second element in any way.
 「含む(include)」、「含んでいる(including)」、およびそれらの変形が、本明細書あるいは特許請求の範囲で使用されている限り、これら用語は、用語「備える(comprising)」と同様に、包括的であることが意図される。さらに、本明細書あるいは特許請求の範囲において使用されている用語「または(or)」は、排他的論理和ではないことが意図される。 As long as the terms “include”, “including” and variations thereof are used in the present specification or claims, these terms are the same as the term “comprising”. It is intended to be comprehensive. Furthermore, the term "or" as used in the specification or claims is not intended to be an exclusive OR.
 本明細書において、文脈または技術的に明らかに1つのみしか存在しない装置である場合以外は、複数の装置をも含むものとする。 In the present specification, a plurality of devices are also included, unless it is a device in which only one is clearly present in terms of context or technology.
 本開示の全体において、文脈から明らかに単数を示したものではなければ、複数のものを含むものとする。 The whole of this disclosure includes a plurality unless the context clearly shows the singular.
 1…翻訳用データ生成システム、12…ノイズ付与部、13…コーパス構築部、40…ノイズモデル学習装置(ノイズモデル学習部)、50…翻訳モデル学習装置(翻訳モデル学習部)。 1... Translation data generation system, 12... Noise adding unit, 13... Corpus building unit, 40... Noise model learning device (noise model learning unit), 50... Translation model learning device (Translation model learning unit).

Claims (10)

  1.  原言語テキストにノイズを付与してノイズ付与原言語テキストを得るノイズ付与部と、
     前記ノイズ付与原言語テキストと、該ノイズ付与原言語テキストのノイズ付与前の原言語テキストに対応する目的言語テキストとを対応付けた疑似対訳コーパスを構築するコーパス構築部と、を備える翻訳用データ生成システム。
    A noise adding unit that adds noise to the source language text to obtain noise
    Translation data generation including a corpus construction unit that constructs a pseudo bilingual corpus in which the noise-added source language text and the target language text corresponding to the noise-added source language text before noise addition are associated with each other. system.
  2.  前記疑似対訳コーパスを用いて翻訳モデルを学習する翻訳モデル学習部を更に備える、請求項1記載の翻訳用データ生成システム。 The translation data generation system according to claim 1, further comprising a translation model learning unit that learns a translation model using the pseudo bilingual corpus.
  3.  ノイズを含んだ原言語テキスト群である訓練データを用いて、原言語テキストに対するノイズの付与に係るノイズモデルを学習するノイズモデル学習部を更に備え、
     前記ノイズ付与部は、前記ノイズモデルを用いて、原言語テキストにノイズを付与する、請求項1又は2記載の翻訳用データ生成システム。
    Using training data that is a source language text group containing noise, a noise model learning unit that learns a noise model related to the addition of noise to the source language text is further provided.
    The translation data generation system according to claim 1, wherein the noise adding unit adds noise to the source language text using the noise model.
  4.  前記ノイズ付与部は、原言語テキストの各単語に、ノイズのタイプを示すノイズラベルを付与し、該ノイズラベルを該ノイズラベルに対応する単語へ置き換えることにより、原言語テキストにノイズを付与する、請求項3記載の翻訳用データ生成システム。 The noise adding unit adds a noise label indicating a type of noise to each word of the source language text, and replaces the noise label with a word corresponding to the noise label to add noise to the source language text, The data generation system for translation according to claim 3.
  5.  前記ノイズ付与部は、1つの前記ノイズラベルに対して置き換える単語を複数パターン導出し、1つの原言語テキストから複数パターンの前記ノイズ付与原言語テキストを得る、請求項4記載の翻訳用データ生成システム。 The translation data generation system according to claim 4, wherein the noise adding unit derives a plurality of patterns of words to be replaced with respect to one noise label, and obtains a plurality of patterns of the noise added source language text from one source language text. ..
  6.  前記ノイズ付与部は、各単語に対応する前記ノイズラベルを複数パターン導出し、1つの原言語テキストから複数パターンの前記ノイズ付与原言語テキストを得る、請求項4又は5記載の翻訳用データ生成システム。 The translation data generation system according to claim 4, wherein the noise adding unit derives a plurality of patterns of the noise label corresponding to each word, and obtains a plurality of patterns of the noise-added source language text from one source language text. ..
  7.  前記ノイズ付与部は、原言語テキストの各単語の特徴に応じて、前記ノイズラベルを付与する、請求項4~6のいずれか一項記載の翻訳用データ生成システム。 The translation data generation system according to any one of claims 4 to 6, wherein the noise adding unit adds the noise label according to the characteristics of each word of the source language text.
  8.  前記ノイズ付与部は、原言語テキストの各単語の特徴である、形態素、品詞、及び単語の読みの少なくとも一つに応じて、前記ノイズラベルを付与する、請求項7記載の翻訳用データ生成システム。 The translation data generation system according to claim 7, wherein the noise adding unit adds the noise label according to at least one of a morpheme, a part of speech, and a reading of the word, which is a feature of each word of the source language text. ..
  9.  前記ノイズ付与部は、原言語テキストの各単語の特徴を入力として前記ノイズモデルから出力される各ノイズラベルのスコアに基づく各ノイズレベルの確率分布に従ってノイズラベルをサンプリングし、原言語テキストに付与するノイズラベルを決定する、請求項7又は8記載の翻訳用データ生成システム。 The noise adding unit samples the noise label according to the probability distribution of each noise level based on the score of each noise label output from the noise model with the feature of each word of the source language text as an input, and adds the noise label to the source language text. 9. The translation data generation system according to claim 7, which determines a noise label.
  10.  前記ノイズモデルは、条件付き確率場又はニューラルネットワークを用いた手法により構築されている、請求項3~9のいずれか一項記載の翻訳用データ生成システム。 The translation data generation system according to any one of claims 3 to 9, wherein the noise model is constructed by a method using a conditional random field or a neural network.
PCT/JP2019/039337 2019-02-12 2019-10-04 Translation data generating system WO2020166125A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2020572078A JP7194759B2 (en) 2019-02-12 2019-10-04 Translation data generation system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2019-022411 2019-02-12
JP2019022411 2019-02-12

Publications (1)

Publication Number Publication Date
WO2020166125A1 true WO2020166125A1 (en) 2020-08-20

Family

ID=72043903

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/039337 WO2020166125A1 (en) 2019-02-12 2019-10-04 Translation data generating system

Country Status (2)

Country Link
JP (1) JP7194759B2 (en)
WO (1) WO2020166125A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378586A (en) * 2021-07-15 2021-09-10 北京有竹居网络技术有限公司 Speech translation method, translation model training method, device, medium, and apparatus

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018055671A (en) * 2016-09-21 2018-04-05 パナソニックIpマネジメント株式会社 Paraphrase identification method, paraphrase identification device, and paraphrase identification program

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2018055671A (en) * 2016-09-21 2018-04-05 パナソニックIpマネジメント株式会社 Paraphrase identification method, paraphrase identification device, and paraphrase identification program

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IMADE, MASAHIRO ET AL.: "Automatic corpora generation applied to neural machine translation", THE 31ST ANNUAL CONFERENCE OF THE JAPANESE SOCIETY FOR ARTIFICIAL INTELLIGENCE , 2017, 26 May 2017 (2017-05-26), pages 1 - 4 *
OHTA, KENGO ET AL.: "Construction of Language Model with Fillers from Corpus without Fillers", IPSJ SIG TECHNICAL REPORT., vol. 2007, no. 75, 21 July 2007 (2007-07-21), pages 1 - 6 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113378586A (en) * 2021-07-15 2021-09-10 北京有竹居网络技术有限公司 Speech translation method, translation model training method, device, medium, and apparatus
CN113378586B (en) * 2021-07-15 2023-03-28 北京有竹居网络技术有限公司 Speech translation method, translation model training method, device, medium, and apparatus

Also Published As

Publication number Publication date
JP7194759B2 (en) 2022-12-22
JPWO2020166125A1 (en) 2021-10-21

Similar Documents

Publication Publication Date Title
US11157698B2 (en) Method of training a descriptive text generating model, and method and apparatus for generating descriptive text
CN108091328B (en) Speech recognition error correction method and device based on artificial intelligence and readable medium
CN107729313B (en) Deep neural network-based polyphone pronunciation distinguishing method and device
JP5901001B1 (en) Method and device for acoustic language model training
EP3792789A1 (en) Translation model training method, sentence translation method and apparatus, and storage medium
JP6493866B2 (en) Information processing apparatus, information processing method, and program
US20220092276A1 (en) Multimodal translation method, apparatus, electronic device and computer-readable storage medium
CN111079432B (en) Text detection method and device, electronic equipment and storage medium
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN110415679B (en) Voice error correction method, device, equipment and storage medium
CN113470619B (en) Speech recognition method, device, medium and equipment
CN111563390B (en) Text generation method and device and electronic equipment
US20180277145A1 (en) Information processing apparatus for executing emotion recognition
CN111414745A (en) Text punctuation determination method and device, storage medium and electronic equipment
WO2021070819A1 (en) Scoring model learning device, scoring model, and determination device
WO2020166125A1 (en) Translation data generating system
CN110569030B (en) Code recommendation method and device
WO2021215262A1 (en) Punctuation mark delete model training device, punctuation mark delete model, and determination device
CN113066510B (en) Vowel weak reading detection method and device
CN111626059B (en) Information processing method and device
US11842165B2 (en) Context-based image tag translation
CN114580446A (en) Neural machine translation method and device based on document context
CN110728137B (en) Method and device for word segmentation
CN113627155A (en) Data screening method, device, equipment and storage medium
CN113838456A (en) Phoneme extraction method, voice recognition method, device, equipment and storage medium

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2020572078

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19915465

Country of ref document: EP

Kind code of ref document: A1