JP2015069359A

JP2015069359A - Translation device and translation program

Info

Publication number: JP2015069359A
Application number: JP2013202405A
Authority: JP
Inventors: 太郎宮▲崎▼; Taro Miyazaki; 加藤　直人; Naoto Kato; 直人加藤
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2013-09-27
Filing date: 2013-09-27
Publication date: 2015-04-13
Anticipated expiration: 2033-09-27
Also published as: JP6325789B2

Abstract

PROBLEM TO BE SOLVED: To achieve highly accurate sign language translation.SOLUTION: A translation device which performs sign language translation on input data includes: input character division means for dividing the input data in each predetermined character; translating means for translating the character divided by the input character division means using a translation model in which a combination of learning data of a translation target language and sign language is learned in advance by a predetermined phrase; and translation result output means for outputting a translation result by the translating means.

Description

本願は、翻訳装置及び翻訳プログラムに係り、特に、固有名詞の高精度な手話翻訳を実現するための翻訳装置及び翻訳プログラムに関する。 The present application relates to a translation apparatus and a translation program, and more particularly, to a translation apparatus and a translation program for realizing highly accurate sign language translation of proper nouns.

元言語から目的の用途に対応させて様々な目的言語に翻訳する手法が存在する。例えば、固有名詞の翻訳の場合には、その固有名詞の読みが用いられる。例えば、「福島」を英語に翻訳する場合には、その読みをローマ字表記した「Ｆｕｋｕｓｈｉｍａ」となる。従来では、上述したように固有名詞等の読みに基づいた翻訳手法が存在する（例えば、特許文献１及び特許文献２参照）。 There are techniques for translating from the original language into various target languages corresponding to the intended use. For example, in the case of translation of a proper noun, the proper noun reading is used. For example, when “Fukushima” is translated into English, the reading is “Fukushima” written in Roman letters. Conventionally, as described above, there is a translation technique based on reading of proper nouns (see, for example, Patent Document 1 and Patent Document 2).

また、近年では、目的言語の一つとして手話への翻訳が注目されている。手話は、聴覚障害者にとって重要なコミュニケーション手段である。特に、先天的或いは幼少期に聴覚を失った人等にとって、手話は第一言語であり、日本語よりも理解しやすい。そのため、日本語の文字より手話での情報を提示した方が好ましいとされている。なお、手話は、ＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）等の映像を用いて提示することができる。そのため、元言語から手話に翻訳する際には、最初に元言語から手話単語列を変換し、変換した手話単語列から各単語に対応するＣＧを抽出し、抽出したＣＧを連結して手話映像を生成する。 In recent years, translation into sign language has attracted attention as one of the target languages. Sign language is an important means of communication for the hearing impaired. Sign language is the first language, especially for people who have lost their hearing in their innate or early childhood, and are easier to understand than Japanese. For this reason, it is preferable to present sign language information rather than Japanese characters. The sign language can be presented using a video such as CG (Computer Graphics). Therefore, when translating from the original language into sign language, first the sign language word string is converted from the original language, the CG corresponding to each word is extracted from the converted sign language word string, and the extracted CG is connected to sign language video. Is generated.

特開２００５−９２６８２号公報JP 2005-92682 A 特表２００５−５２０２５１号公報JP-T-2005-520251

ところで、手話における固有名詞等の多くは、文字毎に翻訳される。例えば、「松江」であれば、「松」の文字が手話単語の｛松｝に翻訳され、「江」の文字が指文字の｛エ｝に翻訳される（上述の｛｝に囲まれた部分は、手話の１単語を表し、以下の説明でも同様とする）。 By the way, many proper nouns and the like in sign language are translated for each character. For example, in the case of “Matsue”, the character “Matsu” is translated into the sign language word {pine}, and the character “E” is translated into the finger character {e} (enclosed in {} above). The part represents one word of sign language, and the same applies to the following description).

したがって、手話による固有名詞の翻訳では、固有名詞の読みを使う場合が少ないため、従来の翻訳手法をそのまま適用することができない。 Therefore, in the translation of proper nouns using sign language, since the reading of proper nouns is rarely used, the conventional translation method cannot be applied as it is.

１つの側面では、本発明は、高精度な手話翻訳を実現するための翻訳装置及び翻訳プログラムを提供することを目的とする。 In one aspect, an object of the present invention is to provide a translation device and a translation program for realizing highly accurate sign language translation.

上記課題を解決するために、本件発明は、以下の特徴を有する課題を解決するための手段を採用している。 In order to solve the above problems, the present invention employs means for solving the problems having the following characteristics.

一態様における翻訳装置は、入力データに対する手話翻訳を行う翻訳装置において、前記入力データを所定の文字毎に分割する入力文字分割手段と、前記入力文字分割手段により分割された文字に対して、予め翻訳対象言語と手話との学習データの組み合わせを所定のフレーズ単位で学習した翻訳モデルを用いて翻訳を行う翻訳手段と、前記翻訳手段により翻訳結果を出力する翻訳結果出力手段とを有する。 In one aspect, a translation device in a translation device that performs sign language translation of input data, an input character dividing unit that divides the input data into predetermined characters, and a character divided by the input character dividing unit in advance Translation means for performing translation using a translation model in which a combination of learning data of a translation target language and sign language is learned in a predetermined phrase unit, and translation result output means for outputting a translation result by the translation means.

一態様における翻訳プログラムは、コンピュータを、上述した翻訳装置が有する各手段として機能させるための翻訳プログラムである。 The translation program in one aspect is a translation program for causing a computer to function as each unit included in the translation apparatus described above.

手話翻訳において、特に固有名詞の高精度な翻訳を実現することができる。 In sign language translation, it is possible to realize particularly accurate translation of proper nouns.

翻訳装置の機能構成の一例を示す図である。It is a figure which shows an example of a function structure of a translation apparatus. 指文字変換のための機能拡張例を示す図である。It is a figure which shows the example of a function expansion for finger character conversion. 翻訳モデル学習処理の一例を示すフローチャートである。It is a flowchart which shows an example of a translation model learning process. 翻訳処理の一例を示すフローチャートである。It is a flowchart which shows an example of a translation process. 置き換えモデルの一例を示す図である。It is a figure which shows an example of a replacement model. 言語モデルの一例を示す図である。It is a figure which shows an example of a language model. 従来手法との比較例を示す図である。It is a figure which shows the comparative example with the conventional method.

＜本実施形態について＞
本実施形態では、例えば翻訳対象言語（例えば、日本語）と手話との翻訳手法を提供する。本実施形態では、例えば日本語（元言語）から手話（目的言語）への翻訳でもよく、手話（元言語）から日本語（目的言語）への翻訳でもよい。この場合、翻訳対象言語である日本語は、元言語にも目的言語にも成り得る。以下では、一例として日本語を元言語とし、その日本語に対応する手話（日本手話）を目的言語として翻訳する手法について説明する。 <About this embodiment>
In the present embodiment, for example, a translation technique between a translation target language (for example, Japanese) and sign language is provided. In the present embodiment, for example, translation from Japanese (original language) to sign language (target language) may be performed, or translation from sign language (original language) to Japanese (target language) may be performed. In this case, the Japanese language that is the translation target language can be either the original language or the target language. In the following, as an example, a method of translating Japanese as an original language and sign language (Japanese sign language) corresponding to the Japanese as a target language will be described.

ここで、一般に手話での固有名詞表現には、（１）漢字手話、（２）指文字、（３）漢字手話＋指文字、（４）固定訳の４つの手法が用いられる。 Here, generally, four methods of (1) Kanji sign language, (2) finger characters, (3) Kanji sign language + finger characters, and (4) fixed translation are used for proper noun expressions in sign language.

（１）漢字手話
漢字手話とは、例えば固有名詞を文字毎に分割し、各文字に対応する手話で置き換えた場合の翻訳を意味する。例えば、日本語の漢字「福」に対応する漢字手話は、意味的に近い手話単語である｛幸せ｝である。これを使って、「福島」であれば、「福」と「島」に分けて、「福」に対応する漢字手話｛幸せ｝と、「島」を表す漢字手話｛島｝の２単語で表現される。なお、これらの単語は、後述する「（２）指文字」より簡潔に表現できることが多く、例えば固定訳がない場合に多く用いられる。 (1) Kanji sign language Kanji sign language means, for example, translation when a proper noun is divided into characters and replaced with sign language corresponding to each character. For example, the kanji sign language corresponding to the Japanese kanji “Fuku” is {happiness}, which is a semantically similar sign language word. Using this, if it is “Fukushima”, it will be divided into “Fuku” and “Island”, and it will be in two words: Kanji sign language corresponding to “Fuku” {Happy} and Kanji sign language representing “Island” {Island}. Expressed. In many cases, these words can be expressed more concisely than “(2) finger characters” to be described later. For example, these words are often used when there is no fixed translation.

（２）指文字
指文字とは、例えば固有名詞の読み仮名を指文字で表した場合の翻訳を意味する。手話では、日本語の仮名文字５０音が全て指文字として定義されている。指文字は、表現力は高いが、１単語を表すのに時間がかかるという問題があるため、日本語の翻訳ではあまり使われない。しかしながら、指文字は、漢字手話での翻訳がしづらい場合や外国の地名、カタカナ語等にはよく用いられる。 (2) Finger characters Finger characters mean, for example, translation when the proper noun reading kana is represented by finger characters. In sign language, all 50 Japanese kana characters are defined as finger characters. Although finger characters are highly expressive, there is a problem that it takes time to express one word, so it is not often used in Japanese translation. However, finger characters are often used when it is difficult to translate kanji sign language, foreign place names, and katakana.

（３）漢字手話＋指文字
漢字手話＋指文字は、例えば上述した「（１）漢字手話」と「（２）指文字」とを組み合わせた翻訳手法を意味する。例えば「長野」であれば、「長」は手話単語の｛長い｝を使った漢字手話で表し、「野」は指文字の｛ノ｝で表す。例えば、「野」のように、読み仮名の短い漢字に対しては指文字が使われることも多い。 (3) Kanji sign language + finger character Kanji sign language + finger character means, for example, a translation technique in which “(1) Kanji sign language” and “(2) finger character” described above are combined. For example, in the case of “Nagano”, “Long” is represented by Kanji sign language using the {Long} sign language word, and “No” is represented by {N} finger characters. For example, finger characters are often used for kanji with short readings such as “no”.

（４）固定訳
固定訳は、例えばある固有名詞に対応する手話単語がすでに決まっている場合を表す。例えば、「広島」であれば、手話では厳島神社の鳥居の形を手指動作で表現することで表す。固定訳では、非常に特徴的な表現であることが多く、意味を確実に伝えることができ、かつ簡潔に表現できるため、固定訳がある場合は、優先して使われることが多い。 (4) Fixed translation A fixed translation represents a case where a sign language word corresponding to a certain proper noun is already determined. For example, in the case of “Hiroshima”, in sign language, the shape of the torii gate of Itsukushima Shrine is expressed by finger movement. Fixed translations are often very distinctive expressions, meanings can be conveyed reliably and can be expressed succinctly, so if there is a fixed translation, it is often used preferentially.

このように、手話においては固定訳がある特殊な場合を除き、固有名詞の翻訳は、固有名詞を文字毎に分割し、文字毎に対応する手話単語か指文字に置き換えることで実現されている。 In this way, except for special cases where there is a fixed translation in sign language, proper noun translation is realized by dividing proper nouns into characters and replacing them with corresponding sign language words or finger characters for each character. .

日本語から日本手話への固有名詞の翻訳においては、日本語の固有名詞に含まれる漢字を手話単語又は指文字に置き換えて表現する。そのため、固有名詞を文字毎に分割し、その文字がどの手話単語又は指文字に対応するかを１文字毎に決定するのが好ましい。 In the translation of proper nouns from Japanese to Japanese sign language, Kanji characters contained in Japanese proper nouns are replaced with sign language words or finger characters. Therefore, it is preferable to divide the proper noun for each character and determine for each character which sign language word or finger character the character corresponds to.

そこで、本実施形態では、機械翻訳の手法を用いて文字毎に対応する手話単語や指文字を学習する。なお、日本語の文字と手話単語の対応付けの部分の機械学習は、精度があまり高くならない。その原因としては、一文あたりに出現する日本語の文字数と手話の単語数との差が大きいことや、日本語の文字の異なり数と手話単語の異なり数の差が大きいことが挙げられる。 Therefore, in this embodiment, a sign language word or a finger character corresponding to each character is learned using a machine translation technique. Note that the machine learning of the correspondence between Japanese characters and sign language words is not very accurate. The reason is that the difference between the number of Japanese characters appearing per sentence and the number of words in sign language is large, and the difference between the number of differences in Japanese characters and the number of differences in sign language words is large.

そこで、本実施形態では、学習に用いる文対（例えば、日本語と手話とで対応するセンテンスペア等）を予めフレーズ単位に分割して学習文対を短くすることで、適切な対応付けが容易になるようにする。 Therefore, in this embodiment, appropriate correspondence is easy by dividing a sentence pair used for learning (for example, a sentence spare corresponding to Japanese and sign language) into phrases and shortening the learned sentence pair. To be.

以下に、翻訳装置及び翻訳プログラムを好適に実施した形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments in which a translation apparatus and a translation program are suitably implemented will be described in detail with reference to the drawings.

＜翻訳装置の機能構成例＞
図１は、翻訳装置の機能構成の一例を示す図である。図１における翻訳装置１０は、翻訳モデル学習手段１１と、単語入力手段１２と、固定訳翻訳手段１３と、入力文字分割手段１４と、翻訳手段１５と、指文字変換手段１６と、翻訳結果出力手段１７とを有するよう構成されている。 <Functional configuration example of translation device>
FIG. 1 is a diagram illustrating an example of a functional configuration of the translation apparatus. The translation apparatus 10 in FIG. 1 includes a translation model learning unit 11, a word input unit 12, a fixed translation translation unit 13, an input character dividing unit 14, a translation unit 15, a finger character conversion unit 16, and a translation result output. And means 17.

翻訳モデル学習手段１１は、事前に翻訳モデルを学習し、翻訳手段１５で使用する翻訳モデルを格納する。例えば、翻訳モデル学習手段１１は、学習に用いる文対（例えば、日本語と手話とで対応するセンテンスペア等）を予めフレーズ単位に分割して翻訳モデルを学習する。これにより、元言語（日本語）と目的言語（日本手話）との対応付けを適切に行って翻訳モデルを学習することができる。 The translation model learning unit 11 learns the translation model in advance and stores the translation model used by the translation unit 15. For example, the translation model learning unit 11 learns a translation model by dividing a sentence pair used for learning (for example, a sentence spare corresponding to Japanese and sign language) into phrases. Thereby, the translation model can be learned by appropriately associating the original language (Japanese) with the target language (Japanese sign language).

単語入力手段１２は、入力データの一例として、翻訳対象の文字例（１単語又は複数単語からなる文字列（文章））の入力を受け付ける。単語入力手段１２における入力単語は、固定訳翻訳手段１３及び入力文字分割手段１４に出力される。単語入力手段１２は、例えばキーボードやタッチパネル上の操作ボタンを用いた文字入力、マイク等を用いた音声入力の各種入力手段により単語文字列を入力することができるが、これに限定されるものではない。例えば、単語入力手段１２は、ニュース原稿データや映画やドラマ等の台本データ等を入力してもよい。 The word input unit 12 accepts input of a character example to be translated (a character string (sentence) consisting of one word or a plurality of words) as an example of input data. The input word in the word input unit 12 is output to the fixed translation translating unit 13 and the input character dividing unit 14. The word input unit 12 can input a word character string by various input units such as character input using an operation button on a keyboard or a touch panel, or voice input using a microphone, but is not limited thereto. Absent. For example, the word input means 12 may input news manuscript data, script data such as movies and dramas, and the like.

固定訳翻訳手段１３は、予め設定された手話翻訳用の固定訳辞書２１を用いて、単語入力手段１２により入力された単語列のうち、固定訳辞書２１に含まれている単語があれば、その単語に対応する固定訳の手話に翻訳する。なお、固定訳辞書２１には、例えば「神戸→｛神戸｝」、「東京→｛東京｝」、「横浜→｛横浜｝」等のように、地名や人名等の固有名詞の単語とその単語に対する日本手話とが対応付けられて格納されている。固定訳翻訳手段１３により翻訳された結果は、入力文字分割手段１４に出力する。 If there is a word included in the fixed translation dictionary 21 among the word strings input by the word input means 12 using the preset fixed sign dictionary 21 for sign language translation, Translate into a fixed sign language corresponding to the word. The fixed translation dictionary 21 includes words of proper nouns such as place names and personal names such as “Kobe → {Kobe}”, “Tokyo → {Tokyo}”, “Yokohama → {Yokohama}”, and the like. Is stored in association with Japanese sign language. The result translated by the fixed translation translating means 13 is output to the input character dividing means 14.

入力文字分割手段１４は、単語入力手段１２により入力された単語文字列のうち、固定訳翻訳手段１３により翻訳された文字以外の単語を文字単位で分割する。入力文字分割手段１４は、分割した文字を翻訳手段１５に出力する。 The input character dividing unit 14 divides words other than the characters translated by the fixed translation translating unit 13 among the word character strings input by the word input unit 12 in character units. The input character dividing unit 14 outputs the divided characters to the translating unit 15.

翻訳手段１５は、分割した文字列を翻訳モデル学習手段１１で学習した翻訳モデルを用いて翻訳を行う。翻訳手段１５は、翻訳結果を指文字変換手段１６に出力する。この翻訳手段１５における翻訳には、例えば「ｍｏｓｅｓ」等の公知の統計的翻訳ツールキットが利用可能であるが、これに限定されるものではない。 The translation unit 15 translates the divided character string using the translation model learned by the translation model learning unit 11. The translation unit 15 outputs the translation result to the finger character conversion unit 16. For the translation in the translation means 15, a known statistical translation tool kit such as “moses” can be used, but is not limited thereto.

指文字変換手段１６は、翻訳手段１５の結果から、単語入力手段１２により入力された単語文字列のうち、固定訳翻訳手段１３や翻訳手段１５でも、翻訳できていない残りの文字を指文字に変換する。指文字変換手段１６は、指文字変換結果を翻訳結果出力手段１７に出力する。 Based on the result of the translation means 15, the finger character conversion means 16 uses the remaining characters that have not been translated by the fixed translation translation means 13 and the translation means 15 among the word character strings input by the word input means 12 as finger characters. Convert. The finger character conversion means 16 outputs the finger character conversion result to the translation result output means 17.

翻訳結果出力手段１７は、固定訳翻訳手段１３の翻訳結果と、翻訳手段１５の翻訳結果と、指文字変換手段１６による出力結果とを統合して単語入力手段１２により入力された単語文字列に対応する翻訳結果を出力する。 The translation result output means 17 integrates the translation result of the fixed translation translation means 13, the translation result of the translation means 15, and the output result of the finger character conversion means 16 into the word character string input by the word input means 12. Output the corresponding translation result.

例えば、翻訳結果出力手段１７は、翻訳結果を予め設定されたＣＧキャラクタによる手話ＣＧに変換して画面等に出力することができるが、これに限定されるものではなく、例えば翻訳結果の文字列（例えば、「釜石→｛カマ（指文字）｝｛石｝」）等を出力してもよい。 For example, the translation result output means 17 can convert the translation result into a sign language CG with a preset CG character and output it to a screen or the like, but is not limited to this, for example, a character string of the translation result (For example, “Kamaishi → {Kama (finger character)} {stone}”) or the like may be output.

＜翻訳モデル学習手段１１の機能構成例＞
ここで、上述した翻訳モデル学習手段１１における機能構成例について具体的に説明する。図１の例に示すように、翻訳モデル学習手段１１は、例えば学習データ格納手段（センテンスペア）３１と、文分割手段３２と、学習データ格納手段（フレーズペア）３３と、文字単位分割手段３４と、置き換えモデル学習手段３５と、言語モデル学習手段３６と、翻訳モデル格納手段３７とを有するよう構成されている。なお、学習データ格納手段（センテンスペア）３１と、学習データ格納手段（フレーズペア）３３とは、１つの学習データ格納手段として構成されていてもよい。 <Functional Configuration Example of Translation Model Learning Unit 11>
Here, a functional configuration example in the translation model learning unit 11 described above will be specifically described. As shown in the example of FIG. 1, the translation model learning unit 11 includes, for example, a learning data storage unit (sentence spare) 31, a sentence division unit 32, a learning data storage unit (phrase pair) 33, and a character unit division unit 34. And a replacement model learning means 35, a language model learning means 36, and a translation model storage means 37. Note that the learning data storage means (sentence spare) 31 and the learning data storage means (phrase pair) 33 may be configured as one learning data storage means.

学習データ格納手段（センテンスペア）３１は、例えば所定の単語単位で区切られた元言語及び目的言語の文対を格納する。本実施形態では、学習データ格納手段（センテンスペア）３１は、例えば日本語と日本手話とのセンテンス（文節）毎のペアの学習データを格納する。 The learning data storage means (sentence spare) 31 stores, for example, sentence pairs of the original language and the target language divided in predetermined word units. In this embodiment, the learning data storage means (sentence spare) 31 stores learning data of pairs for each sentence (sentence) of Japanese and Japanese sign language, for example.

文分割手段３２は、学習データ格納手段（センテンスペア）３１に格納された文対ペア（センテンスペア）を入力して、ワードアライメント（例えば、単語の並びや単語同士の対応付け等）の結果を用いて、より短いフレーズペアに分割する。また、文分割手段３２は、各文のフレーズ単位に学習データ格納手段（フレーズペア）３３に格納する。ワードアライメントの取得には、例えば「ＧＩＺＡ＋＋」等の公知のツールキットが利用可能であるが、これに限定されるものではない。例えば「ＧＩＺＡ＋＋」は、統計翻訳に用いるための単語の確率値の計算を行うツールであり、単語の対応関係の確率値を計算することができる。そのため、文分割手段３２は、「ＧＩＺＡ＋＋」で計算を行い、その結果を用いてフレーズペア毎に分割して学習データ格納手段（フレーズペア）３３に格納することができる。 The sentence division unit 32 inputs the sentence pair (sentence spare) stored in the learning data storage unit (sentence spare) 31 and outputs the result of word alignment (for example, word alignment, word association). Use to split into shorter phrase pairs. Moreover, the sentence division means 32 stores in the learning data storage means (phrase pair) 33 for each sentence phrase. For obtaining word alignment, a known tool kit such as “GIZA ++” can be used, but is not limited thereto. For example, “GIZA ++” is a tool that calculates the probability value of a word for use in statistical translation, and can calculate the probability value of the correspondence relationship between words. Therefore, the sentence dividing means 32 can calculate “GIZA ++” and use the result to divide into phrase pairs and store them in the learning data storage means (phrase pair) 33.

文字単位分割手段３４は、学習データ格納手段（フレーズペア）３３に格納される各フレーズペアのうち、日本語（元言語）について文字単位に分割する。この文字単位に分割された日本語と、その日本語に対応する手話の単語列とからなるフレーズペアを置き換えモデル学習手段３５に出力する。 The character unit dividing unit 34 divides Japanese (original language) into character units among the phrase pairs stored in the learning data storage unit (phrase pair) 33. A phrase pair composed of the Japanese divided into character units and a sign language word string corresponding to the Japanese is output to the replacement model learning means 35.

置き換えモデル学習手段３５は、文字単位分割手段３４により文字単位に分割されたフレーズペアに対して、置き換えモデルの学習を行う。置き換えモデルは、日本語の複数文字と手話の複数単語との間での翻訳確率を計算してまとめたものである。置き換えモデルの具体例については、後述する。 The replacement model learning unit 35 learns a replacement model for the phrase pair divided by the character unit by the character unit dividing unit 34. The replacement model is a compilation of translation probabilities between multiple Japanese characters and multiple words in sign language. A specific example of the replacement model will be described later.

言語モデル学習手段３６は、学習データ格納手段（センテンスペア）３１の内容に基づいて、その内容を言語モデルとして学習する。言語モデルは、学習データ格納手段（センテンスペア）３１のデータから言語モデルを生成する。この言語モデル学習手段３６における言語モデルの生成には、例えば「ＳＲＩＬＭ」等の公知の統計的言語モデル作成ツールキット等が利用可能であるが、これに限定されるものではない。 The language model learning means 36 learns the contents as a language model based on the contents of the learning data storage means (sentence spare) 31. The language model is generated from the data stored in the learning data storage means (sentence spare) 31. For the generation of the language model in the language model learning means 36, for example, a known statistical language model creation tool kit such as “SRILM” can be used, but it is not limited to this.

翻訳モデル格納手段３７は、置き換えモデル学習手段３５から得られる置き換えモデルの内容と、言語モデル学習手段３６から得られる言語モデルとを翻訳モデルとして格納する。 The translation model storage unit 37 stores the content of the replacement model obtained from the replacement model learning unit 35 and the language model obtained from the language model learning unit 36 as a translation model.

上述したように、翻訳モデル学習手段１１は、学習に用いる文対を予めフレーズ単位に分割することで、学習文対を短くし、翻訳対象の単語文字列との対応付けを適切かつ容易に行うことができ、高精度な翻訳を可能にする。 As described above, the translation model learning unit 11 divides a sentence pair used for learning into phrase units in advance, thereby shortening the learned sentence pair and appropriately and easily associating with the word character string to be translated. And enables high precision translation.

＜指文字変換のための機能拡張例（単語−読み対応付け）＞
ここで、上述した本実施形態における指文字変換手段１６は、例えば翻訳の際に翻訳モデルにない入力文字があった場合や、翻訳の結果として「エン（指文字）」等が得られた場合等に、その文字に対応する読みを使った指文字を取得する。この指文字変換は、例えば予め設定された指文字変換辞書等を用いて変換されるが、漢字と読みを対応付ける必要がある。そのため、翻訳装置１０は、指文字変換手段１６に入力される単語文字列に対して読みを対応付ける機能を設けて機能を拡張してもよい。 <Function expansion example for finger character conversion (word-reading association)>
Here, the finger character conversion means 16 in this embodiment described above, for example, when there is an input character that is not in the translation model at the time of translation, or when “en (finger character)” or the like is obtained as a result of translation For example, a finger character using a reading corresponding to the character is obtained. This finger character conversion is performed using, for example, a preset finger character conversion dictionary or the like, but it is necessary to associate kanji and reading. Therefore, the translation apparatus 10 may extend the function by providing a function for associating the reading with the word character string input to the finger character conversion means 16.

図２は、指文字変換のための機能拡張例を示す図である。なお、図２の例では、上述した翻訳装置１０の構成のうち、指文字変換手段１６への入出力に関する部分のみを示しており、翻訳装置の他の実施形態である。 FIG. 2 is a diagram illustrating an example of function expansion for finger character conversion. In the example of FIG. 2, only the part related to input / output to the finger character conversion means 16 is shown in the configuration of the translation apparatus 10 described above, which is another embodiment of the translation apparatus.

図２の例では、単語入力手段１２と、単語読み入力手段４１と、単語−読み対応付け手段４２と、指文字変換手段１６とを有する。 In the example of FIG. 2, it includes a word input unit 12, a word reading input unit 41, a word-reading association unit 42, and a finger character conversion unit 16.

単語読み入力手段４１は、上述した単語入力手段１２が入力した単語（例えば、固有名詞の漢字表記等）に対応する読みを入力する。単語−読み対応付け手段４２は、例えば単語入力手段１２からの固有名詞の漢字表記の入力と、単語読み入力手段４１からの固有名詞の読みとを用いて漢字と読みの対応付けを行う。 The word reading input means 41 inputs a reading corresponding to the word (for example, kanji notation of a proper noun) input by the word input means 12 described above. The word-reading association means 42 associates kanji and readings using, for example, the input of the proper noun kanji notation from the word input means 12 and the proper noun reading from the word reading input means 41.

なお、単語−読み対応付け手段４２は、固有名詞の対応付けに限定されるものではなく、他の単語に対する読みの対応付けを行ってもよい。また、単語読み入力手段４１は、読みを手入力してもよく、漢字表記から辞書等を用いて自動で取得してもよい。例えば、単語−読み対応付け手段４２は、「園田（ソノダ）」が入力された場合、「園（ソノ）田（ダ）」のように、どの文字にどの読みが対応するかを出力する。 Note that the word-reading association unit 42 is not limited to the association of proper nouns, and may associate readings with other words. Moreover, the word reading input means 41 may input a reading manually and may acquire it automatically from a Chinese character notation using a dictionary. For example, when “Sonoda” is input, the word-reading association unit 42 outputs which reading corresponds to which character, such as “Sonoda”.

指文字変換手段１６は、上述した処理結果を用いて、例えば翻訳手段１５から、「園」の翻訳結果として「エン（指文字）」が得られていた場合に、この場合の「園」の読みが「ソノ」であることを利用し、「ソノ（指文字）」に変換することができる。 The finger character conversion means 16 uses the processing result described above, for example, when “en (finger character)” is obtained as the translation result of “garden” from the translation means 15, for example, Utilizing that the reading is “Sono”, it can be converted to “Sono (finger character)”.

＜翻訳モデル学習処理の一例＞
次に、本実施形態における翻訳モデル学習処理の一例について、フローチャートを用いて説明する。図３は、翻訳モデル学習処理の一例を示すフローチャートである。図３の例において、翻訳モデル学習手段１１は、予め格納された学習データに含まれる元言語（日本語）と目的言語（日本手話）とのセンテンスペアを文分割し、フレーズペアを生成して格納する（Ｓ０１）。 <Example of translation model learning process>
Next, an example of the translation model learning process in the present embodiment will be described using a flowchart. FIG. 3 is a flowchart illustrating an example of the translation model learning process. In the example of FIG. 3, the translation model learning means 11 divides a sentence spare of an original language (Japanese language) and a target language (Japanese sign language) included in learning data stored in advance, and generates a phrase pair. Store (S01).

次に、翻訳モデル学習手段１１は、上述したフレーズペアの日本語を文字単位で分割する（Ｓ０２）。次に、翻訳モデル学習手段１１は、分割した文字を用いてフレーズペアから置き換えモデルを学習する（Ｓ０３）。 Next, the translation model learning unit 11 divides the phrase pair Japanese described above in character units (S02). Next, the translation model learning unit 11 learns a replacement model from the phrase pair using the divided characters (S03).

次に、翻訳モデル学習手段１１は、学習データに含まれるセンテンスペアから言語モデルを学習する（Ｓ０４）。なお、Ｓ０４の処理のタイミングはこれに限定されるものではなく、例えば上述したＳ０１の処理の前に行ってもよい。 Next, the translation model learning unit 11 learns a language model from a sentence spare included in the learning data (S04). Note that the timing of the process of S04 is not limited to this, and may be performed before the process of S01 described above, for example.

次に、翻訳モデル学習手段１１は、Ｓ０３の処理で得られた置き換えモデルと、Ｓ０４の処理で得られた言語モデルとを格納する（Ｓ０５）。なお、上述した翻訳モデル学習処理は、後述する翻訳処理の前に行う。 Next, the translation model learning unit 11 stores the replacement model obtained by the process of S03 and the language model obtained by the process of S04 (S05). The translation model learning process described above is performed before the translation process described later.

＜翻訳処理の一例＞
次に、本実施形態における翻訳処理の一例について、フローチャートを用いて説明する。図４は、翻訳処理の一例を示すフローチャートである。図４の例において、翻訳装置１０は、単語入力手段１２等により翻訳対象の単語（文字列等を含む）の入力を受け付ける（Ｓ１１）。 <Example of translation processing>
Next, an example of translation processing in the present embodiment will be described using a flowchart. FIG. 4 is a flowchart illustrating an example of translation processing. In the example of FIG. 4, the translation apparatus 10 receives an input of a word to be translated (including a character string or the like) by the word input unit 12 or the like (S11).

次に、翻訳装置１０は、Ｓ１１の処理で入力された単語に対して、予め設定された手話翻訳用の固定訳辞書２１等を用いて固定訳の手話に翻訳する（Ｓ１２）。次に、翻訳装置１０は、Ｓ１１の処理で入力された単語に対して、固定訳辞書２１を用いて翻訳できない部分（例えば、新しい表現の単語や新しい固有名詞等）を文字単位で分割する（Ｓ１３）。次に、翻訳装置１０は、翻訳モデル格納手段３７に格納された翻訳モデルを用いて文字単位で手話に翻訳する（Ｓ１４）。 Next, the translation apparatus 10 translates the word input in the process of S11 into a fixed sign language using the preset fixed sign dictionary 21 or the like for sign language translation (S12). Next, the translation apparatus 10 divides a part (for example, a new expression word or a new proper noun) that cannot be translated using the fixed translation dictionary 21 in units of characters with respect to the word input in the process of S11 ( S13). Next, the translation apparatus 10 translates it into sign language in units of characters using the translation model stored in the translation model storage means 37 (S14).

次に、翻訳装置１０は、Ｓ１４までの処理の後でも翻訳できなかった文字があれば、その文字を指文字変換し（Ｓ１５）、上述した処理により得られた翻訳結果（Ｓ１１の処理で入力した単語に対する最終的な翻訳結果）を出力する（Ｓ１６）。 Next, if there is a character that could not be translated even after the processing up to S14, the translation device 10 converts the character into a finger character (S15), and the translation result obtained by the above-described processing (input in the processing of S11). The final translation result for the completed word is output (S16).

ここで、翻訳装置１０は、他の単語等の翻訳を続けるか否かを判断し（Ｓ１７）、翻訳を続ける場合（Ｓ１７において、ＹＥＳ）、Ｓ１１の処理に戻る。また、翻訳装置１０は、翻訳を続けない場合（Ｓ１７において、ＮＯ）、翻訳処理を終了する。上述した処理により、日本語から手話への適切な翻訳を実現することができる。 Here, translation apparatus 10 determines whether or not to continue translation of other words or the like (S17). When translation is continued (YES in S17), the process returns to S11. Moreover, the translation apparatus 10 complete | finishes a translation process, when not continuing a translation (in S17 NO). With the processing described above, appropriate translation from Japanese into sign language can be realized.

＜各種データ例＞
次に、本実施形態で用いられる各種データ例について、図を用いて説明する。 <Examples of various data>
Next, various data examples used in the present embodiment will be described with reference to the drawings.

＜置き換えモデルの一例＞
図５は、置き換えモデルの一例を示す図である。置き換えモデルは、置き換えモデル学習手段３５で学習されるデータである。図５の例において、置き換えモデルの項目としては、例えば「日本語（元言語）表記」、「手話（目的言語）表記」、「各種確率」、「（置き換えモデル内で）同じ手話表現を持つ行数」、「（置き換えモデル内で）同じ日本語表現を持つ行数」等があるが、これに限定されるものではない。 <Example of replacement model>
FIG. 5 is a diagram illustrating an example of a replacement model. The replacement model is data learned by the replacement model learning means 35. In the example of FIG. 5, the replacement model items include, for example, “Japanese (original language) notation”, “sign language (target language) notation”, “various probabilities”, and “(within the replacement model) the same sign language expression. There are “the number of lines”, “the number of lines having the same Japanese expression (within the replacement model)”, etc., but it is not limited thereto.

置き換えモデルでは、学習データであるフレーズペアを用いて、日本語（元言語）と日本手話（目的言語）との組み合わせと、各種確率の値として「手話単語→日本語の翻訳確率（尤度）」、「手話単語→日本語の場合の日本語の単語毎の共起確率の積」、「日本語単語→手話単語の翻訳確率（尤度）」、「日本語単語→手話単語の場合の手話の単語毎の共起確率の積」、「一律に与えた数値（ｅ）」の学習を行う。 In the replacement model, using a pair of learning data, the combination of Japanese (original language) and Japanese sign language (target language), and various probabilities, “sign language word → Japanese translation probability (likelihood) "Product of co-occurrence probability for each Japanese word when sign language word → Japanese", "Translation probability (likelihood) of Japanese word → sign language word", "Japanese word → Sign language word Learning of “product of co-occurrence probability for each word in sign language” and “a numerical value given uniformly (e)”.

なお、図５に示す「手話表記」の｛ｐｔ｝は、例えばＣＧキャラクタ等の手話話者の指差し動作を示し、｛ｐｔ３｝は自分や相手以外のものや人への指差し動作を示す。また、「手話表記」の｛Ｎ｝は、手話を行うＣＧキャラクタ等の手話話者のうなずきを表す。本実施形態において、うなずきは固有名詞中では使われないため、無視して翻訳を行うことができる。 In addition, {pt} of “sign language notation” shown in FIG. 5 indicates a pointing action of a sign language speaker such as a CG character, and {pt3} indicates a pointing action to a person other than himself or the other party or a person. . In addition, {N} of “sign language notation” represents a nod of a sign language speaker such as a CG character performing sign language. In this embodiment, nodding is not used in proper nouns and can be ignored for translation.

本実施形態では、日本語単語から手話単語への翻訳に用いる翻訳モデルを生成するため、図５の例における「日本語単語→手話単語の翻訳確率（尤度）」のデータがあればよく、その他の確率は置き換えモデルに含まれていなくてもよい。これらの各種確率値等のパラメータは、例えば「同じ手話表現を持つ行数」、「同じ日本語表現を持つ行数」等の各種データから、機械学習により得られる。なお、「一律に与えた数値（ｅ）」は、翻訳の際に数値の調整に用いられる値であり、図５の例に限定されるものではなく、置き換えモデルに含まれていなくてもよい。 In this embodiment, in order to generate a translation model used for translation from a Japanese word to a sign language word, it is sufficient if there is data of “translation probability (likelihood) of Japanese word → sign language word” in the example of FIG. Other probabilities may not be included in the replacement model. These parameters such as various probability values are obtained by machine learning from various data such as “the number of lines having the same sign language expression” and “the number of lines having the same Japanese expression”. Note that “a numerical value (e) given uniformly” is a value used for adjustment of the numerical value at the time of translation, and is not limited to the example of FIG. 5 and may not be included in the replacement model. .

ここで、本実施形態における翻訳モデルを用いた翻訳は、例えば１字毎の翻訳も可能であるが、これに限定されるものではなく複数の文字やフレーズをまとめた翻訳も可能である。例えば、学習データに「福島」が含まれているため、図５に示す置き換えモデルにも「福島」も現れており、翻訳の際にはこの情報も参照している。したがって、本実施形態における翻訳処理では、学習データにその通りの順番に現れない文字列、例えば「島福」等を翻訳した場合に１文字ずつ翻訳されることになる。 Here, the translation using the translation model in the present embodiment can be performed, for example, for each character, but is not limited to this, and can also be performed by translating a plurality of characters and phrases. For example, since “Fukushima” is included in the learning data, “Fukushima” also appears in the replacement model shown in FIG. 5, and this information is also referred to during translation. Therefore, in the translation processing according to the present embodiment, when a character string that does not appear in the order in the learning data, such as “Shimafuku”, is translated one character at a time.

＜言語モデルの一例＞
図６は、言語モデルの一例を示す図である。言語モデルは、言語モデル学習手段３６で学習されるデータである。図６の例に示す言語モデルの項目としては、例えば「単語が並ぶ確率（の対数尤度）」、「単語の並び」、「バックオフ確率」等があるが、これに限定されるものではない。本実施形態における翻訳処理で使用している項目は、「単語が並ぶ確率（の対数尤度）」、「単語の並び」であるため、「バックオフ確率」は、言語モデルに含まれていなくてもよい。 <Example of language model>
FIG. 6 is a diagram illustrating an example of a language model. The language model is data learned by the language model learning means 36. The items of the language model shown in the example of FIG. 6 include, for example, “probability of word alignment (log likelihood)”, “arrangement of words”, “backoff probability”, and the like. Absent. Since the items used in the translation processing in the present embodiment are “probability of word alignment (log likelihood)” and “arrangement of words”, “backoff probability” is not included in the language model. May be.

「バックオフ確率」とは、例えば３単語の並びまで考慮する「３−ｇｒａｍ」を学習した際に、その３単語の並びでは出てこなかった単語列の確率を計算するためのものである。 The “back-off probability” is used to calculate the probability of a word string that did not appear in the 3-word sequence when learning, for example, “3-gram” that considers up to 3-word sequence.

図６の例において、例えば｛幸せ｝の場合のバックオフ確率の数値が−０．１８５６６９７となっているのは、学習データに｛幸せ｝に続く３単語の単語列の中で、学習データに現れなかったものが現れる確率を表している。例えば、学習データに現れない３単語の並びの場合、３単語の並びのモデルは使えないため、２単語の並びのモデル、１単語の並びのモデル等のように、より短いモデルを使って確率値を表す。このように、より低次なモデルを使う際に乗算等の演算で使用する係数の一例としてバックオフ確率が用いられる。 In the example of FIG. 6, for example, the numerical value of the backoff probability in the case of {happy} is −0.18566667 in the learning data in the word string of 3 words following {happy} in the learning data. It represents the probability that something that did not appear will appear. For example, in the case of a 3-word sequence that does not appear in the learning data, the 3-word sequence model cannot be used, so the probability is determined using a shorter model, such as a 2-word sequence model, 1-word sequence model, etc. Represents a value. As described above, the backoff probability is used as an example of a coefficient used in an operation such as multiplication when a lower-order model is used.

＜翻訳モデルの学習と翻訳の具体例＞
次に、本実施形態における翻訳モデルの学習と翻訳の具体例について説明する。例えば、学習データ格納手段（センテンスペア）３１に格納されているセンテンスペアの例として、
「日本語：長野／は／朝／から／晴れる／でしょ／う」
「手話：｛長い｝／｛ノ［指文字］｝／｛朝｝／｛から｝／｛晴れ｝／｛夢｝」
があるとする。なお、上述した「／」は、ここでは分割された単語の区切りを示すラベルである。 <Examples of translation model learning and translation>
Next, a specific example of translation model learning and translation in this embodiment will be described. For example, as an example of a sentence spare stored in the learning data storage means (sentence spare) 31,
“Japanese: Nagano / Ha / Morning / From / Sunny”
"Sign language: {long} / {no [finger]} / {morning} / {from} / {clear} / {dream}"
Suppose there is. Note that “/” described above is a label indicating the division of the divided words.

このデータから、例えば「ＧＩＺＡ＋＋」等で単語間の対応付けを獲得し、それを基にフレーズペアを作成すると、
１．「日本語：長野／は」、「手話：｛長い｝／｛ノ［指文字］｝」
２．「日本語：朝／から」、「手話：｛朝｝／｛から｝」
３．「日本語：晴れる／でしょ／う」、「手話：｛晴れ｝／｛夢｝」
と、３つのフレーズペアが生成できる。本実施形態では、このフレーズペアの日本語を文字単位に分けた上で、上述した置き換えモデルの学習を行う。 From this data, for example, “GIZA ++” or the like obtains the correspondence between words, and creates a phrase pair based on it.
1. "Japanese: Nagano / ha", "Sign language: {long} / {no [finger]}"
2. “Japanese: Morning / From”, “Sign Language: {Morning} / {From}”
3. “Japanese: Sunny / De /”, “Sign Language: {Sunny} / {Dream}”
And three phrase pairs can be generated. In the present embodiment, the replacement model described above is learned after Japanese of the phrase pair is divided into character units.

学習されるフレーズペアの例として、「長→｛長い｝」、「野→｛ノ［指文字］｝」、「は→（対応する手話単語なし）」、「朝→｛朝｝」、「か／ら→｛から｝」、「晴／れ／る→｛晴れ｝」、「で／し／ょ／う→｛夢｝」等と学習できる。ここでの「／」は、分割された文字の区切りを示すラベルである。 Examples of learned phrase pairs include “long → {long}”, “field → {no [finger]}”, “ha → (no corresponding sign language word)”, “morning → {morning}”, “ You can learn from “Kara / Ra → {From}”, “Sunny / Re / Ru → {Sunny}”, “De / Shi / yo / U → {Dream}”, etc. Here, “/” is a label indicating the division of the divided characters.

本実施形態では、上述した学習結果（翻訳モデル）を用いて翻訳を行うことで、例えば手話の固有名詞の表現方法を生かした適切な翻訳を実現することができる。 In the present embodiment, by performing translation using the learning result (translation model) described above, it is possible to realize appropriate translation utilizing, for example, a sign language proper noun expression method.

次に、本実施形態における翻訳処理の各実施例について説明する。 Next, examples of translation processing in the present embodiment will be described.

＜実施例１＞
翻訳装置１０は、例えば入力単語として「加藤」が入力された場合、固定訳辞書２１を用いた翻訳を行い、固定訳辞書２１に「加藤→｛加藤｝」が存在する場合には、｛加藤｝という翻訳結果を出力する。 <Example 1>
For example, when “Kato” is input as an input word, the translation apparatus 10 performs translation using the fixed translation dictionary 21, and when “Kato → {Kato}” exists in the fixed translation dictionary 21, {Kato } Is output.

なお、実施例１の場合、翻訳装置１０は、固定訳辞書２１に入力単語の全ての文字に対応する翻訳手話が存在しているため、翻訳手段１５による翻訳及び指文字変換手段１６による指文字変換は行わない。 In the case of the first embodiment, the translation device 10 includes translation sign language corresponding to all characters of the input word in the fixed translation dictionary 21, so that the translation by the translation unit 15 and the finger character by the digit conversion unit 16 No conversion is performed.

＜実施例２＞
翻訳装置１０は、例えば入力単語として「福島」が入力された場合に、まず固定訳辞書２１を用いた翻訳を行い、固定訳辞書２１に入力単語が含まれていないため、翻訳モデル格納手段３７に格納された翻訳モデルから、「福→｛幸せ｝」、「島→｛島｝」の翻訳規則を用いて翻訳を行い、｛幸せ｝｛島｝という翻訳結果を出力する。 <Example 2>
For example, when “Fukushima” is input as an input word, the translation device 10 performs translation using the fixed translation dictionary 21 first, and since the input word is not included in the fixed translation dictionary 21, the translation model storage unit 37. Is translated using the translation rules of “Fuku → {Happy}” and “Island → {Island}” from the translation model stored in, and a translation result of {Happy} {Island} is output.

なお、実施例２の場合、翻訳装置１０は、翻訳モデル格納手段３７に格納された翻訳モデルに入力単語の全てに対応する翻訳手話が存在していたため、指文字変換手段１６による指文字変換は行わない。 In the case of the second embodiment, the translation device 10 includes the translation sign language corresponding to all the input words in the translation model stored in the translation model storage unit 37. Not performed.

＜実施例３＞
翻訳装置１０は、例えば入力単語として「園田」が入力された場合、まず固定訳辞書２１を用いた翻訳を行い、固定訳辞書２１に入力単語が含まれていないため、文字毎に翻訳して「園→｛エン（指文字）｝」「田→｛田｝」が得られる。 <Example 3>
For example, when “Sonoda” is input as an input word, the translation device 10 performs translation using the fixed translation dictionary 21 first, and does not include the input word in the fixed translation dictionary 21. “Garden → {En (finger)}” “Ta → {Ta}” is obtained.

また、翻訳装置１０は、「園→｛エン（指文字）｝」の部分については、指文字に変換する指示があるため、指文字変換手段１６で指文字の処理を行う。 Also, the translation device 10 processes the finger character by the finger character conversion means 16 because there is an instruction to convert the portion of “Garden → {En (finger character)}” into a finger character.

ここで、実施例３では、上述した単語−読み対応付け手段４２により、「園田（ソノダ）」は「園（ソノ）田（ダ）」という対応付けが得られる。そのため、翻訳装置１０は、「園」の翻訳を｛ソノ（指文字）｝に変換し、最終的に｛ソノ（指文字）｝｛田｝という翻訳結果が得られる。 Here, in the third embodiment, the above-mentioned word-reading association unit 42 obtains an association of “Sonoda” with “Sonoda”. Therefore, the translation apparatus 10 converts the translation of “zono” into {sono (finger character)} and finally obtains a translation result of {sono (finger character)} {field}.

＜実施例４＞
翻訳装置１０は、例えば入力単語として「釜石」が入力された場合、まず固定訳辞書２１を用いた翻訳を行い、固定訳辞書２１に「石→｛石｝」があるが、「釜」については辞書に含まれていないため、翻訳モデル格納手段３７に格納された翻訳モデルを用いて翻訳を行うが、「釜」の字が未学習の場合であり翻訳規則がない。 <Example 4>
For example, when “Kamaishi” is input as an input word, the translation device 10 first translates using the fixed translation dictionary 21, and there is “stone → {stone}” in the fixed translation dictionary 21. Is not included in the dictionary, translation is performed using the translation model stored in the translation model storage unit 37. However, there is no translation rule because the character “Kama” is not learned.

そこで、翻訳装置１０は、指文字変換手段１６を用いて「釜」を指文字に変換する。したがって、指文字変換手段１６への入力として、「釜→（指文字）」、「石→｛石｝」が得られる。 Therefore, the translation apparatus 10 uses the finger character conversion means 16 to convert “Kama” into a finger character. Therefore, “Kama → (Finger character)” and “Stone → {Stone}” are obtained as input to the finger character conversion means 16.

なお、「釜」については、「釜石」全体の読みが「カマイシ」であることから、例えば公知のモノルビ付与手法等を用いることで、「釜（カマ）」を得ることができる。なお、「釜」については、上述した単語−読み対応付け手段４２により「釜（カマ）」を対応付けてもよい。 As for “Kama”, since “Kamaishi” is read as “Kamaishi”, “Kama” can be obtained by using, for example, a known mono-ruby imparting technique. As for “Kama”, “Kama” may be associated with the word-reading association means 42 described above.

したがって、「釜→（指文字）」は、「釜→カマ（指文字）」となる。これにより、指文字変換手段１６により「釜」が指文字変換され、最終的に｛カマ（指文字）｝｛石｝という翻訳結果が得られる。 Therefore, “Kama → (finger character)” becomes “Kama → Kama (finger character)”. As a result, the finger character conversion means 16 converts the “kama” into a finger character, and finally a translation result of {Kama (finger character)} {stone} is obtained.

上述した実施例１〜４に示す翻訳結果は、例えばＣＧキャラクタによるＣＧ手話として出力することができる。また、上述した実施例１〜４は、単語のみであったが、複数の単語を含む文字列（文章）であっても同様の処理を行うことができる。 The translation results shown in the first to fourth embodiments can be output as CG sign language by a CG character, for example. Moreover, although Examples 1-4 mentioned above were only a word, the same process can be performed even if it is a character string (sentence) containing several words.

＜翻訳の手順＞
つまり、本実施形態における翻訳手法は、例えば固有名詞等の単語が入力された場合に、まず固有名詞を文字毎に分割し、それぞれの文字に対応する手話単語を置き換えモデルから取得する。その結果、それぞれの文字毎に複数の翻訳候補を取得できるため、次にそれらを全てのパターンで組み合わせる。 <Translation procedure>
That is, in the translation method in the present embodiment, when a word such as a proper noun is input, for example, the proper noun is first divided for each character, and a sign language word corresponding to each character is acquired from the replacement model. As a result, a plurality of translation candidates can be acquired for each character, and then they are combined in all patterns.

例えば、上述した置き換えモデルから得た「文字を変換する部分のスコア（翻訳確率）」と、言語モデルから得た「翻訳結果の単語の並びの確からしさ（単語の並ぶ確率）」とから、「Ｐ（Ｓ）・Ｐ（Ｔ｜Ｓ）」を計算し、これが最大のものを翻訳結果として出力する。ここで、Ｐ（Ｓ）は言語モデルから得られる尤度を示し、Ｐ（Ｔ｜Ｓ）は置き換えモデルから得られる尤度を示す。また、Ｐ（Ｓ）は「出力される文の尤もらしさ」を意味し、Ｐ（Ｔ｜Ｓ）は「翻訳結果のＳという文を入れた時に元の文が入力された文Ｔである確率」を意味する。 For example, from the above-mentioned “score of the part to convert characters (translation probability)” obtained from the replacement model and “probability of word arrangement of translation results (probability of word arrangement)” obtained from the language model, “ P (S) · P (T | S) "is calculated, and the largest one is output as the translation result. Here, P (S) represents the likelihood obtained from the language model, and P (T | S) represents the likelihood obtained from the replacement model. P (S) means “likelihood of output sentence”, and P (T | S) is “probability that the original sentence is the sentence T when the sentence“ S ”is input. "Means.

例えば「福島」の例で考えると、まず置き換えモデルからそれぞれの文字毎の翻訳候補とスコア（翻訳確率）を算出すると、日本語→手話とその尤度との関係として、
福→｛幸せ｝：尤度０．６
福→｛フク［指文字］｝：尤度０．５
島→｛島｝：尤度０．７
島→｛岸｝｛島｝：尤度０．２
等の候補が得られたとする。 For example, in the case of “Fukushima”, when the translation candidate and score (translation probability) for each character are first calculated from the replacement model, the relationship between Japanese → sign language and its likelihood is:
Lucky → {happy}: likelihood 0.6
Lucky → {Fuku [finger]}: Likelihood 0.5
Island → {Island}: likelihood 0.7
Island → {shore} {island}: likelihood 0.2
And so on.

次に、それぞれを組み合わせた場合の尤度を言語モデルから得ると、
｛幸せ｝｛島｝：尤度０．６
｛フク［指文字］｝｛島｝：尤度０．３
｛幸せ｝｛岸｝｛島｝：尤度０．２
｛フク［指文字］｝｛岸｝｛島｝：尤度０．１
等となる。これらの結果から、
Ｐ（｛幸せ｝｛島｝）＝０．６＊０．７＊０．６＝０．２５２
Ｐ（｛フク［指文字］｝｛島｝）＝０．５＊０．７＊０．３＝０．１０５
Ｐ（｛幸せ｝｛岸｝｛島｝）＝０．６＊０．２＊０．２＝０．０２４
Ｐ（｛フク［指文字］｝｛岸｝｛島｝）＝０．５＊０．２＊０．１＝０．０１
等となり、｛幸せ｝｛島｝が最大のスコア（０．２５２）となるため、これが最終的な翻訳結果として出力される。 Next, when the likelihood of combining each is obtained from the language model,
{Happy} {Island}: Likelihood 0.6
{Fuku [finger]} {island}: likelihood 0.3
{Happy} {shore} {island}: likelihood 0.2
{Fuku [finger]} {Kishi} {Island}: Likelihood 0.1
Etc. From these results,
P ({happy} {island}) = 0.6 * 0.7 * 0.6 = 0.252
P ({Hook [Finger]} {Island}) = 0.5 * 0.7 * 0.3 = 0.105
P ({happy} {shore} {island}) = 0.6 * 0.2 * 0.2 = 0.024
P ({Hook [Finger]} {Kishi} {Island}) = 0.5 * 0.2 * 0.1 = 0.01
Since {happiness} {island} has the maximum score (0.252), this is output as the final translation result.

＜比較例＞
次に、本実施形態と従来手法との翻訳結果の比較例について説明する。図７は、従来手法との比較例を示す図である。比較例では、単語のアライメントと翻訳モデルの生成に、「ＧＩＺＡ＋＋」と「ｇｒｏｗ−ｄｉａｇ−ｆｉｎａｌ−ａｎｄ」を用いた。デコーダには「Ｍｏｓｅｓ」を用い、言語モデルの学習には「ＳＲＩＬＭ」を用いた。 <Comparative example>
Next, a comparative example of translation results between this embodiment and the conventional method will be described. FIG. 7 is a diagram showing a comparative example with the conventional method. In the comparative example, “GIZA ++” and “grow-diag-final-and” were used for word alignment and translation model generation. “Moses” was used as the decoder, and “SRILM” was used for learning the language model.

日本語の固有名詞は、数文字程度の文字列であることから、言語モデルには、３−ｇｒａｍを採用し、学習データには、例えば既存の手話ニュースコーパスの２１９９５文対を用いた。また、固有名詞には、人名と地名とを用いた。 Since Japanese proper nouns are character strings of several characters, 3-gram is adopted as a language model, and for example, 21995 sentence pairs of an existing sign language news corpus are used as learning data. In addition, personal names and place names were used as proper nouns.

まず、日本の苗字データベースからランダムに抽出した１００の人名と日本の市名からランダムに抽出した１００の地名の合計２００の固有名詞を、３人のネイティブの手話話者に翻訳してもらい、手話話者２人以上の表現が一致した９６の人名と８２の地名とを比較データとして採用した。 First, a total of 200 proper names consisting of 100 names randomly extracted from the Japanese surname database and 100 place names randomly extracted from Japanese city names were translated by 3 native sign language speakers. 96 person names and 82 place names that matched the expression of two or more speakers were used as comparison data.

その結果、図７に示すように、従来手法（フレーズ単位に分割せずに、センテンスペアから直接学習した翻訳モデル（Ｂａｓｅｌｉｎｅｍｅｔｈｏｄ））では、人名の単語が９６個中７４個（正解率７７．１％）、地名の単語が８２個中５２個（正解率６３．４％）で正解であったのに対し、本実施形態（本手法）では、人名の単語が９６個中７９個（正解率８２．３％）、地名の単語が８２個中５６個（正解率６８．３％）で正解であった。 As a result, as shown in FIG. 7, in the conventional method (the translation model (Baseline method) directly learned from the sentence spare without being divided into phrases), 74 out of 96 words (accuracy rate 77. 1%), 52 of 82 place names were correct, with a correct rate of 63.4%. In this embodiment (this method), 79 out of 96 words (correct) Rate was 82.3%), and the place name was 56 out of 82 words (correct answer rate 68.3%) and correct.

つまり、翻訳精度は、本手法は、人名では５．２ポイント向上し、地名では４．９ポイント向上し、固有名詞全体（合計）でも５．０ポイント向上した。したがって、上述したように本実施形態によれば、高精度な手話翻訳を実現することができる。 In other words, the translation accuracy of this method improved by 5.2 points for human names, 4.9 points for place names, and 5.0 points for all proper nouns (total). Therefore, according to the present embodiment as described above, sign language translation with high accuracy can be realized.

＜実行プログラム＞
ここで、上述した翻訳装置１０は、例えばＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）等の揮発性の記憶装置（格納装置）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等の不揮発性の記憶装置（格納装置）、マウスやキーボード、ポインティングデバイス等の入力装置、画像やデータ等を表示する表示装置、並びに外部と通信するためのインタフェース装置を備えたコンピュータによって構成することができる。 <Execution program>
Here, the translation device 10 described above includes, for example, a volatile storage device (storage device) such as a CPU (Central Processing Unit), a RAM (Random Access Memory), and a nonvolatile storage device (such as a ROM (Read Only Memory)). Storage device), an input device such as a mouse, a keyboard, and a pointing device, a display device for displaying images and data, and a computer having an interface device for communicating with the outside.

したがって、翻訳装置１０が有する上述した各機能は、これらの機能を記述したプログラムをＣＰＵに実行させることによりそれぞれ実現可能となる。また、これらのプログラムは、磁気ディスク（フロッピー（登録商標）ディスク、ハードディスク等）、光ディスク（ＣＤ−ＲＯＭ、ＤＶＤ等）、半導体メモリ等の記録媒体に格納して頒布することもできる。 Accordingly, the above-described functions of the translation apparatus 10 can be realized by causing the CPU to execute a program describing these functions. These programs can also be stored and distributed in a recording medium such as a magnetic disk (floppy (registered trademark) disk, hard disk, etc.), an optical disk (CD-ROM, DVD, etc.), or a semiconductor memory.

つまり、上述した各構成における処理をコンピュータに実行させるための実行プログラム（翻訳プログラム）を生成し、例えば汎用のパーソナルコンピュータやサーバ等にそのプログラムをインストールすることにより、上述した翻訳モデル学習処理や翻訳処理を実現することができる。なお、本実施形態における実行プログラムによる処理については、これに限定されるものではない。 That is, by generating an execution program (translation program) for causing a computer to execute the processing in each configuration described above and installing the program in, for example, a general-purpose personal computer or server, the above-described translation model learning processing or translation Processing can be realized. In addition, about the process by the execution program in this embodiment, it is not limited to this.

上述したように本実施形態によれば、手話翻訳において、固有名詞等の高精度な翻訳を実現することができる。例えば、元言語の一例としての日本語単語と、目的言語の一例としての手話単語との対応付けを行い、その結果に基づいて文をフレーズ単位に分割し、分割した結果に基づいて日本語文字と手話単語との対応付けを機械学習することで、学習データ対の中の文字数、手話単語数が少なくなり、対応付けの学習精度が向上する。したがって、上述した学習手法により得られた翻訳モデルを用いることで、より高精度な手話翻訳を実現することができる。 As described above, according to the present embodiment, it is possible to realize highly accurate translation of proper nouns and the like in sign language translation. For example, a Japanese word as an example of the original language is associated with a sign language word as an example of the target language, and the sentence is divided into phrases based on the result, and the Japanese character is based on the divided result. By machine learning of the association between a sign language word and a sign language word, the number of characters and the number of sign language words in the learning data pair are reduced, and the learning accuracy of the association is improved. Therefore, more accurate sign language translation can be realized by using the translation model obtained by the learning method described above.

なお、例えば日本語単語と手話単語の間での対応付けは、同じ意味を表す文中での単語の出現数もある程度似ているため、対応付けがしやすい。また、本実施形態では、学習データの分割の際に、明らかに日本語と日本手話の間で単語数に差がある場合は、その分割結果が誤っているとして除去してもよい。例えば、今回は単語数がどちらかの２倍以上かつ５個以上の差がある場合に誤りとして除去することで、より適切な翻訳を可能とする。 Note that, for example, the association between Japanese words and sign language words is easy to associate because the number of appearances of words in the sentence having the same meaning is somewhat similar. In the present embodiment, when the learning data is divided, if there is a clear difference in the number of words between Japanese and Japanese sign language, the division result may be removed as incorrect. For example, this time, if there is a difference of more than twice the number of words and 5 or more, it is removed as an error, thereby enabling more appropriate translation.

また、本実施形態では、手話に限らず、文を分割してから翻訳モデルを学習することで、翻訳の性能が向上する可能性がある。例えば、学習データが完全な「直訳コーパス」ではない場合に有効である。 In this embodiment, not only sign language but also translation performance may be improved by learning a translation model after dividing a sentence. For example, this is effective when the learning data is not a complete “direct translation corpus”.

なお、従来では、例えば一部の番組等で手話での放送を行っているが、手話通訳者の確保が困難であり、夜間等の突発的な災害等には、手話での情報提示が行えない場合があったが、上述した本実施形態の技術を適用することにより、翻訳精度を向上させた手話映像を提供することができる。 Conventionally, for example, some programs have been broadcast in sign language, but it is difficult to secure a sign language interpreter, and information can be presented in sign language for sudden disasters such as at night. In some cases, sign language images with improved translation accuracy can be provided by applying the technique of the present embodiment described above.

なお、上述した本実施形態では、日本語（元言語）から日本手話（目的言語）への翻訳例を示したが、これに限定されるものではなく、例えば英語等の他言語から日本手話や他言語手話といった翻訳に適用することもでき、また日本手話から日本語に翻訳する手法に適用することもできる。 In the above-described embodiment, an example of translation from Japanese (original language) to Japanese sign language (target language) is shown. However, the present invention is not limited to this example. It can be applied to translations such as other language sign language, and can also be applied to a technique for translating Japanese sign language into Japanese.

以上、本発明の好ましい実施形態について詳述したが、開示の技術は係る特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形、変更が可能である。また、上述した各実施例の構成要素の全部又は一部を組み合わせることも可能である。 The preferred embodiments of the present invention have been described in detail above. However, the disclosed technology is not limited to the specific embodiments, and various modifications can be made within the scope of the gist of the present invention described in the claims. Modifications and changes are possible. It is also possible to combine all or some of the components of the above-described embodiments.

１０翻訳装置
１１翻訳モデル学習手段
１２単語入力手段
１３固定訳翻訳手段
１４入力文字分割手段
１５翻訳手段
１６指文字変換手段
１７翻訳結果出力手段
２１固定訳辞書
３１学習データ格納手段（センテンスペア）
３２文分割手段
３３学習データ格納手段（フレーズペア）
３４文字単位分割手段
３５置き換えモデル学習手段
３６言語モデル学習手段
３７翻訳モデル格納手段
４１単語読み入力手段
４２単語−読み対応付け手段 DESCRIPTION OF SYMBOLS 10 Translation apparatus 11 Translation model learning means 12 Word input means 13 Fixed translation translation means 14 Input character division means 15 Translation means 16 Finger character conversion means 17 Translation result output means 21 Fixed translation dictionary 31 Learning data storage means (sentence spare)
32 sentence division means 33 learning data storage means (phrase pair)
34 Character unit dividing means 35 Replacement model learning means 36 Language model learning means 37 Translation model storage means 41 Word reading input means 42 Word-reading association means

Claims

In a translation device that performs sign language translation of input data,
Input character dividing means for dividing the input data into predetermined characters;
Translation means for translating a character divided by the input character dividing means using a translation model in which a combination of learning data of a translation target language and sign language is learned in a predetermined phrase unit,
A translation apparatus comprising translation result output means for outputting a translation result by the translation means.

A translation model learning means for learning the translation model;
The translation model learning means includes sentence dividing means for dividing learning data for each sentence of the language to be translated and the sign language included in preset learning data into sentences in phrases.
Character unit dividing means for dividing the phrase pair learning data obtained by the sentence dividing means into character units;
The translation apparatus according to claim 1, wherein the translation model is learned using a phrase divided by the character unit dividing unit and a language model corresponding to learning data for each sentence.

Before the input data is translated by the translation means, there is a fixed translation translation means for translating the input data using a preset fixed translation dictionary,
The translation apparatus according to claim 1, wherein the translation unit performs translation using the translation model for a word that cannot be translated by the fixed translation translation unit in the input data.

4. A finger character conversion means for converting the word into a sign language of a finger character when there is a word that could not be translated by the fixed translation means and the translation means in the input data. The translation device described in 1.

The translation program for functioning a computer as each means which the translation apparatus of any one of Claims 1 thru | or 4 has.