JP6306899B2

JP6306899B2 - Non-finger movement detection device and program

Info

Publication number: JP6306899B2
Application number: JP2014040707A
Authority: JP
Inventors: 加藤　直人; 直人加藤; 太郎宮▲崎▼
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2014-03-03
Filing date: 2014-03-03
Publication date: 2018-04-04
Anticipated expiration: 2034-03-03
Also published as: JP2015166902A

Description

この発明は、非手指動作検出装置及びプログラムに関する。 The present invention relates to a non-finger movement detection device and a program.

手話サービスの拡充をめざし、日本語から手話への自動翻訳の研究が進められている。手話では、手や指の動きなどの手指動作とともに、顔の表情や頭部の動きなどの非手指動作によって言語的情報を表出している。テキストから手話への翻訳は、日本語文と手話文との対訳コーパスに基づいて行われる。一般に、手話文で表記されるのは主に手指動作であり、非手指動作は一部のみしか書き起こされていない。これは、手指動作は単語の同定をしやすいが、非手指動作は難しいからである。非手指動作は言語的役割や音声言語のプロソディの役割もあり、その区別をつけて書き起こすことは人間でも容易ではない。 Research on automatic translation from Japanese into sign language is ongoing with the aim of expanding sign language services. In sign language, linguistic information is expressed by finger movements such as hand and finger movements as well as non-finger movements such as facial expressions and head movements. Translation from text to sign language is performed based on a bilingual corpus of Japanese sentences and sign language sentences. In general, the sign language sentence mainly describes finger movements, and only a part of non-hand movements is transcribed. This is because a finger movement is easy to identify a word, but a non-finger movement is difficult. Non-finger movements also have the role of linguistic and spoken language prosody, and it is not easy for humans to write with distinction.

従来、非手指動作を検出する技術としては、顔認識技術を使って顔の表情を自動認識し、その箇所を非手指動作として検出する技術がある（例えば、特許文献１〜３）。 Conventionally, as a technique for detecting a non-finger movement, there is a technique for automatically recognizing a facial expression using a face recognition technique and detecting the position as a non-finger movement (for example, Patent Documents 1 to 3).

特開２０１４−０２１８９６号公報JP 2014-021896 A 特開２０１４−０２１８４６号公報JP 2014-021846 A 特開２０１１−１５０３８１号公報JP 2011-150381 A

社会福祉法人全国手話研修センター日本手話研究所編、「新日本語−手話辞典」、財団法人全日本聾唖連盟、２０１１年６月１０日Social Welfare Corporation National Sign Language Training Center, Japanese Sign Language Research Institute, “New Japanese-Sign Language Dictionary”, All-Japan Samurai Federation, June 10, 2011

しかしながら、手話では顔表情が音声言語のプロソディの役割も持っており、非手指動作以外でも顔表情が表出されることがある。したがって、顔表情があっても必ずしも言語的な意味を持つものではなく、正しく非手指動作が検出されない場合がある。 However, in sign language, facial expressions also have the role of a spoken language prosody, and facial expressions may appear even in non-finger movements. Therefore, even if there is a facial expression, it does not necessarily have a linguistic meaning, and a non-hand movement may not be detected correctly.

したがって、かかる点に鑑みてなされた本発明の目的は、日本語文と手話文との対訳コーパスに基づき非手指動作すべき単語を検出する非手指動作検出装置及びプログラムを提供することである。 Accordingly, an object of the present invention made in view of such a point is to provide a non-hand movement detection device and a program for detecting a word to be non-hand movement based on a bilingual corpus of a Japanese sentence and a sign language sentence.

上述した諸課題を解決すべく、本発明に係る非手指動作検出装置は、日本語文と、前記日本語文に対応する手話文とを含む対訳コーパスを格納する格納部であって、前記日本語文を日本語単語に分割し、語順を示す番号と紐付けて格納すると共に、前記手話文を手指動作に対応する手話単語に分割し、語順を示す番号と紐付けて格納する前記格納部と、前記日本語文に含まれる前記日本語単語と、前記手話文に含まれる前記手話単語とを対応付けるアライメント部であって、前記日本語単語と前記手話単語との間の文字又は文字列の一致と、前記日本語単語と前記手話単語の文中距離に基づいて、前記日本語単語と前記手話単語とを対応付けることを含む、前記アライメント部と、前記日本語単語のうち、前記手話単語と対応付けられていない日本語単語から、語彙的意味を表す内容語を非手指動作の対象となる日本語単語として検出する非手指動作検出部と、を備える。 In order to solve the above-described problems, a non-finger movement detection device according to the present invention is a storage unit that stores a bilingual corpus including a Japanese sentence and a sign language sentence corresponding to the Japanese sentence, The storage unit is divided into Japanese words and stored in association with numbers indicating the word order, and the sign language sentence is divided into sign language words corresponding to finger movements, and stored in association with numbers indicating the word order , said Japanese words contained in Japanese sentences, a alignment unit for associating the said sign language word included in the sign language sentence, the matching character or string between the sign language words and the Japanese word, the based Japanese words and the sentence length of said sign language words comprises associating the said sign language word and the Japanese word, and the alignment portion, of the Japanese word, Do not associated with the sign language word From Japanese word comprises a non-finger movement detection unit for detecting the content words representing the lexical meaning as the Japanese word to be non-finger operation, the.

また、前記アライメント部は、前記日本語単語のうち、複数漢字から構成される日本語単語が当該日本語単語の漢字数以上の他言語単語に翻訳される場合、前記複数漢字から構成される日本語単語を各漢字に分割し、前記各漢字と前記手話単語との間の翻訳確率に基づき前記各漢字と前記手話単語との対応付けをさらに行う、ことが好ましい。 In addition, when the Japanese word composed of a plurality of kanji characters is translated into another language word that is equal to or more than the number of kanji characters of the Japanese word, the alignment unit is configured to include the plurality of kanji characters. Preferably, the word word is divided into kanji characters, and the kanji characters and the sign language words are further associated with each other based on the translation probabilities between the kanji characters and the sign language words.

また、上記課題を解決するため、本発明に係るプログラムは、コンピュータを、上記非手指動作検出装置として機能させるものである。 In order to solve the above problems, a program according to the present invention causes a computer to function as the non-finger movement detection device.

本発明に係る非手指動作検出装置及びプログラムによれば、日本語文と手話文との対訳コーパスに基づき非手指動作すべき単語を検出することが可能になる。 According to the non-finger movement detection apparatus and program according to the present invention, it is possible to detect a word that should be non-finger movement based on a bilingual corpus of a Japanese sentence and a sign language sentence.

本発明の一実施形態に係る非手指動作検出装置の概略構成を示す図である。It is a figure which shows schematic structure of the non-finger movement detection apparatus which concerns on one Embodiment of this invention. 非手指動作検出装置の処理を示すフローチャートである。It is a flowchart which shows the process of a non-finger movement detection apparatus.

以降、諸図面を参照しながら、本発明の実施態様を詳細に説明する。ここで、手話には確立された表記法はないが、本稿においては、一般的に用いられる表記法である日本語ラベルを用いるものとする（非特許文献１参照）。日本語ラベルでは、手話単語は“{雨}”のように、手話単語と意味的に近い日本語の単語を括弧“{}”で括ることによって記述される。日本語ラベルは単語ごとの書き起こしであるので、機械翻訳で扱いやすいという利点がある。日本語文と手話文（日本語ラベル）との対訳コーパスの一例は、下記の通り表わされる。
［対訳コーパス］
日本語文：大雪や強い雷雨に注意するよう呼びかけています
手話文：{とても} {降雪} {雷} {雨} {注意} {宣伝} {中} Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. Here, there is no established notation for sign language, but in this paper, Japanese labels that are commonly used notation are used (see Non-Patent Document 1). In a Japanese label, a sign language word is described by enclosing a Japanese word semantically close to a sign language word in parentheses “{}”, such as “{rain}”. Japanese labels have the advantage that they are easy to handle in machine translation because they are written for each word. An example of a bilingual corpus of Japanese sentences and sign language sentences (Japanese labels) is expressed as follows.
[Bilingual Corpus]
Japanese sentence: Calling attention to heavy snow and heavy thunderstorms Sign language: {Very} {Snowfall} {Thunder} {Rain} {Caution} {Promotion} {Medium}

図１は、本発明の一実施形態に係る非手指動作検出装置の概略構成を示す図である。非手指動作検出装置は、形態素解析部１１、単語プレアライメント部１２、文字プレアライメント部１３、文中距離計算部１４、漢字分割部１５、翻訳確率推定部１６、翻訳確率アライメント部１７、及び非手指動作検出部１８を有する制御部１０と、対訳コーパス格納部２１及び翻訳確率格納部２２を有する格納部２０とを備える。制御部１０において、単語プレアライメント部１２、文字プレアライメント部１３、文中距離計算部１４、漢字分割部１５、翻訳確率推定部１６、翻訳確率アライメント部１７は、アライメント部Ａを構成する。制御部１０の各処理部１１〜１８は好適なプロセッサにより構成され、各処理部１１〜１８を共通のプロセッサで実装したり、個別のプロセッサとして実装したりすることができる。格納部２０は好適な記憶装置であって、非手指動作検出装置に内蔵されるものだけではなく、通信インタフェースを介した外部記憶装置を用いてもよい。 FIG. 1 is a diagram showing a schematic configuration of a non-finger movement detection device according to an embodiment of the present invention. The non-finger movement detection device includes a morphological analysis unit 11, a word pre-alignment unit 12, a character pre-alignment unit 13, a sentence distance calculation unit 14, a kanji division unit 15, a translation probability estimation unit 16, a translation probability alignment unit 17, and a non-hand The control part 10 which has the operation | movement detection part 18, and the storage part 20 which has the bilingual corpus storage part 21 and the translation probability storage part 22 are provided. In the control unit 10, the word pre-alignment unit 12, the character pre-alignment unit 13, the sentence distance calculation unit 14, the kanji division unit 15, the translation probability estimation unit 16, and the translation probability alignment unit 17 constitute an alignment unit A. Each processing unit 11 to 18 of the control unit 10 is configured by a suitable processor, and each processing unit 11 to 18 can be implemented by a common processor or can be implemented as an individual processor. The storage unit 20 is a suitable storage device, and may be an external storage device via a communication interface as well as a built-in non-hand motion detection device.

形態素解析部１１は、対訳コーパス中の日本語文を形態素解析して単語に分割する。また、形態素解析部１１は、対訳コーパス中の手話文が日本語ラベルのように単語ごとに分かれていない場合、手話文を形態素解析して単語に分割することができる。これ以降、日本語文中の各単語を日本語単語、手話文中の各単語を手話単語と適宜称するものとする。 The morpheme analyzer 11 divides the Japanese sentence in the bilingual corpus into words by performing morphological analysis. In addition, when the sign language sentence in the bilingual corpus is not divided into words like Japanese labels, the morphological analysis unit 11 can divide the sign language sentence into words by performing morphological analysis. Hereinafter, each word in the Japanese sentence will be appropriately referred to as a Japanese word, and each word in the sign language sentence will be appropriately referred to as a sign language word.

アライメント部Ａは、日本語文に含まれる日本語単語と、手話文に含まれる手話単語とを対応付ける（アライメントする）ものである。 The alignment unit A associates (aligns) the Japanese words included in the Japanese sentence with the sign language words included in the sign language sentence.

単語プレアライメント部１２は、日本語単語と手話単語との間で表層的に一致するもの同士を対応付ける。すなわち、単語プレアライメント部１２は、文字（文字列）が完全に一致する日本語単語と手話単語とを対応付けるものである。 The word pre-alignment unit 12 associates Japanese words and sign language words that are superficially matched. That is, the word pre-alignment unit 12 associates Japanese words and sign language words whose characters (character strings) completely match.

文字プレアライメント部１３は、日本語単語を構成する文字と手話単語との間で表層的に一致するもの同士を対応付ける。すなわち、文字プレアライメント部１３は、日本語単語と、当該日本語単語を構成する文字と同じ手話単語とを対応付けるものである。 The character pre-alignment unit 13 associates the characters constituting the Japanese word and the sign language words that are superficially matched. That is, the character pre-alignment unit 13 associates a Japanese word with the same sign language word as the character constituting the Japanese word.

文中距離計算部１４は、ある日本語単語が日本語文中で出現する位置と、ある手話単語が手話文中で出現する位置との近さ（文中距離）を計算する。一般的に日本語文と手話文との語順は似ているため、ある日本語単語とある手話単語との文中距離が近い場合、当該単語同士が対応する可能性も高くなる。また、ある日本語単語とある手話単語との文中距離が離れている場合、当該単語同士が対応する可能性は低くなる。そのため、文中距離が所定の閾値を超える単語同士については、アライメントの対象から除外することが可能となる。文中距離計算部１４が計算する文中距離dist(j_p,s_q)は、式（１）のように定義される。ここで、N_Jは日本語文中の日本語単語数、j_pはp番目の日本語単語、N_Sは手話文中の手話単語数、s_qはq番目の手話単語である。
dist(j_p,s_q) = |p/N_J- q/N_S| （１） The in-sentence distance calculation unit 14 calculates the closeness (in-sentence distance) between a position where a certain Japanese word appears in a Japanese sentence and a position where a certain sign language word appears in a sign language sentence. In general, Japanese sentences and sign language sentences are similar in word order. Therefore, when the distance in a sentence between a certain Japanese word and a certain sign language word is close, there is a high possibility that the words correspond to each other. In addition, when the distance in a sentence between a certain Japanese word and a certain sign language word is long, the possibility that the words correspond to each other is low. Therefore, words whose sentence distance exceeds a predetermined threshold can be excluded from alignment targets. The in-sentence distance dist (j _p , s _q ) calculated by the in-sentence distance calculation unit 14 is defined as in Expression (1). Here, N _J is the number of Japanese words in the Japanese sentence, j _p is the p-th Japanese word, N _S is the number of sign language words in the sign language sentence, and s _q is the q-th sign language word.
dist (j _p , s _q ) = | p / N _J -q / N _S | (1)

漢字分割部１５は、日本語単語のうち、複数漢字から構成される日本語単語が当該日本語単語の漢字数以上の他言語単語に翻訳される場合、複数漢字から構成される日本語単語を各漢字に分割する。ここで、他言語とは、例えば英語であって、漢字分割部１５は、２文字の漢字から構成される日本語単語が２つ以上の英単語に翻訳される場合、当該日本語単語を各漢字に分割する。日本語単語の他言語への翻訳には種々の公知技術を用いることができ、例えば、漢字分割部１５は、日本語単語から他言語への翻訳辞書等を備えているものとする。 When a Japanese word composed of a plurality of kanji characters is translated into another language word that is greater than or equal to the number of kanji in the Japanese word, the kanji dividing unit 15 converts the Japanese word composed of a plurality of kanji characters. Divide into each kanji. Here, the other language is, for example, English, and when the Japanese word composed of two kanji characters is translated into two or more English words, the kanji dividing unit 15 Divide into kanji. Various known techniques can be used for translating Japanese words into other languages. For example, the kanji dividing unit 15 is provided with a translation dictionary from Japanese words into other languages.

翻訳確率推定部１６は、日本語単語と手話単語との間の翻訳確率を推定する。翻訳確率の推定には種々の公知技術を用いることができ、例えば、翻訳確率推定部１６は、ＥＭアルゴリズムにより翻訳確率を推定することができる。なお、ここで言う日本語単語とは、漢字分割部１５により分割された各漢字（１つの漢字からなる日本語単語）を含むものである。 The translation probability estimation unit 16 estimates a translation probability between a Japanese word and a sign language word. Various well-known techniques can be used for estimating the translation probability. For example, the translation probability estimating unit 16 can estimate the translation probability using an EM algorithm. In addition, the Japanese word said here includes each Chinese character (Japanese word which consists of one Chinese character) divided | segmented by the Chinese character division | segmentation part 15. FIG.

翻訳確率アライメント部１７は、日本語単語と手話単語との間の翻訳確率に基づき日本語単語と手話単語との対応付けを行う。翻訳確率に基づく対応付けには、種々の公知技術を用いることができ、例えば、翻訳確率アライメント部１７は、Ｇｒｅｅｄｙアルゴリズムにより単語の対応付けを行うことができる。なお、日本語単語は漢字分割部１５により分割された各漢字を含むものであるため、翻訳確率アライメント部１７は、各漢字と手話単語との間の翻訳確率に基づき各漢字と手話単語との対応付けを行うものである。 The translation probability alignment unit 17 associates the Japanese word with the sign language word based on the translation probability between the Japanese word and the sign language word. Various known techniques can be used for the association based on the translation probabilities. For example, the translation probability alignment unit 17 can perform word association using the Greedy algorithm. Since the Japanese word includes each kanji divided by the kanji dividing unit 15, the translation probability alignment unit 17 associates each kanji with the sign language word based on the translation probability between each kanji and the sign language word. Is to do.

非手指動作検出部１８は、日本語単語のうち、手話単語と対応付けられていない日本語単語から、非手指動作を行う日本語単語を検出する。特に、非手指動作検出部１８は、手話単語と対応付けられていない日本語単語のうち、名詞、動詞、形容詞など、主として語彙的意味を表す内容語を非手指動作の対象となる日本語単語として検出する。 The non-finger movement detector 18 detects a Japanese word that performs a non-finger movement from Japanese words that are not associated with a sign language word. In particular, the non-hand motion detection unit 18 is a Japanese word that is a target of non-hand motion for content words that mainly represent lexical meaning, such as nouns, verbs, and adjectives, among Japanese words that are not associated with sign language words. Detect as.

対訳コーパス格納部２１は、日本語文と、日本語文に対応する手話文とを含む対訳コーパスを格納する。また、対訳コーパス格納部２１は、アライメント部Ａによる日本語単語と手話単語との対応付け結果を適宜格納する。 The bilingual corpus storage unit 21 stores a bilingual corpus including a Japanese sentence and a sign language sentence corresponding to the Japanese sentence. The bilingual corpus storage unit 21 appropriately stores the association result between the Japanese word and the sign language word by the alignment unit A.

翻訳確率格納部２２は、翻訳確率推定部１６が推定した日本語単語と手話単語との間の翻訳確率を格納する。 The translation probability storage unit 22 stores the translation probability between the Japanese word and the sign language word estimated by the translation probability estimation unit 16.

図２は、非手指動作検出装置の処理を示すフローチャートである。本フローチャートでは、下記の対訳コーパス１を例に、各機能部の処理を説明する。
［対訳コーパス１］
日本語文：大雪や強い雷雨に注意するよう呼びかけています
手話文：{とても} {降雪} {雷} {雨} {注意} {宣伝} {中} FIG. 2 is a flowchart illustrating processing of the non-finger movement detection device. In this flowchart, the processing of each functional unit will be described using the following parallel corpus 1 as an example.
[Bilingual Corpus 1]
Japanese sentence: Calling attention to heavy snow and heavy thunderstorms Sign language: {Very} {Snowfall} {Thunder} {Rain} {Caution} {Promotion} {Medium}

Ｓ１では、日本語文と手話文とを含む対訳コーパスが入力されると、形態素解析部１１は、対訳コーパスの日本語文を形態素解析して単語に分割し、分割結果を対訳コーパス格納部２１に格納する。例えば、対訳コーパス１は、形態素解析部１１により下記の通り単語に分割される。
［対訳コーパス１（単語分割例）］
日本語文：大雪₁/や₂/強い₃/雷雨₄/に₅/注意₆/する₇/よう₈/呼びかけ₉/て₁₀/い₁₁/ます₁₂
手話文：{とても}₁ {降雪}₂ {雷}₃ {雨}₄ {注意}₅ {宣伝}₆ {中}₇ In S <b> 1, when a bilingual corpus including a Japanese sentence and a sign language sentence is input, the morphological analysis unit 11 performs morphological analysis on the Japanese sentence of the bilingual corpus and divides the word into words, and stores the division result in the bilingual corpus storage unit 21. To do. For example, the bilingual corpus 1 is divided into words by the morphological analyzer 11 as follows.
[Bilingual Corpus 1 (word division example)]
Japanese statement: heavy snow ₁ / and ₂ / strong ₃ / thunderstorm _{_4/5} / Note ₆ / to _{_7/8} so / called ₉ / Te ₁₀ / have ₁₁ / mass ₁₂
Sign language: {Very} ₁ {Snowfall} ₂ {Thunder} ₃ {Rain} ₄ {Caution} ₅ {Promotion} ₆ {Medium} ₇

Ｓ２では、単語プレアライメント部１２は、Ｓ１で分割された日本語単語及び手話単語の中で表層的に一致するもの同士を発見し、文中距離計算部１４により計算された文中距離が閾値以下である場合には、当該単語同士を対応付けてアライメント結果を対訳コーパス格納部２１に格納する。例えば、対訳コーパス１（単語分割例）の日本語単語及び手話単語の中で、表層的に一致する単語は「注意₆」と{注意}₅である。 In S2, the word pre-alignment unit 12 discovers the Japanese words and sign language words divided in S1 that match in the surface layer, and the in-sentence distance calculated by the in-sentence distance calculation unit 14 is less than or equal to the threshold value. If there is, the alignment result is stored in the bilingual corpus storage unit 21 in association with the words. For example, among the Japanese words and sign language words of the bilingual corpus 1 (word division example), the words that match surfacely are “Caution ₆ ” and {Caution} ₅ .

ここで、文中距離計算部１４は、「注意₆」と{注意}₅との文中距離dist(注意₆,{注意}₅)を計算する。文中距離の閾値を0.3とすると、下記の通り、dist(注意₆,{注意}₅)は閾値0.3を下回る値となるため、単語プレアライメント部１２は、「注意₆」と{注意}₅とを対応付けてアライメント結果を対訳コーパス格納部２１に格納する。
dist(注意₆,{注意}₅) = |6/12 - 5/7|= 0.21 ≦ 0.3 Here, the in-sentence distance calculation unit 14 calculates the in-sentence distance dist (attention ₆ , {attention} ₅ ) between “attention ₆ ” and {attention} ₅ . Assuming that the threshold of the distance in the sentence is 0.3, dist (note ₆ , {attention} ₅ ) is a value below the threshold 0.3 as shown below, so the word pre-alignment unit 12 sets “attention ₆ ” and {attention} ₅ to And the alignment result is stored in the bilingual corpus storage unit 21.
dist (Note ₆ , {Note} ₅ ) = | _6/ 12-5/7 | = 0.21 ≤ 0.3

Ｓ３では、文字プレアライメント部１３は、アライメントされていない日本語単語を対象に、日本語単語を構成する文字と手話単語とが表層的に一致するもの同士を発見し、文中距離計算部１４により計算された値が閾値以下である場合には、当該単語同士をアライメントしてアライメント結果を対訳コーパス格納部２１に格納する。例えば、Ｓ２でアライメントされなかった対訳コーパス１（単語分割例）の日本語単語の中で、日本語単語を構成する文字と手話単語とが表層的に一致するものは次の単語となる。
雷雨₄の雷＝ {雷}₃
雷雨₄の雨＝ {雨}₄ In S3, the character pre-alignment unit 13 finds the characters that make up the Japanese words and the sign language words in a superficial manner for unaligned Japanese words, and the in-sentence distance calculation unit 14 If the calculated value is less than or equal to the threshold value, the words are aligned and the alignment result is stored in the bilingual corpus storage unit 21. For example, among the Japanese words in the bilingual corpus 1 (word division example) that are not aligned in S2, the character that constitutes the Japanese word and the sign language word are superficially matched is the next word.
Thunder ₄ Thunderstorm = {Thunder} ₃
Thunderstorm ₄ rain = {rain} ₄

ここで、文中距離計算部１４は、「雷雨₄」と{雷}₃との文中距離dist(雷雨₄,{雷}₃)及び「雷雨₄」と{雨}₄との文中距離dist(雷雨₄,{雨}₄)を計算する。下記の通り、dist(雷雨₄,{雷}₃)及びdist(雷雨₄,{雨}₄)はいずれも閾値0.3を下回る値となるため、文字プレアライメント部１３は、雷雨₄の「雷」と{雷}₃、雷雨₄の「雨」と{雨}₄とを対応付けてアライメント結果を対訳コーパス格納部２１に格納する。
dist(雷雨₄,{雷}₃) = |4/12 - 3/7|= 0.10 ≦ 0.3
dist(雷雨₄,{雨}₄) = |4/12 - 4/7|= 0.24 ≦ 0.3 Here, the sentence distance calculation unit 14, the sentence distance dist (thunderstorm ₄ , {thunder} ₃ ) between “thunderstorm ₄ ” and {thunder} _3, and the sentence distance dist (thunderstorm) between “thunderstorm ₄ ” and {rain} _4. ₄ , {rain} ₄ ) As described below, since dist (thunderstorm ₄ , {thunder} ₃ ) and dist (thunderstorm ₄ , {rain} ₄ ) are both below the threshold value 0.3, the character pre-alignment unit 13 performs “thunder” in thunderstorm ₄ And {thunder} ₃ , “rain” of thunderstorm ₄ and {rain} ₄ are associated with each other, and the alignment result is stored in the parallel corpus storage unit 21.
dist (Thunderstorm ₄ , {thunder} ₃ ) = | 4/12-3/7 | = 0.10 ≦ 0.3
dist (Thunderstorm ₄ , {Rain} ₄ ) = | 4/12-4/7 | = 0.24 ≤ 0.3

Ｓ４では、漢字分割部１５は、アライメントされていない日本語単語のうち、複数漢字から構成される日本語単語が当該日本語単語の漢字数以上の他言語単語に翻訳される場合、複数漢字から構成される日本語単語を各漢字に分割し、分割後の各漢字を対訳コーパス格納部２１に格納する。例えば、Ｓ２及びＳ３でアライメントされなかった対訳コーパス１（単語分割例）の日本語単語の中で、複数漢字で構成される単語は「大雪₁」である。漢字分割部１５は、例えば日英対訳辞書を参照すると、「大雪₁」は「heavy snow」という２つの英単語に翻訳されるため、「大雪₁」を「大」と「雪」の各漢字に分割し、分割結果を対訳コーパス格納部２１に格納する。 In S4, when the Japanese word composed of a plurality of kanji characters is translated into other language words equal to or more than the number of kanji of the Japanese word among the unaligned Japanese words, the kanji dividing unit 15 starts from the plurality of kanji characters. The constructed Japanese word is divided into kanji characters, and each divided kanji character is stored in the bilingual corpus storage unit 21. For example, among the Japanese words of the bilingual corpus 1 (word division example) that are not aligned in S2 and S3, the word composed of a plurality of Chinese characters is “Daiyuki ₁ ”. Kanji dividing unit 15, when you see, for example, Japanese-English bilingual dictionary, each Chinese character for "heavy snow _1" is translated into two of the English word "heavy snow", "heavy snow _1" and "large", "snow" The result of the division is stored in the bilingual corpus storage unit 21.

Ｓ５では、翻訳確率推定部１６は、アライメントされていない日本語単語及び手話単語のうち、文中距離が閾値以下である単語の組を対象に、翻訳確率を推定して翻訳確率格納部２２に格納する。ここで、対訳コーパス１（単語分割例）の中で、Ｓ２〜Ｓ４でアライメントされていない日本語単語及び手話単語は以下の通りである。
［対訳コーパス（アライメントされていない単語）］
日本語文：大₁/雪₁/や₂/強い₃/に₅/する₇/よう₈/呼びかけ₉/て₁₀/い₁₁/ます₁₂
手話文：{とても}₁ {降雪}₂ {宣伝}₆ {中}₇ In S <b> 5, the translation probability estimation unit 16 estimates a translation probability for a set of words whose sentence distance is equal to or less than a threshold among unaligned Japanese words and sign language words, and stores them in the translation probability storage unit 22. To do. Here, in the bilingual corpus 1 (word division example), Japanese words and sign language words that are not aligned in S2 to S4 are as follows.
[Bilingual corpus (unaligned words)]
Japanese statement: large ₁ / snow ₁ / and ₂ / strong _{_3/5} / to _{_7/8} so / called ₉ / Te ₁₀ / have ₁₁ / mass ₁₂
Sign language: {Very} ₁ {Snowfall} ₂ {Promotion} ₆ {Medium} ₇

まず、文中距離計算部１４は、各日本語単語と各手話単語との文中距離を計算する。例えば、日本語単語である「大₁」と各手話単語との文中距離は、下記のように計算できる。この結果、「大₁」に対して、翻訳確率推定部１６により翻訳確率を推定する対象となるのは、文中距離が閾値（0.3）以下となる{とても}₁と{降雪}₂である。
dist(大₁,{とても}₁) = |1/12 - 1/7|= 0.20 ≦ 0.3
dist(大₁,{降雪}₂) = |1/12 - 2/7|= 0.20 ≦ 0.3
dist(大₁,{宣伝}₆) = |1/12 - 6/7|= 0.77 ＞ 0.3
dist(大₁,{中}₇) = |1/12 - 3/7|= 0.92 ＞ 0.3 First, the sentence distance calculation unit 14 calculates the sentence distance between each Japanese word and each sign language word. For example, the distance in the sentence between the Japanese word “Dai ₁ ” and each sign language word can be calculated as follows. As a result, for “large ₁ ”, the translation probabilities are estimated by the translation probability estimator 16 for {very} ₁ and {snowfall} ₂ where the distance in the sentence is equal to or less than the threshold (0.3).
dist (large ₁ , {very} ₁ ) = | ₁ /12-1/7 | = 0.20 ≤ 0.3
dist (large ₁ , {snowfall} ₂ ) = | ₁ /12-2/7 | = 0.20 ≤ 0.3
dist (large ₁ , {advertisement} ₆ ) = | 1/12-6/7 | = 0.77> 0.3
dist (large ₁ , {medium} ₇ ) = | 1/12-3/ ₇ | = 0.92> 0.3

翻訳確率推定部１６は、文中距離が閾値以下となる日本語単語と手話単語の組について、例えばＥＭアルゴリズムにより以下のような翻訳確率を推定し、推定した翻訳確率を翻訳確率格納部２２に格納する。
大₁：{とても}₁（翻訳確率＝0.54）
雪₁：{降雪}₂（翻訳確率＝0.83）
呼びかけ₉：{宣伝}₆（翻訳確率＝0.65）
い₁₁：{中}₇（翻訳確率＝0.16） The translation probability estimation unit 16 estimates the following translation probabilities for a pair of Japanese words and sign language words whose distance in the sentence is equal to or less than a threshold, for example, using the EM algorithm, and stores the estimated translation probabilities in the translation probability storage unit 22. To do.
Large ₁ : {very} ₁ (translation probability = 0.54)
Snow ₁ : {Snowfall} ₂ (Translation probability = 0.83)
Call ₉ : {Promotion} ₆ (Translation probability = 0.65)
₁₁ : {Medium} ₇ (Translation probability = 0.16)

Ｓ６では、翻訳確率アライメント部１７は、Ｓ５で推定された翻訳確率を用いて、まだアライメントされていない単語のアライメントを行い、アライメント結果を対訳コーパス格納部２１に格納する。翻訳確率アライメント部１７は、例えば、ある閾値以上の翻訳確率のみを用いたgreedyアルゴリズムにより下記のようなアライメントを行い、アライメント結果を対訳コーパス格納部２１に格納する。
大₁ ＝ {とても}₁
雪₁ ＝ {降雪}₂
強い₃ ＝（アライメントなし）
に₅ ＝（アライメントなし）
する₇ ＝（アライメントなし）
呼びかけ₉ ＝ {宣伝}₆
い₁₁ ＝ {中}₇
ます₁₂ ＝（アライメントなし） In S6, the translation probability alignment unit 17 performs alignment of words that are not yet aligned using the translation probability estimated in S5, and stores the alignment result in the bilingual corpus storage unit 21. The translation probability alignment unit 17 performs, for example, the following alignment by a greedy algorithm using only a translation probability equal to or higher than a certain threshold, and stores the alignment result in the bilingual corpus storage unit 21.
Large ₁ = {Very} ₁
Snow ₁ = {Snowfall} ₂
Strong ₃ = (no alignment)
₅ = (no alignment)
₇ = (No alignment)
Call ₉ = {Promotion} ₆
₁₁ = {Medium} ₇
₁₂ = (No alignment)

非手指動作検出部１８は、Ｓ２〜６のアライメント後もまだ手話単語と対応付けられていない日本語単語から、非手指動作を行う日本語単語を検出する。特に、非手指動作検出部１８は、アライメントされていない日本語単語のうち、名詞、動詞、形容詞など、主として語彙的意味を表す内容語を非手指動作の対象となる日本語単語として検出する。例えば、上記例では、アライメントなしと判定された日本語単語は「強い₃」、「に₅」、「する₇」、「ます₁₂」である。非手指動作検出部１８は、このうち内容語である「強い₃」を非手指動作の対象となる日本語単語として検出する。 The non-finger movement detector 18 detects a Japanese word that performs a non-finger movement from Japanese words that are not yet associated with a sign language word after the alignment of S2-6. In particular, the non-finger movement detection unit 18 detects content words that mainly represent lexical meaning, such as nouns, verbs, and adjectives, among Japanese words that are not aligned as Japanese words that are subject to non-finger movement. For example, in the above example, the Japanese words determined to have no alignment are “strong ₃ ”, “ni ₅ ”, “do ₇ ”, and “mas ₁₂ ”. The non-hand motion detection unit 18 detects “strong ₃ ”, which is a content word, as a Japanese word that is a target of non-hand motion.

Ｓ８では、非手指動作検出部１８は、Ｓ７で検出した非手指動作の対象となる日本語単語を後段の手話表示装置等（図示せず）に出力する。なお、例えば、対訳コーパス１の場合、日本語文において「強い₃」は「雷雨₄」に係るものであり、このような係り受け情報はＳ１の形態素解析により取得することができる。そのため、非手指動作検出部１８は、非手指動作の対象となる日本語単語に加え、係り受けの情報を併せて出力することにより、後段の手話表示装置等において、「強い」という非手指動作を「雷雨」の手指動作のタイミングで行わせることが可能となる。 In S8, the non-finger movement detection unit 18 outputs the Japanese word that is the target of the non-finger movement detected in S7 to a subsequent sign language display device or the like (not shown). For example, in the case of the bilingual corpus 1, “strong ₃ ” in the Japanese sentence relates to “thunderstorm ₄ ”, and such dependency information can be acquired by the morphological analysis of S1. For this reason, the non-finger movement detection unit 18 outputs the dependency information in addition to the Japanese word that is the target of the non-finger movement, so that the non-finger movement “strong” is performed in the sign language display device or the like in the subsequent stage. Can be performed at the timing of the “thunderstorm” finger movement.

このように、本実施形態によれば、アライメント部Ａは日本語文に含まれる日本語単語と、手話文に含まれる手話単語とを対応付け、非手指動作検出部１８は日本語単語のうち、手話単語と対応付けられていない日本語単語から、非手指動作を行う日本語単語を検出する。これにより、日本語文と手話文との対訳コーパスに基づき非手指動作すべき単語を検出することが可能となる。 Thus, according to the present embodiment, the alignment unit A associates the Japanese words included in the Japanese sentence with the sign language words included in the sign language sentence, and the non-finger movement detection unit 18 includes Japanese words that perform non-finger movement are detected from Japanese words that are not associated with sign language words. Thereby, it is possible to detect a word that should be operated by a non-finger based on a bilingual corpus of a Japanese sentence and a sign language sentence.

また、アライメント部Ａは、日本語単語のうち、複数漢字から構成される日本語単語が当該日本語単語の漢字数以上の他言語単語に翻訳される場合、複数漢字から構成される日本語単語を各漢字に分割し、各漢字と手話単語との間の翻訳確率に基づき各漢字と手話単語との対応付けを行うことができる。これにより、表層的に一致しない日本語単語と手話単語であっても、日本語単語を構成する漢字と手話単語との意味が近い場合には、当該単語同士を対応づけることが可能となる。このため、対訳コーパス中の単語同士の対応精度を向上させることができ、結果的にアライメントされない単語が減少するため、より精度よく非手指動作すべき単語を検出することが可能となる。 The alignment unit A also includes a Japanese word composed of a plurality of kanji characters when a Japanese word composed of a plurality of kanji characters is translated into another language word equal to or more than the number of kanji in the Japanese word. Can be divided into each Chinese character, and each Chinese character can be associated with the sign language word based on the translation probability between each Chinese character and the sign language word. As a result, even if a Japanese word and a sign language word that do not coincide with each other on the surface layer, if the meanings of the kanji and the sign language word constituting the Japanese word are similar, the words can be associated with each other. For this reason, the correspondence accuracy between the words in the bilingual corpus can be improved, and as a result, the number of unaligned words is reduced, so that it is possible to detect a word that should be non-finger-operated more accurately.

また、アライメント部Ａは、手話単語と対応付けられていない日本語単語のうち、語彙的意味を表す内容語を非手指動作の対象となる日本語単語として検出する。これにより、名詞、動詞、形容詞など、主として語彙的意味を表す内容語を非手指動作により表現することが可能となる。 In addition, the alignment unit A detects a content word representing a lexical meaning among Japanese words that are not associated with a sign language word as a Japanese word that is a target of non-hand movement. This makes it possible to express content words that mainly represent lexical meaning, such as nouns, verbs, and adjectives, by non-finger movements.

なお、上述した非手指動作検出装置として機能させるためにコンピュータを用いることができ、そのようなコンピュータは、非手指動作検出装置の各機能を実現する処理内容を記述したプログラムを該コンピュータの記憶部に格納しておき、該コンピュータのＣＰＵによってこのプログラムを読み出して実行させることで実現することができる。なお、このプログラムは、コンピュータ読取り可能な記録媒体に記録することができる。 Note that a computer can be used to function as the above-described non-finger movement detection device, and such a computer stores a program describing processing contents for realizing each function of the non-finger movement detection device. This program can be realized by reading out and executing this program by the CPU of the computer. This program can be recorded on a computer-readable recording medium.

本発明を諸図面や実施例に基づき説明してきたが、当業者であれば本開示に基づき種々の変形や修正を行うことが容易であることに注意されたい。従って、これらの変形や修正は本発明の範囲に含まれることに留意されたい。例えば、各機能部、各ステップなどに含まれる機能などは論理的に矛盾しないように再配置可能であり、複数の手段やステップなどを１つに組み合わせたり、或いは分割したりすることが可能である。 Although the present invention has been described based on the drawings and examples, it should be noted that those skilled in the art can easily make various modifications and corrections based on the present disclosure. Therefore, it should be noted that these variations and modifications are included in the scope of the present invention. For example, the functions included in each functional unit, each step, etc. can be rearranged so as not to be logically contradictory, and a plurality of means and steps can be combined into one or divided. is there.

１０制御部
１１形態素解析部
１２単語プレアライメント部
１３文字プレアライメント部
１４文中距離計算部
１５漢字分割部
１６翻訳確率推定部
１７翻訳確率アライメント部
１８非手指動作検出部
Ａアライメント部
２０格納部
２１対訳コーパス格納部
２２翻訳確率格納部 DESCRIPTION OF SYMBOLS 10 Control part 11 Morphological analysis part 12 Word pre-alignment part 13 Character pre-alignment part 14 Sentence distance calculation part 15 Kanji division part 16 Translation probability estimation part 17 Translation probability alignment part 18 Non-finger movement detection part A Alignment part 20 Storage part 21 Parallel translation Corpus storage unit 22 Translation probability storage unit

Claims

A storage unit for storing a bilingual corpus including a Japanese sentence and a sign language sentence corresponding to the Japanese sentence, the Japanese sentence is divided into Japanese words, stored in association with a number indicating a word order, and The storage unit that divides a sign language sentence into sign language words corresponding to finger movements and stores them in association with numbers indicating word orders ;
Said Japanese words contained in the Japanese sentence, said a alignment unit for associating the said sign language words included in the sign language sentence, match character or string between the sign language words and the Japanese word, The alignment unit includes associating the Japanese word with the sign language word based on a sentence distance between the Japanese word and the sign language word ;
A non-finger movement detector for detecting a content word representing a lexical meaning as a Japanese word to be subjected to a non-finger movement from a Japanese word that is not associated with the sign language word among the Japanese words ; A non-finger movement detection device provided.

The alignment unit, when a Japanese word composed of a plurality of Chinese characters among the Japanese words is translated into another language word that is equal to or more than the number of Chinese characters of the Japanese word, The non-finger movement detection device according to claim 1, further comprising: associating each kanji with the sign language word based on a translation probability between each kanji and the sign language word.

A program for causing a computer to function as a non-manual motion detection apparatus according to claim 1 or 2.