JPH02183371A - Automatic interpreting device - Google Patents
Automatic interpreting deviceInfo
- Publication number
- JPH02183371A JPH02183371A JP1003581A JP358189A JPH02183371A JP H02183371 A JPH02183371 A JP H02183371A JP 1003581 A JP1003581 A JP 1003581A JP 358189 A JP358189 A JP 358189A JP H02183371 A JPH02183371 A JP H02183371A
- Authority
- JP
- Japan
- Prior art keywords
- speaker
- sound
- emotional information
- emotion
- change
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008451 emotion Effects 0.000 claims abstract description 25
- 230000002996 emotional effect Effects 0.000 claims abstract description 20
- 238000000605 extraction Methods 0.000 claims abstract description 17
- 230000008921 facial expression Effects 0.000 claims description 16
- 230000015572 biosynthetic process Effects 0.000 claims description 9
- 238000003786 synthesis reaction Methods 0.000 claims description 9
- 230000014509 gene expression Effects 0.000 abstract description 6
- 210000004709 eyebrow Anatomy 0.000 abstract description 3
- 239000000284 extract Substances 0.000 description 2
- 230000001815 facial effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000000034 method Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
Description
【発明の詳細な説明】
[産業上の利用分野]
本発明は話者の音声言語を他の音声言語に通訳する自動
通訳装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Field of Industrial Application] The present invention relates to an automatic interpreting device for interpreting a spoken language of a speaker into another spoken language.
[従来の技術]
最近、音声の認識技術1合成技術9機械翻訳技術の発展
を背景として、あるレベルでの自動音声翻訳が可能とな
り、実用化も間近になっている。[Prior Art] Recently, with the development of speech recognition technology, synthesis technology, and machine translation technology, automatic speech translation has become possible at a certain level, and its practical use is nearing.
この種の自動通訳装置としては、話者の音声を認識する
音声認識部の認識結果に基づいて機械翻訳部により機械
翻訳し、この翻訳結果を音声合成部で音声合成して出力
するようにしたものがある。This type of automatic interpretation device uses a machine translation section to perform machine translation based on the recognition results of a speech recognition section that recognizes the speaker's voice, and then synthesizes this translation result into speech using a speech synthesis section and outputs it. There is something.
[発明が解決しようとする課題]
ところで、上述した従来の自動通訳装置にあっては、話
者の正確な気持や感情をおり込んで翻訳音声を出力する
ことができないので、この感情の伝達ができない分翻訳
精度のレベルが低くなっているという欠点があった。[Problems to be Solved by the Invention] By the way, with the above-mentioned conventional automatic interpreting devices, it is not possible to output translated speech that incorporates the speaker's accurate feelings and emotions, so it is difficult to convey these emotions. However, the disadvantage was that the level of translation accuracy was low.
そこで、本発明の課題は、話者の感情をおり込んだ翻訳
音声を出力することができるようにする点にある。Therefore, an object of the present invention is to make it possible to output translated speech that incorporates the speaker's emotions.
[課題を解決するための手段]
このような課題を解決するための本発明の技術的手段は
、話者の音声を認識する音声認識手段と、話者の表情を
認識する表情認識手段と、表情認識手段により認識され
た表情の変化に対応した感情情報を抽出する感情抽出手
段と、前記音声認識結果及び感情情報に基づいて機械翻
訳する機械翻訳手段と、機械翻訳手段の翻訳結果及び上
記感情情報に基づいて音声を合成する音声合成手段とを
備えた自動通訳装置にある。[Means for Solving the Problems] Technical means of the present invention for solving these problems include a voice recognition means for recognizing the voice of a speaker, a facial expression recognition means for recognizing the facial expressions of the speaker, an emotion extraction means for extracting emotional information corresponding to a change in facial expression recognized by the facial expression recognition means; a machine translation means for machine translation based on the voice recognition result and the emotion information; and a translation result of the machine translation means and the emotion. An automatic interpreting device includes a speech synthesis means for synthesizing speech based on information.
[実施例]
以下、添付図面に基づいて本発明の実施例に係る自動通
訳装置を説明する。[Embodiment] Hereinafter, an automatic interpretation device according to an embodiment of the present invention will be described based on the accompanying drawings.
第1図に示すように、実施例に係る自動通訳装置は、マ
イク10等から人力された話者の音声を認識する音声認
識装置1と、話者の表情を認識する表情認識手段とを備
えている。表情認識手段は、話者と受手との映像交換を
行う映像装置11に接続され、カメラllaから話者の
画像を受信して認識する画像認識装置5で構成されてい
る。また、この画像認識装置5は該装置5で認識さねた
話者の表情の変化に対応した感情情報を抽出する感情抽
出手段に接続されている。この感情抽出手段は、話者の
表情の変化として、例えば、手の動き、顔の動き、まゆ
げ・目・口の動き等の変化抽出する変化抽出装置6と、
これらの表情の変化に対応して予め定められた感情情報
を蓄積しである知識ベース8と、上記表情の変化に対応
した知識ベース8の感情情報を抽出して出力する感情抽
出装置7とから構成されている。また、この自動通訳装
置は、音声認識装置1の音声認識結果及び感情情報に基
づいて所定の機械翻訳をする機械翻訳装置2と、機械翻
訳装置2の翻訳結果及び上記感情情報に基づいて所定の
音声を合成してスピーカに出力する音声合成装置3とを
備えている。上記機械翻訳装置2は、音声認識結果及び
感情情報に対応した言いまわしを蓄積した知識ベース4
から、該当するデータを選択するものである。As shown in FIG. 1, the automatic interpreting device according to the embodiment includes a speech recognition device 1 that recognizes the speaker's voice input manually from a microphone 10, etc., and facial expression recognition means that recognizes the speaker's facial expression. ing. The facial expression recognition means includes an image recognition device 5 that is connected to a video device 11 that exchanges images between the speaker and the receiver, and that receives and recognizes an image of the speaker from a camera lla. The image recognition device 5 is also connected to emotion extraction means for extracting emotional information corresponding to changes in the speaker's facial expressions that the device 5 fails to recognize. This emotion extraction means includes a change extraction device 6 that extracts changes in the speaker's facial expressions, such as hand movements, facial movements, eyebrow/eye/mouth movements, etc.;
A knowledge base 8 that stores predetermined emotional information corresponding to these changes in facial expressions, and an emotion extraction device 7 that extracts and outputs emotional information from the knowledge base 8 that corresponds to the changes in facial expressions. It is configured. This automatic interpretation device also includes a machine translation device 2 that performs a predetermined machine translation based on the voice recognition result of the speech recognition device 1 and the emotional information, and a machine translation device 2 that performs predetermined machine translation based on the translation result of the machine translation device 2 and the emotional information. It is equipped with a voice synthesis device 3 that synthesizes voice and outputs it to a speaker. The machine translation device 2 has a knowledge base 4 that has accumulated phrases corresponding to speech recognition results and emotional information.
The relevant data is selected from the list.
従って、この実施例に係る自動通訳装置によれば、話者
の音声は音声認識部1で音声認識される。通常単語又は
文節単位で機械翻訳部2に送られる。機械翻訳装置2で
は構文解析、意味解析。Therefore, according to the automatic interpreting device according to this embodiment, the speaker's voice is recognized by the voice recognition unit 1. Usually, it is sent to the machine translation unit 2 in units of words or phrases. Machine translation device 2 performs syntactic analysis and semantic analysis.
文脈解析等を行って翻訳をする。このとき、知識ベース
4に蓄積された常識的言いまわしや感情抽出装置7から
抽出された感情情報に対応した言°いまわしが選択され
る。選択された言いまわしは音声合成部3で音声合成さ
れるが、この時も感情抽出装置7から抽出された感情情
報にあうように音声の強弱、ピッチを調整して音声合成
が行われる。Translate by performing context analysis, etc. At this time, common-sense expressions accumulated in the knowledge base 4 and expressions corresponding to the emotional information extracted from the emotion extraction device 7 are selected. The selected expression is synthesized into speech by the speech synthesis section 3, and at this time too, the strength and pitch of the speech are adjusted to match the emotional information extracted from the emotion extraction device 7, and the speech synthesis is performed.
更に詳しく説明すると、前記感情抽出装置7での感情情
報が抽出される過程は以下の通りである。話者の表情は
カメラllaを通じて画像認識装置5で適当なタイムス
ケールで認識される。認識された画像は変化抽出装置6
において主要なターゲット例えば、手の動き、顔の動作
くうなずき、否定的な横ふり、etc ) 、まゆげの
動き。To explain in more detail, the process by which emotion information is extracted by the emotion extraction device 7 is as follows. The speaker's facial expression is recognized by the image recognition device 5 at an appropriate time scale through the camera lla. The recognized image is sent to the change extraction device 6
Main targets include hand movements, facial movements, nodding, negative sideways movements, etc.), and eyebrow movements.
目の変化(涙)1口の変化(笑い)等の個別の変化が読
みとられる。読みとられた個別のターゲットは感情抽出
装置7において総合的に判断され所定の感情情報が選択
される。この感情情報は各ターゲット毎の条件的知識と
して知識ベース8に蓄積されている。Individual changes such as changes in eyes (tears) and changes in one mouth (laughter) are read. The read individual targets are comprehensively judged by the emotion extraction device 7 and predetermined emotion information is selected. This emotional information is stored in the knowledge base 8 as conditional knowledge for each target.
そのため、話者の感情にフィツトした言いまわしで翻訳
されるとともに、話者の感情にあった抑揚やピッチをも
って翻訳音声が出力される。Therefore, the translation is performed using phrasing that fits the speaker's emotions, and the translated voice is output with intonation and pitch that match the speaker's emotions.
[発明の効果]
以上説明したように本発明の自動通訳装置によれば、話
者の画像から感情情報を抽出し、この感情情報に対応し
た翻訳と音声合成を行うので、話者の感情を表出するこ
とができ、この感情の伝達が可能となる分、翻訳精度が
向上する。[Effects of the Invention] As explained above, according to the automatic interpreting device of the present invention, emotional information is extracted from the image of the speaker, and translation and speech synthesis corresponding to this emotional information are performed. The translation accuracy improves to the extent that this emotion can be conveyed.
第1図は本発明の実施例に係る自動通訳装置の構成を示
すブロック図である。
:音声認識装置
2:機械翻訳装置
=r1声合成装置
4:機械翻訳用知識ベース
5:画像認識装置
6:変化抽出装置
7:感情抽出装置
8:感情抽出用知識ベースFIG. 1 is a block diagram showing the configuration of an automatic interpretation device according to an embodiment of the present invention. : Voice recognition device 2: Machine translation device = r1 Voice synthesis device 4: Knowledge base for machine translation 5: Image recognition device 6: Change extraction device 7: Emotion extraction device 8: Knowledge base for emotion extraction
Claims (1)
認識する表情認識手段と、表情認識手段により認識され
た表情の変化に対応した感情情報を抽出する感情抽出手
段と、前記音声認識結果及び感情情報に基づいて機械翻
訳する機械翻訳手段と、機械翻訳手段の翻訳結果及び上
記感情情報に基づいて音声を合成する音声合成手段とを
備えたことを特徴とする自動通訳装置。a voice recognition means for recognizing the voice of the speaker; a facial expression recognition means for recognizing the facial expression of the speaker; an emotion extraction means for extracting emotional information corresponding to a change in the facial expression recognized by the facial expression recognition means; An automatic interpreting device comprising: a machine translation device that performs machine translation based on the result and emotional information; and a speech synthesis device that synthesizes speech based on the translation result of the machine translation device and the emotional information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1003581A JPH02183371A (en) | 1989-01-10 | 1989-01-10 | Automatic interpreting device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP1003581A JPH02183371A (en) | 1989-01-10 | 1989-01-10 | Automatic interpreting device |
Publications (1)
Publication Number | Publication Date |
---|---|
JPH02183371A true JPH02183371A (en) | 1990-07-17 |
Family
ID=11561421
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP1003581A Pending JPH02183371A (en) | 1989-01-10 | 1989-01-10 | Automatic interpreting device |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPH02183371A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0585098A2 (en) * | 1992-08-24 | 1994-03-02 | Hitachi, Ltd. | Sign recognition apparatus and method and sign translation system using same |
US5659764A (en) * | 1993-02-25 | 1997-08-19 | Hitachi, Ltd. | Sign language generation apparatus and sign language translation apparatus |
US5887069A (en) * | 1992-03-10 | 1999-03-23 | Hitachi, Ltd. | Sign recognition apparatus and method and sign translation system using same |
US20050201565A1 (en) * | 2004-03-15 | 2005-09-15 | Samsung Electronics Co., Ltd. | Apparatus for providing sound effects according to an image and method thereof |
JP2007148039A (en) * | 2005-11-28 | 2007-06-14 | Matsushita Electric Ind Co Ltd | Speech translation device and speech translation method |
JP2008021058A (en) * | 2006-07-12 | 2008-01-31 | Nec Corp | Portable telephone apparatus with translation function, method for translating voice data, voice data translation program, and program recording medium |
US7962345B2 (en) | 2001-04-11 | 2011-06-14 | International Business Machines Corporation | Speech-to-speech generation system and method |
JP6290479B1 (en) * | 2017-03-02 | 2018-03-07 | 株式会社リクルートライフスタイル | Speech translation device, speech translation method, and speech translation program |
CN109949794A (en) * | 2019-03-14 | 2019-06-28 | 合肥科塑信息科技有限公司 | A kind of intelligent sound converting system based on Internet technology |
JP2020134719A (en) * | 2019-02-20 | 2020-08-31 | ソフトバンク株式会社 | Translation device, translation method, and translation program |
CN112102831A (en) * | 2020-09-15 | 2020-12-18 | 海南大学 | Cross-data, information and knowledge modal content encoding and decoding method and component |
-
1989
- 1989-01-10 JP JP1003581A patent/JPH02183371A/en active Pending
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5887069A (en) * | 1992-03-10 | 1999-03-23 | Hitachi, Ltd. | Sign recognition apparatus and method and sign translation system using same |
EP0585098A3 (en) * | 1992-08-24 | 1995-01-11 | Hitachi Ltd | Sign recognition apparatus and method and sign translation system using same. |
EP0585098A2 (en) * | 1992-08-24 | 1994-03-02 | Hitachi, Ltd. | Sign recognition apparatus and method and sign translation system using same |
US5659764A (en) * | 1993-02-25 | 1997-08-19 | Hitachi, Ltd. | Sign language generation apparatus and sign language translation apparatus |
US5953693A (en) * | 1993-02-25 | 1999-09-14 | Hitachi, Ltd. | Sign language generation apparatus and sign language translation apparatus |
US7962345B2 (en) | 2001-04-11 | 2011-06-14 | International Business Machines Corporation | Speech-to-speech generation system and method |
US8238566B2 (en) * | 2004-03-15 | 2012-08-07 | Samsung Electronics Co., Ltd. | Apparatus for providing sound effects according to an image and method thereof |
US20050201565A1 (en) * | 2004-03-15 | 2005-09-15 | Samsung Electronics Co., Ltd. | Apparatus for providing sound effects according to an image and method thereof |
JP2007148039A (en) * | 2005-11-28 | 2007-06-14 | Matsushita Electric Ind Co Ltd | Speech translation device and speech translation method |
JP2008021058A (en) * | 2006-07-12 | 2008-01-31 | Nec Corp | Portable telephone apparatus with translation function, method for translating voice data, voice data translation program, and program recording medium |
JP6290479B1 (en) * | 2017-03-02 | 2018-03-07 | 株式会社リクルートライフスタイル | Speech translation device, speech translation method, and speech translation program |
JP2020134719A (en) * | 2019-02-20 | 2020-08-31 | ソフトバンク株式会社 | Translation device, translation method, and translation program |
CN109949794A (en) * | 2019-03-14 | 2019-06-28 | 合肥科塑信息科技有限公司 | A kind of intelligent sound converting system based on Internet technology |
CN112102831A (en) * | 2020-09-15 | 2020-12-18 | 海南大学 | Cross-data, information and knowledge modal content encoding and decoding method and component |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN113454708A (en) | Linguistic style matching agent | |
US8224652B2 (en) | Speech and text driven HMM-based body animation synthesis | |
US8131551B1 (en) | System and method of providing conversational visual prosody for talking heads | |
CN112650831A (en) | Virtual image generation method and device, storage medium and electronic equipment | |
KR102098734B1 (en) | Method, apparatus and terminal for providing sign language video reflecting appearance of conversation partner | |
WO2021196645A1 (en) | Method, apparatus and device for driving interactive object, and storage medium | |
CN107972028A (en) | Man-machine interaction method, device and electronic equipment | |
US20230082830A1 (en) | Method and apparatus for driving digital human, and electronic device | |
KR20190114150A (en) | Method and apparatus for translating speech of video and providing lip-synchronization for translated speech in video | |
KR102174922B1 (en) | Interactive sign language-voice translation apparatus and voice-sign language translation apparatus reflecting user emotion and intention | |
WO2017195775A1 (en) | Sign language conversation assistance system | |
KR20200090355A (en) | Multi-Channel-Network broadcasting System with translating speech on moving picture and Method thererof | |
US20240022772A1 (en) | Video processing method and apparatus, medium, and program product | |
JPH02183371A (en) | Automatic interpreting device | |
CN113689879A (en) | Method, device, electronic equipment and medium for driving virtual human in real time | |
CN110162598A (en) | A kind of data processing method and device, a kind of device for data processing | |
WO2024088321A1 (en) | Virtual image face driving method and apparatus, electronic device and medium | |
WO2022072752A1 (en) | Voice user interface using non-linguistic input | |
US20240221753A1 (en) | System and method for using gestures and expressions for controlling speech applications | |
Hrúz et al. | Automatic fingersign-to-speech translation system | |
CN113112575A (en) | Mouth shape generation method and device, computer equipment and storage medium | |
CN116129852A (en) | Training method of speech synthesis model, speech synthesis method and related equipment | |
CN117275485B (en) | Audio and video generation method, device, equipment and storage medium | |
CN117351929A (en) | Translation method, translation device, electronic equipment and storage medium | |
Eguchi et al. | Development of Mobile Device-Based Speech Enhancement System Using Lip-Reading |