JP2004272363A

JP2004272363A - Voice input/output device

Info

Publication number: JP2004272363A
Application number: JP2003058722A
Authority: JP
Inventors: Hiroki Yamamoto; 寛樹山本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-03-05
Filing date: 2003-03-05
Publication date: 2004-09-30

Abstract

<P>PROBLEM TO BE SOLVED: To provide a multi-language voice input/output device for making a user to easily recognize language which is being actually used and language which can be used. <P>SOLUTION: This voice input/output device 100 is provided with: a central processing unit(CPU) 102; an input device 105; a display device 106; a voice output device(speaker) 107; and a voice input device(microphone) 108. This display device 106 displays language with the national flag marks of countries where the language is mainly spoken, and also displays a national flag mark corresponding to language (in this case, Japanese) which can be treated at that time and national flag marks corresponding to language (English, French, German) which can be used according to the change of setting. Furthermore, the national flag mark corresponding to the language (Japanese) which can be treated at that time and the language which can be used according to the change of setting are displayed by changing the display methods. <P>COPYRIGHT: (C)2004,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、音声入出力装置に関し、特に音声入力可能な言語及び音声出力可能な言語を表示する音声入出力装置に関する。
【０００２】
【従来の技術】
従来、音声認識装置や音声合成装置は利用者の母国語等の単一の言語のみに対応しており、利用者は母国語に対応した音声認識装置や音声合成装置を使用していたため、音声認識装置や音声合成装置が扱える言語を稼動時に表示する必要は特になかった。近年、複数の言語の音声を認識して、出力する多言語音声認識装置や、多言語音声合成装置が開発され、単一の装置で複数の言語の音声入出力が可能になりつつある。このような装置では、利用者に対してどの言語が利用できるのかを表示する必要がある。例えば、言語を選択することによって英語又は日本語で観光案内する装置があった場合、利用者に対して、まず現在はどの言語で案内をしており、どの言語に切り替えられるのかを表示する必要がある。
【０００３】
従来、このように多言語を扱う装置で、扱える言語を利用者に通知する例として、複数の言語のメッセージを表示する銀行のキャッシュディスペンサ（ＣＤ）端末における言語選択に関する方法が開示されている（例えば、特許文献１）。この方法では、ＣＤ端末が対応可能な言語を象徴する国旗マークをタッチパネル上に表示し、利用者が所望の言語に対応する国旗マークに触れるとＣＤ端末の表示画面に用いられるメッセージの言語が所望の言語に切り替えられる。この方法では、表示中の言語のうち用いられていないものについては、選択できないようにするか、又は表示画面上への表示を削除するかして、ＣＤ端末表示画面上には切り替え可能な言語に対応する国旗マークを表示している。
【０００４】
同様な例は、インターネット上のＷＥＢページにも見られる。同一内容のコンテンツを複数の言語に翻訳して公開しているページでは、先のＣＤ端末の例と同様に、コンテンツのある言語に関して対応する国旗アイコン又は言語名アイコンを表示している。表示部分は対応する言語で記述されたコンテンツへのリンクである場合が多く、ＣＤ端末同様、国旗アイコンや言語名アイコンをクリックするとその言語で記述されたコンテンツのページに切り替わる。
【０００５】
ここで説明したＣＤ端末及びＷＥＢページの例では、利用者は表示されている内容から設定されている言語を判断することができる。仮に表示されているものがどの言語かわからない場合でも、表示されている言語が利用者が所望する言語ではないことだけは判断できる。つまり、利用者は、利用にあたって言語を切り替えるかどうかを表示画面から判断できる。したがって、その時点で設定されている言語が何であるかを明示的に示す必要はなかった。
【０００６】
【特許文献１】
特開平６−１６１６９２号公報
【０００７】
【発明が解決しようとする課題】
しかしながら、音声入出力装置は、入力、出力とも音声だけで行うことが可能なため、文字の表示が特に必要ない場合があり、このような音声入出力装置が多言語に対応している場合、従来例と同様に選択可能な言語のみを表示する手法を用いると、利用者は表示されている言語以外のどの言語で話せばいいのか、又は表示されている言語以外のどの言語で音声出力されるのか判断することができない。つまり、多言語に対応した音声入出力装置の場合は、その時点で入力可能な言語を明示的に示す必要がある。
【０００８】
また、多言語を扱う音声認識装置の中には、同時に複数の言語を認識できる装置も開発されている。例えば、英語と日本語に対応している場合は、利用者が英語で話しても日本語で話しても音声認識を行うことができる。さらにこの装置が言語の設定により、他の言語、例えばフランス語と中国語にも対応できる場合、利用者に対して何らかの方法で、その時点で利用している言語、設定変更により利用可能な言語を表示する必要がある。
【０００９】
本発明の目的は、現に利用している言語及び利用可能な言語を利用者に容易に認識させることができる多言語音声入出力装置を提供することにある。
【００１０】
【課題を解決するための手段】
上述の目的を達成するために、請求項１記載の音声入出力装置は、複数の言語による音声入出力が可能な音声入出力装置であって、音声入力可能な言語及び音声出力可能な言語を表示する表示手段を備えることを特徴とする。
【００１１】
【発明の実施の形態】
以下、本発明の実施の形態に係る音声入出力装置を図面を参照して詳述する。
【００１２】
図１は、本発明の実施の形態に係る音声入出力装置の概略構成を示すブロック図である。
【００１３】
本実施の形態に係る音声入出力装置は、音声対話システムとして実装された場合を想定しているが、本発明はこれに限定されるものではなく単独の音声入力装置若しくは単独の音声合成装置、又は他の形態の装置であってもよい。
【００１４】
図１において、音声入出力装置１００は、制御メモリ（ＲＯＭ）１０１、中央処理装置（ＣＰＵ）１０２、メモリ（ＲＡＭ）１０３、外部記憶装置１０４、入力装置１０５、表示装置１０６、音声出力装置（スピーカ）１０７、及び音声入力装置（マイクロフォン）１０８を備え、これらはバス１０９を介して夫々接続されている。
【００１５】
本音声入出力装置１００を実現するための制御プログラムやその制御プログラムで用いるデータは、外部記憶装置１０４に記憶される。これらの制御プログラムやデータは、中央処理装置１０２の制御のもと、バス１０９を介して適宜メモリ１０３に取り込まれ、中央処理装置１０２によって実行される。制御プログラムやデータは制御メモリ１０１に記憶してもよい。
【００１６】
図２は、図１の音声入出力装置のモジュール構成を示すブロック図である。
【００１７】
図１の音声入出力装置１００は、言語設定部２０１、設定言語表示部２０２、及び音声対話制御部２０３を有する。言語設定部２０１は、音声入出力装置の言語設定及び設定言語の変更を行い、設定言語表示部２０２は言語設定部２０１で設定された言語を利用者に表示し、音声対話制御部２０３は音声認識及び音声合成の制御及び音声入出力を用いる利用者との対話を制御する。
【００１８】
また、言語設定部２０１は、設定言語を変更する際に、音声認識や音声合成に必要なデータ、例えば音声認識用の音響モデル、音声認識文法、音声合成用の波形辞書などを設定された言語のデータに変更する。
【００１９】
図１の音声入出力装置は、日本語のほか、英語、フランス語、ドイツ語による音声入出力に対応しており、起動時には日本語による音声入出力を行うことができる。
【００２０】
図３は、図１の音声入出力装置によって実行される設定言語表示処理を示すフローチャートである。
【００２１】
図３において、まず、装置を起動すると、初期設定が行われる（ステップ３０１）。本装置では起動時には日本語のみの音声入出力に対応しており、ここでは、音声対話制御部２０３で日本語の音声認識や音声合成に必要なデータを設定する。
【００２２】
次に、図４に示すように、表示装置１０６は設定されている言語、すなわちその時点で扱える言語を表示装置１０６上に表示する（ステップＳ３０２）。ここで、言語を表示する際に、その時点で扱える言語以外に、設定を変更することによって使用できる言語を併せて表示する。また、その時点で扱える言語と設定変更によって扱える言語は表示方法を変えて表示する。
【００２３】
図４の例では、表示装置１０６は、その言語が主に話されている国の国旗マークで示しており、その時点で扱える言語（この例では日本語）に対応する国旗マークとともに、設定変更により使用できる言語（英語、フランス語、ドイツ語）に対応する国旗マークを併せて表示している。さらに、その時点で扱える言語（日本語）に対応する国旗マークと設定変更により使用できる言語とは表示方法を変えて表示している。図４の例では、その時点で扱える日本語に対応する日本の国旗を他の言語の国旗よりも大きく表示、さらに太い線で縁取りをすることで強調している。また、別の表示方法として、図５のように、設定変更しないと使用できない言語の国旗に、その旨わかるような記号を付加して、その時点で扱える言語と区別して表示してもよい。
【００２４】
続くステップＳ３０３では、利用者が設定言語の変更を行うか否かを判別し、変更する場合は、音声対話制御部２０３で設定されている日本語の音声認識、日本語の音声合成で音声対話を行い（ステップＳ３０４）、次に、音声入出力装置作動を終了するまでは（ステップＳ３０５でＮＯ）、ステップＳ３０３以降の処理を繰り返すと共に、音声入出力装置作動を終了する場合は、本処理を終了する。
【００２５】
ステップＳ３０３の判別の結果、利用者が設定言語を変更するときは、本音声入出力装置では起動時の言語が日本語のみに設定されているので、利用者が、例えば英語で音声対話を行うために、設定言語を例えば英語に変更した場合、ステップＳ３０６で言語設定部２０１により設定言語を英語に変更する。その際、音声対話制御部２０３では、英語の音声入出力に必要なデータを設定する。次に、ステップＳ３０２に進み、英語が扱える言語であることを表示する。この際、同時に扱えるのが一言語の装置の場合は、図６のように英語を表す米国の国旗のみを強調して表示する。また、同時に複数の言語が扱える場合は、図７のように日本及び米国の国旗を強調して表示する。後者の場合、音声入出力装置は、日本語及び英語による音声入力を受け付け、日本語の音声入力に対しては日本語の音声出力で、英語の音声入力に対しては英語の音声出力を行う。設定言語の変更は、例えば専用の変更ボタンを設置してもよいし、表示装置１０６を液晶タッチパネルにして、利用者が表示されている所望の言語の国旗に触れることでその言語に切り替えられるようにしてもよい。
【００２６】
本発明の実施の形態に係る音声入出力装置において、言語を表示する際にその言語が主に話される国の国旗マークを用いて表示したが、これに限るものではなく、その言語に対応した略号又は記号などで表示してもよい。図８に略号を用いた場合の例を示す。図８では、日本語、英語、フランス語、ドイツ語をそれぞれ、ＪＰＮ、ＥＮＧ、ＦＲＡ、ＧＥＲという略号を用いて表現している。また、日本語が使用可能であることを表示するため、ＪＰＮを太字で表し、さらに下線を加えて強調している。
【００２７】
本発明の実施の形態に係る音声入出力装置において、音声入力と音声出力の言語が同じ場合について説明したが、これに限るものではない。本来、音声入力と音声出力の言語は一緒であることが望ましいが、音声入出力装置によっては、例えば音声入力は日本語、英語、フランス語、ドイツ語に対応していても、音声出力は、日本語と英語しか対応していないという場合もある。このような場合は、音声入力で対応している言語と音声出力で対応している言語を個別に表示する必要がある。図９にその一例を示す。図９では、音声入力可能な言語をマイクの記号に続いて表示し、音声出力可能な言語をスピーカの記号に続いて表示している。また、音声入力は、４ヶ国語全てでできるが、音声出力は日本語と英語のみ対応しており、そのうち現在の音声出力は英語になっていることを示している。
【００２８】
本発明は、上述した実施の形態の機能を実現するソフトウェアのプログラム（図３のフローチャート）をコンピュータ又はＣＰＵに供給し、そのコンピュータ又はＣＰＵが該供給されたプログラムを読出して実行することによって、達成することができる。
【００２９】
この場合、上記プログラムは、該プログラムを記録した記憶媒体から直接供給されるか、又はインターネット、商用ネットワーク、若しくはローカルエリアネットワーク等に接続される不図示の他のコンピュータやデータベース等からダウンロードすることにより供給される。
【００３０】
上記プログラムの形態は、オブジェクトコード、インタプリタにより実行されるプログラムコード、ＯＳ（オペレーティングシステム）に供給されるスクリプトデータ等の形態から成ってもよい。
【００３１】
また、本発明は、上述した実施の形態の機能を実現するソフトウェアのプログラムを記憶した記憶媒体をコンピュータ又はＣＰＵに供給し、そのコンピュータ又はＣＰＵが記憶媒体に記憶されたプログラムを読出して実行することによっても、達成することができる。
【００３２】
この場合、格納媒体から読出されたプログラムコード自体が上述した各実施の形態の機能を実現すると共に、そのプログラムコードを記憶した記憶媒体は本発明を構成する。
【００３３】
プログラムコードを記憶する記憶媒体としては、例えば、ＲＯＭ、ＲＡＭ、ＮＶ−ＲＡＭ、フロッピー（登録商標）ディスク、ハードディスク、光ディスク（登録商標）、光磁気ディスク、ＣＤ−ＲＯＭ、ＭＯ、ＣＤ−Ｒ、ＣＤ−ＲＷ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＤＶＤ−ＲＷ、ＤＶＤ＋ＲＷ、磁気テープ、不揮発性のメモリカード等がある。
【００３４】
上述した実施の形態の機能は、コンピュータから読出されたプログラムコードを実行することによるばかりでなく、コンピュータ上で稼動するＯＳ等がプログラムコードの指示に基づいて実際の処理の一部又は全部を行うことによっても実現することができる。
【００３５】
〔実施態様１〕複数の言語による音声入出力が可能な音声入出力装置であって、音声入力可能な言語及び音声出力可能な言語を表示する表示手段を備えることを特徴とする音声入出力装置。
【００３６】
〔実施態様２〕前記表示手段は、前記音声入力可能な言語及び前記音声出力可能な言語を個別に表示することを特徴とする請求項１記載の音声入出力装置。
【００３７】
〔実施態様３〕前記表示手段は、その時点で扱える言語と、設定変更を行うことによって扱える言語とを区別して表示することを特徴とする請求項１又は２記載の音声入出力装置。
【００３８】
〔実施態様４〕前記表示手段は、その言語が主に使用されている国の国旗マークを用いて表示することを特徴とする請求項１乃至３のいずれか１項に記載の音声入出力装置。
【００３９】
〔実施態様５〕前記表示手段は、その言語が主に使用されている国の略号を用いて表示することを特徴とする請求項１乃至３のいずれか１項に記載の音声入出力装置。
【００４０】
〔実施態様６〕複数の言語による音声入出力が可能な音声入出力方法であって、音声入力可能な言語及び音声出力可能な言語を表示する表示工程を備えることを特徴とする音声入出力方法。
【００４１】
〔実施態様７〕前記表示工程は、前記音声入力可能な言語及び音声出力可能な言語を個別に表示することを特徴とする請求項６記載の音声入出力方法。
【００４２】
〔実施態様８〕前記表示工程は、その時点で扱える言語と設定変更を行うことによって扱える言語を区別して表示することを特徴とする請求項６又は７記載の音声に入出力方法。
【００４３】
〔実施態様９〕前記表示工程は、その言語が主に使用されている国の国旗マークを用いて表示することを特徴とする請求項６乃至８のいずれか１項に記載の音声入出力装置。
【００４４】
〔実施態様１０〕前記表示工程は、その言語が主に使用されている国の略号を用いて表示することを特徴とする請求項６乃至８のいずれか１項に記載の音声入出力装置。
【００４５】
〔実施態様１１〕複数の言語による音声入出力が可能な音声入出力方法をコンピュータに実行させる音声入出力プログラムにおいて、音声入力可能な言語及び音声出力可能な言語を表示する表示モジュールを備えることを特徴とする音声入出力プログラム。
【００４６】
〔実施態様１２〕請求項１１記載の音声入出力プログラムを格納することを特徴とするコンピュータ読取り可能な記憶媒体。
【００４７】
【発明の効果】
以上詳細に説明したように、請求項１記載の音声入出力装置によれば、音声入力可能な言語及び音声出力可能な言語を表示するので、現に利用している言語及び利用可能な言語を利用者に容易に認識させることができる。
【図面の簡単な説明】
【図１】本発明の実施の形態に係る音声入出力装置の概略構成を示すブロック図である。
【図２】図１の音声入出力装置のモジュール構成を示すブロック図である。
【図３】図１の音声入出力装置によって実行される設定言語表示処理を示すフローチャートである。
【図４】使用できる図１における表示装置１０６の言語の表示例を説明する図である。
【図５】日本語が使用可能な場合の図１における表示装置１０６の表示例を説明する図である。
【図６】英語が使用可能な場合の図１における表示装置１０６の表示の例を説明する図である。
【図７】英語と日本語が使用可能な場合の図１における表示装置１０６の表示例を説明する図である。
【図８】本発明の実施の形態に係る音声入出力装置において、言語を略号を用いて表示した場合の図１における表示装置１０６の表示例を説明する図である。
【図９】音声入力が対応している言語と音声出力が対応している言語とを個別に表示した場合の図１における表示装置１０６の表示例を説明する図である。
【符号の説明】
１００音声入出力装置
１０１制御メモリ（ＲＯＭ）
１０２中央処理装置（ＣＰＵ）
１０３メモリ（ＲＡＭ）
１０４外部記憶装置
１０５入力装置
１０６表示装置
１０７音声出力装置（スピーカー）
１０８音声出力装置（マイクロフォン）
２０１言語設定部
２０２設定言語提示部
２０３音声対話制御部[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a voice input / output device, and more particularly to a voice input / output device that displays a language in which voice can be input and a language in which voice can be output.
[0002]
[Prior art]
Conventionally, speech recognizers and speech synthesizers support only a single language, such as the user's native language, and users have used speech recognizers and speech synthesizers that support their native language. There was no particular need to display a language that can be handled by a recognizer or a speech synthesizer during operation. In recent years, a multilingual speech recognition device and a multilingual speech synthesizing device that recognize and output voices in a plurality of languages have been developed, and a single device has become capable of inputting and outputting speech in a plurality of languages. In such a device, it is necessary to display to the user which language is available. For example, if there is a device that provides sightseeing guidance in English or Japanese by selecting a language, it is necessary for the user to first indicate which language is currently providing guidance and which language can be switched. There is.
[0003]
Conventionally, as an example of notifying a user of a language that can be handled by such a device that handles multiple languages, a method related to language selection in a bank cash dispenser (CD) terminal that displays messages in a plurality of languages has been disclosed ( For example, Patent Document 1). In this method, a flag mark symbolizing a language that can be supported by the CD terminal is displayed on the touch panel, and when a user touches the flag mark corresponding to a desired language, the language of the message used on the display screen of the CD terminal is changed to a desired language. Language. In this method, a language that can be switched is displayed on the CD terminal display screen by making it impossible to select an unused language among the displayed languages or deleting the display on the display screen. The flag mark corresponding to is displayed.
[0004]
A similar example is found on web pages on the Internet. In a page in which the same content is translated into a plurality of languages and published, a flag icon or a language name icon corresponding to a certain language of the content is displayed as in the example of the CD terminal. The display portion is often a link to the content described in the corresponding language. Like a CD terminal, when a flag icon or a language name icon is clicked, the display switches to a page of the content described in that language.
[0005]
In the example of the CD terminal and the WEB page described here, the user can determine the set language from the displayed contents. Even if the displayed language is not known, it can be determined only that the displayed language is not the language desired by the user. That is, the user can determine from the display screen whether or not to switch the language for use. Therefore, there was no need to explicitly indicate what the currently set language is.
[0006]
[Patent Document 1]
JP-A-6-161892
[Problems to be solved by the invention]
However, since the voice input / output device can perform both input and output only by voice, there is a case where display of characters is not particularly necessary, and when such a voice input / output device supports multiple languages, When using the method of displaying only selectable languages in the same manner as in the conventional example, the user can use any language other than the displayed language to speak, or can output audio in any language other than the displayed language. I can not judge whether it is. That is, in the case of a voice input / output device that supports multiple languages, it is necessary to explicitly indicate the languages that can be input at that time.
[0008]
Further, among speech recognition devices that handle multiple languages, a device that can recognize a plurality of languages simultaneously has been developed. For example, if both English and Japanese are supported, voice recognition can be performed whether the user speaks in English or Japanese. Furthermore, if this device can support other languages, for example, French and Chinese, by setting the language, the user can use any method to change the language used at that time and the language available by changing the setting. Must be displayed.
[0009]
An object of the present invention is to provide a multilingual voice input / output device that allows a user to easily recognize a currently used language and an available language.
[0010]
[Means for Solving the Problems]
In order to achieve the above object, a voice input / output device according to claim 1 is a voice input / output device capable of performing voice input / output in a plurality of languages, wherein a language capable of voice input and a language capable of voice output are provided. It is characterized by having display means for displaying.
[0011]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, a voice input / output device according to an embodiment of the present invention will be described in detail with reference to the drawings.
[0012]
FIG. 1 is a block diagram showing a schematic configuration of a voice input / output device according to an embodiment of the present invention.
[0013]
Although the voice input / output device according to the present embodiment is assumed to be implemented as a voice interaction system, the present invention is not limited to this, and a single voice input device or a single voice synthesis device, Or, it may be another type of device.
[0014]
1, an audio input / output device 100 includes a control memory (ROM) 101, a central processing unit (CPU) 102, a memory (RAM) 103, an external storage device 104, an input device 105, a display device 106, and an audio output device (speaker). ) 107 and a voice input device (microphone) 108, which are connected to each other via a bus 109.
[0015]
A control program for implementing the voice input / output device 100 and data used in the control program are stored in the external storage device 104. Under the control of the central processing unit 102, these control programs and data are appropriately loaded into the memory 103 via the bus 109 and executed by the central processing unit 102. The control program and data may be stored in the control memory 101.
[0016]
FIG. 2 is a block diagram showing a module configuration of the audio input / output device of FIG.
[0017]
The voice input / output device 100 of FIG. 1 includes a language setting unit 201, a set language display unit 202, and a voice interaction control unit 203. The language setting unit 201 sets the language of the voice input / output device and changes the language setting. The setting language display unit 202 displays the language set by the language setting unit 201 to the user. It controls recognition and speech synthesis and controls dialogue with the user using speech input and output.
[0018]
Further, when changing the language to be set, the language setting unit 201 sets a language in which data necessary for speech recognition and speech synthesis, for example, an acoustic model for speech recognition, a speech recognition grammar, a waveform dictionary for speech synthesis, and the like are set. Change the data to
[0019]
The voice input / output device in FIG. 1 supports voice input / output in English, French, and German in addition to Japanese, and can perform voice input / output in Japanese at the time of startup.
[0020]
FIG. 3 is a flowchart showing a set language display process executed by the voice input / output device of FIG.
[0021]
In FIG. 3, first, when the apparatus is started, initialization is performed (step 301). This device supports voice input / output in Japanese only at the time of startup. In this case, the voice interaction control unit 203 sets data necessary for Japanese voice recognition and voice synthesis.
[0022]
Next, as shown in FIG. 4, the display device 106 displays the set language, that is, the language that can be handled at that time, on the display device 106 (step S302). Here, when the language is displayed, a language that can be used by changing the setting is also displayed in addition to the language that can be handled at that time. The language that can be handled at that time and the language that can be handled by changing the setting are displayed by changing the display method.
[0023]
In the example of FIG. 4, the display device 106 is indicated by a flag mark of a country where the language is mainly spoken, and the setting change is performed together with a flag mark corresponding to the language (Japanese in this example) that can be handled at that time. A flag mark corresponding to a language (English, French, German) that can be used is also displayed. Furthermore, the display method is changed between the flag mark corresponding to the language (Japanese) that can be handled at that time and the language that can be used by changing the setting. In the example of FIG. 4, the Japanese flag corresponding to the Japanese language that can be handled at that time is displayed larger than flags of other languages, and is emphasized by being outlined with a thicker line. As another display method, as shown in FIG. 5, a flag that can be used may be added to a national flag of a language that cannot be used unless the setting is changed so that the flag can be displayed separately from the language that can be handled at that time.
[0024]
In the following step S303, it is determined whether or not the user changes the set language. If the user wants to change the set language, the voice dialogue is performed by Japanese voice recognition and Japanese voice synthesis set by the voice dialogue control unit 203. (Step S304). Until the operation of the voice input / output device is completed (NO in Step S305), the processes in and after Step S303 are repeated. finish.
[0025]
As a result of the determination in step S303, when the user changes the set language, since the language at the time of startup is set to only Japanese in the voice input / output device, the user performs a voice dialogue in, for example, English. For this reason, when the set language is changed to, for example, English, the set language is changed to English by the language setting unit 201 in step S306. At this time, the voice interaction control unit 203 sets data necessary for inputting / outputting English voice. Next, the process proceeds to step S302, where it is displayed that the language can be handled in English. At this time, if the device can handle only one language at a time, only the US flag representing English is highlighted and displayed as shown in FIG. If a plurality of languages can be handled at the same time, the flags of Japan and the United States are highlighted and displayed as shown in FIG. In the latter case, the voice input / output device accepts voice input in Japanese and English, and outputs Japanese voice for Japanese voice input and English voice output for English voice input. . The setting language can be changed by, for example, setting a dedicated change button, or changing the display device 106 to a liquid crystal touch panel and switching the language by touching a flag of a desired language displayed by the user. It may be.
[0026]
In the voice input / output device according to the embodiment of the present invention, when displaying a language, the language is displayed using a flag mark of a country in which the language is mainly spoken, but the present invention is not limited to this, and the language is supported. It may be indicated by an abbreviated symbol or symbol. FIG. 8 shows an example in which abbreviations are used. In FIG. 8, Japanese, English, French, and German are expressed using abbreviations JPN, ENG, FRA, and GER, respectively. Also, in order to indicate that Japanese is available, the JPN is shown in bold and emphasized with an underline.
[0027]
In the voice input / output device according to the embodiment of the present invention, the case where the language of voice input and the language of voice output are the same has been described, but the present invention is not limited to this. Originally, it is desirable that the language of the voice input and the voice output be the same, but depending on the voice input / output device, for example, even if the voice input corresponds to Japanese, English, French, and German, the voice output is In some cases, only English and English are supported. In such a case, it is necessary to separately display the language supported by voice input and the language supported by voice output. FIG. 9 shows an example. In FIG. 9, a language in which voice can be input is displayed after the symbol of the microphone, and a language in which voice can be output is displayed after the symbol of the speaker. The voice input can be performed in all four languages, but the voice output supports only Japanese and English, indicating that the current voice output is in English.
[0028]
The present invention is achieved by providing a computer or CPU with a software program (flow chart in FIG. 3) for realizing the functions of the above-described embodiment, and reading and executing the supplied program. can do.
[0029]
In this case, the program is supplied directly from a storage medium on which the program is recorded, or is downloaded from another computer or database (not shown) connected to the Internet, a commercial network, a local area network, or the like. Supplied.
[0030]
The form of the program may be in the form of object code, program code executed by an interpreter, script data supplied to an OS (Operating System), or the like.
[0031]
According to the present invention, a storage medium storing a software program for realizing the functions of the above-described embodiments is supplied to a computer or a CPU, and the computer or the CPU reads and executes the program stored in the storage medium. Can also be achieved.
[0032]
In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention.
[0033]
Examples of storage media for storing program codes include ROM, RAM, NV-RAM, floppy (registered trademark) disk, hard disk, optical disk (registered trademark), magneto-optical disk, CD-ROM, MO, CD-R, and CD. -RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW, magnetic tape, nonvolatile memory card, and the like.
[0034]
The functions of the above-described embodiments are not only executed by executing the program code read from the computer, but also executed by the OS or the like running on the computer in part or all of the actual processing based on the instruction of the program code. This can also be realized by:
[0035]
[Embodiment 1] A voice input / output device capable of voice input / output in a plurality of languages, comprising a display unit for displaying a language in which voice can be input and a language in which voice can be output. .
[0036]
[Second Embodiment] The voice input / output device according to claim 1, wherein the display unit displays the language in which the voice can be input and the language in which the voice can be output individually.
[0037]
[Embodiment 3] The voice input / output device according to claim 1 or 2, wherein the display means displays a language that can be handled at that time and a language that can be handled by changing a setting.
[0038]
[Embodiment 4] The voice input / output device according to any one of Claims 1 to 3, wherein the display means displays using a national flag mark of a country where the language is mainly used. .
[0039]
[Embodiment 5] The voice input / output device according to any one of claims 1 to 3, wherein the display unit displays the language using an abbreviation of a country where the language is mainly used.
[0040]
[Sixth Embodiment] A voice input / output method capable of performing voice input / output in a plurality of languages, comprising a display step of displaying a language in which voice can be input and a language in which voice can be output. .
[0041]
[Seventh Embodiment] The voice input / output method according to claim 6, wherein in the displaying step, the language in which voice can be input and the language in which voice can be output are individually displayed.
[0042]
[Eighth Embodiment] The voice input / output method according to claim 6 or 7, wherein in the display step, a language that can be handled at that time and a language that can be handled by changing the setting are displayed separately.
[0043]
[Embodiment 9] The voice input / output apparatus according to any one of claims 6 to 8, wherein the display step is performed using a national flag mark of a country where the language is mainly used. .
[0044]
[Embodiment 10] The voice input / output device according to any one of claims 6 to 8, wherein in the displaying step, the language is displayed using an abbreviation of a country where the language is mainly used.
[0045]
[Embodiment 11] A voice input / output program for causing a computer to execute a voice input / output method capable of voice input / output in a plurality of languages includes a display module for displaying a language in which voice input is possible and a language in which voice output is possible. Characteristic voice input / output program.
[0046]
[Twelfth Embodiment] A computer-readable storage medium storing the voice input / output program according to claim 11.
[0047]
【The invention's effect】
As described in detail above, according to the voice input / output device according to the first aspect, the language in which voice can be input and the language in which voice can be output are displayed, so that the currently used language and the available language are used. Can be easily recognized.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a schematic configuration of a voice input / output device according to an embodiment of the present invention.
FIG. 2 is a block diagram showing a module configuration of the audio input / output device of FIG.
FIG. 3 is a flowchart showing a set language display process executed by the voice input / output device of FIG. 1;
FIG. 4 is a diagram illustrating a display example of a language of the display device 106 in FIG. 1 that can be used.
FIG. 5 is a diagram illustrating a display example of the display device in FIG. 1 when Japanese is available.
FIG. 6 is a diagram illustrating an example of display on display device 106 in FIG. 1 when English is usable.
FIG. 7 is a diagram illustrating a display example of the display device in FIG. 1 when English and Japanese are available.
FIG. 8 is a diagram illustrating a display example of the display device in FIG. 1 when a language is displayed using an abbreviation in the voice input / output device according to the embodiment of the present invention.
FIG. 9 is a diagram illustrating a display example of the display device 106 in FIG. 1 in a case where a language corresponding to voice input and a language corresponding to voice output are individually displayed.
[Explanation of symbols]
100 voice input / output device 101 control memory (ROM)
102 Central Processing Unit (CPU)
103 Memory (RAM)
104 external storage device 105 input device 106 display device 107 audio output device (speaker)
108 Audio output device (microphone)
201 language setting unit 202 setting language presentation unit 203 voice dialogue control unit

Claims

An audio input / output device capable of inputting / outputting audio in a plurality of languages, comprising a display unit for displaying a language in which audio input is possible and a language in which audio output is possible.