JP2014062944A

JP2014062944A - Information processing devices

Info

Publication number: JP2014062944A
Application number: JP2012206597A
Authority: JP
Inventors: Hirofumi Shimada; 裕文嶋田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2012-09-20
Filing date: 2012-09-20
Publication date: 2014-04-10

Abstract

PROBLEM TO BE SOLVED: To correctly and quickly perform voice recognition of a user's voice command by connecting an information processing device such as a car navigation device and an external device such as a portable phone.SOLUTION: Provided are first and second information processing devices. The first information processing device comprises: an input unit which receives input of a voice command; a transmission unit which transmits the voice command to the second information processing device; a first voice recognition unit which performs voice recognition at a low level at the same time as the transmission of the voice command; an output unit; a determination unit; and a command execution unit. The second information processing device comprises: a second voice recognition unit which performs voice recognition at high level; and a transmission unit which transmits a voice recognition result to the first information processing device. When the determination unit of the first information processing device determines that reliability of voice recognition performed by the first voice recognition unit is high, the first information processing device executes the command recognized by the first voice recognition unit, and when the determination unit determines that the reliability of voice recognition performed by the first voice recognition unit is low, the first information processing device executes the command recognized by the second voice recognition unit.

Description

この発明は、情報処理装置に関し、特に、音声認識機能を有する、例えばカーナビゲーション装置のような情報処理装置に関する。 The present invention relates to an information processing apparatus, and more particularly to an information processing apparatus having a voice recognition function, such as a car navigation apparatus.

上述したカーナビゲーション装置のような情報処理装置においては、ユーザーが発した音声コマンドを認識し、それに応じた処理を行うことが可能な装置が知られている。また、携帯電話では、通信回線を用いて音声認識や文章解析をサーバー側で処理させるようにした、いわゆるコンシェルジュサービスと呼ばれる処理が知られている。 In an information processing apparatus such as the car navigation apparatus described above, an apparatus capable of recognizing a voice command issued by a user and performing a process corresponding to the command is known. In mobile phones, a process called a concierge service is known in which voice recognition and sentence analysis are processed on the server side using a communication line.

これらに関する技術として、携帯端末を車内で車載端末と連携させる技術（特許文献１参照）、車載機で携帯電話のアプリケーションを使えるようにした技術（特許文献２参照）、音声認識機能のリソースが限られている機器であっても十分に認識できる音声認識装置に関する技術（特許文献３参照）などが知られている。 As a technology related to these, a technology for linking a mobile terminal with an in-vehicle terminal in a vehicle (see Patent Document 1), a technology for enabling a mobile phone application to be used in an in-vehicle device (see Patent Document 2), and a resource for a voice recognition function are limited. A technology related to a speech recognition device (see Patent Document 3) that can be sufficiently recognized even by a known device is known.

特開２０１２−２７０７０号公報JP 2012-27070 A 特開２０１０−１３０２２３号公報JP 2010-130223 A 特開２００５−２６６１９２号公報JP 2005-266192 A

しかしながら、カーナビゲーション装置などに組み込まれた音声認識装置は、認識能力が限定されており、必要最低限の機能はあるものの、音声コマンドの文末の活用が変化したり、発話表現が変化すると認識できないことが多い。また、携帯電話で利用されるコンシェルジュサービスは、通信回線を用い、高精度な音声認識や文章解析が可能であるが、音声認識に時間がかかるという問題がある。 However, voice recognition devices built into car navigation devices have limited recognition capabilities and have the minimum necessary functions, but they cannot be recognized if the use of the end of a voice command changes or the utterance expression changes. There are many cases. In addition, the concierge service used in a mobile phone uses a communication line and can perform highly accurate speech recognition and sentence analysis, but has a problem that it takes time for speech recognition.

このため、カーナビゲーション装置などの音声認識を携帯電話のコンシェルジュサービスに委託すると、カーナビゲーション装置などに内蔵された音声認識装置と比較して長い認識時間を要するため、スピーディな対応をすることができないという問題があった。 For this reason, entrusting voice recognition of a car navigation device or the like to a mobile phone concierge service requires a longer recognition time than a voice recognition device built in a car navigation device or the like, and therefore cannot respond quickly. There was a problem.

そこで、この発明は、以上のような事情を考慮してなされたものであり、カーナビゲーション装置のような情報処理装置と携帯電話のような外部機器とを接続して同時に用い、情報処理装置の音声認識と同時に外部機器での音声認識を実行して、情報処理装置側の認識信頼度が低い場合には外部機器の認識結果を利用するようにした情報処理装置を提供することを課題とする。 Therefore, the present invention has been made in consideration of the above circumstances, and an information processing device such as a car navigation device and an external device such as a mobile phone are connected and used at the same time. An object of the present invention is to provide an information processing apparatus that performs voice recognition on an external device simultaneously with voice recognition and uses the recognition result of the external device when the recognition reliability on the information processing device side is low .

この発明は、第１のレベルの音声認識機能を有する第１の情報処理装置と、第１の情報処理装置よりも高レベルの第２の音声認識機能を有する第２の情報処理装置からなり、第１の情報処理装置は、ユーザーの音声コマンドを入力することが可能な入力部と、入力された音声コマンドを第２の情報処理装置に送信する送信部と、送信部の送信と同時に、入力された音声コマンドの音声認識を第１のレベルで行う第１の音声認識部と、第１の音声認識部の音声認識の結果を出力する出力部とを備え、第２の情報処理装置は、第１の情報処理装置から受信した音声コマンドの音声認識を第２のレベルで行う第２の音声認識部と、第２の音声認識部の音声認識結果を第１の情報処理装置に送信する送信部とを備え、第１の情報処理装置が、さらに、第２の音声認識部の音声認識結果を受信する受信部と、第１の音声認識部の音声認識の信頼度を判定する判定部と、判定部によって信頼度が高いと判定された場合には第１の音声認識部で音声認識されたコマンドを実行するとともに、判定部によって信頼度が低いと判定された場合には第２の音声認識部で音声認識されたコマンドを実行するコマンド実行部を備えたことを特徴とする情報処理装置を提供するものである。 The present invention comprises a first information processing device having a first level speech recognition function and a second information processing device having a second level speech recognition function higher than the first information processing device, The first information processing device includes an input unit capable of inputting a user's voice command, a transmission unit that transmits the input voice command to the second information processing device, and input simultaneously with transmission of the transmission unit. A first speech recognition unit that performs speech recognition of the received voice command at a first level, and an output unit that outputs a result of speech recognition of the first speech recognition unit. A second voice recognition unit that performs voice recognition of a voice command received from the first information processing apparatus at a second level, and a transmission that transmits a voice recognition result of the second voice recognition unit to the first information processing apparatus A first information processing device further comprising: A receiving unit that receives the voice recognition result of the second voice recognition unit, a determination unit that determines the reliability of the voice recognition of the first voice recognition unit, and the determination unit determines that the reliability is high. A command execution unit that executes a command that has been voice-recognized by one voice recognition unit and that executes a command that has been voice-recognized by the second voice recognition unit when the determination unit determines that the reliability is low. An information processing apparatus characterized by the above is provided.

これによれば、第１と第２の情報処理装置で音声コマンドの音声認識処理が同時に実行され、第１の情報処理装置での音声認識結果の信頼度が高ければ、その音声認識結果が採用されて迅速にコマンドが実行され、信頼度が低ければ、第２の情報処理装置での音声認識結果が採用されて正確なコマンドが実行される。したがって、迅速かつ正確なコマンドの実行処理が可能となる。 According to this, if the first and second information processing apparatuses execute voice command voice recognition processing simultaneously and the reliability of the voice recognition result in the first information processing apparatus is high, the voice recognition result is adopted. If the command is promptly executed and the reliability is low, the voice recognition result in the second information processing apparatus is adopted and an accurate command is executed. Accordingly, it is possible to execute a command execution process quickly and accurately.

また、第１の情報処理装置の判定部は、ユーザーの音声コマンドの音声認識が第１の音声認識部で不可能であれば、音声認識の信頼度が低いと判定する判定部であることを特徴とする。これによれば、不要なコマンドの実行処理を防止することができる。 The determination unit of the first information processing apparatus is a determination unit that determines that the voice recognition reliability is low if the first voice recognition unit cannot recognize the voice command of the user. Features. According to this, it is possible to prevent unnecessary command execution processing.

また、第１の情報処理装置の判定部は、ユーザーの音声コマンドの音声認識が第１の音声認識部で行われた後、出力部から出力された音声認識結果からユーザーの指示により音声認識の信頼度が低いと判定する判定部であることを特徴とする。
これによれば、誤った音声認識が行われた場合に、ユーザーの指示により誤ったコマンド実行処理を防止することができる。 The determination unit of the first information processing apparatus performs voice recognition according to a user instruction from the voice recognition result output from the output unit after the voice recognition of the user's voice command is performed by the first voice recognition unit. The determination unit determines that the reliability is low.
According to this, when an erroneous voice recognition is performed, an erroneous command execution process can be prevented by a user instruction.

この発明によれば、音声コマンドを与えて、第１と第２の情報処理装置で音声認識処理を同時に実行し、第１の情報処理装置での音声認識結果の信頼度が高ければ、その音声認識結果を採用して迅速にコマンドの実行を行い、信頼度が低ければ、第２の情報処理装置での音声認識結果を採用して正確なコマンドの実行を行う。したがって、迅速かつ正確なコマンドの実行処理が可能となる。 According to the present invention, when a voice command is given and voice recognition processing is simultaneously executed by the first and second information processing apparatuses, and the reliability of the voice recognition result in the first information processing apparatus is high, the voice is processed. If the recognition result is adopted, the command is quickly executed, and if the reliability is low, the voice recognition result in the second information processing apparatus is adopted and the accurate command is executed. Accordingly, it is possible to execute a command execution process quickly and accurately.

この発明の情報処理装置の一実施例の構成ブロック図である。1 is a configuration block diagram of an embodiment of an information processing apparatus of the present invention. この発明の情報処理装置の処理動作の内容を示す説明図である。It is explanatory drawing which shows the content of the processing operation of the information processing apparatus of this invention. この発明の情報処理装置の詳細な処理動作を示す説明図である。It is explanatory drawing which shows the detailed processing operation of the information processing apparatus of this invention.

以下に、本発明を実施するための最良の形態について図面を参照して説明する。なお、これによって、この発明が限定されるものではない。 The best mode for carrying out the present invention will be described below with reference to the drawings. However, this does not limit the present invention.

＜この発明の情報処理装置の構成＞
図１に、この発明の情報処理装置の一実施例の構成ブロック図を示す。本実施例では、この発明の情報処理装置が、カーナビゲーション装置１と携帯電話２から構成されている例を示す。カーナビゲーション装置１は、車に搭載されＧＰＳ（全地球的測位システム）により経路案内を行う市販の装置を適用しているが、これは一例であり、比較的低次元の音声認識機能を有する電子機器であればどのような電子機器であってもよい。例えば、車載タイプではなく、個人で携帯可能な経路案内機能を有する小型軽量の電子機器や、携帯電話と連携する音声認識機能内蔵の家庭用電気製品のようなものであってもよい。 <Configuration of Information Processing Apparatus of the Invention>
FIG. 1 is a block diagram showing the configuration of an embodiment of an information processing apparatus according to the present invention. In the present embodiment, an example in which the information processing apparatus of the present invention includes a car navigation apparatus 1 and a mobile phone 2 will be described. The car navigation device 1 uses a commercially available device that is mounted in a car and performs route guidance using GPS (Global Positioning System), but this is an example, and an electronic device having a relatively low-dimensional voice recognition function. Any electronic device may be used as long as it is a device. For example, instead of a vehicle-mounted type, it may be a small and light electronic device having a route guidance function that can be carried by an individual, or a household electric product with a built-in voice recognition function that cooperates with a mobile phone.

携帯電話２は、本例では従来から用いられている携帯電話機を適用しているが、音声認識サービス（いわゆるコンシェルジュサービス）を利用可能なものであればどのような端末機器であってもよい。例えばスマートフォン、タブレット端末などの携帯性を有する小型軽量の端末機器であってもよい。携帯電話２は、通信回線により音声認識サーバー３と接続が可能である。音声認識サーバー３は優先認識ワードデータベースを有している。 In the present example, the mobile phone 2 is a mobile phone that has been used in the past, but may be any terminal device that can use a voice recognition service (so-called concierge service). For example, it may be a small and lightweight terminal device having portability such as a smartphone or a tablet terminal. The mobile phone 2 can be connected to the voice recognition server 3 through a communication line. The voice recognition server 3 has a priority recognition word database.

カーナビゲーション装置１は、主として、ナビ制御部１１、音声入力部１２、音声データ送信部１３、ナビ内蔵音声認識部１４、認識信頼度判定部１５、コマンド実行部１６、携帯音声認識データ受信部１７、コマンド実行／回答音声再生部１８とを備える。また、これらの機能以外に各種の操作ボタン、タッチパネルと表示パネルが一体となったデータ入出力部、及びナビゲーション動作を行うための各種の機能を備えている。 The car navigation device 1 mainly includes a navigation control unit 11, a voice input unit 12, a voice data transmission unit 13, a built-in navigation voice recognition unit 14, a recognition reliability determination unit 15, a command execution unit 16, and a portable voice recognition data reception unit 17. And a command execution / answer voice playback unit 18. In addition to these functions, various operation buttons, a data input / output unit in which a touch panel and a display panel are integrated, and various functions for performing a navigation operation are provided.

ナビ制御部１１は、カーナビゲーション装置１の各機能ブロックの動作を制御する部分であり、主として、ＣＰＵ、ＲＯＭ、ＲＡＭ、Ｉ／Ｏコントローラ、タイマー等からなるマイクロコンピュータにより構成されている。ＣＰＵは、ＲＯＭ等に記憶されたプログラムに基づいて、各種ハードウエアを有機的に動作させることによって、カーナビゲーション装置１の各機能を実行する。 The navigation control unit 11 is a part that controls the operation of each functional block of the car navigation device 1, and is mainly configured by a microcomputer including a CPU, a ROM, a RAM, an I / O controller, a timer, and the like. The CPU executes each function of the car navigation apparatus 1 by organically operating various hardware based on a program stored in a ROM or the like.

音声入力部１２は、ユーザーの音声を入力するためのものであり、入力されたユーザーの音声を音声データに変換する。音声データ送信部１３は、音声データをナビ内蔵音声認識部１４に送ると同時に、携帯電話２の音声データ受信部２２に送信する。ナビ内蔵音声認識部１４は、音声データの音声認識を行って音声認識データに変換する。 The voice input unit 12 is for inputting a user's voice, and converts the input user's voice into voice data. The voice data transmission unit 13 transmits the voice data to the navigation built-in voice recognition unit 14 and simultaneously transmits the voice data to the voice data reception unit 22 of the mobile phone 2. The navigation built-in voice recognition unit 14 performs voice recognition of voice data and converts the voice data into voice recognition data.

認識信頼度判定部１５は、ナビ内蔵音声認識部１４で認識された音声認識データが妥当なものであるか否かの判定を行う。コマンド実行部１６はナビ内蔵音声認識部１３で変換された音声認識データに基づいてコマンドの実行を行う。携帯音声認識データ受信部１７は、携帯電話２側の携帯音声認識データ送信部２４から送られてきた音声認識データを受信する。コマンド実行／回答音声再生部１８は、携帯音声認識データ受信部１７で受信した音声認識データに基づいてコマンドの実行及び音声データの回答として得られた回答音声の再生を行う。 The recognition reliability determination unit 15 determines whether or not the voice recognition data recognized by the navigation built-in voice recognition unit 14 is appropriate. The command execution unit 16 executes a command based on the voice recognition data converted by the navigation built-in voice recognition unit 13. The mobile voice recognition data receiving unit 17 receives the voice recognition data transmitted from the mobile voice recognition data transmitting unit 24 on the mobile phone 2 side. The command execution / answer voice reproduction unit 18 executes a command based on the voice recognition data received by the portable voice recognition data reception unit 17 and reproduces the answer voice obtained as a voice data answer.

携帯電話２は、主として、携帯電話制御部２１、音声データ受信部２２、携帯音声認識部２３、携帯音声認識データ送信部２４を備える。また、これらの機能以外に各種の操作ボタン、タッチパネルと表示パネルが一体となったデータ入出力部、及び携帯電話としての動作を行うための各種の機能を備えている。 The mobile phone 2 mainly includes a mobile phone control unit 21, a voice data receiving unit 22, a mobile voice recognition unit 23, and a mobile voice recognition data transmission unit 24. In addition to these functions, various operation buttons, a data input / output unit in which a touch panel and a display panel are integrated, and various functions for performing operations as a mobile phone are provided.

携帯電話制御部２１は、携帯電話２の各機能ブロックの動作を制御する部分であり、主として、ＣＰＵ、ＲＯＭ、ＲＡＭ、Ｉ／Ｏコントローラ、タイマー等からなるマイクロコンピュータにより構成されている。ＣＰＵは、ＲＯＭ等に記憶されたプログラムに基づいて、各種ハードウエアを有機的に動作させることによって、携帯電話２の各機能を実行する。 The mobile phone control unit 21 is a part that controls the operation of each functional block of the mobile phone 2 and is mainly composed of a microcomputer including a CPU, a ROM, a RAM, an I / O controller, a timer, and the like. The CPU executes each function of the mobile phone 2 by organically operating various hardware based on a program stored in a ROM or the like.

音声データ受信部２２は、カーナビゲーション装置１の音声データ送信部１３から送られてきた音声データを受信する。携帯音声認識部２３は、音声データ受信部２２で受信したデータを音声認識サーバー３に送信して、音声認識サーバー３から音声認識データを受信する。携帯音声認識データ送信部２４は、音声認識サーバー３から受信した音声認識データをカーナビゲーション装置１の携帯音声認識データ受信部１７に送信する。 The audio data receiving unit 22 receives the audio data transmitted from the audio data transmitting unit 13 of the car navigation device 1. The portable voice recognition unit 23 transmits the data received by the voice data receiving unit 22 to the voice recognition server 3 and receives the voice recognition data from the voice recognition server 3. The portable voice recognition data transmitting unit 24 transmits the voice recognition data received from the voice recognition server 3 to the portable voice recognition data receiving unit 17 of the car navigation device 1.

＜この発明の情報処理装置の動作＞
図２を用いて、この発明の情報処理装置の処理動作の内容を説明する。処理動作はカーナビゲーション装置１、携帯電話２、音声認識サーバー３についてそれぞれ示している。これらの処理はカーナビゲーション装置１についてはナビ制御部１１が行い、携帯電話２については携帯電話制御部２１が行う。カーナビゲーション装置１と携帯電話２とはBluetooth（登録商標）又は有線４によって接続されている。また、携帯電話２と音声認識サーバー３とは携帯電話網やWiMAX／Wi-Fi（登録商標）等の通信回線５によって接続されている。 <Operation of Information Processing Apparatus of the Present Invention>
The contents of the processing operation of the information processing apparatus according to the present invention will be described with reference to FIG. The processing operations are shown for the car navigation device 1, the mobile phone 2, and the voice recognition server 3, respectively. These processes are performed by the navigation control unit 11 for the car navigation device 1 and by the mobile phone control unit 21 for the mobile phone 2. The car navigation device 1 and the mobile phone 2 are connected by Bluetooth (registered trademark) or a wired 4. The mobile phone 2 and the voice recognition server 3 are connected by a communication line 5 such as a mobile phone network or WiMAX / Wi-Fi (registered trademark).

カーナビゲーション装置１側の処理について説明する。ステップＳ１において、カーナビゲーション装置１の音声入力部１２にコマンドとしてユーザーの音声が入力されると、ステップＳ２において、その音声データが音声データ送信部１３によって携帯電話２の音声データ受信部２２に送られ、同時にナビ内蔵音声認識部１４に送られる。これによりカーナビゲーション装置１と携帯電話２で同時に音声認識処理が開始される。 Processing on the car navigation device 1 side will be described. In step S1, when a user's voice is input as a command to the voice input unit 12 of the car navigation apparatus 1, the voice data is sent to the voice data receiving unit 22 of the mobile phone 2 by the voice data transmitting unit 13 in step S2. At the same time, it is sent to the voice recognition unit 14 with built-in navigation. As a result, the voice recognition process is started simultaneously in the car navigation device 1 and the mobile phone 2.

ステップＳ３において、音声データはナビ内蔵音声認識部１４で音声認識され、ステップＳ４において、認識信頼度判定部１５で認識信頼度の判定が行われる。この判定については後述する。音声認識の信頼度が高い場合は、ステップＳ５において、コマンド実行部１６によってナビ内蔵音声認識のコマンドが実行され、ステップＳ６において、携帯電話２側の音声認識がキャンセルされて処理が終了する。一方、音声認識の信頼度が低い場合は、ステップＳ７において、携帯音声認識データ受信部１７によって携帯電話２の携帯音声認識データ送信部２４から携帯音声認識データが受信される。携帯音声認識データには、回答としての音声認識データとカーナビゲーション装置へのコマンドが含まれている。そして、ステップＳ８において、コマンド実行／回答音声再生部１８によって携帯音声認識によるコマンド実行／回答音声の再生が行われる。 In step S3, the voice data is recognized by the navigation built-in voice recognition unit 14, and in step S4, the recognition reliability determination unit 15 determines the recognition reliability. This determination will be described later. When the reliability of voice recognition is high, the command execution unit 16 executes a command for voice recognition with built-in navigation in step S5, and in step S6, voice recognition on the mobile phone 2 side is canceled and the process ends. On the other hand, when the reliability of voice recognition is low, portable voice recognition data is received by the portable voice recognition data receiving unit 17 from the portable voice recognition data transmitting unit 24 of the mobile phone 2 in step S7. The portable voice recognition data includes voice recognition data as a response and a command to the car navigation device. In step S8, the command execution / answer voice reproduction unit 18 reproduces the command execution / answer voice by portable voice recognition.

前述したステップＳ４の認識信頼度判定部１５における認識信頼度の判定は、以下のようにして行う。すなわち、ナビ内蔵音声認識部１４において、音声が認識不可である場合には認識信頼度が低いと判定される。また、音声の認識結果が表示パネルに表示された場合や、認識結果が音声再生された場合に、ユーザーがその認識結果を不適切であると判断しその認識結果をキャンセルした場合も、認識信頼度が低いと判定される。その他の場合は認識信頼度が高いと判定される。 The determination of the recognition reliability in the recognition reliability determination unit 15 in step S4 described above is performed as follows. That is, the navigation built-in voice recognition unit 14 determines that the recognition reliability is low when the voice cannot be recognized. Also, if the recognition result is displayed on the display panel, or if the recognition result is played back as a sound, the user determines that the recognition result is inappropriate and cancels the recognition result. It is determined that the degree is low. In other cases, it is determined that the recognition reliability is high.

携帯電話２側の処理について説明する。カーナビゲーション装置１から音声データ受信部２２によって音声データを受信すると、ステップＳ９において、携帯音声認識部２３によって携帯音声認識が行われる。これは一般にコンシェルジュサービスと呼ばれるものである。この処理では、通信回線５を通じて、音声認識サーバー３に音声データとカーナビゲーション装置の機種名を送信し、音声認識サーバー３から音声の認識結果を受け取る。そして、カーナビゲーション装置１の携帯音声認識データ受信部１７に音声認識データとコマンドを送信する。 Processing on the mobile phone 2 side will be described. When the voice data is received by the voice data receiving unit 22 from the car navigation device 1, the portable voice recognition unit 23 performs portable voice recognition in step S9. This is generally called a concierge service. In this process, the voice data and the model name of the car navigation device are transmitted to the voice recognition server 3 through the communication line 5, and the voice recognition result is received from the voice recognition server 3. Then, the voice recognition data and the command are transmitted to the portable voice recognition data receiving unit 17 of the car navigation device 1.

音声認識サーバー３は、優先認識ワードデータベース３１を有しており、この優先認識ワードデータベース３１を利用して、カーナビゲーション装置の機種名から優先認識ワードを入手し、音声認識データ及びコマンドの作成に利用する。 The voice recognition server 3 has a priority recognition word database 31. The priority recognition word database 31 is used to obtain a priority recognition word from the model name of the car navigation device, and to create voice recognition data and commands. Use.

このように、カーナビゲーション装置と携帯電話で同時に音声認識を行い、カーナビゲーション装置に内蔵した音声認識結果の信頼度が高い場合はカーナビゲーション装置の音声認識結果を利用し、カーナビゲーション装置に内蔵した音声認識結果の信頼度が低いか誤りである場合は携帯電話の音声認識結果を利用する。 In this way, voice recognition is performed simultaneously with the car navigation device and the mobile phone, and when the reliability of the voice recognition result built into the car navigation device is high, the voice recognition result of the car navigation device is used and built into the car navigation device. If the reliability of the speech recognition result is low or incorrect, the speech recognition result of the mobile phone is used.

図３に、この発明の情報処理装置の詳細な処理動作の内容を示す。図においては、カーナビゲーション装置１、携帯電話２および音声認識サーバー３の全体の処理内容を関連付けて示している。 FIG. 3 shows the details of the processing operation of the information processing apparatus of the present invention. In the figure, the entire processing contents of the car navigation apparatus 1, the mobile phone 2, and the voice recognition server 3 are shown in association with each other.

接続時の初期設定として、カーナビゲーション装置１では、ステップＴ１において、携帯電話２との接続初期手続を行い、ステップＴ２において、カーナビゲーション装置の機種情報を携帯電話２に送信する。携帯電話２では、ステップＴ２１において、カーナビゲーション装置１との接続初期手続を行い、ステップＴ２２において、カーナビゲーション装置の機種情報を取得する。 As an initial setting at the time of connection, the car navigation device 1 performs an initial connection procedure with the mobile phone 2 in step T1, and transmits model information of the car navigation device to the mobile phone 2 in step T2. The cellular phone 2 performs an initial connection procedure with the car navigation device 1 in step T21, and acquires model information of the car navigation device in step T22.

カーナビゲーション装置１では、ステップＴ３において、コマンドとして入力されたユーザーの音声認識処理を開始し、ステップＴ４において、音声データを取得して、ステップＴ５において、音声データを携帯電話２に送信する。そして、ステップＴ６において、音声認識処理を行う。ステップＴ７において、この音声認識処理の信頼度が高ければ、ステップＴ８において、ナビ内蔵音声認識によるコマンドの実行を行い、ステップＴ９において、携帯電話の音声認識データとナビゲーション装置のコマンドのキャンセル処理を行う。一方、音声認識処理の信頼度が低ければ、ステップＴ１０において、携帯電話２に認識結果の送信指示を行って音声認識データとナビゲーション装置のコマンドを受信し、ステップＴ１１において、携帯音声認識によるコマンド実行／回答音声の再生を行う。 The car navigation apparatus 1 starts the voice recognition process of the user input as a command in step T3, acquires voice data in step T4, and transmits the voice data to the mobile phone 2 in step T5. In step T6, voice recognition processing is performed. If the reliability of the voice recognition process is high in step T7, the command is executed by the voice recognition with built-in navigation in step T8, and the voice recognition data of the mobile phone and the command of the navigation device are canceled in step T9. . On the other hand, if the reliability of the voice recognition processing is low, in step T10, the mobile phone 2 is instructed to transmit the recognition result to receive the voice recognition data and the navigation device command, and in step T11, the command is executed by portable voice recognition. / Play the answer voice.

携帯電話２では、ステップＴ２３において、カーナビゲーション装置１から音声データを受信し、ステップＴ２４において、音声データとナビゲーション装置の機種名を音声認識サーバー３に送信する。 In step T23, the cellular phone 2 receives the voice data from the car navigation device 1, and transmits the voice data and the model name of the navigation device to the voice recognition server 3 in step T24.

音声認識サーバー３では、ステップＴ３１において、携帯電話２から音声データとナビゲーション装置の機種名を受信し、ステップＴ３２において、音声認識エンジンによりいわゆるコンシェルジュサービスと呼ばれる音声認識処理を行う。この処理では、優先認識ワードデータベース３１を用いてナビゲーション装置の機種名から対応コマンドリストを得る。そして、ステップＴ３３において、回答音声認識データとナビゲーション装置のコマンドを携帯電話２に送信する。 In step T31, the voice recognition server 3 receives voice data and the model name of the navigation device from the mobile phone 2, and in step T32, the voice recognition engine performs voice recognition processing called a so-called concierge service. In this process, a priority command word database 31 is used to obtain a corresponding command list from the model name of the navigation device. In step T33, the reply voice recognition data and the command of the navigation device are transmitted to the mobile phone 2.

携帯電話２では、ステップＴ２５において、音声認識サーバー３から回答音声認識データとナビゲーション装置のコマンドを受信すると、ステップＴ２６において、カーナビゲーション装置１からの携帯電話の音声認識データのキャンセル指示の有無、および音声認識データの送信指示の有無を調べる。ここで、音声認識データのキャンセル指示があれば携帯電話の音声認識データをキャンセルし、携帯電話の音声認識データの送信指示があれば、ステップＴ２７において、回答音声認識データとナビゲーション装置のコマンドをカーナビゲーション装置１に送信する。 When the mobile phone 2 receives the answer voice recognition data and the navigation device command from the voice recognition server 3 in step T25, in step T26, whether or not there is an instruction to cancel the voice recognition data of the mobile phone from the car navigation device 1, and Check whether there is an instruction to send voice recognition data. If there is an instruction to cancel the voice recognition data, the voice recognition data of the mobile phone is canceled. If there is an instruction to transmit the voice recognition data of the mobile phone, the answer voice recognition data and the command of the navigation device are set in step T27. Transmit to the navigation device 1.

カーナビゲーション装置１では、前述したように、ステップＴ１０において、携帯音声認識データとナビゲーション装置のコマンドを受信し、ステップＴ１１において、携帯音声認識によるコマンド実行／回答音声の再生を行う。 As described above, the car navigation device 1 receives the portable voice recognition data and the navigation device command in step T10, and performs command execution / answer voice reproduction by portable voice recognition in step T11.

以上のように、この発明では、カーナビゲーション装置と携帯電話にユーザーの音声コマンドを同時に与える。そして、音声コマンドが単純な内容である場合には、カーナビゲーション装置に内蔵した音声認識機能を用いて高速に音声認識処理を行い、音声コマンドが複数の単語を含む場合や規定外の活用形発話のような複雑な内容である場合には、携帯電話の音声認識結果を利用して正確な音声認識処理を行うことにより、迅速でかつ正確な音声認識およびコマンドの実行処理が可能となる。 As described above, according to the present invention, the user's voice command is simultaneously given to the car navigation device and the mobile phone. If the voice command has simple contents, the voice recognition function built in the car navigation device is used to perform voice recognition processing at high speed. In the case of such complicated contents, it is possible to perform voice recognition and command execution processing quickly and accurately by performing accurate voice recognition processing using the voice recognition result of the mobile phone.

１カーナビゲーション装置
２携帯電話
３音声認識サーバー
４有線
５通信回線
１１ナビ制御部
１２音声入力部
１３音声データ送信部
１４ナビ内蔵音声認識部
１５認識信頼度判定部
１６コマンド実行部
１７携帯音声認識データ受信部
１８コマンド実行／回答音声再生部
２１携帯電話制御部
２２音声データ受信部
２３携帯音声認識部
２４携帯音声認識データ送信部
３１優先認識ワードデータベース DESCRIPTION OF SYMBOLS 1 Car navigation apparatus 2 Mobile phone 3 Voice recognition server 4 Wired 5 Communication line 11 Navigation control part 12 Voice input part 13 Voice data transmission part 14 Built-in voice recognition part 15 Recognition reliability determination part 16 Command execution part 17 Portable voice recognition data Reception unit 18 Command execution / answer voice reproduction unit 21 Mobile phone control unit 22 Voice data reception unit 23 Mobile voice recognition unit 24 Mobile voice recognition data transmission unit 31 Priority recognition word database

Claims

A first information processing apparatus having a first level voice recognition function and a second information processing apparatus having a second level voice recognition function higher than the first information processing apparatus;
The first information processing device includes an input unit capable of inputting a user's voice command, a transmission unit that transmits the input voice command to the second information processing device, and input simultaneously with transmission of the transmission unit. A first voice recognition unit that performs voice recognition of the voice command at a first level, and an output unit that outputs a result of voice recognition of the first voice recognition unit,
The second information processing apparatus includes a second voice recognition unit that performs voice recognition of the voice command received from the first information processing apparatus at a second level, and a voice recognition result of the second voice recognition unit as the first information. A transmission unit for transmitting to the information processing apparatus,
The first information processing apparatus further includes a receiving unit that receives the voice recognition result of the second voice recognition unit, a determination unit that determines the reliability of voice recognition of the first voice recognition unit, and a When it is determined that the degree is high, the command recognized by the first voice recognition unit is executed, and when the reliability is determined by the determination unit to be low, the second voice recognition unit performs voice recognition. An information processing apparatus comprising a command execution unit for executing a command that has been executed.

The determination unit of the first information processing apparatus is a determination unit that determines that the voice recognition reliability is low if the first voice recognition unit cannot recognize the voice command of the user. The information processing apparatus according to claim 1.

The determination unit of the first information processing apparatus determines the reliability of the voice recognition based on the user's instruction from the voice recognition result output from the output unit after the voice recognition of the user's voice command is performed by the first voice recognition unit. The information processing apparatus according to claim 2, wherein the information processing apparatus is a determination unit that determines that the value is low.