JP2020187163A

JP2020187163A - Voice operation system, voice operation control method and voice operation control program

Info

Publication number: JP2020187163A
Application number: JP2019089627A
Authority: JP
Inventors: 孝浩田中; Takahiro Tanaka; 将郎小池; Masaro Koike
Original assignee: Honda Motor Co Ltd
Current assignee: Honda Motor Co Ltd
Priority date: 2019-05-10
Filing date: 2019-05-10
Publication date: 2020-11-19

Abstract

To provide a voice operation system capable of easily correcting recognized or estimated indication details based upon a user's utterance.SOLUTION: An indication candidate determination unit 12 recognizes or estimates, based upon an indication utterance of a user U recognized by an utterance recognition unit 11, details indicated by the user U to determine a first indication candidate. An indication candidate correction unit 13 performs first candidate notification to vocally output the first indication candidate or an execution state of first predetermined processing corresponding to the first indication candidate, determines a second indication candidate, obtained by correcting the first indication candidate according to an indication by a correction indication utterance to indicate corrections of indication details of the first indication candidate when the correction indication utterance is recognized in response to the first candidate notification, and vocally outputs details of the second indication candidate or an execution state of second predetermined processing corresponding to the second indication candidate.SELECTED DRAWING: Figure 1

Description

本発明は、音声操作システム、音声操作制御方法、及び音声操作制御プログラムに関する。 The present invention relates to a voice operation system, a voice operation control method, and a voice operation control program.

従来、ナビゲーション装置において、利用者の音声入力を解析して発話内容を複数の構成要素に分割して表示し、各構成要素を個別に選択して、音声の再入力による構成要素の修正ができるようにした構成が提案されている（例えば、特許文献１参照）。 Conventionally, in a navigation device, it is possible to analyze a user's voice input, divide the utterance content into a plurality of components and display them, select each component individually, and modify the component by re-inputting the voice. Such a configuration has been proposed (see, for example, Patent Document 1).

特開２０１３−１１４６６号公報Japanese Unexamined Patent Publication No. 2013-11466

上述した従来の構成による場合、利用者は、言い間違えや誤った解析による発話内容の誤認識が生じたときに、誤認識が生じた構成要素を選択する操作を行って音声を再入力しなければならないという煩わしさがある。また、利用者が、完全な指示内容ではなく、指示の一部のみを発話して、ＡＩにより指示内容を推定したときに、推定した指示内容が利用者の意に反したものとなる場合があり、この場合にも、利用者は、意に反している部分を選択する操作を行って、音声を再入力しなければならないという煩わしさがある。
本発明は、かかる背景に鑑みてなされたものであり、利用者の発話に基づいて認識又は推定された指示内容を、利用者が容易に修正することができる音声操作システム、音声操作制御方法、及び音声操作制御プログラムを提供することを目的とする。 In the case of the conventional configuration described above, when a misrecognition of the utterance content occurs due to a mistake or an erroneous analysis, the user must perform an operation of selecting the component in which the erroneous recognition occurs and re-enter the voice. There is the hassle of having to. In addition, when the user utters only a part of the instruction instead of the complete instruction content and estimates the instruction content by AI, the estimated instruction content may be contrary to the user's intention. In this case as well, the user has to perform an operation of selecting an unintended part and re-enter the voice.
The present invention has been made in view of such a background, and a voice operation system, a voice operation control method, and a voice operation control method, in which the user can easily correct the instruction content recognized or estimated based on the utterance of the user. And to provide a voice operation control program.

上記目的を達成するための第１態様として、利用者の発話を認識する発話認識部と、前記発話認識部により認識された前記利用者の指示発話に基づいて、前記利用者による指示内容を認識又は推定することにより、第１指示候補を決定する指示候補決定部と、前記第１指示候補又は前記第１指示候補に応じた第１所定処理の実行状況を、音声により出力する第１候補報知を行い、前記第１候補報知に応じて、前記第１指示候補による指示内容の訂正を指示する訂正指示発話が前記発話認識部により認識されたときに、前記第１指示候補を前記訂正指示発話による指示に応じて訂正した第２指示候補を決定し、前記第２指示候補の内容又は前記第２指示候補に応じた第２所定処理の実行状況を、音声により出力する第２候補報知を行う指示候補訂正部とを備える音声操作システムが挙げられる。 As the first aspect for achieving the above object, the utterance recognition unit that recognizes the user's utterance and the instruction content by the user are recognized based on the utterance of the user's instruction recognized by the utterance recognition unit. Alternatively, the first candidate notification that outputs by voice the instruction candidate determination unit that determines the first instruction candidate and the execution status of the first predetermined process according to the first instruction candidate or the first instruction candidate by estimating. When the utterance recognition unit recognizes the correction instruction utterance instructing the correction of the instruction content by the first instruction candidate in response to the first candidate notification, the first instruction candidate is used as the correction instruction utterance. The corrected second instruction candidate is determined in response to the instruction given by the above, and the second candidate notification that outputs the content of the second instruction candidate or the execution status of the second predetermined process according to the second instruction candidate by voice is performed. An example is a voice operation system including an instruction candidate correction unit.

上記音声操作システムにおいて、前記指示候補訂正部は、前記第１指示候補に所定ジャンルの第１指示要素が含まれると共に、前記訂正指示発話に前記所定ジャンルの第２指示要素が含まれる場合に、前記第１指示要素を前記第２指示要素に基づいて訂正することによって、前記第２指示候補を決定する構成としてもよい。 In the voice operation system, the instruction candidate correction unit includes a first instruction element of a predetermined genre in the first instruction candidate and a second instruction element of the predetermined genre in the correction instruction utterance. The second instruction candidate may be determined by correcting the first instruction element based on the second instruction element.

上記音声操作システムにおいて、前記音声操作システムは、ナビゲーション装置における目的地の探索条件の指示に使用され、前記指示候補決定部は、前記第１指示候補として目的地の第１探索条件を決定し、前記指示候補訂正部は、前記第２指示候補として、前記第１探索条件を前記訂正指示発話による指示に応じて訂正した第２探索条件を決定し、前記所定ジャンルは、目的地の場所、目的地への出発日時、目的地である施設の評価ランク、目的地である施設の種類、及び前記利用者が複数である場合の利用者の識別情報のうちのいずれかである構成としてもよい。 In the voice operation system, the voice operation system is used to instruct the search condition of the destination in the navigation device, and the instruction candidate determination unit determines the first search condition of the destination as the first instruction candidate. The instruction candidate correction unit determines, as the second instruction candidate, a second search condition in which the first search condition is corrected according to the instruction by the correction instruction speech, and the predetermined genre is the location and purpose of the destination. The configuration may be one of the departure date and time to the place, the evaluation rank of the facility as the destination, the type of the facility as the destination, and the identification information of the user when there are a plurality of the users.

上記音声操作システムにおいて、前記指示候補訂正部は、前記発話認識部により前記訂正指示発話が認識されてから、前記第２候補報知を行うまでの間に、前記第１指示候補が前記利用者の意図する指示内容と異なっていたことを音声により報知する誤り確認報知を行う構成としてもよい。 In the voice operation system, the instruction candidate correction unit uses the first instruction candidate as the user during the period from the recognition of the correction instruction utterance by the utterance recognition unit to the notification of the second candidate. An error confirmation notification may be configured to notify by voice that the instruction content is different from the intended content.

上記音声操作システムにおいて、前記利用者の行動習慣を推定する行動習慣推定部を備え、前記指示候補決定部は、前記指示発話から前記利用者が意図する指示内容を特定することができない場合に、前記指示発話に含まれる指示要素と、前記行動習慣推定部により推定される前記利用者の行動習慣とに基づいて、前記第１指示候補を決定する構成としてもよい。 The voice operation system includes a behavior habit estimation unit that estimates the behavior habit of the user, and the instruction candidate determination unit cannot specify the instruction content intended by the user from the instruction utterance. The first instruction candidate may be determined based on the instruction element included in the instruction utterance and the behavior habit of the user estimated by the behavior habit estimation unit.

上記目的を達成するための第２態様として、利用者の発話を認識する発話認識部を有する、単一又は複数のコンピュータにより実施される音声操作制御方法であって、前記発話認識部により前記利用者の発話が認識される発話認識ステップと、前記発話認識ステップにより認識された前記利用者の指示発話に基づいて、前記利用者による指示内容を認識又は推定することにより、第１指示候補を決定する第１指示候補決定ステップと、前記第１指示候補又は前記第１指示候補に応じた第１所定処理の実行状況を、音声により出力する第１候補報知ステップと、前記第１候補報知ステップに応じて、前記第１指示候補による指示内容の訂正を指示する訂正指示発話が前記発話認識部により認識された場合、前記訂正指示発話による指示に応じて前記第１指示候補から訂正された第２指示候補を決定する第２指示候補決定ステップと、前記第２指示候補の内容又は前記第２指示候補に応じた第２所定処理の実行状況を、音声により出力する第２候補報知ステップとを含む音声操作制御方法が挙げられる。 As a second aspect for achieving the above object, it is a voice operation control method implemented by a single or a plurality of computers having an utterance recognition unit that recognizes a user's utterance, and the use by the utterance recognition unit. The first instruction candidate is determined by recognizing or estimating the instruction content by the user based on the utterance recognition step in which the utterance of the person is recognized and the instruction utterance of the user recognized by the utterance recognition step. The first candidate notification step for outputting the execution status of the first instruction candidate or the first predetermined process corresponding to the first instruction candidate by voice, and the first candidate notification step. Correspondingly, when the correction instruction utterance instructing the correction of the instruction content by the first instruction candidate is recognized by the utterance recognition unit, the second instruction corrected from the first instruction candidate according to the instruction by the correction instruction utterance. The second candidate notification step of determining the instruction candidate and the second candidate notification step of outputting the content of the second instruction candidate or the execution status of the second predetermined process according to the second instruction candidate by voice are included. A voice operation control method can be mentioned.

上記目的を達成するための第３態様として、単一又は複数のコンピュータにインストールされ、前記コンピュータに、利用者の発話を認識する発話認識処理と、前記発話認識処理により認識された前記利用者の指示発話に基づいて、前記利用者による指示内容を認識又は推定することにより、第１指示候補を決定する第１指示候補決定処理と、前記第１指示候補又は前記第１指示候補に応じた第１所定処理の実行状況を、音声により出力する第１候補報知処理と、前記第１候補報知処理に応じて、前記第１指示候補による指示内容の訂正を指示する訂正指示発話が前記発話認識処理により認識された場合、前記訂正指示発話による指示に応じて前記第１指示候補から訂正された第２指示候補を決定する第２指示候補決定処理と、前記第２指示候補の内容又は前記第２指示候補に応じた第２所定処理の実行状況を、音声により出力する第２候補報知処理とを実行させる音声操作制御プログラムが挙げられる。 As a third aspect for achieving the above object, the utterance recognition process installed on a single computer or a plurality of computers and recognizing the user's utterance on the computer, and the utterance recognition process of the user recognized by the utterance recognition process. The first instruction candidate determination process for determining the first instruction candidate by recognizing or estimating the instruction content by the user based on the instruction utterance, and the first instruction candidate or the first instruction candidate corresponding to the first instruction candidate. 1 The first candidate notification process that outputs the execution status of a predetermined process by voice and the correction instruction utterance that instructs the correction of the instruction content by the first instruction candidate according to the first candidate notification process are the utterance recognition processes. When recognized by, the second instruction candidate determination process of determining the second instruction candidate corrected from the first instruction candidate in response to the instruction by the correction instruction utterance, and the content of the second instruction candidate or the second instruction candidate. An example is a voice operation control program that executes a second candidate notification process that outputs the execution status of the second predetermined process according to the instruction candidate by voice.

上記音声操作システムによれば、指示候補決定部により、利用者の指示発話に基づいて第１指示候補が決定され、指示候補訂正部により、第１指示候補又は第１指示候補に応じた第１所定処理の実行状況を音声により出力する第１候補報知が行われる。そして、指示候補訂正部は、第１候補報知に応じた訂正指示発話が認識されたときに、第１指示候補を訂正指示発話による指示に応じて訂正した第２指示候補を決定し、第２指示候補の内容又は第２指示候補に応じた第２所定処理の実行状況を、音声により出力する第２候補報知を行う。これにより、利用者は、音声操作システムにより決定された第１指示候補が、意図していた内容ではなかったときに、指示の変更を指示する訂正指示発話を発するという簡易な操作によって、訂正内容を音声の出力により確認して指示内容を訂正することができる。 According to the voice operation system, the instruction candidate determination unit determines the first instruction candidate based on the user's instruction utterance, and the instruction candidate correction unit determines the first instruction candidate or the first instruction candidate according to the first instruction candidate. The first candidate notification that outputs the execution status of the predetermined process by voice is performed. Then, when the correction instruction utterance corresponding to the first candidate notification is recognized, the instruction candidate correction unit determines the second instruction candidate in which the first instruction candidate is corrected according to the instruction by the correction instruction utterance, and the second instruction candidate is determined. The second candidate notification that outputs the content of the instruction candidate or the execution status of the second predetermined process according to the second instruction candidate by voice is performed. As a result, when the first instruction candidate determined by the voice operation system does not have the intended content, the user issues a correction instruction utterance instructing the change of the instruction. Can be confirmed by voice output and the instruction content can be corrected.

図１は、音声操作システムの機能を含むナビゲーション装置の構成図である。FIG. 1 is a configuration diagram of a navigation device including the functions of a voice operation system. 図２は、利用者データの説明図である。FIG. 2 is an explanatory diagram of user data. 図３は、利用者の発話と行動習慣に基づいて、目的地の第１探索条件を決定する処理のフローチャートである。FIG. 3 is a flowchart of a process of determining the first search condition of the destination based on the user's utterance and behavioral habits. 図４は、利用者の発話に基づいて、目的地の探索条件を修正する処理のフローチャートである。FIG. 4 is a flowchart of a process of modifying the search condition of the destination based on the utterance of the user. 利用者と音声操作システム間の第１対話例の説明図である。It is explanatory drawing of the 1st dialogue example between a user and a voice operation system. 利用者と音声操作システム間の第２対話例の説明図である。It is explanatory drawing of the 2nd dialogue example between a user and a voice operation system. 利用者と音声操作システム間の第３対話例の説明図である。It is explanatory drawing of the 3rd dialogue example between a user and a voice operation system. 利用者と音声操作システム間の第４対話例の説明図である。It is explanatory drawing of the 4th dialogue example between a user and a voice operation system. 利用者と音声操作システム間の第５対話例の説明図である。It is explanatory drawing of the 5th dialogue example between a user and a voice operation system. 利用者と音声操作システム間の第６対話例の説明図である。It is explanatory drawing of the 6th dialogue example between a user and a voice operation system.

［１．音声操作システムの構成］
図１を参照して、本実施形態の音声操作システム２の構成について説明する。音声操作システム２は、車両（図示しない）に搭載されたナビゲーション装置１の機能の一部として構成されている。なお、本実施形態では、車両に搭載されたナビゲーション装置１を示したが、ポータブルタイプのナビゲーション装置であってもよい。また、スマートフォン等の携帯端末においてナビゲーション用アプリ（アプリケーションプログラム）を実行することにより構成されるナビゲーション装置であってもよい。 [1. Voice operation system configuration]
The configuration of the voice operation system 2 of the present embodiment will be described with reference to FIG. The voice operation system 2 is configured as a part of the function of the navigation device 1 mounted on the vehicle (not shown). Although the navigation device 1 mounted on the vehicle is shown in the present embodiment, it may be a portable type navigation device. Further, it may be a navigation device configured by executing a navigation application (application program) on a mobile terminal such as a smartphone.

ナビゲーション装置１は、ＣＰＵ（Central Processing Unit）１０、メモリ２０、通信部３０、マイク３１、スピーカー３２、タッチパネル３３、スイッチ３４、及びＧＰＳ（Global Positioning System）ユニット３５を備えている。通信部３０は、通信ネットワーク１００を介して、操作支援サーバー１１０等の外部システムとの間で通信を行う。また、通信部３０は、通信ネットワーク１００を介して或いは直接、ナビゲーション装置１の利用者Ｕが使用する利用者端末９０との間で通信を行う。利用者端末９０は、スマートフォン、タブレット端末、携帯電話等の携帯型の通信端末である。 The navigation device 1 includes a CPU (Central Processing Unit) 10, a memory 20, a communication unit 30, a microphone 31, a speaker 32, a touch panel 33, a switch 34, and a GPS (Global Positioning System) unit 35. The communication unit 30 communicates with an external system such as the operation support server 110 via the communication network 100. Further, the communication unit 30 communicates with the user terminal 90 used by the user U of the navigation device 1 via the communication network 100 or directly. The user terminal 90 is a portable communication terminal such as a smartphone, a tablet terminal, or a mobile phone.

マイク３１は、利用者Ｕの音声を入力する。スピーカー３２は、利用者Ｕに対する音声ガイダンス等を出力する。タッチパネル３３は、液晶パネル等のフラット型の表示器と、表示器の表面に配置されたタッチスイッチとにより構成されている。スイッチ３４は、利用者Ｕの押圧により操作される。ＧＰＳユニット３５は、ＧＰＳ衛星から送信される電波を受信することによって、ナビゲーション装置１の現在位置（車両の現在位置）を検出する。 The microphone 31 inputs the voice of the user U. The speaker 32 outputs voice guidance and the like to the user U. The touch panel 33 is composed of a flat type display such as a liquid crystal panel and a touch switch arranged on the surface of the display. The switch 34 is operated by pressing the user U. The GPS unit 35 detects the current position (current position of the vehicle) of the navigation device 1 by receiving radio waves transmitted from GPS satellites.

ナビゲーション装置１は、利用者Ｕによるタッチパネル３３のタッチ操作、或いはマイク３１に入力される利用者の音声による操作に応じて、目的地を設定する。そして、ナビゲーション装置１は、ＧＰＳユニット３５により検出されるナビゲーション装置１の現在位置（ナビゲーション装置１が搭載された車両の現在位置）、及びメモリ２０に保存された地図データ２３に基づいて、目的地までのルート案内を行う。なお、地図データは、通信部３０により操作支援サーバー１１０等の外部サーバーにアクセスすることによって、取得してもよい。 The navigation device 1 sets the destination according to the touch operation of the touch panel 33 by the user U or the operation by the user's voice input to the microphone 31. Then, the navigation device 1 determines the destination based on the current position of the navigation device 1 (current position of the vehicle on which the navigation device 1 is mounted) detected by the GPS unit 35 and the map data 23 stored in the memory 20. We will guide you to the route. The map data may be acquired by accessing an external server such as the operation support server 110 by the communication unit 30.

音声操作システム２は、ＣＰＵ１０、メモリ２０等により構成され、ＣＰＵ１０は、メモリ２０に保持された音声操作システム２の制御用プログラム２１を読み込んでインストールし、制御用プログラム２１を実行することにより、発話認識部１１、指示候補決定部１２、指示候補訂正部１３、所定処理実行部１４、行動履歴保存部１５、及び行動習慣推定部１６として機能する。ＣＰＵ１０は、本発明の単一又は複数のコンピュータに相当し、音声操作制御方法を実施する。制御用プログラム２１は、本発明の音声操作制御プログラムを含んでいる。 The voice operation system 2 is composed of a CPU 10, a memory 20, and the like. The CPU 10 reads and installs the control program 21 of the voice operation system 2 held in the memory 20, installs the control program 21, and executes the control program 21 to make an utterance. It functions as a recognition unit 11, an instruction candidate determination unit 12, an instruction candidate correction unit 13, a predetermined processing execution unit 14, an action history storage unit 15, and an action habit estimation unit 16. The CPU 10 corresponds to a single computer or a plurality of computers of the present invention, and implements a voice operation control method. The control program 21 includes the voice operation control program of the present invention.

行動履歴保存部１５は、利用者Ｕがこれまでに移動した場所及び日時を示す行動履歴を、利用者データ２２に保存する。利用者データ２２には、図２に示したように、ナビゲーション装置１が搭載された車両を使用する複数の利用者毎に、利用者ＩＤ２２ａ、利用者を識別するための生体データ２２ｂ、及び行動履歴２２ｃが記録されている。生体データ２２ｂには、顔画像、声紋、虹彩、指紋等の生体認証を行うためのデータが保存される。図２は、利用者Ｕについての利用者データ２２を例示している。 The action history storage unit 15 stores the action history indicating the place and the date and time when the user U has moved so far in the user data 22. As shown in FIG. 2, the user data 22 includes a user ID 22a, biometric data 22b for identifying the user, and an action for each of a plurality of users who use the vehicle equipped with the navigation device 1. The history 22c is recorded. The biometric data 22b stores data for performing biometric authentication such as a face image, voiceprint, iris, and fingerprint. FIG. 2 illustrates the user data 22 for the user U.

行動履歴保存部１５は、ＧＰＳユニット３５により検出されるナビゲーション装置１の現在位置の推移、及び利用者Ｕにより設定された目的地等に基づいて、利用者Ｕが移動した場所を認識して行動履歴２２ｃに記録する。また、行動履歴保存部１５は、利用者端末９０との通信により、利用者端末９０で実行されるスケジュールアプリにより設定された利用者のＵの行動予定、利用者端末９０で実行される決済アプリにより処理された支払い履歴等により認識した利用者Ｕの行動状況を、行動履歴２２ｃに記録する。 The action history storage unit 15 recognizes the place where the user U has moved and acts based on the transition of the current position of the navigation device 1 detected by the GPS unit 35 and the destination set by the user U. Record in history 22c. In addition, the action history storage unit 15 communicates with the user terminal 90 to set the action schedule of the user U set by the schedule application executed on the user terminal 90 and the payment application executed on the user terminal 90. The action status of the user U recognized by the payment history or the like processed by is recorded in the action history 22c.

発話認識部１１は、マイク３１に入力された利用者Ｕの音声を解析して、利用者Ｕの発話内容を認識する。指示候補決定部１２は、ＡＩ（Artificial Intelligence）エンジンを用いて構成され、発話認識部１１により認識された利用者Ｕの発話内容、及び利用者データ２２に記録された行動履歴２２ｃに基づいて、利用者Ｕが意図している目的地の第１探索条件（本発明の第１指示候補に相当する）を決定する。指示候補訂正部１３は、第１探索条件の音声出力に応じて、利用者Ｕよる探索条件の訂正を指示する発話（本発明の訂正指示発話に相当する）が発話認識部１１により認識された場合に、第１探索条件を訂正した第２探索条件（本発明の第２指示候補に相当する）を決定する。 The utterance recognition unit 11 analyzes the voice of the user U input to the microphone 31 and recognizes the utterance content of the user U. The instruction candidate determination unit 12 is configured by using an AI (Artificial Intelligence) engine, and is based on the utterance content of the user U recognized by the utterance recognition unit 11 and the action history 22c recorded in the user data 22. The first search condition of the destination intended by the user U (corresponding to the first instruction candidate of the present invention) is determined. In the instruction candidate correction unit 13, the utterance (corresponding to the correction instruction utterance of the present invention) instructing the correction of the search condition by the user U is recognized by the utterance recognition unit 11 in response to the voice output of the first search condition. In this case, the second search condition (corresponding to the second instruction candidate of the present invention) obtained by correcting the first search condition is determined.

所定処理実行部１４は、第１探索条件に従った目的地の第１探索処理（本発明の第１所定処理に相当する）、及び第２探索条件に従った目的地の第２探索処理（本発明の第２所定処理に相当する）を実行する。 The predetermined process execution unit 14 performs a first search process for the destination according to the first search condition (corresponding to the first predetermined process of the present invention) and a second search process for the destination according to the second search condition (corresponding to the first predetermined process of the present invention). The second predetermined process of the present invention) is executed.

行動習慣推定部１６は、ＡＩエンジンにより、利用者データ２２に記録された行動履歴２２ｃに基づいて、例えば、以下のような利用者Ｕの行動習慣を推定する。
（１）利用者Ｕは、平日、勤務先から帰宅する途中に、自宅近くのスーパーＸＸのａ町店に立寄ることが多い。
（２）利用者Ｕは、郷里の実家に帰る場合はいつも、金曜日の夕飯を食べた後に自宅を出発し、自宅近くのスーパーＸＸのａ町店に立寄ってから実家に向かう。
（３）利用者Ｕは、車で旅行する際は、ホテルで夕食を取らずに、ＡＡガイドブックに載っている、ホテル近くの二つ星のフランス料理レストランに寄って食事をする。
（４）利用者Ｕは夫妻であり、妻は、平日は毎日、職場であるＦＦ市役所に車で通勤している。
（５）利用者ＵはＧＧ球団のファンであり、年に一度程度、地元の清原球場で行われるＧＧ球団と他球団との試合を、毎回観戦している。 The behavior habit estimation unit 16 estimates the behavior habit of the user U as follows, for example, based on the behavior history 22c recorded in the user data 22 by the AI engine.
(1) On weekdays, user U often stops at the supermarket XX a-machi store near his home on the way home from work.
(2) Whenever the user U returns to his / her hometown, he / she leaves his / her home after having dinner on Friday, stops at the supermarket XX a town store near his / her home, and then heads for his / her home.
(3) When traveling by car, user U does not have dinner at the hotel, but instead stops at a two-star French restaurant near the hotel, which is listed in the AA guidebook.
(4) User U is a couple, and his wife commute by car to the FF city hall, which is her workplace, every weekday.
(5) User U is a fan of the GG team, and watches the game between the GG team and other teams held at the local Kiyohara baseball stadium about once a year.

［２．目的地の探索条件の決定処理］
図３〜図４に示したフローチャートに従って、図５の第１対話例に示したように、利用者Ｕが目的地を指示する発話Ｖ１０（指示発話）を行ったときに、音声操作システム２により実行される目的地の探索条件の決定処理について説明する。図３のステップＳ１で、発話認識部１１は、マイク３１に入力された音声から利用者Ｕの発話を認識したときに、ステップＳ２に処理を進める。ステップＳ２で利用者Ｕの発話を認識する処理は、本発明の音声操作制御方法における発話認識ステップに相当すると共に、本発明の音声操作制御プログラムにおける発話認識処理に相当する。 [2. Destination search condition determination process]
According to the flowcharts shown in FIGS. 3 to 4, as shown in the first dialogue example of FIG. 5, when the user U makes an utterance V10 (instruction utterance) instructing the destination, the voice operation system 2 is used. The process of determining the search condition of the destination to be executed will be described. In step S1 of FIG. 3, when the utterance recognition unit 11 recognizes the utterance of the user U from the voice input to the microphone 31, the process proceeds to step S2. The process of recognizing the utterance of the user U in step S2 corresponds to the utterance recognition step in the voice operation control method of the present invention, and corresponds to the utterance recognition process in the voice operation control program of the present invention.

ステップＳ２で、発話認識部１１は、発話内容から目的地の探索条件（利用者Ｕによる指示内容）が特定できるか否かを判断する。そして、発話認識部１１は、探索条件が特定できるときはステップＳ２０に処理を進め、探索条件が特定できないときにはステップＳ３に処理を進める。ステップＳ２０で、所定処理実行部１４は、特定された探索条件による目的地の探索処理を実行し、図４のステップＳ１３に処理を進める。 In step S2, the utterance recognition unit 11 determines whether or not the search condition for the destination (content of instruction by the user U) can be specified from the utterance content. Then, the utterance recognition unit 11 proceeds to step S20 when the search condition can be specified, and proceeds to step S3 when the search condition cannot be specified. In step S20, the predetermined process execution unit 14 executes the search process of the destination according to the specified search condition, and proceeds to step S13 of FIG.

図５の例では、「近所のスーパーを教えて」との発話Ｖ１０が発話認識部により認識され、ステップＳ３で、指示候補決定部１２は、「近所のスーパー」を指示要素として抽出する。続くステップＳ４で、指示候補決定部１２は、声紋による生体認証により、発話Ｖ１０を行ったのが、利用者Ｕであることを認識する。なお、声紋に代えて、顔画像、指紋、虹彩等による生体認証を行ってもよい。 In the example of FIG. 5, the utterance V10 saying "Tell me the supermarket in the neighborhood" is recognized by the utterance recognition unit, and in step S3, the instruction candidate determination unit 12 extracts the "supermarket in the neighborhood" as an instruction element. In the following step S4, the instruction candidate determination unit 12 recognizes that it is the user U who has performed the utterance V10 by biometric authentication using the voiceprint. In addition, instead of the voiceprint, biometric authentication may be performed using a face image, a fingerprint, an iris, or the like.

次のステップＳ５で、指示候補決定部１２は、指示要素「近所のスーパー」と、行動習慣推定部１６により推定された利用者Ｕの行動習慣とに基づいて、目的地の第１探索条件を決定する。図５の例では、利用者Ｕについて、行動習慣推定部１６により、上記（１）の「利用者Ｕは、平日、勤務先から帰宅する途中に、自宅近くのスーパーＸＸのａ町店に立寄ることが多い。」という行動習慣が推定されているとする。指示候補決定部１２は、上記（１）の行動習慣に基づいて、「近所のスーパー」は、利用者Ｕが勤務先から帰宅する際にいつも立寄るスーパーであると推定し、第１探索条件を「いつものスーパー」に決定する。ステップＳ５で第１探索条件を決定する処理は、本発明の音声操作制御方法における第１指示候補決定ステップに相当すると共に、本発明の音声操作制御プログラムにおける第１指示候補決定処理に相当する。 In the next step S5, the instruction candidate determination unit 12 determines the first search condition of the destination based on the instruction element "neighborhood supermarket" and the behavior habit of the user U estimated by the behavior habit estimation unit 16. decide. In the example of FIG. 5, for the user U, the behavioral habit estimation unit 16 described the above (1), "User U stops at the supermarket XX a-machi store near his home on the way home from work on weekdays. It is assumed that the behavioral habit of "often" is presumed. Based on the behavioral habit of (1) above, the instruction candidate determination unit 12 estimates that the "neighborhood supermarket" is a supermarket that the user U always stops by when returning home from work, and sets the first search condition. Decide on "the usual supermarket". The process of determining the first search condition in step S5 corresponds to the first instruction candidate determination step in the voice operation control method of the present invention, and corresponds to the first instruction candidate determination process in the voice operation control program of the present invention.

続くステップＳ６で、所定処理実行部１４は、第１探索条件「いつものスーパー」に従って、地図データ２３を参照して、行動履歴２２ｃに記録された「スーパーＸＸａ町店」までの経路を探索する第１探索処理を実行する。続くステップＳ７〜図４のステップＳ１０及びステップＳ１２は、指示候補訂正部１３による処理である。 In the following step S6, the predetermined processing execution unit 14 searches for the route to the "super XXa town store" recorded in the action history 22c with reference to the map data 23 according to the first search condition "usual supermarket". The first search process is executed. Subsequent steps S7 to S10 and S12 of FIG. 4 are processes by the instruction candidate correction unit 13.

ステップＳ７で、指示候補訂正部１３は、第１探索処理の実行状況を音声によりスピーカー３２から出力する（本発明の第１候補報知に相当する）。図５の例では、指示候補訂正部１３は、「いつものスーパーを探しています」という音声Ｗ１０を、スピーカー３２から出力する。ステップＳ７で第１探索処理の実行状況を音声によりスピーカー３２から出力する処理は、本発明の音声操作制御方法における第１候補報知ステップに相当すると共に、本発明の音声操作制御プログラムにおける第１候補報知処理に相当する。 In step S7, the instruction candidate correction unit 13 outputs the execution status of the first search process by voice from the speaker 32 (corresponding to the first candidate notification of the present invention). In the example of FIG. 5, the instruction candidate correction unit 13 outputs the voice W10 "I am looking for the usual supermarket" from the speaker 32. The process of outputting the execution status of the first search process from the speaker 32 by voice in step S7 corresponds to the first candidate notification step in the voice operation control method of the present invention, and is the first candidate in the voice operation control program of the present invention. Corresponds to notification processing.

続く図４のステップＳ８で、指示候補訂正部１３は、利用者Ｕによる第１探索条件の訂正を指示する発話（訂正指示発話）が、発話認識部１１により認識されたか否かを判断する。そして、指示候補訂正部１３は、訂正指示発話が認識されたときはステップＳ９に処理を進める。一方、訂正指示発話が認識されなかったときには、指示候補訂正部１３はステップＳ１３に処理を進め、この場合は第１探索条件の訂正は行われない。 In the following step S8 of FIG. 4, the instruction candidate correction unit 13 determines whether or not the utterance (correction instruction utterance) instructing the user U to correct the first search condition has been recognized by the utterance recognition unit 11. Then, when the instruction candidate correction unit 13 recognizes the correction instruction utterance, the process proceeds to step S9. On the other hand, when the correction instruction utterance is not recognized, the instruction candidate correction unit 13 proceeds to step S13, and in this case, the first search condition is not corrected.

図５の例では、利用者Ｕによる「いつものスーパーじゃなくて、会社の近所のスーパーだよ」という訂正指示発話Ｖ１１が認識される。ステップＳ９で、指示候補訂正部１３は、訂正指示発話Ｖ１１の内容を認識し、５Ｗ１Ｈ（Ｗｈｅｎ、Ｗｈｅｒｅ、Ｗｈｏ、Ｗｈａｔ、Ｗｈｙ、Ｈｏｗｍａｎｙ、Ｈｏｗｍｕｃｈ）のジャンル（本発明の所定ジャンルに相当する）による区分を利用して、利用者Ｕが何を訂正したいのかを推定する。 In the example of FIG. 5, the correction instruction utterance V11 by the user U, "It is not the usual supermarket, but the supermarket in the neighborhood of the company" is recognized. In step S9, the instruction candidate correction unit 13 recognizes the content of the correction instruction utterance V11, and corresponds to the genre of 5W1H (When, Where, Who, What, Why, How many, How much) (corresponding to the predetermined genre of the present invention). ) Is used to estimate what the user U wants to correct.

ここで、Ｗｈｅｒｅジャンルには、地名、場所名、所在地、住所、緯度経度情報、地物情報等が含まれる。地物には、ランドマークや観光エリア（山、滝、湖等）、建築物（寺院、橋、ビル、家屋、店舗等）、テーマパークやショッピングモール等の商業施設が含まれる。さらに、地物には、信号機や標識、中央分離帯、フェンス、ガードレール、ポール、電柱、その他の物体が含まれてもよい。地物情報は、これらの地物の名称或いは位置の情報である。 Here, the Where genre includes place names, place names, locations, addresses, latitude / longitude information, feature information, and the like. Features include landmarks, tourist areas (mountains, waterfalls, lakes, etc.), buildings (temples, bridges, buildings, houses, stores, etc.), and commercial facilities such as theme parks and shopping malls. In addition, features may include traffic lights, signs, medians, fences, guardrails, poles, utility poles, and other objects. Feature information is information on the names or locations of these features.

図５の例は、訂正指示発話Ｖ１１で否定された「いつものスーパー」がＷｈｅｒｅジャンルである。そのため、指示候補訂正部１３は、第１探索条件におけるＷｈｅｒｅジャンルの指示要素である第１指示要素「いつものスーパー」を、訂正指示発話Ｖ１１により訂正が指示された指示要素である第２指示要素「会社の近所のスーパー」に置き換えることによって、第１探索条件を訂正した第２探索条件を決定する。ステップＳ９で第１探索条件を訂正した第２探索条件を決定する処理は、本発明の音声操作制御方法における第２指示候補決定ステップに相当すると共に、本発明の音声操作制御プログラムにおける第２指示候補決定処理に相当する。 In the example of FIG. 5, the "usual supermarket" denied in the correction instruction utterance V11 is the Where genre. Therefore, the instruction candidate correction unit 13 changes the first instruction element "usual super", which is an instruction element of the Where genre in the first search condition, to the second instruction element, which is an instruction element for which correction is instructed by the correction instruction utterance V11. By replacing with "supermarket in the neighborhood of the company", the second search condition obtained by correcting the first search condition is determined. The process of determining the second search condition obtained by correcting the first search condition in step S9 corresponds to the second instruction candidate determination step in the voice operation control method of the present invention, and the second instruction in the voice operation control program of the present invention. Corresponds to the candidate determination process.

続くステップＳ１０で、指示候補訂正部１３は、第１探索条件が利用者Ｕの意図する探索条件と異なっていたことを確認するための音声を出力する（本発明の誤り確認報知に相当する）。図５の例では、指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ１１を、スピーカー３２から出力する。利用者Ｕは、音声Ｗ１１を聞くことにより、音声操作システム２が、利用者Ｕによる誤りの指摘を認識したことを確認することができる。 In the following step S10, the instruction candidate correction unit 13 outputs a voice for confirming that the first search condition is different from the search condition intended by the user U (corresponding to the error confirmation notification of the present invention). .. In the example of FIG. 5, the instruction candidate correction unit 13 outputs the voice W11 "Eh, did you make a mistake?" From the speaker 32. By listening to the voice W11, the user U can confirm that the voice operation system 2 has recognized the indication of the error by the user U.

次のステップＳ１１で、指示候補訂正部１３は、訂正内容を確認するための音声をスピーカー３２から出力する（本発明の訂正確認報知に相当する）。図５の例では、指示候補訂正部１３は、「会社の近所のスーパーですよね」という音声Ｗ１２をスピーカー３２から出力する。利用者Ｕは、音声Ｗ１２を聞くことにより、音声操作システム２が、利用者が指示した訂正内容を認識したことを確認することができる。 In the next step S11, the instruction candidate correction unit 13 outputs a voice for confirming the correction content from the speaker 32 (corresponding to the correction confirmation notification of the present invention). In the example of FIG. 5, the instruction candidate correction unit 13 outputs the voice W12 saying "It is a supermarket in the neighborhood of the company" from the speaker 32. By listening to the voice W12, the user U can confirm that the voice operation system 2 has recognized the correction content instructed by the user.

次のステップＳ１２で、所定処理実行部１４は、第２探索条件「会社の近所のスーパー」に従って、地図データ２３を参照して、現在地から、行動履歴２２ｃに記録された利用者Ｕの勤務先の近所にあるスーパーまでの経路を探索する第２探索処理を実行する。続くステップＳ１３で、指示候補訂正部１３は、第２探索処理の実行状況である「会社の近所のスーパーを探します」という音声Ｗ１３を、スピーカー３２から出力する（本発明の第２候補報知に相当する）。ステップＳ１２で第２探索処理の実行状況を音声により出力する処理は、本発明の音声操作制御方法における第２候補報知ステップに相当すると共に、本発明の音声操作制御プログラムにおける第２候補報知処理に相当する。 In the next step S12, the predetermined processing execution unit 14 refers to the map data 23 according to the second search condition “supermarket in the neighborhood of the company”, and from the current location, the work place of the user U recorded in the action history 22c. The second search process for searching the route to the supermarket in the neighborhood of is executed. In the following step S13, the instruction candidate correction unit 13 outputs the voice W13 "search for a supermarket in the neighborhood of the company", which is the execution status of the second search process, from the speaker 32 (to the second candidate notification of the present invention). Equivalent to). The process of outputting the execution status of the second search process by voice in step S12 corresponds to the second candidate notification step in the voice operation control method of the present invention, and also corresponds to the second candidate notification process in the voice operation control program of the present invention. Equivalent to.

図３、図４の処理により、図５に示したように、利用者Ｕは、「近所のスーパーを教えて」という短い発話により、目的地の第１探索条件を指示することができる。また、利用者Ｕは、音声操作システム２による「いつものスーパーを探しています」という音声出力Ｗ１０から、第１探索条件が利用者Ｕが意図する探索条件と一致しているか否かを判断することができる。そして、利用者Ｕは、第１探索条件が意図していたものと異なっている場合には、第１探索条件の訂正を指示する「いつものスーパーじゃなくて、会社の近所のスーパーだよ」という簡易な発話を行うことで、第１探索条件のＷｈｅｒｅジャンルの指示要素を訂正した第２探索条件を設定することができる。 By the processing of FIGS. 3 and 4, as shown in FIG. 5, the user U can instruct the first search condition of the destination by a short utterance "Tell me the supermarket in the neighborhood". Further, the user U determines whether or not the first search condition matches the search condition intended by the user U from the voice output W10 of "searching for the usual supermarket" by the voice operation system 2. be able to. Then, when the first search condition is different from the intended one, the user U instructs the correction of the first search condition, "It is not the usual supermarket, but a supermarket in the neighborhood of the company." By making a simple utterance, it is possible to set a second search condition in which the indicator element of the Where genre of the first search condition is corrected.

これにより、探索条件を訂正するために、探索条件を最初から設定し直すことを不要として、利用者Ｕの意図と異なっている第１探索条件の指示要素「いつものスーパー」のみを、「会社の近所のスーパー」に置き換えることにより、第１探索条件を訂正した第２探索条件を容易に設定することができる。 As a result, in order to correct the search condition, it is not necessary to reset the search condition from the beginning, and only the instruction element "usual super" of the first search condition, which is different from the intention of the user U, is "company". By replacing it with "a supermarket in the neighborhood of", the second search condition obtained by correcting the first search condition can be easily set.

［３．Ｗｈｅｎジャンルの指示要素の修正］
図６を参照して、第１探索条件におけるＷｈｅｎジャンル（出発日時等が含まれる）の指示要素を訂正する第２対話例の実施形態について説明する。図６の例では、利用者Ｕについて、行動習慣推定部１６により、上記（２）の「利用者Ｕは、郷里の実家に帰る場合はいつも、金曜日の夕飯を食べた後に自宅を出発し、近くのスーパーＸＸａ町店に立寄ってから実家に向かう」という行動習慣が推定されている。 [3. Modifying the indicator elements of the When genre]
With reference to FIG. 6, an embodiment of a second dialogue example for correcting an indicator element of the When genre (including the departure date and time) in the first search condition will be described. In the example of FIG. 6, for the user U, the behavioral habit estimation unit 16 described the above (2), "Whenever the user U returns to his / her parents' home, he / she leaves home after having dinner on Friday. It is estimated that the behavioral habit is "go to the parents' house after stopping by the nearby supermarket XXa-machi store".

図６では、先ず、発話認識部１１により、「郷里の実家に帰る、何時に着く」という利用者Ｕの発話Ｖ２０（指示発話）が認識されている。指示候補決定部１２は、上記（２）の利用者Ｕの行動習慣から、利用者Ｕは、夕食を食べた後に実家に向けて自宅を出発すると推定する。そして、指示候補決定部１２は、第１探索条件として、「夕食後の２１時頃に自宅を出発し、スーパーＸＸのａ町店に立寄って実家に向かった場合の実家への到着時刻」を設定する。 In FIG. 6, first, the utterance recognition unit 11 recognizes the utterance V20 (instructed utterance) of the user U, "return to the parents' home in the hometown, what time will arrive". From the behavioral habits of the user U in (2) above, the instruction candidate determination unit 12 estimates that the user U leaves his / her home for his / her parents' house after having dinner. Then, the instruction candidate determination unit 12 sets, as the first search condition, "the arrival time at the parents'home when leaving the home at around 21:00 after dinner and stopping at the a-machi store of Super XX and heading for the parents' home". Set.

所定処理実行部１４は、第１探索条件に従った第１探索処理実行して、実家に到着する予測日時を算出する。指示候補訂正部１３は、実家に到着する予測日時である「金曜日の２３時」の音声Ｗ２０を、スピーカー３２から出力する（第１候補報知）。第１候補報知に応じて、利用者Ｕによる「いつもの出発時刻じゃなくて、今からだよ」という発話Ｖ２１（訂正指示発話）が、発話認識部１１により認識される。指示候補訂正部１３は、訂正指示発話Ｖ２１が、Ｗｈｅｎジャンルの「いつもの出発時刻」を否定して「今から」への訂正を指示するものであるため、第１探索条件におけるＷｈｅｎジャンルの第１指示要素「いつもの出発時刻」を、訂正指示発話Ｖ２１により訂正が指示されたＷｈｅｎジャンルの「今から」に置き換えることによって、第１探索条件を訂正した第２探索条件を決定する。 The predetermined processing execution unit 14 executes the first search processing according to the first search condition, and calculates the estimated date and time of arrival at the parents' house. The instruction candidate correction unit 13 outputs the voice W20 of “Friday 23:00”, which is the predicted date and time of arrival at the parents' house, from the speaker 32 (first candidate notification). In response to the first candidate notification, the utterance recognition unit 11 recognizes the utterance V21 (correction instruction utterance) by the user U, "It is not the usual departure time, but now." In the instruction candidate correction unit 13, since the correction instruction utterance V21 denies the "usual departure time" of the When genre and instructs the correction to "from now", the instruction candidate correction unit 13 is the first of the When genre in the first search condition. 1 The second search condition in which the first search condition is corrected is determined by replacing the instruction element "usual departure time" with "from now" in the From genre in which the correction is instructed by the correction instruction utterance V21.

指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ２１をスピーカー３２から出力し（誤り確認報知）、続いて「今からですよね」という音声Ｗ２２をスピーカー３２から出力する（訂正確認報知）。そして、所定処理実行部１４は、第２探索条件に従った第２探索処理を実行し、指示候補訂正部１３は、第２探索処理の実行状況（実行結果）である「今からですと、今日の２２時です」という音声Ｗ２３を出力する（第２候補報知）。
［４．Ｈｏｗｍａｎｙ、Ｈｏｗｍｕｃｈジャンルの指示要素の修正］
図７を参照して、第１探索条件におけるＨｏｗｍａｎｙ、Ｈｏｗｍｕｃｈジャンルの指示要素を訂正する第３対話例の実施形態について説明する。図７の例では、利用者Ｕについて、行動習慣推定部１６により、上記（３）の「利用者Ｕは、車で旅行する際は、ホテルで夕食を取らずに、ＡＡガイドブックに載っている、ホテル近くの二つ星のフランス料理レストランに寄って食事をする。」という行動習慣が推定されている。 The instruction candidate correction unit 13 outputs the voice W21 "Huh, did you make a mistake?" From the speaker 32 (error confirmation notification), and then outputs the voice W22 "It's about now" from the speaker 32 (correction confirmation). Notification). Then, the predetermined process execution unit 14 executes the second search process according to the second search condition, and the instruction candidate correction unit 13 is the execution status (execution result) of the second search process, "From now on, It is 22:00 today "is output as voice W23 (second candidate notification).
[4. How many, correction of indicator elements of How much genre]
With reference to FIG. 7, an embodiment of a third dialogue example in which the indicator elements of the How many and How much genres in the first search condition are corrected will be described. In the example of FIG. 7, the behavioral habit estimation unit 16 describes the user U in the AA guidebook in the above (3), "When the user U travels by car, he / she does not have dinner at the hotel. It is estimated that there is a behavioral habit of "stopping at a two-star French restaurant near the hotel to eat."

図７では、先ず、発話認識部１１により、「近所のレストランを探して」という利用者Ｕの発話Ｖ３０（指示発話）が認識されている。指示候補決定部１２は、上記（３）の行動習慣から、利用者Ｕは、いつものように、宿泊するホテルの近くの二つ星のフランス料理のレストランで夕食を取ると推定し、第１探索条件として、「宿泊先のホテル付近の二つ星のフランス料理レストラン」を決定する。 In FIG. 7, first, the utterance recognition unit 11 recognizes the utterance V30 (instruction utterance) of the user U, "search for a restaurant in the neighborhood". From the behavioral habit of (3) above, the instruction candidate determination unit 12 estimates that the user U will have dinner at a two-star French restaurant near the hotel where he / she is staying, as usual. As a search condition, "a two-star French restaurant near the hotel where you are staying" is decided.

所定処理実行部１４は、第１探索条件に従った第１探索処理を実行して、二つ星のフランス料理レストランを探索する。指示候補訂正部１３は、第１探索処理の実行状況である「近くに、二つ星のフランス料理店のＢＢレストランがあります」という音声Ｗ３０を、スピーカー３２から出力する（第１候補報知）。図７では、音声Ｗ３０に応じて、利用者Ｕによる「二つ星じゃなくて、今日は、三つ星で探して」という発話Ｖ３１（訂正指示発話）が、発話認識部１１により認識される。ここで、二つ星、三つ星等は、施設の評価ランクに相当する。 The predetermined processing execution unit 14 executes the first search processing according to the first search condition to search for a two-star French restaurant. The instruction candidate correction unit 13 outputs a voice W30 from the speaker 32, which is the execution status of the first search process, "There is a BB restaurant of a two-star French restaurant nearby" (first candidate notification). In FIG. 7, the utterance V31 (correction instruction utterance) by the user U, "Look for three stars today, not two stars," is recognized by the utterance recognition unit 11 in response to the voice W30. .. Here, two stars, three stars, etc. correspond to the evaluation rank of the facility.

指示候補訂正部１３は、訂正指示発話Ｖ３１により否定された「二つ星」がＨｏｗｍａｎｙ、Ｈｏｗｍｕｃｈジャンルであるため、第１探索条件におけるＨｏｗｍａｎｙ、Ｈｏｗｍｕｃｈジャンルの第１指示要素である「二つ星」を、訂正指示発話Ｖ３１により訂正が指示された第２指示要素「三つ星」に置き換えて、第２探索条件を決定する。第２探索条件は、「宿泊先のホテル付近の三つ星のフランス料理レストラン」となる。 In the instruction candidate correction unit 13, since the "two stars" denied by the correction instruction utterance V31 are the How many and How much genres, the instruction candidate correction unit 13 is the first instruction element of the How many and How much genres in the first search condition. The second search condition is determined by replacing "two stars" with the second instruction element "three stars" whose correction is instructed by the correction instruction utterance V31. The second search condition is "a three-star French restaurant near the hotel where you are staying".

指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ３１をスピーカー３２から出力し（誤り確認報知）、続いて「今日は、三つ星ですよね」という音声Ｗ３２をスピーカー３２から出力する（訂正確認報知）。そして、所定処理実行部１４は、第２探索条件に従って第２探索処理を実行し、指示候補訂正部１３は、第２探索処理の実行状況である「三つ星ですと、フランス料理店で、ホテルの中にＣＣレストランがあります」という音声Ｗ３３を、スピーカー３２から出力する（第２候補報知）。 The instruction candidate correction unit 13 outputs the voice W31 "Huh, did you make a mistake?" From the speaker 32 (error confirmation notification), and then outputs the voice W32 "Today is three stars, isn't it?" From the speaker 32. (Correction confirmation notification). Then, the predetermined processing execution unit 14 executes the second search processing according to the second search condition, and the instruction candidate correction unit 13 is the execution status of the second search processing, "Three stars, at a French restaurant. The voice W33 saying "There is a CC restaurant in the hotel" is output from the speaker 32 (second candidate notification).

［５．Ｗｈａｔジャンルの指示要素の訂正］
図８を参照して、第１探索条件におけるＷｈａｔジャンルの指示要素を訂正する第４対話例の実施形態について説明する。図８の例では、利用者Ｕについて、行動習慣推定部１６により、上記（３）の「利用者Ｕは、車で旅行する際は、ホテルで夕食を取らずに、ＡＡガイドブックに載っている、ホテル近くの二つ星のフランス料理レストランに寄って食事をする。」という行動習慣が推定されている。 [5. Correction of the indicator element of What genre]
An embodiment of the fourth dialogue example for correcting the indicator element of the What genre in the first search condition will be described with reference to FIG. In the example of FIG. 8, the behavioral habit estimation unit 16 indicates that the user U is listed in the AA guidebook in the above (3) "When the user U travels by car, he / she does not have dinner at the hotel. It is estimated that there is a behavioral habit of "stopping at a two-star French restaurant near the hotel to eat."

図８では、先ず、発話認識部１１により、「近所のレストランを探して」という利用者Ｕの発話Ｖ４０（指示発話）が認識されている。指示候補決定部１２は、上記（３）の利用者の行動習慣から、利用者Ｕは、いつものように、宿泊するホテルの近くの二つ星のフランス料理のレストランで夕食を取ると推定し、第１探索条件として、「宿泊先のホテル付近の二つ星のフランス料理レストラン」を決定する。 In FIG. 8, first, the utterance recognition unit 11 recognizes the utterance V40 (instruction utterance) of the user U, "search for a restaurant in the neighborhood". From the behavioral habits of the user in (3) above, the instruction candidate determination unit 12 estimates that the user U will have dinner at a two-star French restaurant near the hotel where he / she will stay, as usual. , As the first search condition, "a two-star French restaurant near the hotel where you are staying" is determined.

所定処理実行部１４は、第１探索条件に従った第１探索処理を実行して、二つ星のフランス料理レストランを探索する。指示候補訂正部１３は、第１探索処理の実行状況である「近くに、二つ星のフランス料理のＤＤレストランがあります」という音声Ｗ４０を、スピーカー３２から出力する（第１候補報知）。図８では、音声Ｗ４０に応じて、利用者Ｕによる「フランス料理じゃなくて、今日は、イタリア料理で探して」という発話Ｖ４１（訂正指示発話）が、発話認識部１１により認識される。 The predetermined processing execution unit 14 executes the first search processing according to the first search condition to search for a two-star French restaurant. The instruction candidate correction unit 13 outputs a voice W40 from the speaker 32, which is the execution status of the first search process, "There is a two-star French restaurant DD restaurant nearby" (first candidate notification). In FIG. 8, the utterance V41 (correction instruction utterance) by the user U, "Look for Italian food today, not French food," is recognized by the utterance recognition unit 11 in response to the voice W40.

指示候補訂正部１３は、訂正指示発話Ｖ４１により否定された「フランス料理」がＷｈａｔジャンルであるため、第１探索条件におけるＷｈａｔジャンルの第１指示要素である「フランス料理」を、訂正指示発話Ｖ４１により訂正が指示された第２指示要素である「イタリア料理」に置き換えることにより、第１探索条件を訂正した第２探索条件を決定する。第２探索条件は、「宿泊先のホテル付近のイタリア料理のレストラン」となる。 Since the "French cuisine" denied by the correction instruction utterance V41 is the What genre, the instruction candidate correction unit 13 changes the "French cuisine" which is the first instruction element of the What genre in the first search condition to the correction instruction utterance V41. The second search condition in which the first search condition is corrected is determined by replacing it with "Italian cuisine", which is the second indicator element whose correction is instructed by. The second search condition is "Italian restaurant near the hotel where you are staying".

指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ４１をスピーカー３２から出力し（誤り確認報知）、続いて「今日は、イタリア料理ですよね」という音声Ｗ４２をスピーカー３２から出力する（訂正確認報知）。そして、所定処理実行部１４は、第２探索条件に従って第２探索処理を実行し、指示候補訂正部１３は、第２探索処理の実行状況である「イタリア料理で、ホテルの中に、ＥＥレストランがあります」という音声Ｗ４３をスピーカー３２から出力する（第２候補報知）。 The instruction candidate correction unit 13 outputs a voice W41 saying "Huh, did you make a mistake?" From the speaker 32 (error confirmation notification), and then outputs a voice W42 saying "Today is Italian food, isn't it?" From the speaker 32. (Correction confirmation notification). Then, the predetermined processing execution unit 14 executes the second search processing according to the second search condition, and the instruction candidate correction unit 13 is the execution status of the second search processing, "Italian food, in the hotel, EE restaurant. The voice W43 saying "There is" is output from the speaker 32 (second candidate notification).

［６．Ｗｈｏジャンルの指示要素の訂正］
図９を参照して、第１探索条件におけるＷｈｏジャンルの指示要素を訂正する第５対話例の実施形態について説明する。図９の例では、利用者Ｕについて、行動習慣推定部１６により、上記（４）の「利用者Ｕは夫妻であり、妻は、平日は毎日、職場であるＦＦ市役所に車で通勤している。」という行動習慣が推定されている。 [6. Correction of the indicator elements of the Who genre]
An embodiment of the fifth dialogue example for correcting the indicator element of the Who genre in the first search condition will be described with reference to FIG. In the example of FIG. 9, for the user U, the behavioral habit estimation unit 16 described in (4) above, "The user U is a couple, and the wife commute to work by car to the FF city hall, which is the workplace, every weekday. The behavioral habit of "is." Is presumed.

図９では、先ず、発話認識部１１により、「職場を探して」という利用者Ｕ（ここでは夫）の発話Ｖ５０（指示発話）が認識されている。指示候補決定部１２は、上記（４）の行動習慣から、利用者Ｕは、いつものように、妻の職場であるＦＦ市役所に向かうと推定し、第１探索条件として「妻の職場のＦＦ市役所」を決定する。 In FIG. 9, first, the utterance recognition unit 11 recognizes the utterance V50 (instruction utterance) of the user U (here, the husband) who says "search for a workplace". From the behavioral habit of (4) above, the instruction candidate determination unit 12 estimates that the user U heads for the FF city hall, which is the wife's workplace, as usual, and as the first search condition, "FF of the wife's workplace". "City hall" is decided.

所定処理実行部１４は、第１探索条件に従った第１探索処理を実行して、ＦＦ市役所への経路を探索する。指示候補訂正部１３は、第１探索処理の実行内容である「職場の、ＦＦ市役所を探しています」という音声Ｗ５０を、スピーカー３２から出力する（第１候補報知）。花子は妻の名前である。図９では、音声Ｗ５０の出力に応じて、利用者Ｕ（ここでは夫）による「妻の職場じゃなくて、今日は、私の職場を探して」という発話Ｖ５１（訂正指示発話）が、発話認識部１１により認識される。 The predetermined processing execution unit 14 executes the first search processing according to the first search condition to search for the route to the FF city hall. The instruction candidate correction unit 13 outputs the voice W50 "I am looking for the FF city hall in the workplace", which is the execution content of the first search process, from the speaker 32 (first candidate notification). Hanako is my wife's name. In FIG. 9, the utterance V51 (correction instruction utterance) by the user U (here, the husband) saying "Look for my workplace today, not my wife's workplace" is uttered in response to the output of the voice W50. It is recognized by the recognition unit 11.

ここで、指示候補訂正部１３は、妻による発話と夫による発話は、利用者データ２２に保存された妻と夫の声紋による生体認証によって識別する。妻と夫は、利用者Ｕが複数である場合の識別情報に相当する。 Here, the instruction candidate correction unit 13 identifies the utterance by the wife and the utterance by the husband by biometric authentication by the voiceprints of the wife and the husband stored in the user data 22. The wife and husband correspond to the identification information when there are a plurality of users U.

指示候補訂正部１３は、訂正指示発話Ｖ５１により否定された「妻の」がＷｈｏジャンルであるため、第１探索条件のＷｈｏジャンルの指示要素である「妻の」を、訂正指示発話Ｖ５１により訂正された「私（夫）」に置き換えて訂正することにより、第２探索条件を決定する。第２探索条件は「夫の職場」となる。 Since the "wife's" denied by the correction instruction utterance V51 is the Who genre, the instruction candidate correction unit 13 corrects the "wife's" which is the instruction element of the Who genre of the first search condition by the correction instruction utterance V51. The second search condition is determined by replacing it with "I (husband)" and correcting it. The second search condition is "husband's workplace".

指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ５１をスピーカー３２から出力し（誤り確認報知）、続いて「今日は、和夫さんの職場ですよね」という音声Ｗ５２をスピーカー３２から出力する（訂正確認報知）。和夫は夫の名前である。そして、所定処理実行部１４は、第２探索条件に従って第２探索処理を実行し、指示候補訂正部１３は、第２探索処理の実行状況である「和夫さんの職場を探せました」という音声Ｗ５３を、スピーカー３２から出力する（第２候補報知）。 The instruction candidate correction unit 13 outputs a voice W51 saying "Huh, did you make a mistake?" From the speaker 32 (error confirmation notification), and then a voice W52 saying "Today is Mr. Kazuo's workplace, isn't it?" From the speaker 32. Output (correction confirmation notification). Kazuo is her husband's name. Then, the predetermined processing execution unit 14 executes the second search processing according to the second search condition, and the instruction candidate correction unit 13 makes a voice saying "I was able to find Mr. Kazuo's workplace" which is the execution status of the second search processing. W53 is output from the speaker 32 (second candidate notification).

［７．Ｗｈｅｒｅジャンルの指示要素の訂正］
図１０を参照して、第１探索条件におけるＷｈｅｒｅジャンルの指示要素を訂正する第６対話例の実施形態について説明する。図１０の例では、利用者Ｕについて、行動習慣推定部１６により、上記（５）の「利用者ＵはＧＧ球団のファンであり、年に一度程度、地元の清原球場で行われるＧＧ球団と他球団との試合を、毎回観戦している。」という行動習慣が推定されている。 [7. Correction of the indicator element of the Where genre]
An embodiment of the sixth dialogue example for correcting the indicator element of the Where genre in the first search condition will be described with reference to FIG. In the example of FIG. 10, regarding the user U, the behavioral habit estimation unit 16 described the above (5), "User U is a fan of the GG team, and the GG team is held at the local Kiyohara baseball stadium about once a year. It is estimated that he has a behavioral habit of watching games with other teams every time.

図１０では、先ず、発話認識部１１により、「ＧＧ球団の試合を観に行きたい」という利用者Ｕの発話Ｖ６０（指示発話）が認識されている。指示候補決定部１２は、上記（５）の行動習慣から、「利用者Ｕが、今日、清原球場で開催される、１８：００開場、１８：３０開始の、ＧＧ球団とＪＪ球団との試合を見に行き、利用者ＵはＧＧ球団のファンであるから、ホーム球団であるＧＧ球団のベンチ側の一塁側スタンドで観戦する」と推定する。そして、指示候補決定部１２は、第１探索条件として、「清原球場の１塁側近くの駐車場」を決定する。 In FIG. 10, first, the utterance recognition unit 11 recognizes the utterance V60 (instruction utterance) of the user U who wants to go to see the game of the GG team. Based on the behavioral habit of (5) above, the instruction candidate determination unit 12 said, "User U will be held at Kiyohara Stadium today, opening at 18:00 and starting at 18:30, a match between the GG team and the JJ team. Since the user U is a fan of the GG team, he will watch the game on the first base side stand on the bench side of the GG team, which is the home team. " Then, the instruction candidate determination unit 12 determines "a parking lot near the first base side of Kiyohara Baseball Stadium" as the first search condition.

所定処理実行部１４は、第１探索条件に従った第１探索処理を実行して、現在地から清原球場の１塁側近くの駐車場までの経路を探索する。指示候補訂正部１３は、第１探索処理の実行状況である「清原球場の１塁側近くの駐車場に、１７時３０分に到着します」という音声Ｗ６０を、スピーカー３２から出力する（第１候補報知）。図１０では、音声Ｗ６０の出力に応じて、利用者Ｕによる「いやいや、ＨＨ焼き肉店で、ＩＩテレビで観る」という発話Ｖ６１（訂正指示発話）が、発話認識部１１により認識されるている。 The predetermined processing execution unit 14 executes the first search processing according to the first search condition to search for a route from the current location to the parking lot near the first base side of Kiyohara Baseball Stadium. The instruction candidate correction unit 13 outputs a voice W60 from the speaker 32, which is the execution status of the first search process, "Arrive at the parking lot near the 1st base side of Kiyohara Baseball Stadium at 17:30" (No. 1). 1 candidate notification). In FIG. 10, the utterance V61 (correction instruction utterance) by the user U, "No, I watch it on II TV at the HH yakiniku restaurant," is recognized by the utterance recognition unit 11 in response to the output of the voice W60.

指示候補訂正部１３は、訂正指示発話Ｖ６１により訂正された「ＨＨ焼き肉店」がＷｈｅｒｅジャンルであるため、第１探索条件におけるＷｈｅｒｅジャンルの指示要素である第１指示要素「清原球場の１塁側近くの駐車場」を、訂正指示発話Ｖ６１により訂正が指示された「ＨＨ焼き肉店」に置き換えることにより、第１探索条件を訂正した第２探索条件を決定する。第２探索条件は、「ＨＨ焼肉店」となる。 In the instruction candidate correction unit 13, since the "HH yakiniku restaurant" corrected by the correction instruction utterance V61 is the Where genre, the first instruction element "the first base side of the Kiyohara stadium" which is the instruction element of the Where genre in the first search condition. By replacing "a nearby parking lot" with "HH yakiniku restaurant" whose correction was instructed by the correction instruction utterance V61, the second search condition in which the first search condition was corrected is determined. The second search condition is "HH yakiniku restaurant".

指示候補訂正部１３は、「え、間違えましたか？」という音声Ｗ６１をスピーカー３２から出力し（誤り確認報知）、続いて「ＨＨ焼肉店ですよね」という音声Ｗ６２をスピーカー３２から出力する（訂正確認報知）。そして、所定処理実行部１４は、第２探索条件に従って第２探索処理を実行し、指示候補訂正部１３は、第２探索処理の実行状況である「ＨＨ焼き肉店に、１７時５０分に到着します」という音声Ｗ６３を、スピーカー３２から出力する（第２候補報知）。 The instruction candidate correction unit 13 outputs the voice W61 "Huh, did you make a mistake?" From the speaker 32 (error confirmation notification), and then outputs the voice W62 "HH Yakiniku restaurant, right?" From the speaker 32 (correction). Confirmation notification). Then, the predetermined processing execution unit 14 executes the second search processing according to the second search condition, and the instruction candidate correction unit 13 arrives at the “HH yakiniku restaurant” which is the execution status of the second search processing at 17:50. The voice W63 saying "I will do it" is output from the speaker 32 (second candidate notification).

［８．他の実施形態］
上記実施形態では、音声操作システム２をナビゲーション装置１の機能の一部として構成したが、音声操作システム２を家電製品等の他の種類の装置の一部として構成してもよく、或いは専用装置として構成してもよい。 [8. Other embodiments]
In the above embodiment, the voice operation system 2 is configured as a part of the function of the navigation device 1, but the voice operation system 2 may be configured as a part of other types of devices such as home appliances, or a dedicated device. It may be configured as.

上記実施形態では、利用者Ｕによる指示発話として、ナビゲーション装置１に対する目的地の探索条件を指示する発話を例示したが、他の内容に関する指示発話であってもよい。例えば、車両に備えられた空調装置の運転条件、家電製品の操作、セキュリティ設備の作動等を音声操作により指示する場合に、本発明を適用して指示内容の修正操作を容易にすることができる。 In the above embodiment, as the instructional utterance by the user U, the utterance instructing the search condition of the destination with respect to the navigation device 1 is illustrated, but the instructional utterance regarding other contents may be used. For example, when the operating conditions of an air conditioner provided in a vehicle, the operation of home appliances, the operation of security equipment, etc. are instructed by voice operation, the present invention can be applied to facilitate the correction operation of the instruction content. ..

また、上記実施形態では、音声操作システム２を、ナビゲーション装置１の機能の一部として構成したが、音声操作システム２を、例えば、ラジオ受信機の音声操作部として構成してもよい。この場合、音声操作システム２は、利用者Ｕによる「ラジオをつけて」のみの発話に対して、受信するラジオ局（放送局名やチャンネル名等により、ＦＭ、ＡＭ、衛星等の放送周波数が特定される）を、利用者の行動習慣に基づいて、発話がなされた時間帯に利用者Ｕがよく聴くラジオ放送局に決定するようにしてもよい。また、この場合に、音声操作システム２は、利用者の行動習慣に基づいて、平日と休日で異なるラジオ局を決定するようにしてもよい。 Further, in the above embodiment, the voice operation system 2 is configured as a part of the function of the navigation device 1, but the voice operation system 2 may be configured as, for example, a voice operation unit of the radio receiver. In this case, the voice operation system 2 receives a radio station (depending on the broadcast station name, channel name, etc., the broadcast frequency of FM, AM, satellite, etc.) in response to the utterance of only "turn on the radio" by the user U. (Specified) may be determined as the radio broadcasting station that the user U often listens to during the time when the utterance is made, based on the behavioral habits of the user. Further, in this case, the voice operation system 2 may determine different radio stations on weekdays and holidays based on the behavioral habits of the user.

上記実施形態では、指示候補訂正部１３は、第１候補報知を行ってから第２候補報知を行うまでの間に、誤り確認報知と訂正確認報知を行ったが、誤り確認報知と訂正確認報知とのうちのいずれか一方のみを行ってもよく、両報知を省略してもよい。 In the above embodiment, the instruction candidate correction unit 13 performs the error confirmation notification and the correction confirmation notification between the first candidate notification and the second candidate notification. However, the error confirmation notification and the correction confirmation notification are performed. Only one of the above may be performed, and both notifications may be omitted.

また、音声操作システム２の構成を、操作支援サーバー１１０に備えてもよい。この場合、操作支援サーバー１１０は、ナビゲーション装置１から送信される利用者Ｕの発話データを受信して発話内容を認識し、指示発話と利用者の行動習慣とに基づいて第１指示候補を決定する。また、操作支援サーバー１１０は、訂正指示発話による指示に応じて第１指示指示を訂正することにより、第２指示候補を決定する。そして、操作支援サーバー１１０は、第１候補指示及び第２候補指示の情報を、ナビゲーション装置１に送信する構成となる。 Further, the operation support server 110 may be provided with the configuration of the voice operation system 2. In this case, the operation support server 110 receives the utterance data of the user U transmitted from the navigation device 1, recognizes the utterance content, and determines the first instruction candidate based on the instruction utterance and the user's behavioral habit. To do. Further, the operation support server 110 determines the second instruction candidate by correcting the first instruction instruction according to the instruction by the correction instruction utterance. Then, the operation support server 110 is configured to transmit the information of the first candidate instruction and the second candidate instruction to the navigation device 1.

上記実施形態では、ナビゲーション装置１に備えられたマイク３１により利用者Ｕの発話を入力し、ナビゲーション装置１に備えられたスピーカー３２から、音声操作システム２による応答音声を出力した。他の構成として、利用者端末９０に備えられたマイク（図示しない）により利用者Ｕの発話を入力して、発話データを利用者端末９０からナビゲーション装置１に送信するようにしてもよい。また、ナビゲーション装置１から利用者端末９０に、応答音声データを送信して、利用者端末９０のスピーカー（図示しない）から、応答音声を出力するようにしてもよい。 In the above embodiment, the utterance of the user U is input by the microphone 31 provided in the navigation device 1, and the response voice by the voice operation system 2 is output from the speaker 32 provided in the navigation device 1. As another configuration, the utterance of the user U may be input by a microphone (not shown) provided in the user terminal 90, and the utterance data may be transmitted from the user terminal 90 to the navigation device 1. Further, the response voice data may be transmitted from the navigation device 1 to the user terminal 90, and the response voice may be output from the speaker (not shown) of the user terminal 90.

上記実施形態では、行動習慣推定部１６を備えて、指示候補決定部１２は、行動習慣推定部１６により推定された利用者Ｕの行動習慣に基づいて、第１探索条件を決定したが、行動習慣推定部１６を省略した構成としてもよい。 In the above embodiment, the behavior habit estimation unit 16 is provided, and the instruction candidate determination unit 12 determines the first search condition based on the behavior habit of the user U estimated by the behavior habit estimation unit 16, but the behavior The habit estimation unit 16 may be omitted.

上記実施形態では、指示候補決定部１２をＡＩエンジンを用いて構成したが、ＡＩエンジンを用いない構成としてもよい。 In the above embodiment, the instruction candidate determination unit 12 is configured by using the AI engine, but it may be configured not by using the AI engine.

なお、図１は、本願発明の理解を容易にするために、音声操作システム２の機能構成を、主な処理内容により区分して示した概略図であり、音声操作システム２の構成を、他の区分によって構成してもよい。また、各構成要素の処理は、１つのハードウェアユニットにより実行されてもよいし、複数のハードウェアユニットにより実行されてもよい。また、各構成要素の処理は、１つのプログラムにより実行されてもよいし、複数のプログラムにより実行されてもよい。 Note that FIG. 1 is a schematic view showing the functional configuration of the voice operation system 2 divided according to the main processing contents in order to facilitate understanding of the present invention. It may be configured according to the classification of. Further, the processing of each component may be executed by one hardware unit, or may be executed by a plurality of hardware units. Further, the processing of each component may be executed by one program or may be executed by a plurality of programs.

１…ナビゲーション装置、２…音声操作システム、１０…ＣＰＵ、１１…発話認識部、１２…指示候補決定部、１３…指示候補訂正部、１４…所定処理実行部、１５…行動履歴保存部、１６…行動習慣推定部、２０…メモリ、２１…制御用プログラム、２２…利用者データ、２３…地図データ、３０…通信部、３１…マイク、３２…スピーカー、３３…タッチパネル、３４…スイッチ、３５…ＧＰＳユニット、９０…利用者端末、１００…通信ネットワーク、１１０…操作支援サーバー、Ｕ…利用者。
1 ... Navigation device, 2 ... Voice operation system, 10 ... CPU, 11 ... Speaking recognition unit, 12 ... Instruction candidate determination unit, 13 ... Instruction candidate correction unit, 14 ... Predetermined processing execution unit, 15 ... Action history storage unit, 16 ... Behavior habit estimation unit, 20 ... Memory, 21 ... Control program, 22 ... User data, 23 ... Map data, 30 ... Communication unit, 31 ... Mike, 32 ... Speaker, 33 ... Touch panel, 34 ... Switch, 35 ... GPS unit, 90 ... user terminal, 100 ... communication network, 110 ... operation support server, U ... user.

Claims

The utterance recognition unit that recognizes the user's utterance,
An instruction candidate determination unit that determines a first instruction candidate by recognizing or estimating the instruction content by the user based on the instruction utterance of the user recognized by the utterance recognition unit.
The first candidate notification that outputs the execution status of the first instruction candidate or the first predetermined process according to the first instruction candidate by voice is performed, and the instruction by the first instruction candidate is performed in response to the first candidate notification. When the correction instruction utterance instructing the correction of the content is recognized by the utterance recognition unit, the second instruction candidate in which the first instruction candidate is corrected according to the instruction by the correction instruction utterance is determined, and the second instruction is determined. A voice operation system including an instruction candidate correction unit that outputs a second candidate notification by voice output of the content of the candidate or the execution status of the second predetermined process according to the second instruction candidate.

When the first instruction candidate includes the first instruction element of the predetermined genre and the correction instruction utterance includes the second instruction element of the predetermined genre, the instruction candidate correction unit sets the first instruction element. The voice operation system according to claim 1, wherein the second instruction candidate is determined by making corrections based on the second instruction element.

The voice operation system is used to instruct the search condition of the destination in the navigation device.
The instruction candidate determination unit determines the first search condition of the destination as the first instruction candidate.
The instruction candidate correction unit determines, as the second instruction candidate, a second search condition in which the first search condition is corrected according to the instruction by the correction instruction utterance, and the predetermined genre is the location and purpose of the destination. The description in claim 2, which is one of the departure date and time to the place, the evaluation rank of the facility as the destination, the type of the facility as the destination, and the identification information of the user when there are a plurality of users. Voice control system.

In the instruction candidate correction unit, the first instruction candidate is different from the instruction content intended by the user between the time when the correction instruction utterance is recognized by the utterance recognition unit and the time when the second candidate notification is performed. The voice operation system according to any one of claims 1 to 3, wherein an error confirmation notification is performed to notify the fact that the system has been used by voice.

The instruction candidate correction unit is a correction confirmation notification that notifies the correction contents of the first instruction candidate by voice between the time when the correction instruction utterance is recognized by the utterance recognition unit and the time when the second candidate notification is performed. The voice operation system according to any one of claims 1 to 4.

It is equipped with a behavior habit estimation unit that estimates the behavior habits of the user.
When the instruction candidate determination unit cannot specify the instruction content intended by the user from the instruction utterance, the instruction element included in the instruction utterance and the user estimated by the behavior habit estimation unit. The voice operation system according to any one of claims 1 to 5, which determines the first instruction candidate based on the behavioral habits of the above.

A voice operation control method performed by a single computer or a plurality of computers having an utterance recognition unit that recognizes a user's utterance.
An utterance recognition step in which the user's utterance is recognized by the utterance recognition unit, and
The first instruction candidate determination step for determining the first instruction candidate by recognizing or estimating the instruction content by the user based on the instruction utterance of the user recognized by the utterance recognition step.
A first candidate notification step that outputs the execution status of the first instruction candidate or the first predetermined process corresponding to the first instruction candidate by voice, and
When the utterance recognition unit recognizes the correction instruction utterance instructing the correction of the instruction content by the first instruction candidate in response to the first candidate notification step, the first instruction is given in response to the instruction by the correction instruction utterance. The second instruction candidate determination step for determining the second instruction candidate corrected from the candidates, and
A voice operation control method including a second candidate notification step that outputs the content of the second instruction candidate or the execution status of the second predetermined process according to the second instruction candidate by voice.

Installed on one or more computers,
Utterance recognition processing that recognizes the user's utterance,
The first instruction candidate determination process for determining the first instruction candidate by recognizing or estimating the instruction content by the user based on the instruction utterance of the user recognized by the utterance recognition process.
The first candidate notification process that outputs the execution status of the first instruction candidate or the first predetermined process according to the first instruction candidate by voice, and
When the correction instruction utterance instructing the correction of the instruction content by the first instruction candidate is recognized by the utterance recognition process in response to the first candidate notification process, the first instruction is given in response to the instruction by the correction instruction utterance. The second instruction candidate determination process for determining the second instruction candidate corrected from the candidates, and
A voice operation control program that executes a second candidate notification process that outputs the content of the second instruction candidate or the execution status of the second predetermined process according to the second instruction candidate by voice.