JP2019074498A

JP2019074498A - Drive supporting device

Info

Publication number: JP2019074498A
Application number: JP2017202838A
Authority: JP
Inventors: 晋大須賀; Susumu Osuga; 博幸森▲崎▼; Hiroyuki Morisaki; 和久永石; Kazuhisa Nagaishi; 教英北岡; Norihide Kitaoka; 哲嗣田村; Tetsutsugu Tamura
Original assignee: Aisin Seiki Co Ltd
Current assignee: Aisin Corp
Priority date: 2017-10-19
Filing date: 2017-10-19
Publication date: 2019-05-16

Abstract

To provide a drive supporting device that allows a more smooth talking with a passenger.SOLUTION: The drive supporting device according to an embodiment includes: a sound output unit for outputting sound to a passenger; an acquisition unit for acquiring an image of the passenger in an image taken by an imaging device for imaging the inside of a vehicle; and a determination unit for determining whether the output of sound is to be interrupted, on the basis of the operation or the state of the passenger in the result of acquisition by the acquisition unit. The sound output unit interrupts output of sound when the determination unit has determined that the output of sound needs to be interrupted.SELECTED DRAWING: Figure 4

Description

本発明の実施形態は、運転支援装置に関する。 An embodiment of the present invention relates to a driving support device.

従来、音声ガイダンスを出力し、乗員（ユーザ）の応答を音声認識することによって、乗員との対話を行う運転支援装置が知られている。また、ユーザと対話を行うシステムにおいて、システムから音声ガイダンスが出力されている途中で、ユーザの音声によるバージイン（割り込み）を行うことが可能な技術が知られている。 2. Description of the Related Art Conventionally, there is known a driving assistance apparatus that performs dialogue with an occupant by outputting voice guidance and performing voice recognition of a response of the occupant (user). In addition, in a system that interacts with a user, there is known a technology capable of performing barge-in (interruption) by the user's voice while voice guidance is output from the system.

特開２００４−１６３５４１号公報JP 2004-163541 A

しかしながら、従来技術においては、乗員と運転支援装置との対話を、よりスムーズに行うことが望まれていた。 However, in the prior art, it has been desired that the interaction between the passenger and the driving support device be performed more smoothly.

本発明の実施形態にかかる運転支援装置は、一例として、乗員に対して音声を出力する音声出力部と、車内を撮像する撮像装置の撮像画像に含まれる乗員の画像を取得する取得部と、取得部の取得結果に含まれる乗員の動作や状態に基づいて、音声の出力を中断するか否かを判断する判断部と、を備える。音声出力部は、判断部が、音声の出力を中断すると判断した場合に、音声の出力を中断する。このため、実施形態の運転支援装置によれば、乗員の動作や状態に基づいて、音声の出力を中断するため、例えば、よりスムーズに乗員と対話することができる。 A driving assistance apparatus according to an embodiment of the present invention includes, as an example, a voice output unit that outputs a voice to an occupant, and an acquisition unit that acquires an occupant image included in a captured image of an imaging device that captures an interior of a vehicle. And a determination unit configured to determine whether or not to interrupt the audio output based on the operation or state of the occupant included in the acquisition result of the acquisition unit. The voice output unit interrupts the output of voice when the determination unit determines that the output of voice is interrupted. Therefore, according to the driving support device of the embodiment, since the output of the voice is interrupted based on the operation and the state of the occupant, for example, it is possible to interact with the occupant more smoothly.

上記運転支援装置では、一例として、所定の動作は、発話の動作、承認の意思を表す動作、否定の意思を表す動作、保留の意思を表す動作、または、出力された音声が聞き取れないことを表す動作のいずれかである。このため、実施形態にかかる運転支援装置によれば、乗員の意思を踏まえて音声を中断することにより、例えば、乗員にとって不要な音声の出力を抑制できる。これにより、実施形態にかかる運転支援装置によれば、よりスムーズに乗員と対話することができる。 In the above-described driving support device, as an example, the predetermined action may be an action of speech, an action indicating intention of approval, an action indicating intention of denial, an action indicating intention of holding, or that the output sound can not be heard. It is one of the operations to represent. For this reason, according to the driving support device according to the embodiment, by interrupting the voice based on the intention of the occupant, for example, it is possible to suppress the output of the voice unnecessary for the occupant. Thereby, according to the driving assistance device concerning an embodiment, it can interact with a crew member more smoothly.

上記運転支援装置では、一例として、乗員が承認の意思を表す動作をしていると判断された場合、出力された音声の承認に対応する処理を実行する対話制御部、をさらに備える。よって、実施形態にかかる運転支援装置によれば、例えば、乗員の意思を確認した上で、乗員の意思に沿った後続の処理を、迅速に開始することができる。このため、実施形態にかかる運転支援装置によれば、例えば、対話に要する時間を短縮し、よりスムーズに乗員との対話を行うことができる。 The above-described driving support device further includes, as an example, a dialog control unit that executes processing corresponding to the approval of the output voice when it is determined that the occupant is performing an operation indicating the intention of the approval. Therefore, according to the driving support device according to the embodiment, for example, after confirming the intention of the occupant, it is possible to quickly start the subsequent processing in accordance with the intention of the occupant. For this reason, according to the driving support device according to the embodiment, for example, the time required for the dialogue can be shortened, and the dialogue with the occupant can be performed more smoothly.

上記運転支援装置では、一例として、乗員が否定の意思を表す動作をしていると判断された場合、出力された音声の否定に対応する処理を実行する対話制御部、をさらに備える。よって、実施形態にかかる運転支援装置によれば、例えば、乗員の意思を確認した上で、乗員の意思に沿った後続の処理を、迅速に開始することができる。このため、実施形態にかかる運転支援装置によれば、例えば、対話に要する時間を短縮し、よりスムーズに乗員との対話を行うことができる。 The above-described driving support device further includes, as an example, a dialog control unit that executes a process corresponding to negation of the output voice when it is determined that the occupant is performing an operation indicating a negative intention. Therefore, according to the driving support device according to the embodiment, for example, after confirming the intention of the occupant, it is possible to quickly start the subsequent processing in accordance with the intention of the occupant. For this reason, according to the driving support device according to the embodiment, for example, the time required for the dialogue can be shortened, and the dialogue with the occupant can be performed more smoothly.

上記運転支援装置では、一例として、音声出力部が出力した音声に対して乗員が発話する音声を認識する音声認識部、をさらに備える。また、乗員が保留の意思を表す動作をしていると判断された場合、音声認識部は、乗員が発話する音声の認識時間を延長する。よって、実施形態にかかる運転支援装置によれば、乗員が意思決定を行うまで待つことができ、乗員が回答するペースに合わせて対話を行うことができる。 The above-described driving support device further includes, as an example, a voice recognition unit that recognizes a voice uttered by a passenger with respect to the voice output by the voice output unit. Also, when it is determined that the occupant is performing an operation indicating the intention to suspend, the voice recognition unit extends the recognition time of the voice spoken by the occupant. Therefore, according to the driving support device according to the embodiment, it is possible to wait until the occupant makes a decision, and to perform dialogue in accordance with the pace at which the occupant answers.

上記運転支援装置では、一例として、乗員が保留の意思を表す動作をしていると判断された場合、音声出力部は、乗員が保留の意思を表す動作をしていると判断された際に出力していた音声と異なる音声を出力する。よって、実施形態にかかる運転支援装置によれば、当初出力された音声の内容とは異なる選択肢を提示することによって、乗員の意思決定を支援することができる。 In the above-described driving support device, as one example, when it is determined that the occupant is performing an action indicating the intention to suspend, the voice output unit is determined to be performing an operation indicating the intention to suspend. Output a voice different from the voice being output. Therefore, according to the driving support device according to the embodiment, it is possible to support the decision making of the occupant by presenting an option different from the content of the voice that is initially output.

上記運転支援装置では、一例として、乗員が音声が聞き取れないことを表す動作をしていると判断された場合、音声出力部は、乗員が音声が聞き取れないことを表す動作をしていると判断された際に出力していた音声を再度出力する。よって、実施形態にかかる運転支援装置によれば、乗員が聞き取れなかった音声を再度出力することにより、乗員との対話を円滑に行うことができる。 In the above-described driving support device, as an example, when it is determined that the occupant is performing an action indicating that the voice can not be heard, the voice output unit determines that the occupant is performing an action indicating that the voice can not be heard. The voice that was being output when it was output is output again. Therefore, according to the driving support device according to the embodiment, by re-outputting the voice that the occupant could not hear, it is possible to smoothly interact with the occupant.

図１は、実施形態にかかる車両の車室の一部が透視された状態が示された例示的な斜視図である。FIG. 1 is an exemplary perspective view in which a part of a cabin of a vehicle according to the embodiment is seen through. 図２は、実施形態にかかる撮像装置の配置の一例を示す図である。FIG. 2 is a view showing an example of the arrangement of imaging devices according to the embodiment. 図３は、実施形態にかかる運転支援システムのハードウェア構成の一例を示す図である。FIG. 3 is a diagram illustrating an example of a hardware configuration of the driving support system according to the embodiment. 図４は、実施形態にかかるＥＣＵが有する機能の一例を示すブロック図である。FIG. 4 is a block diagram showing an example of functions of the ECU according to the embodiment. 図５は、実施形態にかかる対話制御処理の手順の一例を示すフローチャートである。FIG. 5 is a flowchart illustrating an example of the procedure of the dialogue control process according to the embodiment. 図６は、実施形態にかかるＥＣＵによる音声ガイダンスと、乗員との対話の一例を示す図である。FIG. 6 is a diagram showing an example of the voice guidance by the ECU according to the embodiment and the dialog with the occupant. 図７は、実施形態にかかる乗員の承認の意思を表す動作の一例を示す図である。FIG. 7 is a view showing an example of an operation representing the occupant's approval intention according to the embodiment. 図８は、実施形態にかかる乗員の否定の意思を表す動作の一例を示す図である。FIG. 8 is a diagram showing an example of an operation representing the occupant's negative intention according to the embodiment. 図９は、実施形態にかかる乗員の発話の動作の一例を示す図である。FIG. 9 is a diagram illustrating an example of the operation of the utterance of the occupant according to the embodiment. 図１０は、実施形態にかかる出力された音声が聞き取れないことを表す動作の一例を示す図である。FIG. 10 is a diagram illustrating an example of an operation indicating that the output sound according to the embodiment can not be heard. 図１１は、変形例４にかかる保留に対応する処理の一例を示す図である。FIG. 11 is a diagram illustrating an example of processing corresponding to suspension according to the fourth modification.

本実施形態においては、乗員の動作に応じて運転支援装置が音声の出力を中断することにより、乗員と運転支援装置とがよりスムーズに対話することができる。以下、本実施形態の運転支援装置を車両に搭載した例を挙げて説明する。 In the present embodiment, the driver assistance apparatus can interrupt the voice output according to the operation of the occupant, whereby the occupant and the drive assistance apparatus can interact more smoothly. Hereinafter, the example which mounted the driving assistance device of this embodiment in the vehicle is mentioned and demonstrated.

図１は、本実施形態にかかる車両１の車室２ａの一部が透視された状態が示された例示的な斜視図である。車両１は、例えば、内燃機関自動車であってもよいし、電気自動車や燃料電池自動車、ハイブリッド自動車等であってもよいし、他の駆動源を備えた自動車であってもよい。また、車両１は、種々の変速装置を搭載することができるし、内燃機関や電動機を駆動するのに必要な種々の装置、例えばシステムや部品等を搭載することができる。 FIG. 1 is an exemplary perspective view in which a part of a compartment 2a of a vehicle 1 according to the present embodiment is seen through. The vehicle 1 may be, for example, an internal combustion engine automobile, an electric automobile, a fuel cell automobile, a hybrid automobile or the like, or may be an automobile provided with another drive source. In addition, the vehicle 1 can be mounted with various transmissions, and can be mounted with various devices necessary for driving an internal combustion engine or a motor, such as a system or components.

また、図１に例示されるように、車両１は、例えば、四輪自動車であり、左右２つの前輪３Ｆと、左右２つの後輪３Ｒとを有する。これら４つの車輪３は、いずれも転舵可能に構成されうる。車両１における車輪３の駆動に関わる装置の方式や、数、レイアウト等は、種々に設定することができる。 Further, as illustrated in FIG. 1, the vehicle 1 is, for example, a four-wheeled vehicle, and has two left and right front wheels 3F and two left and right two rear wheels 3R. All of these four wheels 3 can be configured to be steerable. The system, the number, the layout, and the like of devices related to the driving of the wheels 3 in the vehicle 1 can be set variously.

図１に例示されるように、車体２は、不図示の乗員が乗車する車室２ａを構成している。車室２ａ内には、乗員としての運転者の座席４０ａ（運転席）に臨む状態で、操舵部４や、加速操作部５、制動操作部６、変速操作部７等が設けられている。 As illustrated in FIG. 1, the vehicle body 2 constitutes a passenger compartment 2 a in which a passenger (not shown) rides. A steering unit 4, an acceleration operation unit 5, a braking operation unit 6, a shift operation unit 7 and the like are provided in the passenger compartment 2a in a state of facing the driver's seat 40a (driver's seat) as a passenger.

操舵部４は、例えば、ダッシュボード１２から突出したステアリングホイール（ハンドル）である。また、加速操作部５は、例えば、運転者の足下に位置されたアクセルペダルである。また、制動操作部６は、例えば、運転者の足下に位置されたブレーキペダルである。また、変速操作部７は、例えば、センターコンソールから突出したシフトレバーである。なお、操舵部４や、加速操作部５、制動操作部６、変速操作部７は、これらに限定されるものではない。 The steering unit 4 is, for example, a steering wheel (handle) that protrudes from the dashboard 12. In addition, the acceleration operation unit 5 is, for example, an accelerator pedal positioned under the driver's foot. The braking operation unit 6 is, for example, a brake pedal positioned under the driver's foot. Further, the shift operation unit 7 is, for example, a shift lever that protrudes from the center console. The steering unit 4, the acceleration operation unit 5, the braking operation unit 6, and the transmission operation unit 7 are not limited to these.

また、車室２ａ内には、モニタ装置１１が設けられている。モニタ装置１１には、表示装置（図３に図示）や、音声出力装置（図３に図示）が設けられている。音声出力装置は、例えば、スピーカである。また、表示装置は、例えば、ＬＣＤ（liquid crystal display）や、ＯＥＬＤ（organic electroluminescent display）等である。また、表示装置は、例えば、タッチパネル等、透明な操作入力部（図３に図示）で覆われている。また、モニタ装置１１とは異なる車室２ａ内の他の位置に、不図示の音声出力装置が設けられても良い。なお、モニタ装置１１は、例えば、ナビゲーションシステムやオーディオシステムと兼用されうる。 Further, a monitor device 11 is provided in the passenger compartment 2a. The monitor device 11 is provided with a display device (shown in FIG. 3) and an audio output device (shown in FIG. 3). The audio output device is, for example, a speaker. The display device is, for example, a liquid crystal display (LCD), an organic electroluminescent display (OELD), or the like. The display device is covered with a transparent operation input unit (shown in FIG. 3) such as a touch panel, for example. Further, an audio output device (not shown) may be provided at another position in the compartment 2 a different from the monitor device 11. The monitor device 11 can also be used as, for example, a navigation system or an audio system.

また、車体２の天井には、音声入力部２４が設けられている。音声入力部２４は例えばマイクであり、車室２ａ内の乗員の音声を入力可能である。図１に示す音声入力部２４の設置位置は一例であり、これに限定されるものではない。 Further, on the ceiling of the vehicle body 2, a voice input unit 24 is provided. The voice input unit 24 is, for example, a microphone, and can input a voice of a passenger in the passenger compartment 2a. The installation position of the voice input unit 24 shown in FIG. 1 is an example, and the present invention is not limited to this.

また、操舵部４とダッシュボード１２とを接続するハンドルコラム（図２に図示）には、車両１の車内を撮像する撮像装置１５が設置されている。この撮像装置１５は、例えば、ＣＣＤ（Charge Coupled Device）カメラ等である。 Further, an imaging device 15 for imaging the inside of the vehicle 1 is installed on a steering wheel column (shown in FIG. 2) connecting the steering unit 4 and the dashboard 12. The imaging device 15 is, for example, a CCD (Charge Coupled Device) camera or the like.

図２は、本実施形態にかかる撮像装置１５の配置の一例を示す図である。本実施形態においては、撮像装置１５は、ハンドルコラム４１に設置されている。撮像装置１５は、座席４０ａに着座する乗員４２の顔が、視野中心に位置するように、視野角及び姿勢が調整されている。この撮像装置１５は、車両１の車内を撮像し、撮像により得た画像データである撮像画像を、後述するＥＣＵへ順次出力する。 FIG. 2 is a view showing an example of the arrangement of the imaging device 15 according to the present embodiment. In the present embodiment, the imaging device 15 is installed on the handle column 41. The viewing angle and the attitude of the imaging device 15 are adjusted such that the face of the occupant 42 sitting on the seat 40 a is positioned at the center of the field of view. The imaging device 15 captures the inside of the vehicle 1 and sequentially outputs captured images, which are image data obtained by imaging, to an ECU described later.

図２に示す撮像装置１５の設置位置は一例であり、これに限定されるものではない。例えば、撮像装置１５は、ダッシュボード１２の上や、モニタ装置１１等に設けられた広角カメラであっても良い。 The installation position of the imaging device 15 shown in FIG. 2 is an example, and the present invention is not limited to this. For example, the imaging device 15 may be a wide-angle camera provided on the dashboard 12 or on the monitor device 11 or the like.

図３は、本実施形態にかかる運転支援システム１０のハードウェア構成の一例を示す図である。図３に示すように、車両１に搭載された運転支援システム１０では、ＥＣＵ１４や、モニタ装置１１等の他、ブレーキシステム１８、舵角センサ１９（角度センサ）、アクセルセンサ２０、シフトセンサ２１、車輪速センサ２２、操舵システム１３、音声入力部２４等が、電気通信回線としての車内ネットワーク２３を介して電気的に接続されている。車内ネットワーク２３は、例えば、ＣＡＮ（controller area network）として構成される。なお、音声入力部２４は車内ネットワーク２３を介さずに、直接ＥＣＵ１４と接続する構成を採用しても良い。 FIG. 3 is a diagram showing an example of a hardware configuration of the driving support system 10 according to the present embodiment. As shown in FIG. 3, in the driving support system 10 installed in the vehicle 1, the brake system 18, the steering angle sensor 19 (angle sensor), the accelerator sensor 20, the shift sensor 21, in addition to the ECU 14 and the monitor device 11. The wheel speed sensor 22, the steering system 13, the voice input unit 24, and the like are electrically connected via an in-vehicle network 23 as a telecommunication line. The in-vehicle network 23 is configured as, for example, a CAN (controller area network). The voice input unit 24 may be directly connected to the ECU 14 without passing through the in-vehicle network 23.

ＥＣＵ１４は、各種の演算処理および運転支援システム１０の各構成の制御を実行することができる。より詳細には、ＥＣＵ１４は、車内ネットワーク２３を通じて制御信号を送ることで、操舵システム１３、ブレーキシステム１８等を制御する。また、ＥＣＵ１４は、モニタ装置１１に含まれる表示装置８ａ、音声出力装置８ｂを制御する。また、ＥＣＵ１４は、車内ネットワーク２３を介して、トルクセンサ１３ｂ、ブレーキセンサ１８ｂ、舵角センサ１９、アクセルセンサ２０、シフトセンサ２１、車輪速センサ２２、音声入力部２４等の検出結果、ならびに、操作入力部８ｃ等の指示信号（制御信号、操作信号、入力信号、データ）を受け取る。また、ＥＣＵ１４は、撮像装置１５から撮像画像を取得する。ＥＣＵ１４は、本実施形態における運転支援装置の一例である。 The ECU 14 can execute various types of arithmetic processing and control of each component of the driving support system 10. More specifically, the ECU 14 controls the steering system 13, the brake system 18 and the like by transmitting control signals through the in-vehicle network 23. Further, the ECU 14 controls the display device 8 a and the voice output device 8 b included in the monitor device 11. In addition, the ECU 14 detects the torque sensor 13b, the brake sensor 18b, the steering angle sensor 19, the accelerator sensor 20, the shift sensor 21, the wheel speed sensor 22, the voice input unit 24, etc. An instruction signal (a control signal, an operation signal, an input signal, data) of the input unit 8c or the like is received. Further, the ECU 14 acquires a captured image from the imaging device 15. The ECU 14 is an example of the driving support device in the present embodiment.

ＥＣＵ１４は、例えば、ＣＰＵ１４ａ（central processing unit）や、ＲＯＭ１４ｂ（read only memory）、ＲＡＭ１４ｃ（random access memory）、表示制御部１４ｄ、ＳＳＤ１４ｆ（solid state drive、フラッシュメモリ）等を有している。 The ECU 14 has, for example, a CPU 14 a (central processing unit), a ROM 14 b (read only memory), a RAM 14 c (random access memory), a display control unit 14 d, an SSD 14 f (solid state drive, flash memory) and the like.

ＣＰＵ１４ａは、ＲＯＭ１４ｂ等の不揮発性の記憶装置にインストールされ記憶されたプログラムを読み出し、当該プログラムにしたがって演算処理を実行する。また、表示制御部１４ｄは、ＥＣＵ１４での演算処理のうち、主として、撮像装置１５で得られた画像データを用いた画像処理や、表示装置８ａで表示される画像データの合成等を実行する。 The CPU 14a reads a program installed and stored in a non-volatile storage device such as the ROM 14b, and executes arithmetic processing according to the program. In addition, the display control unit 14d mainly performs image processing using image data obtained by the imaging device 15, composition of image data displayed by the display device 8a, and the like among the arithmetic processing in the ECU 14.

操舵システム１３は、少なくとも２つの車輪３を操舵する。操舵システム１３は、アクチュエータ１３ａと、トルクセンサ１３ｂとを有する。また、ブレーキシステム１８は、アクチュエータ１８ａと、ブレーキセンサ１８ｂとを有する。ブレーキシステム１８は、アクチュエータ１８ａを介して、車輪３ひいては車両１に制動力を与える。 The steering system 13 steers at least two wheels 3. The steering system 13 has an actuator 13a and a torque sensor 13b. The brake system 18 also has an actuator 18a and a brake sensor 18b. The brake system 18 applies a braking force to the wheel 3 and thus to the vehicle 1 via the actuator 18a.

上述した各種センサやアクチュエータの構成や、配置、電気的な接続形態等は、一例であって、種々に設定（変更）することができる。 The configuration, arrangement, electrical connection form and the like of the various sensors and actuators described above are merely examples, and can be set (changed) in various ways.

図４は、本実施形態にかかるＥＣＵ１４が有する機能の一例を示すブロック図である。図４に示すように、ＥＣＵ１４は、記憶部１４０と、取得部１４１と、判断部１４２と、音声認識部１４３と、音声出力部１４４と、対話制御部１４５とを備える。 FIG. 4 is a block diagram showing an example of the function of the ECU 14 according to the present embodiment. As shown in FIG. 4, the ECU 14 includes a storage unit 140, an acquisition unit 141, a determination unit 142, a voice recognition unit 143, a voice output unit 144, and a dialogue control unit 145.

取得部１４１と、判断部１４２と、音声認識部１４３と、音声出力部１４４と、対話制御部１４５との各構成は、ＣＰＵ１４ａが、ＲＯＭ１４ｂ内に格納されたプログラムを実行することで実現される。なお、これらの構成をハードウェア回路で実現するように構成しても良い。 Each configuration of the acquisition unit 141, the determination unit 142, the voice recognition unit 143, the voice output unit 144, and the dialogue control unit 145 is realized by the CPU 14a executing a program stored in the ROM 14b. . Note that these configurations may be configured to be realized by a hardware circuit.

記憶部１４０は、後述の音声出力部１４４が出力する音声ガイダンスのテキスト等を記憶する。また、記憶部１４０は、例えば、ＳＳＤ１４ｆ等の記憶装置によって構成される。 The storage unit 140 stores texts and the like of voice guidance output from a voice output unit 144 described later. In addition, the storage unit 140 is configured of, for example, a storage device such as an SSD 14 f.

取得部１４１は、撮像装置１５の撮像画像に含まれる乗員４２の画像を取得する。より詳細には、取得部１４１は、撮像装置１５から撮像画像を取得し、画像処理により当該撮像画像に含まれる乗員４２の画像を抽出（取得）する。本実施形態においては、乗員４２の画像を、取得部１４１の取得結果という。撮像画像から乗員４２の画像を取得する手法は、公知の技術を採用することができる。 The acquisition unit 141 acquires an image of the occupant 42 included in a captured image of the imaging device 15. More specifically, the acquisition unit 141 acquires a captured image from the imaging device 15, and extracts (acquires) an image of the occupant 42 included in the captured image by image processing. In the present embodiment, an image of the occupant 42 is referred to as an acquisition result of the acquisition unit 141. A known technique can be adopted as a method of acquiring the image of the occupant 42 from the captured image.

判断部１４２は、取得部１４１の取得結果に含まれる乗員４２の動作や状態に基づいて、後述の音声出力部１４４による音声の出力を中断するか否かを判断する。 The determination unit 142 determines whether to interrupt the audio output by the audio output unit 144, which will be described later, based on the operation and state of the occupant 42 included in the acquisition result of the acquisition unit 141.

より詳細には、判断部１４２は、取得部１４１が取得した撮像画像に含まれる乗員４２の画像から、乗員４２の動作や状態を判断する。例えば、判断部１４２は、乗員４２が、発話の動作、承認の意思を表す動作、否定の意思を表す動作、保留の意思を表す動作、または、出力された音声が聞き取れないことを表す動作のいずれかをしているか否かを判断する。以下、発話の動作を「発話動作」、承認の意思を表す動作を「承認動作」、否定の意思を表す動作を「否定動作」、保留の意思を表す動作を「保留動作」、出力された音声が聞き取れないことを表す動作を「聞き取れないことを表す動作」という。また、発話の動作、承認の意思を表す動作、否定の意思を表す動作、または、保留の意思を表す動作、出力された音声が聞き取れないことを表す動作、を総称して、所定の動作という。 More specifically, the determination unit 142 determines the operation or state of the occupant 42 from the image of the occupant 42 included in the captured image acquired by the acquisition unit 141. For example, the determination unit 142 may perform an operation of uttering, an operation indicating an intention of approval, an operation indicating an intention of denial, an operation indicating an intention of holding, or an operation indicating that the output voice can not be heard. Determine if you are doing any of them. Hereinafter, the operation of the speech is "uttered operation", the operation representing the intention of approval is "approval operation", the operation representing the intention of denial "negative operation", the operation representing the intention of holding "hold operation", and output An operation indicating that the voice can not be heard is referred to as "a motion indicating that it can not be heard." In addition, the action of uttering, the action of representing the intention of approval, the action of representing the intention of denial, or the action of representing the intention of holding, the action of representing that the output voice can not be heard are collectively referred to as a predetermined action. .

発話動作は、例えば、口を開く動作である。一般に、車内にはラジオの音等の雑音が存在するが、判断部１４２は、撮像画像から乗員４２の発話動作の有無を判断するため、雑音等を乗員４２の発話として誤判断することを抑制できる。このため、本実施形態の判断部１４２は、乗員４２の発話の有無を高精度に判断することができる。また、判断部１４２は、乗員４２が音声を発しようとして口を動かした時点で発話動作をしたと判断するため、より早い段階で乗員４２の発話を認識することができる。 The speech operation is, for example, an operation of opening a mouth. Generally, noise such as radio sound is present in the car, but the judgment unit 142 judges that noise or the like is erroneously judged as the speech of the occupant 42 because the judgment portion 142 judges the presence or absence of the speech operation of the occupant 42 from the captured image. it can. For this reason, the determination unit 142 of the present embodiment can determine the presence or absence of the utterance of the occupant 42 with high accuracy. In addition, since the determination unit 142 determines that the uttering operation is performed when the occupant 42 moves the mouth to emit a voice, it is possible to recognize the utterance of the occupant 42 at an earlier stage.

承認動作は、例えば、首を縦に振って頷く動作であり、後述の音声出力部１４４が出力した音声の内容に対して乗員４２が承認したことを示す動作である。 The approval operation is, for example, an operation in which the neck is swung vertically and turned, which indicates that the passenger 42 has approved the content of the sound output from the sound output unit 144 described later.

また、否定動作は、例えば、首を横に振る動作であり、後述の音声出力部１４４が出力した音声の内容に対して乗員４２が否定したことを示す動作である。 Further, the negative operation is, for example, an operation of shaking the neck side, which is an operation indicating that the occupant 42 has made negative with respect to the content of the sound output from the sound output unit 144 described later.

保留動作は、例えば、首を傾げる動作である。また、顔をしかめる、眉を寄せる、といった顔の動きを、保留動作としても良い。保留動作とは、乗員４２が、後述の音声出力部１４４が出力した音声に対して、承認か否定かの意思決定をまだしていないことを示す動作である。 The holding operation is, for example, an operation of tilting the neck. In addition, the movement of the face, such as frowning or chewing, may be used as the hold operation. The hold operation is an operation that indicates that the occupant 42 has not yet made a decision as to whether the voice output by the voice output unit 144 described later has been approved or not.

聞き取れないことを表す動作は、例えば、目を見開く動作であり、乗員４２が、後述の音声出力部１４４が出力した音声を聞き取れなかったことを示す動作である。 An operation indicating that the user can not hear is, for example, an operation of opening his eyes, and is an operation that indicates that the passenger 42 can not hear the voice output from the voice output unit 144 described later.

判断部１４２は、乗員４２の画像から、乗員４２が発話の動作、承認の意思を表す動作、否定の意思を表す動作、または、保留の意思を表す動作、聞き取れないことを表す動作、のいずれかをしていると判断した場合に、音声の出力を中断すると判断する。なお、発話動作、承認動作、否定動作、保留動作、聞き取れないことを表す動作の具体的な動作の内容は、上述の例に限定されるものではない。また、判断部１４２は、動作だけではなく、乗員４２の姿勢等の状態に基づいて、乗員４２の意思を判断して、音声の出力を中断するか否かを判断しても良い。 From the image of the occupant 42, the determination unit 142 is any of the motion of the occupant 42, the motion indicating the intention of approval, the motion indicating the intention of denial, the motion indicating the intention of holding, and the motion indicating no hearing from the image of the passenger 42. If it is determined that the user is making a noise, it is determined that the audio output is interrupted. The contents of the specific operation of the speech operation, the approval operation, the negative operation, the hold operation, and the operation indicating that the user can not hear it are not limited to the above-described example. Further, the determination unit 142 may determine whether or not to interrupt the voice output by determining the intention of the occupant 42 based not only on the operation but also on the posture of the occupant 42 or the like.

また、判断部１４２は、ＲＮＮ（Recurrent Neural Network）等のディープラーニングの手法を用いて、入力された乗員４２の画像から、音声の出力を中断するか否かを判断しても良い。判断部１４２が音声の出力を中断するか否かを判断する手法は、ＲＮＮに限定されるものではなく、他のディープラーニングや、ディープラーニング以外の手法を採用しても良い。 In addition, the determination unit 142 may determine whether or not to interrupt the sound output from the input image of the occupant 42 using a deep learning method such as Recurrent Neural Network (RNN). The method of determining whether or not the determination unit 142 interrupts the output of voice is not limited to the RNN, and other methods of deep learning and deep learning may be adopted.

また、本実施形態においては取得部１４１が取得した乗員４２の画像に基づいて、判断部１４２が、乗員４２の動作や状態を判断するとしたが、取得部１４１が乗員４２の動作や状態を検出する構成を採用しても良い。この場合、乗員４２の画像から検出された乗員４２の動作や状態を、取得部１４１の取得結果という。 In the present embodiment, the determination unit 142 determines the operation or state of the occupant 42 based on the image of the occupant 42 acquired by the acquisition unit 141. However, the acquisition unit 141 detects the operation or state of the occupant 42 You may employ the structure which In this case, the operation or state of the occupant 42 detected from the image of the occupant 42 is referred to as an acquisition result of the acquisition unit 141.

音声認識部１４３は、音声入力部２４に入力された音声に対して音声認識処理を行い、命令の内容を特定する。本実施形態においては、音声認識部１４３は、後述の音声出力部１４４が出力した音声に対して乗員４２が発話する音声を認識する。より詳細には、音声認識部１４３は、乗員４２の音声から、承認または否定の意思を示す単語を特定する。また、音声認識部１４３は、乗員４２の音声から、車両１の行先を示す固有名詞や、車両１が実行する処理を示す動詞等を特定する。 The voice recognition unit 143 performs voice recognition processing on the voice input to the voice input unit 24 and specifies the content of the command. In the present embodiment, the voice recognition unit 143 recognizes the voice uttered by the occupant 42 from the voice output from the voice output unit 144 described later. More specifically, the voice recognition unit 143 identifies, from the voice of the occupant 42, a word indicating an approval or denial intention. Further, the voice recognition unit 143 specifies, from the voice of the occupant 42, a proper noun indicating the destination of the vehicle 1, a verb indicating the process to be executed by the vehicle 1, and the like.

本実施形態においては、音声認識部１４３は、音声出力部１４４が乗員４２に対して質問等の音声を出力した後に設けられる、音声の認識時間において、乗員４２の音声を認識する。音声認識部１４３が音声認識を行うタイミングはこれに限定されるものではなく、常時、音声認識処理を行っているものとしても良い。また、音声認識部１４３は、乗員４２が操作入力部８ｃ等によって音声認識の開始の操作をした場合にのみ音声認識を行うものとしても良い。また、音声認識部１４３が乗員４２の音声を認識する手法は、公知の手法を採用することができる。また、音声認識部１４３は、判断部１４２が乗員４２が保留動作をしていると判断した場合、音声の認識時間を延長する。音声の認識時間中は、音声出力部１４４による音声の出力の中断は継続される。音声の認識時間の延長は、本実施形態における保留に対応する処理の一例であり、判断部１４２が乗員４２が保留動作をしていると判断した場合に行われる処理は、これに限定されるものではない。 In the present embodiment, the voice recognition unit 143 recognizes the voice of the occupant 42 in the voice recognition time provided after the voice output unit 144 outputs the voice such as the question to the occupant 42. The timing at which the speech recognition unit 143 performs speech recognition is not limited to this, and speech recognition processing may always be performed. Further, the voice recognition unit 143 may perform voice recognition only when the occupant 42 operates the start of voice recognition by the operation input unit 8c or the like. Further, as a method for the speech recognition unit 143 to recognize the speech of the occupant 42, a known method can be adopted. Further, when the determination unit 142 determines that the occupant 42 performs the on-hold operation, the voice recognition unit 143 extends the voice recognition time. During the speech recognition time, the interruption of the speech output by the speech output unit 144 is continued. The extension of the speech recognition time is an example of processing corresponding to suspension in the present embodiment, and the processing performed when the determination unit 142 determines that the occupant 42 is in suspension operation is limited to this. It is not a thing.

音声出力部１４４は、乗員４２に対して音声を出力する。一例として、音声出力部１４４は、音声出力装置８ｂを制御して、ナビゲーションシステムの操作の音声ガイダンスを出力する。また、音声出力部１４４は、判断部１４２が、音声の出力を中断すると判断した場合に、音声の出力を中断する。また、音声出力部１４４は、判断部１４２が、乗員４２が聞き取れないことを表す動作をしていると判断した場合、乗員４２が聞き取れないことを表す動作をしていると判断された際に出力していた音声を再度出力する。 The voice output unit 144 outputs a voice to the occupant 42. As an example, the voice output unit 144 controls the voice output device 8b to output voice guidance of the operation of the navigation system. Further, when the determination unit 142 determines that the output of the audio is interrupted, the audio output unit 144 interrupts the output of the audio. Further, when the sound output unit 144 determines that the determination unit 142 is performing an operation indicating that the occupant 42 can not hear, it is determined that the voice output unit 144 is performing an operation indicating that the occupant 42 can not hear. Re-output the voice that was being output.

対話制御部１４５は、乗員４２と、ＥＣＵ１４との対話を制御する。また、対話制御部１４５は、音声出力部１４４から出力された音声に対する乗員４２の反応に応じて、後続の処理を実行する。例えば、対話制御部１４５は、判断部１４２が、乗員４２が承認動作をしていると判断した場合、出力された音声に対する承認に対応する処理を実行する。また、対話制御部１４５は、判断部１４２が、乗員４２が否定動作をしていると判断した場合、出力された音声に対する否定に対応する処理を実行する。承認に対応する処理、および否定に対応する処理は、音声出力部１４４によって出力された音声の内容に応じて異なる。 The dialogue control unit 145 controls the dialogue between the occupant 42 and the ECU 14. Further, the dialogue control unit 145 executes the subsequent processing in accordance with the reaction of the occupant 42 with the voice output from the voice output unit 144. For example, when the determination unit 142 determines that the occupant 42 performs the approval operation, the dialogue control unit 145 executes a process corresponding to the approval for the output voice. Further, when the determination unit 142 determines that the occupant 42 performs a negative operation, the dialogue control unit 145 executes a process corresponding to the negative of the output voice. The process corresponding to the approval and the process corresponding to the negative differ depending on the content of the audio output by the audio output unit 144.

また、対話制御部１４５は、音声認識部１４３が乗員４２の音声から、承認の意思を特定した場合、承認に対応する処理を実行する。また、対話制御部１４５は、音声認識部１４３が乗員４２の音声から、否定の意思を特定した場合、否定に対応する処理を実行する。 Further, when the speech recognition unit 143 specifies the intention of approval from the voice of the occupant 42, the dialogue control unit 145 executes processing corresponding to the approval. Further, when the speech recognition unit 143 specifies the negative intention from the voice of the occupant 42, the dialogue control unit 145 executes a process corresponding to the negative.

承認に対応する処理は、出力された音声に対して乗員４２が承認をした場合に、実行されることが予め定められた処理である。また、否定に対応する処理は、出力された音声に対して乗員４２が否定をした場合に、実行されることが予め定められた処理である。承認に対応する処理、および、否定に対応する処理の具体例については、後述する。 The process corresponding to the approval is a process that is predetermined to be executed when the occupant 42 approves the output voice. Further, the process corresponding to the negative is a process which is predetermined to be executed when the occupant 42 makes a negative with respect to the output voice. Specific examples of the process corresponding to the approval and the process corresponding to the denial will be described later.

次に、以上のように構成された本実施形態のＥＣＵ１４における対話制御処理について図５〜１０を用いて説明する。図５は、本実施形態にかかる対話制御処理の手順の一例を示すフローチャートである。 Next, dialogue control processing in the ECU 14 of the present embodiment configured as described above will be described using FIGS. FIG. 5 is a flowchart showing an example of the procedure of the dialogue control process according to the present embodiment.

音声出力部１４４は、音声出力装置８ｂを制御して、乗員４２に対する音声の出力を開始する（Ｓ１）。また、図６は、本実施形態にかかるＥＣＵ１４による音声ガイダンスと、乗員４２との対話の一例を示す図である。図５に示すＳ１の処理では、図６に示す音声ガイダンスの中の、音声ｔ２の出力が開始されたものとする。 The voice output unit 144 controls the voice output device 8b to start voice output to the occupant 42 (S1). Moreover, FIG. 6 is a figure which shows an example of the audio | voice guidance by ECU14 concerning this embodiment, and a dialog with the passenger | crew 42. As shown in FIG. In the process of S1 shown in FIG. 5, it is assumed that the output of the voice t2 in the voice guidance shown in FIG. 6 is started.

また、図５のフローチャートに戻り、取得部１４１は、撮像装置１５から撮像画像を取得する（Ｓ２）。また、取得部１４１は、撮像画像から、乗員４２の画像を取得する。また、判断部１４２は、取得部１４１が取得した乗員４２の画像から、乗員４２の動作を判断する。 Further, returning to the flowchart of FIG. 5, the acquisition unit 141 acquires a captured image from the imaging device 15 (S2). The acquisition unit 141 also acquires an image of the occupant 42 from the captured image. Further, the determination unit 142 determines the operation of the occupant 42 from the image of the occupant 42 acquired by the acquisition unit 141.

判断部１４２は、音声ｔ２が出力されている途中で、乗員４２が承認動作をしたと判断した場合（Ｓ３“Ｙｅｓ”）、音声の出力を中断すると判断する。この場合、音声出力部１４４は、音声ｔ２の出力を中断する（Ｓ４）。そして、対話制御部１４５は、音声ｔ２の承認に対応する処理を実行する（Ｓ５）。 When it is determined that the occupant 42 has performed the approval operation while the voice t2 is being output (S3 “Yes”), the determination unit 142 determines to interrupt the voice output. In this case, the audio output unit 144 interrupts the output of the audio t2 (S4). Then, the dialogue control unit 145 executes a process corresponding to the approval of the voice t2 (S5).

図７は、本実施形態にかかる乗員４２の承認動作の一例を示す図である。図７に示すように、音声ｔ２の途中で乗員４２が頷く等の承認動作をしたと判断部１４２が判断した場合、音声出力部１４４は音声ｔ２の出力を中断する。また、対話制御部１４５は、音声ｔ２の承認に対応する処理を実行する。 FIG. 7 is a view showing an example of the approval operation of the occupant 42 according to the present embodiment. As illustrated in FIG. 7, when the determination unit 142 determines that the occupant 42 performs an approval operation such as peeping in the middle of the sound t2, the sound output unit 144 interrupts the output of the sound t2. Further, the dialogue control unit 145 executes a process corresponding to the approval of the voice t2.

図７に示す例では、音声ｔ２は、目的地に対する承認または否定を求める音声であるため、音声ｔ２の承認に対応する処理は、音声ｔ２に含まれる目的地を、乗員４２が承認した場合に実行される処理である。音声ｔ２の承認に対応する処理として、対話制御部１４５は、音声出力部１４４に、乗員４２の承認に対して応答する音声ｔ３を出力させる。また、対話制御部１４５は、音声ｔ２で示した目的地を、ナビゲーションシステムの目的地として設定する。音声ｔ２の承認に対応する処理は、これに限定されるものではなく、出力された音声ｔ２の内容に応じて異なる。 In the example shown in FIG. 7, the voice t2 is a voice for requesting approval or denial with respect to the destination, so the processing corresponding to the approval of the voice t2 is performed when the occupant 42 approves the destination included in the voice t2. It is a process to be performed. As a process corresponding to the approval of the voice t2, the dialogue control unit 145 causes the voice output unit 144 to output the voice t3 in response to the approval of the occupant 42. Further, the dialogue control unit 145 sets the destination indicated by the voice t2 as the destination of the navigation system. The process corresponding to the approval of the voice t2 is not limited to this, and varies depending on the content of the output voice t2.

一般に、乗員４２が音声ｔ２の内容を承認しているにも関わらず、音声ｔ２が最後まで出力されると、乗員４２は音声ｔ２の終了を待たなければならず、対話をスムーズに行うことが困難になる場合がある。本実施形態の音声出力部１４４は、音声ｔ２の途中で乗員４２が承認動作をした場合に音声ｔ２の出力を中断し、対話制御部１４５が後続の処理に移行することで、スムーズに乗員４２と対話することができる。 Generally, although the occupant 42 approves the content of the voice t2, when the voice t2 is output to the end, the occupant 42 has to wait for the end of the voice t2, and the dialogue can be smoothly performed. It can be difficult. The voice output unit 144 according to the present embodiment interrupts the output of the voice t2 when the occupant 42 performs the approval operation in the middle of the voice t2, and the dialog control unit 145 shifts to the subsequent processing, whereby the passenger 42 smoothly. Can interact with

また、一般に、人間同士での対話では、音声だけではなく、動作によっても承認等の意思を伝達し合うことでスムーズに対話を進めている。本実施形態の判断部１４２は、乗員４２の承認等の意思を表す動作を判断することができるため、音声出力部１４４は、より自然なタイミングで音声ｔ２の出力を中断することができる。また、音声出力部１４４は、乗員４２が承認動作をした場合に音声ｔ２の出力を中断することにより、対話に要する時間を短縮することができる。 Also, in general, in human-to-human dialogue, the dialogue is progressed smoothly by communicating intentions such as approval not only by speech but also by action. Since the determination unit 142 of the present embodiment can determine the operation representing the intention of the occupant 42 or the like, the voice output unit 144 can interrupt the output of the voice t2 at a more natural timing. Further, the voice output unit 144 can shorten the time required for the dialogue by interrupting the output of the voice t2 when the occupant 42 performs the approval operation.

図５のフローチャートに戻り、判断部１４２は、音声ｔ２が出力されている途中で、乗員４２が否定動作をしたと判断した場合（Ｓ３“Ｎｏ”、Ｓ６“Ｙｅｓ”）、音声の出力を中断すると判断する。この場合、音声出力部１４４は、音声ｔ２の出力を中断する（Ｓ７）。そして、対話制御部１４５は、音声ｔ２の否定に対応する処理を実行する（Ｓ８）。 Returning to the flowchart of FIG. 5, when the determination unit 142 determines that the occupant 42 has performed a negative operation while the voice t2 is being output (S3 "No", S6 "Yes"), the voice output is interrupted. I will judge. In this case, the audio output unit 144 interrupts the output of the audio t2 (S7). Then, the dialogue control unit 145 executes a process corresponding to the negation of the voice t2 (S8).

図８は、本実施形態にかかる乗員４２の否定動作の一例を示す図である。図８に示すように、音声ｔ２の途中で乗員４２が首を横に振る等の否定動作をしたと判断部１４２が判断した場合、音声出力部１４４は音声ｔ２の出力を中断する。また、対話制御部１４５は、音声ｔ２の否定に対応する処理を実行する。 FIG. 8 is a diagram showing an example of the negative operation of the occupant 42 according to the present embodiment. As shown in FIG. 8, when the determination unit 142 determines that the occupant 42 performs a negative operation such as swinging his / her neck or the like in the middle of the voice t2, the voice output unit 144 interrupts the output of the voice t2. Further, the dialogue control unit 145 executes a process corresponding to the negation of the voice t2.

音声ｔ２の否定に対応する処理の一例として、対話制御部１４５は、音声出力部１４４に、乗員４２の否定に対して応答する音声ｔ４を出力させる。例えば、図８に示すように、音声出力部１４４が目的地がＡ牧場であることを音声で出力している間に、乗員４２が否定動作をした場合は、対話制御部１４５は、音声出力部１４４に、目的地を質問する音声を出力させる。音声ｔ２の否定に対応する処理は、これに限定されるものではなく、出力された音声ｔ２の内容に応じて異なる。 As an example of the process corresponding to the denial of the voice t2, the dialogue control unit 145 causes the voice output unit 144 to output the voice t4 responding to the denial of the occupant 42. For example, as illustrated in FIG. 8, when the occupant 42 performs a negative operation while the voice output unit 144 outputs voice that the destination is A ranch, the dialogue control unit 145 performs voice output The part 144 is made to output an audio questioning the destination. The process corresponding to the negation of the voice t2 is not limited to this, and varies depending on the content of the output voice t2.

乗員４２が否定動作をしたと判断された場合に音声出力部１４４が、音声ｔ２の出力を中断し、対話制御部１４５が後続の処理に移行することで、ＥＣＵ１４は、乗員４２を待たせずにスムーズに対話をすることができる。また、音声出力部１４４は、乗員４２が否定動作をした場合に音声ｔ２の出力を中断することにより、対話に要する時間を短縮することができる。 When it is determined that the occupant 42 has performed a negative operation, the voice output unit 144 interrupts the output of the voice t2, and the dialog control unit 145 shifts to the subsequent processing, whereby the ECU 14 does not wait for the occupant 42. Can interact smoothly. Further, the voice output unit 144 can shorten the time required for the dialogue by interrupting the output of the voice t2 when the occupant 42 makes a negative operation.

図５のフローチャートに戻り、判断部１４２は、音声ｔ２が出力されている途中で、乗員４２が発話動作をしたと判断した場合（Ｓ６“Ｎｏ”、Ｓ９“Ｙｅｓ”）、音声の出力を中断すると判断する。この場合、音声出力部１４４は、音声ｔ２の出力を中断する（Ｓ１０）。 Returning to the flowchart of FIG. 5, when the determination unit 142 determines that the occupant 42 has made a speech operation while the voice t2 is being output (S6 “No”, S9 “Yes”), the voice output is interrupted I will judge. In this case, the audio output unit 144 interrupts the output of the audio t2 (S10).

図９は、本実施形態にかかる乗員４２の発話動作の一例を示す図である。図９に示すように、音声ｔ２の途中で乗員４２が口を開く等の発話動作をしたと判断部１４２が判断した場合、音声出力部１４４は音声ｔ２の出力を中断する。 FIG. 9 is a view showing an example of the utterance operation of the occupant 42 according to the present embodiment. As illustrated in FIG. 9, when the determination unit 142 determines that the occupant 42 performs an utterance operation such as opening the mouth in the middle of the sound t2, the sound output unit 144 interrupts the output of the sound t2.

乗員４２が発話動作をした場合に音声出力部１４４が音声ｔ２の出力を中断することで、対話における話し手が、ＥＣＵ１４から乗員４２に自然なタイミングで交代する。このため、乗員４２は、音声ｔ２に発話を妨げられずに、スムーズに発話を行うことができる。また、乗員４２が発話した音声と、音声出力部１４４が出力した音声ｔ２とが重複しないため、音声認識部１４３が乗員４２が発話した音声をより高精度に認識することができる。 The speech output unit 144 interrupts the output of the voice t2 when the occupant 42 makes an utterance operation, whereby the talker in the dialog changes from the ECU 14 to the occupant 42 at a natural timing. Therefore, the occupant 42 can speak smoothly without being disturbed by the voice t2. Further, since the voice uttered by the occupant 42 and the voice t2 output by the voice output unit 144 do not overlap, the voice recognition unit 143 can more accurately recognize the voice uttered by the occupant 42.

本実施形態においては、乗員４２が発話動作をしたと判断された場合は音声ｔ２の出力を中断して処理が終了するが、音声ｔ２の出力が中断された後に、音声認識部１４３による音声認識待ちの状態となっても良い。また、音声ｔ２の出力が中断されている間に、取得部１４１が撮像画像を繰り返し取得して判断の処理を繰り返し、乗員４２が発話動作をしていないと判断された場合に、音声出力部１４４が音声ｔ２の出力を再開しても良い。 In the present embodiment, when it is determined that the occupant 42 has made an utterance operation, the output of the voice t2 is interrupted and the process is ended, but after the output of the voice t2 is interrupted, the voice recognition by the voice recognition unit 143 is performed It may be in a waiting state. Further, while the output of the voice t2 is interrupted, the voice output unit is obtained when the obtaining unit 141 repeatedly obtains the captured image and repeats the determination process, and it is determined that the occupant 42 does not perform the speech operation. 144 may resume the output of the voice t2.

図５のフローチャートに戻り、判断部１４２は、音声ｔ２が出力されている途中で、乗員４２が保留動作をしたと判断した場合（Ｓ９“Ｎｏ”、Ｓ１１“Ｙｅｓ”）、音声の出力を中断すると判断する。この場合、音声出力部１４４は、音声ｔ２の出力を中断する（Ｓ１２）。そして、保留に対応する処理が開始される（Ｓ１３）。本実施形態においては。保留に対応する処理は、音声の認識時間の延長である。音声の認識時間において、音声認識部１４３は、乗員４２の音声が音声入力部２４に入力されることを待つ、認識待ち状態となる。乗員４２の音声が入力されると、音声認識部１４３は、当該音声を認識し、内容を特定する。 Returning to the flowchart of FIG. 5, when the determination unit 142 determines that the occupant 42 performs the on-hold operation while the voice t2 is being output (S9 “No”, S11 “Yes”), the voice output is interrupted I will judge. In this case, the audio output unit 144 interrupts the output of the audio t2 (S12). Then, the process corresponding to the suspension is started (S13). In this embodiment. The process corresponding to the hold is an extension of the speech recognition time. During the speech recognition time, the speech recognition unit 143 waits for recognition that the speech of the occupant 42 is input to the speech input unit 24. When the voice of the occupant 42 is input, the voice recognition unit 143 recognizes the voice and specifies the content.

また、判断部１４２は、音声ｔ２が出力されている途中で、乗員４２が聞き取れないことを表す動作をしたと判断した場合（Ｓ１１“Ｎｏ”、Ｓ１４“Ｙｅｓ”）、音声の出力を中断すると判断する。この場合、音声出力部１４４は、音声ｔ２の出力を中断する（Ｓ１５）。そして、音声出力部１４４は、Ｓ１に戻り、乗員４２が聞き取れないことを表す動作をしたと判断された際に出力していた音声ｔ２を、再度出力する。 In addition, when the determination unit 142 determines that the occupant 42 has performed an operation indicating that it can not hear the voice while the voice t2 is being output (S11 “No”, S14 “Yes”), the voice output is interrupted. to decide. In this case, the audio output unit 144 interrupts the output of the audio t2 (S15). Then, the sound output unit 144 returns to S1, and outputs again the sound t2 that has been output when it is determined that the occupant 42 has performed an operation indicating that it can not hear.

図１０は、本実施形態にかかる聞き取れないことを表す動作の一例を示す図である。図１０に示すように、音声ｔ２の途中で乗員４２が目を見開く等の、聞き取れないことを表す動作をしたと判断部１４２が判断した場合、音声出力部１４４は、音声ｔ２を冒頭から再度出力する。 FIG. 10 is a diagram showing an example of an operation representing inaudibility according to the present embodiment. As shown in FIG. 10, when the determination unit 142 determines that the occupant 42 performs an operation to indicate that it can not hear, such as opening its eyes, in the middle of the sound t2, the sound output unit 144 starts the sound t2 again from the beginning. Output.

図５のフローチャートに戻り、判断部１４２が、乗員４２が承認動作、否定動作、発話動作、保留動作、聞き取れないことを表す動作、のいずれもしていないと判断した場合であって（Ｓ１４“Ｎｏ”）、音声出力部１４４が音声ｔ２の出力を終了しない場合は（Ｓ１６“Ｎｏ”）、Ｓ２〜Ｓ１６の処理が繰り返される。 Referring back to the flowchart of FIG. 5, the determination unit 142 determines that the occupant 42 does not perform any of the approval operation, the negative operation, the utterance operation, the hold operation, and the operation indicating that it can not be heard (S14 “No” When the audio output unit 144 does not end the output of the audio t2 (S16 “No”), the processing of S2 to S16 is repeated.

また、判断部１４２が、乗員４２が承認動作、否定動作、発話動作、保留動作、聞き取れないことを表す動作、のいずれもしていないと判断した場合であって（Ｓ１４“Ｎｏ”）、音声ｔ２の出力が終了した場合（Ｓ１６“Ｙｅｓ”）、当該フローチャートの処理は終了する。 In addition, in the case where the determination unit 142 determines that the occupant 42 does not perform any of the approval operation, the negative operation, the speech operation, the hold operation, and the operation indicating that it can not be heard (S14 “No”), the voice t2 When the output of (S16 "Yes") is completed, the processing of the flowchart ends.

判断部１４２が、乗員４２が承認動作、否定動作、発話動作、保留動作、聞き取れないことを表す動作、のいずれもしていないと判断した場合であって、音声ｔ２の出力が終了した場合は、図６に示すように、対話制御部１４５は、音声認識部１４３によって認識された乗員４２が発話した音声による応答、または、動作による応答に応じて、後続の処理を行う。例えば、対話制御部１４５は、図６に示す音声ｔ２に対して、乗員４２が承認の応答をした場合、音声ｔ２に対する承認に対応する処理を実行する。また、対話制御部１４５は、音声ｔ２に対して、乗員４２が否定の応答をした場合、音声ｔ２に対する否定に対応する処理を実行する。 When the determination unit 142 determines that the occupant 42 does not perform any of the approval operation, the negative operation, the speech operation, the hold operation, and the operation indicating that the user can not hear, and the output of the voice t2 ends, As shown in FIG. 6, the dialogue control unit 145 performs the subsequent processing according to the response by the voice uttered by the occupant 42 recognized by the voice recognition unit 143 or the response by the operation. For example, when the crew member 42 responds to the voice t2 shown in FIG. 6 as an approval response, the dialog control unit 145 executes a process corresponding to the approval for the voice t2. Further, when the occupant 42 gives a negative response to the voice t2, the dialog control unit 145 executes processing corresponding to the negative of the voice t2.

また、音声ガイダンスが継続しており、音声出力部１４４が次の音声を出力する場合は、図５のフローチャートの処理は繰り返し実行される。 Further, when the voice guidance continues and the voice output unit 144 outputs the next voice, the processing of the flowchart of FIG. 5 is repeatedly executed.

図６〜１０に示す音声出力部１４４が出力する音声ｔ１〜ｔ４および乗員４２が発話する音声ｒ１〜ｒ３の内容は一例であり、これに限定されるものではない。 The contents of the voices t1 to t4 output from the voice output unit 144 and the voices r1 to r3 uttered by the occupant 42 illustrated in FIGS. 6 to 10 are merely examples, and the present invention is not limited thereto.

また、承認に対応する処理、および、否定に対応する処理は、図６〜１０に示す内容に限定されるものではない。また、図６に示す音声ｔ１のように、乗員４２に対して承認または否定を求める内容ではない音声が出力されている間に乗員４２が承認動作または否定動作をしたと判断された場合は、音声出力部１４４が単に音声の出力を中断するだけでも良い。 Further, the process corresponding to approval and the process corresponding to negation are not limited to the contents shown in FIGS. When it is determined that the occupant 42 has performed the approval operation or the negative operation while the voice that is not the content for requesting the approval or denial is output to the occupant 42 as illustrated in the voice t1 illustrated in FIG. The voice output unit 144 may simply interrupt the voice output.

このように、本実施形態のＥＣＵ１４は、撮像画像に含まれる乗員４２の画像から取得された乗員４２の動作や状態に基づいて、音声の出力を中断するため、よりスムーズに乗員４２と対話することができる。 As described above, the ECU 14 according to the present embodiment interacts more smoothly with the occupant 42 in order to interrupt the voice output based on the operation and the state of the occupant 42 acquired from the image of the occupant 42 included in the captured image. be able to.

また、本実施形態のＥＣＵ１４は、乗員４２が発話動作、承認動作、否定動作、保留動作、聞き取れないことを表す動作、のいずれかの動作をしたと判断した場合に、音声の出力を中断する。このため、本実施形態のＥＣＵ１４によれば、乗員４２にとって不要な音声の出力を抑制できる。 Further, the ECU 14 of the present embodiment interrupts the voice output when it is determined that the occupant 42 has performed any of the speech operation, the approval operation, the negative operation, the hold operation, and the operation indicating that the user can not hear it. . Therefore, according to the ECU 14 of the present embodiment, it is possible to suppress the output of sound unnecessary for the occupant 42.

例えば、ＥＣＵ１４が、乗員４２が発話動作をしたと判断した場合に音声の出力を中断することで、乗員４２の発話を妨げることを抑制することができる。これにより、本実施形態のＥＣＵ１４によれば、乗員４２がスムーズに発話を行うことができる。また、本実施形態のＥＣＵ１４によれば、乗員４２が承認動作、または否定動作をしたと判断した場合に音声の出力を中断することで、乗員４２にとって不要な音声の出力を抑制して対話に要する時間を削減すると共に、スムーズに対話を行うことができる。また、本実施形態のＥＣＵ１４によれば、乗員４２が保留動作をしたと判断した場合に音声の出力を中断することで、乗員４２が承認か否定かの意思決定をしていない状態のまま音声の出力が継続することを抑制することができる。また、本実施形態のＥＣＵ１４によれば、乗員４２が聞き取れないことを表す動作をしたと判断した場合に音声の出力を中断することで、乗員４２が音声を聞き取れていない状態で音声の出力が継続することを抑制することができる。 For example, when the ECU 14 determines that the occupant 42 has made a speech operation, interrupting the output of the voice can suppress interference with the utterance of the occupant 42. Thus, according to the ECU 14 of the present embodiment, the occupant 42 can utter smoothly. Further, according to the ECU 14 of the present embodiment, when it is determined that the occupant 42 has performed the approval operation or the negative operation, the output of the voice is interrupted to suppress the output of the voice unnecessary for the occupant 42 and make the dialog The time required can be reduced and the dialogue can be conducted smoothly. Further, according to the ECU 14 of the present embodiment, the voice output is interrupted when it is determined that the occupant 42 has performed the hold operation, so that the voice is still determined while the occupant 42 is not making an approval or denial decision. It is possible to suppress the output of the Further, according to the ECU 14 of the present embodiment, when it is determined that the occupant 42 has performed an operation indicating that it can not hear, the output of the voice is interrupted, so that the voice output can be performed in a state where the occupant 42 can not hear the voice. It is possible to suppress continuation.

さらに、本実施形態のＥＣＵ１４によれば、乗員４２が承認動作をしたと判断した場合、音声の出力を中断し、出力された音声の承認に対応する処理を実行するため、乗員４２の承認の意思に沿った後続の処理を、乗員４２を待たせずに、迅速に開始することができる。このため、本実施形態にかかるＥＣＵ１４によれば、よりスムーズに乗員４２との対話を行うことができる。 Furthermore, according to the ECU 14 of the present embodiment, when it is determined that the occupant 42 has performed the approval operation, the output of the voice is interrupted, and the processing corresponding to the approval of the output voice is performed. Subsequent processing in line with the intention can be quickly initiated without waiting for the occupant 42. For this reason, according to the ECU 14 according to the present embodiment, it is possible to interact with the occupant 42 more smoothly.

さらに、本実施形態のＥＣＵ１４によれば、乗員４２が否定動作をしたと判断した場合、音声の出力を中断し、出力された音声の否定に対応する処理を実行するため、乗員４２の否定の意思に沿った後続の処理を、乗員４２を待たせずに、迅速に開始することができる。このため、本実施形態にかかるＥＣＵ１４によれば、よりスムーズに乗員４２との対話を行うことができる。 Furthermore, according to the ECU 14 of the present embodiment, when it is determined that the occupant 42 has performed a negative operation, the output of the voice is interrupted, and processing corresponding to the denial of the output voice is performed. Subsequent processing in line with the intention can be quickly initiated without waiting for the occupant 42. For this reason, according to the ECU 14 according to the present embodiment, it is possible to interact with the occupant 42 more smoothly.

さらに、本実施形態のＥＣＵ１４は、乗員４２が保留動作をしたと判断した場合、乗員４２の音声の認識時間を延長する。このため、本実施形態のＥＣＵ１４によれば、乗員４２が意思決定を行うまで待つことができ、乗員４２が回答するペースに合わせて対話を行うことができる。 Furthermore, the ECU 14 according to the present embodiment extends the recognition time of the voice of the occupant 42 when it is determined that the occupant 42 performs the on-hold operation. For this reason, according to the ECU 14 of the present embodiment, it is possible to wait until the occupant 42 makes a decision, and it is possible to perform dialogue in accordance with the pace at which the occupant 42 responds.

さらに、本実施形態のＥＣＵ１４は、乗員４２が聞き取れないことを表す動作をしたと判断した場合、出力していた音声を再度出力する。本実施形態のＥＣＵ１４によれば、乗員４２が聞き取れなかった音声を再度出力することにより、乗員４２との対話を円滑に行うことができる。また、本実施形態のＥＣＵ１４によれば、乗員４２が音声の再出力の操作をしたり、ＥＣＵ１４との対話を最初からやり直したり、といった手間が発生することを抑制することができる。 Furthermore, the ECU 14 according to the present embodiment re-outputs the voice that has been output, when it is determined that the occupant 42 has performed an operation indicating that it can not hear. According to the ECU 14 of the present embodiment, it is possible to smoothly interact with the occupant 42 by outputting again the voice that the occupant 42 could not hear. Further, according to the ECU 14 of the present embodiment, it is possible to suppress the occurrence of the trouble that the occupant 42 operates the re-output of the voice, and the interaction with the ECU 14 is re-started from the beginning.

なお、本実施形態では、音声出力部１４４と対話制御部１４５とを異なる機能部としたが、音声出力部１４４と対話制御部１４５との機能は、１つの機能部が有する構成としても良い。また、判断部１４２と、対話制御部１４５との機能は、１つの機能部が有する構成としても良い。 In the present embodiment, the voice output unit 144 and the dialogue control unit 145 are different functional units, but the functions of the voice output unit 144 and the dialogue control unit 145 may be configured to be provided by one functional unit. Further, the functions of the determination unit 142 and the dialogue control unit 145 may be configured to be included in one function unit.

（変形例１）
上述の実施形態では、乗員４２が頷く動作をした場合に、当該動作を承認動作と判断し、音声の出力を中断していた。しかしながら、一般に、人間は、話し手に対して話を促す意図で、相槌として、連続して頷く動作をする場合がある。そこで、本変形例では、乗員４２が相槌をした場合には、音声の出力を中断しない。 (Modification 1)
In the above-mentioned embodiment, when the crew member 42 makes a wriggling motion, the motion is judged as the approval motion, and the output of the sound is interrupted. However, in general, human beings may operate continuously in a row as a compliment with the intention of prompting the speaker to speak. So, in this modification, when crew member 42 makes a fight, output of sound is not interrupted.

本変形例の判断部１４２は、取得部１４１が取得した撮像画像に含まれる乗員４２の画像から、乗員４２が相槌をしたか否かを判断する。より詳細には、判断部１４２は、複数フレームに渡る撮像画像を画像処理することにより、一定の時間における乗員４２の連続する動作を判断する。判断部１４２は、乗員４２が首を縦に１度振った場合は、当該動作を承認動作と判断する。また、判断部１４２は、乗員４２が連続して複数回首を縦に振った場合は、当該動作を相槌と判断する。なお、承認動作と相槌とを区別する手法は、これに限定されるものではない。また、乗員４２の連続する動作を検出する画像処理は、取得部１４１が行うものとしても良い。また、判断部１４２は、入力された乗員４２の画像をＲＮＮ等のディープラーニングの手法を用いて処理することにより、乗員４２が相槌したことを判断しても良い。 From the image of the occupant 42 included in the captured image acquired by the acquisition unit 141, the determination unit 142 of the present modification example determines whether or not the occupant 42 has made a fight. More specifically, the determination unit 142 performs image processing on captured images across a plurality of frames to determine the continuous operation of the occupant 42 at a fixed time. When the occupant 42 shakes the neck vertically once, the determination unit 142 determines that the operation is an approval operation. In addition, when the occupant 42 shakes his / her neck a plurality of times in succession, the determination unit 142 determines that the motion is a compliment. In addition, the method of distinguishing an approval operation and a sumo wrestling is not limited to this. Further, the image processing for detecting the continuous motion of the occupant 42 may be performed by the acquisition unit 141. In addition, the determination unit 142 may determine that the occupant 42 is in conflict by processing the input image of the occupant 42 using a deep learning method such as RNN.

本変形例の判断部１４２は、乗員４２が相槌をしたと判断した場合は、音声の出力を中断しないと判断する。そして、音声出力部１４４は、判断部１４２が音声の出力を中断しないと判断した場合には、音声の出力を継続する。 When it is determined that the occupant 42 has made a fight, the determination unit 142 of the present modification example determines that the audio output is not interrupted. Then, when the determination unit 142 determines that the output of the audio is not interrupted, the audio output unit 144 continues the output of the audio.

また、本変形例の対話制御部１４５は、音声出力部１４４が出力した音声と、当該音声の出力中に乗員４２が相槌をしたと判断されたタイミングとを対応付けて、記憶部１４０に保存する。例えば、音声出力部１４４は、出力した音声を特定する識別情報と、出力開始時刻と、乗員４２が相槌をしたと判断された時刻とを対応付けて、記憶部１４０に保存する。記憶部１４０に保存される内容は一例であり、これに限定されるものではない。 Further, the dialogue control unit 145 of the present modification associates the voice output by the voice output unit 144 with the timing when it is determined that the occupant 42 has made a fight during the output of the voice, and stores them in the storage unit 140. Do. For example, the voice output unit 144 stores the identification information for specifying the output voice, the output start time, and the time when it is determined that the occupant 42 has made a fight in the storage unit 140. The content stored in the storage unit 140 is an example, and the present invention is not limited to this.

乗員４２が相槌をしたと判断された時点においては、乗員４２は、出力された音声の内容に対して同意している可能性が高い。このため、出力された音声と、相槌のタイミングとを対応付けて保存することにより、乗員４２が相槌をしたと判断された時点までに出力された音声の内容に対しては、乗員４２の同意を得ていたことについての信頼性を担保することができる。 When it is determined that the crew member 42 has made a fight, it is highly likely that the crew member 42 agrees with the content of the outputted voice. Therefore, by storing the output voice and the timing of the reciprocation in association with each other, the occupant 42 agrees with the content of the sound output until the time when the occupant 42 is determined to have made a reciprocation. You can guarantee the credibility of what you have earned.

（変形例２）
上述の実施形態では、乗員４２の音声を音声認識する機能を有するＥＣＵ１４を例として説明したが、ＥＣＵ１４は、音声認識の機能を有さなくても良い。例えば、音声出力部１４４が出力した音声に対して、乗員４２が承認や否定等の動作をしたと判断部１４２が判断することによって、ＥＣＵ１４は、音声認識の機能がなくとも、乗員４２との対話を行うことができる。 (Modification 2)
Although the above-mentioned embodiment explained as an example ECU14 which has a function which carries out voice recognition of the voice of crew member 42, ECU14 does not need to have a function of voice recognition. For example, when the determination unit 142 determines that the occupant 42 performs an operation such as approval or denial with respect to the sound output from the voice output unit 144, the ECU 14 does not have the voice recognition function. You can interact.

（変形例３）
上述の実施形態では、音声の出力を中断した後の処理（例えば、承認に対応する処理、および、否定に対応する処理）は、出力されていた音声の内容に応じて予め定められているものとしていたが、音声の出力を中断した後の処理は、これに限定されるものではない。 (Modification 3)
In the above-described embodiment, the processing after interrupting the output of the voice (for example, the processing corresponding to the approval and the processing corresponding to the negation) is determined in advance according to the content of the voice that has been output. However, the processing after interrupting the audio output is not limited to this.

例えば、対話制御部１４５は、ＲＮＮ等のディープラーニングの手法を用いて、音声の出力を中断した後の処理を決定しても良い。より詳細には、対話制御部１４５は、ディープラーニングの入力データとして、撮像画像に含まれる乗員４２の動作または状態を示す時系列のデータと、音声の出力内容の時系列のデータとを入力し、音声の中断後の後続処理を決定しても良い。また、ディープラーニングの入力データとして、撮像画像が用いられても良い。また、当該ディープラーニングの処理は、判断部１４２が実行するものとしても良い。 For example, the dialogue control unit 145 may determine processing after interrupting the output of voice using a deep learning method such as RNN. More specifically, the dialogue control unit 145 inputs, as deep learning input data, time-series data indicating the operation or state of the occupant 42 included in the captured image, and time-series data of audio output content. , And subsequent processing after interruption of speech may be determined. In addition, a captured image may be used as deep learning input data. Further, the deep learning process may be performed by the determination unit 142.

（変形例４）
上述の実施形態では、保留に対応する処理は音声の認識時間の延長としていたが、これに限定されるものではない。例えば、乗員４２が保留動作をしていると判断された場合、本変形例の音声出力部１４４は、乗員４２が保留動作をしていると判断された際に出力していた音声と異なる音声を出力する。より詳細には、乗員４２が保留動作をしていると判断された際に出力していた音声と異なる音声は、乗員４２に他の提案を行う音声である。 (Modification 4)
In the above embodiment, the process corresponding to the hold is an extension of the speech recognition time, but is not limited to this. For example, when it is determined that the passenger 42 is in the hold operation, the voice output unit 144 of this modification is different from the voice that is output when it is determined that the passenger 42 is in the hold operation. Output More specifically, the voice different from the voice that is output when it is determined that the occupant 42 performs the on-hold operation is a voice for making another proposal to the occupant 42.

図１１は、本変形例にかかる保留に対応する処理の一例を示す図である。図１１に示すように、音声入力によって目的地の入力を行う処理において、音声出力部１４４は、乗員４２が発話した音声に対する音声認識部１４３の音声認識によって特定された第１の候補として、「Ａ牧場」を示す音声ｔ２を出力する。ここで、乗員４２が保留動作をしたと判断された場合に、音声出力部１４４は、第２候補として「Ｂ牧場」を提案する音声ｔ５を出力する。 FIG. 11 is a diagram illustrating an example of processing corresponding to suspension according to the present modification. As shown in FIG. 11, in the process of inputting the destination by voice input, the voice output unit 144 sets “the first candidate specified by voice recognition of the voice recognized by the passenger 42 by voice recognition of the voice recognition unit 143. A voice t2 indicating "A ranch" is output. Here, when it is determined that the occupant 42 performs the holding operation, the voice output unit 144 outputs a voice t5 proposing “B ranch” as the second candidate.

第２の候補を決定する手法は、例えば、音声認識部１４３による音声認識結果から、乗員４２が発話した可能性が高い順に、第１候補、第２候補の単語を特定しても良い。または、対話制御部１４５または判断部１４２が、第１候補である目的地（Ａ牧場）から距離が近く、単語のカテゴリ（牧場等）が同じである地名や施設名等を、不図示の地図情報から検索しても良い。 As a method of determining the second candidate, for example, the words of the first candidate and the second candidate may be specified in descending order of the possibility that the occupant 42 uttered from the speech recognition result by the speech recognition unit 143. Alternatively, the dialogue control unit 145 or the determination unit 142 is near the first destination (A ranch), and the place name, facility name, etc. of the same word category (such as ranch) are not shown. You may search from the information.

このように、本変形例のＥＣＵ１４は、乗員４２が保留動作をしていると判断された際に出力していた音声と異なる音声を出力するため、乗員４２が当初出力された音声の内容を承認または否定しない場合、他の選択肢を提示することによって、乗員４２が意思決定をすることを支援することができる。 As described above, since the ECU 14 of the present modification outputs a voice different from the voice output when the occupant 42 is determined to be on hold, the content of the voice originally output by the occupant 42 is If not approved or denied, presenting the other options can help the occupant 42 to make a decision.

以上、本発明の実施形態を例示したが、上記実施形態および変形例はあくまで一例であって、発明の範囲を限定することは意図していない。上記実施形態や変形例は、その他の様々な形態で実施されることが可能であり、発明の要旨を逸脱しない範囲で、種々の省略、置き換え、組み合わせ、変更を行うことができる。また、各実施形態や各変形例の構成や形状は、部分的に入れ替えて実施することも可能である。 As mentioned above, although the embodiment of the present invention was illustrated, the above-mentioned embodiment and modification are an example to the last, and limiting the scope of the invention is not intended. The above embodiment and modifications can be implemented in other various forms, and various omissions, replacements, combinations, and changes can be made without departing from the scope of the invention. In addition, the configurations and shapes of the embodiments and the modifications may be partially replaced and implemented.

１…車両、８ｂ…音声出力装置、１０…運転支援システム、１４…ＥＣＵ、１５…撮像装置、２４…音声入力部、４２…乗員、１４０…記憶部、１４１…取得部、１４２…判断部、１４３…音声認識部、１４４…音声出力部、１４５…対話制御部。 Reference Signs List 1 vehicle 8 b voice output device 10 driving support system 14 ECU 15 imaging device 24 voice input unit 42 occupant 140 storage unit 141 acquisition unit 142 determination unit 142 143: speech recognition unit, 144: speech output unit, 145: dialogue control unit.

Claims

An audio output unit that outputs audio to the occupant;
An acquisition unit that acquires an image of an occupant included in a captured image of an imaging device that captures an interior of a vehicle;
A determination unit that determines whether or not to interrupt the output of the sound based on the operation or state of the occupant included in the acquisition result of the acquisition unit;
The voice output unit interrupts the output of the voice when the determination unit determines that the output of the voice is interrupted.
Driving support device.

The motion is any of a motion of speech, a motion of representing approval, a motion of representing negative intention, a motion of representing holding intention, or a motion of representing inability to hear the outputted voice.
The driving support device according to claim 1.

A dialog control unit for executing processing corresponding to the approval of the output voice when it is determined that the occupant is performing an operation indicating the intention of the approval;
The driving support device according to claim 2.

A dialog control unit for executing a process corresponding to the negation of the output voice when it is determined that the occupant is performing an action indicating the negative intention;
The driving assistance device according to claim 2 or 3.

The voice recognition unit further includes a voice recognition unit that recognizes a voice uttered by the occupant with respect to the voice output by the voice output unit.
If it is determined that the occupant is operating to indicate the intention to suspend, the voice recognition unit extends the recognition time of the voice spoken by the occupant.
The driving assistance device according to any one of claims 2 to 4.

When it is determined that the occupant is operating to indicate the intention to suspend, the voice output unit outputs when it is determined that the occupant is operating to express the intention to suspend. Output a voice different from the voice,
The driving assistance device according to any one of claims 2 to 4.

When it is determined that the occupant is performing an action indicating that the voice can not be heard, the voice output unit determines that the occupant is performing an action indicating that the voice can not be heard. Re-output the voice that was being output
The driving assistance device according to any one of claims 2 to 6.