JP2023094516A

JP2023094516A - Voice command reception device, voice command reception method and program

Info

Publication number: JP2023094516A
Application number: JP2022120231A
Authority: JP
Inventors: 領平須永; Ryohei Sunaga
Original assignee: JVCKenwood Corp
Current assignee: JVCKenwood Corp
Priority date: 2021-12-23
Filing date: 2022-07-28
Publication date: 2023-07-05

Abstract

To appropriately perform an operation by voice command.SOLUTION: A voice command reception device includes: a voice command reception unit that accepts a voice command; a detection unit that detects a direction of a face of a person uttering the voice command; and an execution control unit that executes a function for the accepted voice command when the voice command reception unit accepts the voice command. If the detection unit determines that the direction of the person's face faces the direction of a microphone from which utterance sound of the voice command is acquired, the voice command reception unit accepts the voice command with a recognition rate of the voice command acquired by the voice command reception unit equal to or higher than a first threshold value, and if the detection unit determines that the direction of the person's face does not face the microphone, the voice command reception unit accepts the voice command with the recognition rate of the voice command acquired by the voice command reception unit equal to or higher than a second threshold value lower than the first threshold value.SELECTED DRAWING: Figure 5

Description

本発明は、音声コマンド受付装置、音声コマンド受付方法およびプログラムに関する。 The present invention relates to a voice command receiving device, a voice command receiving method, and a program.

音声コマンドによって操作を行う装置が多様化している。例えば、車両用記録装置、所謂ドライブレコーダにおいては、加速度センサによる衝撃検出に加え、音声コマンドによってイベント記録を行うものもある（例えば、非特許文献１）。音声コマンドによるイベント記録は、自らが事故の当事者ではない場合の事故を記録する場合など、運転中にタッチパネル等の操作を必要とせず、安全にイベント記録を行うことができる。特許文献１には、加速度によるイベント検出に対して音声による指示を行うことで、イベント記録を行うドライブレコーダが開示されている。 Devices that are operated by voice commands are diversifying. For example, some recording devices for vehicles, so-called drive recorders, perform event recording by voice commands in addition to impact detection by acceleration sensors (eg, Non-Patent Document 1). Event recording by voice commands enables safe event recording without the need to operate a touch panel or the like while driving, such as when recording an accident in which the driver is not a party to the accident. Patent Literature 1 discloses a drive recorder that records an event by issuing a voice instruction in response to event detection based on acceleration.

特開２０２０－１５４９０４号公報JP 2020-154904 A

ＤＲＶ－ＭＲ７６０[令和３年１２月２０日検索]、インターネット（ＵＲＬ：https://www.kenwood.com/jp/car/drive-recorders/products/drv-mr760/）DRV-MR760 [searched on December 20, 2021], Internet (URL: https://www.kenwood.com/jp/car/drive-recorders/products/drv-mr760/)

ドライブレコーダにイベント記録を指示する音声コマンドは、例えば「ろくがかいし」のような音声コマンドが受け付けられるよう予め設定されている。音声コマンドは、他の音声による誤検出を防止するため、ある程度の音節数で構成されることが要求される。例えば「ろくがかいし」は６音節からなる。このため、音声コマンドを正確に認識させるために、発話者はドライブレコーダの方向など、音声コマンドの発話音声を入力するマイクロフォンの方向を向いて発話することが多い。一般的なドライブレコーダは、発話者である搭乗者から見て車両の前方に設置されていることから、車両の前方である進行方向を向いた状態での音声コマンド入力は、適切に認識される。 A voice command for instructing the drive recorder to record an event is set in advance so that a voice command such as "Rokugakushi" can be accepted, for example. A voice command is required to consist of a certain number of syllables in order to prevent erroneous detection by other voices. For example, "Rokuga Kaishi" consists of six syllables. Therefore, in order to accurately recognize the voice command, the speaker often speaks while facing the direction of the microphone for inputting the spoken voice of the voice command, such as the direction of the drive recorder. A typical drive recorder is installed in front of the vehicle when viewed from the passenger, who is the speaker, so voice commands input while facing the direction of travel, which is the front of the vehicle, are properly recognized. .

しかし、音声コマンドが適切に認識されないような状況において音声コマンドが発話された場合、音声コマンドの認識率が低くなることから、音声コマンドによる指示が受け付けられない場合が生じる。このような場合、例えば、ドライブレコーダにおけるイベント記録を行う場合の音声コマンドなど、緊急性や即時性を要する操作を指示するための音声コマンドは、音声コマンドの言い直しなどによって、操作に遅れが生じてしまう。音声コマンドが適切に認識されないような状況とは、例えば、音声コマンドの発話する人物が、音声コマンドの発話音声を取得するするマイクロフォンの方向を向いていない場合に生じる可能性がある。 However, if a voice command is uttered in a situation where the voice command is not properly recognized, the recognition rate of the voice command becomes low, so there are cases where the instruction by the voice command cannot be accepted. In such a case, for example, a voice command for instructing an operation requiring urgency or immediacy, such as a voice command for event recording in a drive recorder, may cause a delay in operation due to rephrasing the voice command. end up A situation in which a voice command is not properly recognized can occur, for example, if the person speaking the voice command is not facing the microphone that captures the spoken voice of the voice command.

本発明は、音声コマンドによる操作を適切に行うことができる音声コマンド受付装置、音声コマンド受付方法およびプログラムを提供することを目的とする。 SUMMARY OF THE INVENTION It is an object of the present invention to provide a voice command receiving device, a voice command receiving method, and a program capable of appropriately performing operations using voice commands.

本発明の音声コマンド受付装置は、音声コマンドを受け付ける音声コマンド受付部と、前記音声コマンドを発話する人物の顔の向きを検出する検出部と、前記音声コマンド受付部が音声コマンドを受け付けた場合に、受け付けた音声コマンドに対する機能を実行させる実行制御部と、を備え、前記音声コマンド受付部は、前記検出部が前記人物の顔の向きが前記音声コマンドの発話音声を取得するマイクロフォンの方向を向いていると判定した場合は、前記音声コマンド受付部が取得した音声コマンドの認識率が第一閾値以上で音声コマンドを受け付け、前記人物の顔の向きが前記マイクロフォンの方向以外を向いていると判定した場合は、前記音声コマンド受付部が取得した音声コマンドの認識率が、前記第一閾値より低い第二閾値以上で音声コマンドを受け付ける。 A voice command reception device of the present invention comprises a voice command reception unit that receives a voice command, a detection unit that detects the face orientation of a person who utters the voice command, and when the voice command reception unit receives a voice command, and an execution control unit that executes a function corresponding to the received voice command, wherein the detection unit is configured such that the face of the person faces a microphone that acquires the spoken voice of the voice command. If it is determined that the voice command is received by the voice command receiving unit, the recognition rate of the voice command acquired by the voice command receiving unit is equal to or higher than the first threshold value, and the voice command is received, and the face direction of the person is determined to be in a direction other than the direction of the microphone. In this case, the voice command is accepted when the recognition rate of the voice command acquired by the voice command accepting unit is equal to or higher than the second threshold lower than the first threshold.

本発明の音声コマンド受付方法は、音声コマンドを発話する人物の顔の向きを検出するステップと、前記人物の顔の向きが前記音声コマンドの発話音声を取得するマイクロフォンの方向を向いていると判定した場合は、取得した音声コマンドの認識率が第一閾値以上で音声コマンドを受け付け、前記人物の顔の向きが前記マイクロフォンの方向以外を向いていると判定した場合は、取得した音声コマンドの認識率が、前記第一閾値より低い第二閾値以上で音声コマンドを受け付けるステップと、前記音声コマンドを受け付けた場合に、受け付けた音声コマンドに対する機能を実行させるステップと、を音声コマンド受付装置が実行する。 A method for accepting a voice command according to the present invention includes steps of detecting the orientation of the face of a person who utters a voice command, and determining that the orientation of the face of the person faces the direction of a microphone that acquires the spoken voice of the voice command. If the recognition rate of the acquired voice command is equal to or higher than the first threshold, the voice command is accepted, and if it is determined that the face of the person faces in a direction other than the direction of the microphone, the acquired voice command is recognized. A voice command receiving device executes a step of receiving a voice command at a rate equal to or higher than a second threshold lower than the first threshold, and a step of executing a function corresponding to the received voice command when the voice command is received. .

本発明のプログラムは、音声コマンドを発話する人物の顔の向きを検出するステップと、
前記人物の顔の向きが前記音声コマンドの発話音声を取得するマイクロフォンの方向を向いていると判定した場合は、取得した音声コマンドの認識率が第一閾値以上で音声コマンドを受け付け、前記人物の顔の向きが前記マイクロフォンの方向以外を向いていると判定した場合は、取得した音声コマンドの認識率が、前記第一閾値より低い第二閾値以上で音声コマンドを受け付けるステップと、前記音声コマンドを受け付けた場合に、受け付けた音声コマンドに対する機能を実行させるステップと、をコンピュータに実行させる。 A program of the present invention detects the face orientation of a person who speaks a voice command;
When it is determined that the direction of the person's face is facing the direction of the microphone that acquires the uttered voice of the voice command, the voice command is accepted when the recognition rate of the acquired voice command is equal to or higher than a first threshold, and the person's face If it is determined that the face is directed in a direction other than the direction of the microphone, the step of accepting the voice command at a recognition rate of the acquired voice command equal to or higher than a second threshold lower than the first threshold; causing the computer to execute a function corresponding to the received voice command when the voice command is received.

本発明によれば、音声コマンドによる操作を適切に行うことができる。 ADVANTAGE OF THE INVENTION According to this invention, operation by a voice command can be performed appropriately.

図１は、第一実施形態に係る記録装置の構成例を示すブロック図である。FIG. 1 is a block diagram showing a configuration example of a printing apparatus according to the first embodiment. 図２は、第一実施形態に係る制御部の処理の流れを示すフローチャートである。FIG. 2 is a flow chart showing the flow of processing by a control unit according to the first embodiment. 図３は、第二実施形態に係る記録装置の構成例を示すブロック図である。FIG. 3 is a block diagram showing a configuration example of a printing apparatus according to the second embodiment. 図４は、第二実施形態に係る制御部の処理の流れを示すフローチャートである。FIG. 4 is a flow chart showing the flow of processing by a control unit according to the second embodiment. 図５は、第三実施形態に係る音声コマンド受付装置の構成例を示すブロック図である。FIG. 5 is a block diagram showing a configuration example of a voice command accepting device according to the third embodiment. 図６は、第三実施形態に係る音声コマンド受付装置の処理の流れを示すフローチャートである。FIG. 6 is a flow chart showing the flow of processing of the voice command accepting device according to the third embodiment.

以下、添付図面を参照して、本発明に係る実施形態を詳細に説明する。なお、この実施形態により本発明が限定されるものではなく、また、以下の実施形態において、同一の部位には同一の符号を付することにより重複する説明を省略する。また、本発明に係る音声コマンド受付装置は、音声コマンドを用いて操作を行う様々な装置を想定しており、以下の実施の形態により、適用される装置が限定されるものではない。 DETAILED DESCRIPTION OF THE INVENTION Embodiments according to the present invention will be described in detail below with reference to the accompanying drawings. In addition, the present invention is not limited by this embodiment, and in the following embodiments, the same parts are denoted by the same reference numerals, thereby omitting redundant explanations. Also, the voice command receiving device according to the present invention assumes various devices that operate using voice commands, and the following embodiments do not limit the applicable devices.

［第一実施形態］
第一実施形態においては、音声コマンド受付装置の例として、車両において用いられる記録装置について説明する。 [First embodiment]
In the first embodiment, a recording device used in a vehicle will be described as an example of a voice command receiving device.

（記録装置）
図１を用いて、第一実施形態に係る記録装置の構成例を説明する。図１は、第一実施形態に係る記録装置の構成例を示すブロック図である。 (recording device)
A configuration example of the recording apparatus according to the first embodiment will be described with reference to FIG. FIG. 1 is a block diagram showing a configuration example of a printing apparatus according to the first embodiment.

記録装置１は、車両に対して発生したイベントに基づく映像などを記録する、いわゆるドライブレコーダである。記録装置１は、車両に載置されている装置であってもよいし、可搬型で車両において利用可能な装置であってもよい。記録装置１は、車両にあらかじめ設置されている装置やナビゲーション装置等の機能または構成を含んで実現されてもよい。記録装置１は、車両の運転者を含む搭乗者が車両の進行方向を向いているか否かに応じて、受け付ける音声コマンドに認識率を変更する処理を実行する。 The recording device 1 is a so-called drive recorder that records images based on events occurring in a vehicle. The recording device 1 may be a device mounted on a vehicle, or may be a portable device that can be used in the vehicle. The recording device 1 may be realized by including the function or configuration of a device pre-installed in the vehicle, a navigation device, or the like. The recording device 1 executes a process of changing the recognition rate of the received voice command according to whether or not the passengers including the driver of the vehicle are facing the traveling direction of the vehicle.

図１に示すように、記録装置１は、第一カメラ１０と、第二カメラ１２と、記録部１４と、表示部１６と、マイクロフォン１８と、加速度センサ２０と、操作部２２と、ＧＮＳＳ（ＧｌｏｂａｌＮａｖｉｇａｔｉｏｎＳａｔｅｌｌｉｔｅＳｙｓｔｅｍ）２４と、制御部（記録制御装置）２６と、を備える。記録装置１は、第一カメラ１０と、第二カメラ１２と、マイクロフォン１８とを一体的に含む装置であってもよく、第一カメラ１０と、第二カメラ１２と、マイクロフォン１８とが別体で構成された装置であってもよい。 As shown in FIG. 1, the recording device 1 includes a first camera 10, a second camera 12, a recording unit 14, a display unit 16, a microphone 18, an acceleration sensor 20, an operation unit 22, and a GNSS ( Global Navigation Satellite System) 24 and a control section (recording control device) 26 . The recording device 1 may be a device that integrally includes the first camera 10, the second camera 12, and the microphone 18, or the first camera 10, the second camera 12, and the microphone 18 are separate units. It may be a device configured with

第一カメラ１０は、車両の周辺を撮影するカメラである。第一カメラ１０は、一例としては、記録装置１に固有のカメラ、または、車両の前後方向などをそれぞれ撮影する複数のカメラである。第一実施形態では、第一カメラ１０は、例えば、車両の前方および後方を向いて配置される複数のカメラで構成され、車両の前方および後方を中心とした周辺を撮影する。第一カメラ１０は、例えば、全天周や半天周を撮影可能な単一のカメラであってもよい。第一カメラ１０は、撮影した第一映像データを制御部２６の映像データ取得部３０へ出力する。第一映像データは、例えば毎秒３０フレームの画像から構成される動画像である。 The first camera 10 is a camera that photographs the surroundings of the vehicle. The first camera 10 is, for example, a camera unique to the recording device 1 or a plurality of cameras that capture images in the front and rear directions of the vehicle. In the first embodiment, the first camera 10 is composed of, for example, a plurality of cameras arranged facing forward and rearward of the vehicle, and photographs the surroundings centering on the front and rear of the vehicle. The first camera 10 may be, for example, a single camera capable of taking full-dome or half-dome shots. The first camera 10 outputs the captured first image data to the image data acquisition unit 30 of the control unit 26 . The first video data is, for example, a moving image composed of images of 30 frames per second.

第二カメラ１２は、車両の車室内を撮影するカメラである。第二カメラ１２は、車両の搭乗者の少なくとも顔部を撮影可能な位置に配置されている。車両の搭乗者とは、車両の運転者のみであってもよく、車両の運転者に加え、他の搭乗者を含んでもよい。第二カメラ１２は、例えば、車両のインストルメントパネル、または車両のルームミラー内部またはルームミラーの周辺に配置されている。第二カメラ１２は、撮影範囲と撮影向きが固定またはほぼ固定である。第二カメラ１２は、例えば、可視光カメラまたは近赤外線カメラで構成される。第二カメラ１２は、例えば、可視光カメラと近赤外線カメラとの組み合わせで構成されてもよい。第二カメラ１２は、撮影した第二映像データを制御部２６の映像データ取得部３０へ出力する。第二映像データは、例えば毎秒３０フレームの画像から構成される動画像である。なお、第一映像データおよび第二映像データとの区別を要しない場合、映像データと記載する。 The second camera 12 is a camera that captures the interior of the vehicle. The second camera 12 is arranged at a position capable of photographing at least the face of the passenger of the vehicle. The occupant of the vehicle may be only the driver of the vehicle, or may include other occupants in addition to the driver of the vehicle. The second camera 12 is arranged, for example, in the instrument panel of the vehicle, or inside or around the rearview mirror of the vehicle. The second camera 12 has a fixed or substantially fixed shooting range and shooting direction. The second camera 12 is composed of, for example, a visible light camera or a near-infrared camera. The second camera 12 may be composed of, for example, a combination of a visible light camera and a near-infrared camera. The second camera 12 outputs the captured second image data to the image data acquisition unit 30 of the control unit 26 . The second video data is, for example, a moving image composed of images of 30 frames per second. In addition, when it is not necessary to distinguish between the first video data and the second video data, they are referred to as video data.

第一カメラ１０および第二カメラ１２は、例えば全天周や半天周を撮影可能な単一のカメラで構成されてもよい。この場合、全天周や半天周を撮影した映像データにおいて、映像データの全体または車両の周辺を撮影している範囲や、車両の前方などを撮影している範囲を、第一映像データとする。また、全天周や半天周を撮影した映像データにおいて、車両の座席に着座している搭乗者の顔を撮影可能な範囲を、第二映像データとする。全天周や半天周を撮影した映像データ全体を、第一映像データおよび第二映像データとして扱ってもよい。 The first camera 10 and the second camera 12 may be composed of, for example, a single camera capable of taking full-dome or half-dome shots. In this case, in the video data captured in full-dome or half-dome, the range in which the entire video data is shot, the area around the vehicle, or the area in front of the vehicle is taken as the first image data. . Further, in the image data obtained by photographing the full-dome or half-dome, the range in which the face of the passenger sitting on the seat of the vehicle can be photographed is referred to as the second image data. The entire image data obtained by capturing the full-dome or half-dome may be treated as the first image data and the second image data.

記録部１４は、記録装置１におけるデータの一時記憶などに用いられる。記録部１４は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、フラッシュメモリ（ＦｌａｓｈＭｅｍｏｒｙ）などの半導体メモリ素子、または、メモリカードなどの記録媒体である。または、図示しない通信装置を介して無線接続される外部記録部であってもよい。記録部１４は、制御部２６の記録制御部３６から出力された制御信号に基づいて、ループ記録映像データまたはイベントデータを記録する。 The recording unit 14 is used for temporary storage of data in the recording device 1 . The recording unit 14 is, for example, a RAM (Random Access Memory), a semiconductor memory device such as a flash memory, or a recording medium such as a memory card. Alternatively, it may be an external recording unit wirelessly connected via a communication device (not shown). The recording unit 14 records loop recording video data or event data based on the control signal output from the recording control unit 36 of the control unit 26 .

表示部１６は、例えば、記録装置１に固有の表示装置、または、ナビゲーションシステムを含む他のシステムと共用した表示装置などである。表示部１６は、第一カメラ１０と一体に形成されていてもよい。表示部１６、例えば、液晶ディスプレイ（ＬＣＤ：ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）または有機ＥＬ（ＯｒｇａｎｉｃＥｌｅｃｔｒｏ－Ｌｕｍｉｎｅｓｃｅｎｃｅ）ディスプレイなどを含むディスプレイである。第一実施形態では、表示部１６は、車両の運転者前方の、ダッシュボード、インストルメントパネル、センターコンソールなどに配置されている。表示部１６は、制御部２６の記録制御部３６から出力された映像信号に基づいて、映像を表示する。表示部１６は、第一カメラ１０が撮影している映像、または、記録部１４に記録された映像を表示する。 The display unit 16 is, for example, a display device unique to the recording device 1 or a display device shared with other systems including a navigation system. The display unit 16 may be formed integrally with the first camera 10 . The display unit 16 is a display including, for example, a liquid crystal display (LCD) or an organic EL (Organic Electro-Luminescence) display. In the first embodiment, the display unit 16 is arranged on the dashboard, instrument panel, center console, etc. in front of the driver of the vehicle. The display unit 16 displays images based on the video signal output from the recording control unit 36 of the control unit 26 . The display unit 16 displays images captured by the first camera 10 or images recorded in the recording unit 14 .

マイクロフォン１８は、車両の車室内の音声を収音する。第一実施形態では、マイクロフォン１８は、運転者を含む車両の搭乗者が発話する音声を取得可能な位置に配置される。マイクロフォン１８は、例えば、ダッシュボード、インストルメントパネル、センターコンソールなどに配置されている。マイクロフォン１８は、記録装置１に対する音声コマンドに関する音声を収音する。マイクロフォン１８は、音声コマンドに関する音声を音声コマンド受付部４４に出力する。マイクロフォン１８は、収音した音声を、映像データ取得部３０に出力することで、記録制御部３６は、音声を含むループ記録映像データまたはイベントデータを記録してもよい。 A microphone 18 picks up the sound in the cabin of the vehicle. In the first embodiment, the microphone 18 is arranged at a position capable of acquiring voices spoken by passengers of the vehicle including the driver. The microphone 18 is arranged, for example, on a dashboard, an instrument panel, a center console, or the like. A microphone 18 picks up the voice related to the voice command to the recording device 1 . The microphone 18 outputs the voice related to the voice command to the voice command receiving section 44 . The recording control unit 36 may record loop-recorded video data or event data including audio by outputting the picked-up audio from the microphone 18 to the video data acquiring unit 30 .

加速度センサ２０は、車両に対して生じる加速度を検出するセンサである。加速度センサ２０は、検出結果を制御部２６のイベント検出部４６に出力する。加速度センサ２０は、例えば３軸方向の加速度を検出するセンサである。３軸方向とは、車両の前後方向、左右方向、および上下方向である。 The acceleration sensor 20 is a sensor that detects acceleration that occurs with respect to the vehicle. The acceleration sensor 20 outputs the detection result to the event detection section 46 of the control section 26 . The acceleration sensor 20 is, for example, a sensor that detects acceleration in three axial directions. The three axial directions are the longitudinal direction, the lateral direction, and the vertical direction of the vehicle.

操作部２２は、記録装置１に対する各種操作を受付可能である。例えば、操作部２２は、撮影した映像データを記録部１４にイベントデータとして手動で保存する操作を受付可能である。例えば、操作部２２は、記録部１４に記録したループ記録映像データまたはイベントデータを再生する操作を受付可能である。例えば、操作部２２は、記録部１４に記録したイベントデータを消去する操作を受付可能である。例えば、操作部２２は、ループ記録を終了する操作を受付可能である。操作部２２は、操作情報を制御部２６の操作制御部４８に出力する。 The operation unit 22 can receive various operations for the recording apparatus 1 . For example, the operation unit 22 can accept an operation to manually store captured video data in the recording unit 14 as event data. For example, the operation unit 22 can accept an operation to reproduce loop-recorded video data or event data recorded in the recording unit 14 . For example, the operation unit 22 can accept an operation to erase event data recorded in the recording unit 14 . For example, the operation unit 22 can accept an operation to end loop recording. The operation unit 22 outputs operation information to the operation control unit 48 of the control unit 26 .

ＧＮＳＳ受信部２４は、ＧＮＳＳ衛星からのＧＮＳＳ信号を受信するＧＮＳＳ受信機などで構成される。ＧＮＳＳ受信部２４は、受信したＧＮＳＳ信号を制御部２６の位置情報取得部５０へ出力する。 The GNSS receiver 24 is configured by a GNSS receiver or the like that receives GNSS signals from GNSS satellites. The GNSS reception unit 24 outputs the received GNSS signal to the position information acquisition unit 50 of the control unit 26 .

制御部２６は、記録装置１の各部を制御する。制御部２６は、例えば、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）やＭＰＵ（ＭｉｃｒｏＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などの情報処理装置と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）又はＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）などの記憶装置とを有する。制御部２６は、本発明に係る記録装置１の動作を制御するプログラムを実行する。制御部２６は、例えば、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）やＦＰＧＡ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＧａｔｅＡｒｒａｙ）等の集積回路により実現されてもよい。制御部２６は、ハードウェアと、ソフトウェアとの組み合わせで実現されてもよい。 The control section 26 controls each section of the recording apparatus 1 . The control unit 26 has, for example, an information processing device such as a CPU (Central Processing Unit) or an MPU (Micro Processing Unit), and a storage device such as a RAM (Random Access Memory) or a ROM (Read Only Memory). The control section 26 executes a program for controlling the operation of the recording apparatus 1 according to the invention. The control unit 26 may be implemented by an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The control unit 26 may be realized by a combination of hardware and software.

制御部２６は、映像データ取得部３０と、バッファメモリ３２と、映像データ処理部３４と、記録制御部３６と、再生制御部３８と、表示制御部４０と、検出部４２と、音声コマンド受付部４４と、イベント検出部４６と、操作制御部４８と、位置情報取得部５０と、を制御部２６の構成またはプログラムの実行によって実現される機能ブロックとして備える。 The control unit 26 includes a video data acquisition unit 30, a buffer memory 32, a video data processing unit 34, a recording control unit 36, a reproduction control unit 38, a display control unit 40, a detection unit 42, and an audio command reception unit. A unit 44, an event detection unit 46, an operation control unit 48, and a position information acquisition unit 50 are provided as functional blocks realized by the configuration of the control unit 26 or execution of a program.

映像データ取得部３０は、車両の周辺を撮影した第一映像データおよび車両の車室内を撮影した第二映像データを取得する。具体的には、映像データ取得部３０は、第一カメラ１０が撮影した第一映像データおよび第二カメラ１２が撮影した第二映像データを取得する。映像データ取得部３０は、取得した第一映像データおよび第二映像データを、バッファメモリ３２に出力する。映像データ取得部３０が取得する第一映像データおよび第二映像データは、映像のみのデータに限らず、映像と音声とを含む映像データであってもよい。映像データ取得部３０は、第一映像データおよび第二映像データとして、全天周や半天周を撮影した映像データを取得してもよい。 The image data acquisition unit 30 acquires first image data obtained by photographing the surroundings of the vehicle and second image data obtained by photographing the interior of the vehicle. Specifically, the video data acquisition unit 30 acquires the first video data captured by the first camera 10 and the second video data captured by the second camera 12 . The video data acquisition unit 30 outputs the acquired first video data and second video data to the buffer memory 32 . The first video data and the second video data acquired by the video data acquisition unit 30 are not limited to data of video only, and may be video data including video and audio. The image data acquisition unit 30 may acquire image data obtained by photographing the full-dome or half-dome as the first image data and the second image data.

バッファメモリ３２は、記録装置１が備える内部メモリであり、映像データ取得部３０が取得した一定時間分の映像データを、更新しながら一時的に記録するメモリである。 The buffer memory 32 is an internal memory provided in the recording apparatus 1, and is a memory that temporarily records video data for a certain period of time acquired by the video data acquiring section 30 while updating the video data.

映像データ処理部３４は、バッファメモリ３２が一時的に記憶している映像データを、例えばＨ．２６４やＭＰＥＧ－４（ＭｏｖｉｎｇＰｉｃｔｕｒｅＥｘｐｅｒｔｓＧｒｏｕｐ）などの任意の方式のコーデックで符号化された、例えばＭＰ４形式などの任意のファイル形式に変換する。映像データ処理部３４は、バッファメモリ３２が一時的に記憶している映像データから、一定時間分のファイルとした映像データを生成する。具体例として、映像データ処理部３４は、バッファメモリ３２が一時的に記憶している映像データを、記録順に６０秒間の映像データをファイルとして生成する。映像データ処理部３４は、生成した映像データを記録制御部３６へ出力する。映像データ処理部３４は、生成した映像データを表示制御部４０へ出力する。ファイルとして生成される映像データの期間は、一例として６０秒としたが、これには限定されない。 The video data processing unit 34 converts the video data temporarily stored in the buffer memory 32 into H.264, for example. 264, MPEG-4 (Moving Picture Experts Group), or any other codec, and converted into any file format, such as MP4 format. The video data processing unit 34 generates video data as a file for a certain time from the video data temporarily stored in the buffer memory 32 . As a specific example, the video data processing unit 34 generates a file of 60 seconds of video data temporarily stored in the buffer memory 32 in order of recording. The video data processing unit 34 outputs the generated video data to the recording control unit 36 . The video data processing unit 34 outputs the generated video data to the display control unit 40 . The period of video data generated as a file is set to 60 seconds as an example, but is not limited to this.

記録制御部３６は、映像データ処理部３４でファイル化された映像データを、記録部１４に記録させる制御を行う。記録制御部３６は、車両のアクセサリ電源がＯＮであるときなど、ループ記録処理を実行する期間は、映像データ処理部３４でファイル化された映像データを、上書き可能な映像データとして、記録部１４に記録する。記録制御部３６は、ループ記録処理を実行する期間は、映像データ処理部３４が生成した映像データを記録部１４に記録し続け、記録部１４の容量が一杯になった場合、最も古い映像データに新しい映像データを上書きして記録する。 The recording control unit 36 controls the recording unit 14 to record the video data filed by the video data processing unit 34 . The recording control unit 36 converts the video data filed by the video data processing unit 34 to the recording unit 14 as overwritable video data during a period in which the loop recording process is executed, such as when the accessory power source of the vehicle is ON. to record. The recording control unit 36 continues to record the video data generated by the video data processing unit 34 in the recording unit 14 during the period in which the loop recording process is executed. overwrites new video data on the

記録制御部３６は、音声コマンド受付部４４が音声コマンドによるイベント記録の指示を受け付けた場合に、イベント記録の指示を受け付けた時点を含む第一映像データをイベントデータとして保存する。記録制御部３６は、イベントデータを上書きが禁止されたデータとして記録部１４に保存する。例えば、記録制御部３６は、音声コマンド受付部４４が音声コマンドによるイベント検出を受け付けた時点の前後１０秒程度の所定の期間の第一映像データをバッファメモリ３２からコピーして、イベントデータとして保存する。 When the voice command receiving unit 44 receives an instruction to record an event by a voice command, the recording control unit 36 saves the first video data including the point of time when the instruction to record the event is received as event data. The recording control unit 36 stores the event data in the recording unit 14 as overwrite-prohibited data. For example, the recording control unit 36 copies the first video data in a predetermined period of about 10 seconds before and after the time when the voice command receiving unit 44 receives the event detection by the voice command from the buffer memory 32, and saves it as event data. do.

記録制御部３６は、イベント検出部４６が、加速度センサ２０の出力値に基づきイベントの発生を検出した場合に、イベントを検出した時点を含む第一映像データをイベントデータとして保存する。記録制御部３６は、イベントデータを上書きが禁止されたデータとして記録部１４に保存する。例えば、記録制御部３６は、イベント検出部４６がイベントを検出した時点の前後１０秒程度の所定の期間の第一映像データをバッファメモリ３２からコピーして、イベントデータとして保存する。 When the event detection unit 46 detects the occurrence of an event based on the output value of the acceleration sensor 20, the recording control unit 36 saves the first video data including the point of time when the event was detected as event data. The recording control unit 36 stores the event data in the recording unit 14 as overwrite-prohibited data. For example, the recording control unit 36 copies the first video data for a predetermined period of about 10 seconds before and after the event detection unit 46 detects the event from the buffer memory 32 and saves it as event data.

再生制御部３８は、操作制御部４８から出力された再生操作の制御信号に基づいて、記録部１４に記録されたループ記録映像データまたはイベントデータを再生し、再生した映像などを表示制御部４０によって表示部１６に出力させる制御を行う。 The reproduction control unit 38 reproduces the loop-recorded video data or event data recorded in the recording unit 14 based on the reproduction operation control signal output from the operation control unit 48, and displays the reproduced video or the like on the display control unit 40. is controlled to output to the display unit 16.

表示制御部４０は、表示部１６における映像データの表示を制御する。表示制御部４０は、映像データを表示部１６に出力させる映像信号を出力する。より詳しくは、表示制御部４０は、第一カメラ１０が撮影している映像、または、記録部１４に記録されたループ記録映像データまたはイベントデータの再生によって表示する映像信号を出力する。 The display control unit 40 controls display of video data on the display unit 16 . The display control unit 40 outputs a video signal that causes the display unit 16 to output video data. More specifically, the display control unit 40 outputs a video signal to be displayed by reproducing the video captured by the first camera 10 or the loop-recorded video data or event data recorded in the recording unit 14 .

検出部４２は、音声コマンドを発話する環境における、音声コマンドが適切に認識されない状況となる条件を検出する。本実施形態においては、検出部４２は、音声コマンドを発話する人物の顔の向きを検出する。音声コマンドを発話する人物の顔の向きが、音声コマンドの発話音声を取得するマイクロフォンの方向を向いていないことが、音声コマンドが適切に認識されない状況となる条件となりうる。このため、検出部４２は、第二映像データから車両の搭乗者を認識する。第一実施形態では、搭乗者には、運転者および運転者以外の乗員が含まれる。搭乗者は、車両の運転者のみであってもよい。検出部４２は、第二映像データから人物の顔を認識するとともに、顔の向きを検出する。検出部４２は、例えば、人の顔を構成する特徴部の位置関係などに基づき人の顔を検出し、顔中心線などと、顔を構成する各要素の位置関係などから、顔の向きを検出する。これらの検出方法は、公知の方法を使用可能であり、限定されない。 The detection unit 42 detects a condition in which the voice command is not properly recognized in the environment where the voice command is uttered. In this embodiment, the detection unit 42 detects the orientation of the face of the person who speaks the voice command. If the direction of the face of the person who utters the voice command is not facing the direction of the microphone that acquires the uttered voice of the voice command, it can be a condition that the voice command is not properly recognized. Therefore, the detection unit 42 recognizes the passenger of the vehicle from the second image data. In the first embodiment, passengers include the driver and passengers other than the driver. A passenger may be only the driver of the vehicle. The detection unit 42 recognizes a person's face from the second video data and detects the orientation of the face. For example, the detection unit 42 detects a person's face based on the positional relationship of the feature parts that make up the person's face, and determines the orientation of the face based on the positional relationship between the facial center line and each element that makes up the face. To detect. Known methods can be used for these detection methods, and are not limited.

検出部４２は、第二映像データから、車両の搭乗者の顔の向きを検出し、マイクロフォン１８が車両の搭乗者の音声を適切に収音できる方向に、車両の搭乗者が向いているか否かを判断する。検出部４２は、例えば、第二映像データから、車両の搭乗者の顔の向きを検出し、車両の搭乗者が車両の進行方向を向いているか、進行方向以外を向いているかを判断する。車両の進行方向とは、例えば、搭乗者が車両の前方を向いている状態の角度を０度としたとき、±３０度程度の範囲に含まれる方向をいう。 The detection unit 42 detects the orientation of the vehicle passenger's face from the second image data, and determines whether the vehicle passenger is facing in a direction in which the microphone 18 can appropriately pick up the voice of the vehicle passenger. to judge whether For example, the detection unit 42 detects the orientation of the face of the vehicle occupant from the second image data, and determines whether the vehicle occupant is facing the traveling direction of the vehicle or facing a direction other than the traveling direction. The traveling direction of the vehicle refers to a direction within a range of about ±30 degrees, for example, when the angle in which the passenger faces the front of the vehicle is 0 degrees.

本実施形態では、検出部４２は、車両の搭乗者が車両の進行方向を向いているか、進行方向以外を向いているかを判定するものとして説明するが、本開示はこれに限定されない。検出部４２は、例えば、第二映像データから、車両の搭乗者がマイクロフォン１８の方向を向いているか、マイクロフォン１８以外を向いているかを判定してもよい。 In the present embodiment, the detection unit 42 will be described as determining whether the vehicle occupant is facing the traveling direction of the vehicle or facing a direction other than the traveling direction, but the present disclosure is not limited to this. For example, the detection unit 42 may determine from the second video data whether the vehicle occupant is facing the direction of the microphone 18 or facing away from the microphone 18 .

音声コマンド受付部４４は、マイクロフォン１８が集音した音声を認識することで、音声コマンドを受け付ける。音声コマンド受付部４４は、例えば、マイクロフォン１８が集音した音声に対して、音源分離処理および音声認識処理を実行し、イベント記録を開始するための音声コマンドを認識する。イベント記録を開始するための音声コマンドは、例えば、「録画開始（ろくがかいし）」である。音声コマンド受付部４４は、マイクロフォン１８が集音した音声において「Ｒｏ・Ｋｕ・Ｇａ・Ｋａ・Ｉ・Ｓｈｉ」の連続した６音節を認識した場合に、イベント記録処理を開始するための制御信号を記録制御部３６に出力する。または、音声コマンド受付部４４は、マイクロフォン１８が集音した音声において「ＲｏＫｕＧａＫａＩＳｈｉ」の単語を示す音声を認識した場合に、イベント記録処理を開始するための制御信号を記録制御部３６に出力する。音声コマンド受付部４４は、車両の搭乗者が進行方向を向いているか否か、つまり、音声コマンドの発話音声を取得するためのマイクロフォン１８の方向を向いているか否かに応じて、音声コマンドを取得した否かを判定するための音声の認識率を変更する。 The voice command accepting unit 44 accepts voice commands by recognizing voices collected by the microphone 18 . The voice command reception unit 44 performs, for example, sound source separation processing and voice recognition processing on voice collected by the microphone 18, and recognizes a voice command for starting event recording. A voice command for starting event recording is, for example, "start recording". When the voice command receiving unit 44 recognizes six consecutive syllables of "Ro, Ku, Ga, Ka, I, and Shi" in the voice collected by the microphone 18, the voice command receiving unit 44 issues a control signal for starting the event recording process. Output to the recording control unit 36 . Alternatively, the voice command reception unit 44 outputs a control signal for starting the event recording process to the recording control unit 36 when the voice indicating the word “RoKuGaKaIShi” is recognized in the voice collected by the microphone 18 . The voice command reception unit 44 issues a voice command according to whether or not the vehicle occupant is facing the direction of travel, that is, whether or not the vehicle occupant is facing the direction of the microphone 18 for acquiring the uttered voice of the voice command. Change the speech recognition rate for determining whether or not it has been acquired.

音声コマンド受付部４４は、車両の搭乗者が車両の進行方向を向いている場合など、音声コマンドが適切に認識される状況においては、「Ｒｏ・Ｋｕ・Ｇａ・Ｋａ・Ｉ・Ｓｈｉ」の連続した６音節のうち、全ての音節が一致した場合に、音声コマンドを取得したと判定する。音声コマンド受付部４４は、例えば、音声コマンドを取得した判定する認識率の第一閾値として、９０％に設定する。この場合、音声コマンド受付部４４は、「Ｒｏ・Ｋｕ・Ｇａ・Ｋａ・Ｉ・Ｓｈｉ」の６音節のうち、９０％以上認識できた場合には、音声コマンドを取得したと判定する。 The voice command reception unit 44, in a situation where the voice command can be properly recognized, such as when the passenger of the vehicle is facing the direction of travel of the vehicle, recognizes the sequence of "Ro, Ku, Ga, Ka, I, Shi". If all the syllables out of the 6 syllables obtained match, it is determined that the voice command has been acquired. The voice command reception unit 44 sets, for example, 90% as the first threshold of the recognition rate for determining whether the voice command has been acquired. In this case, if 90% or more of the six syllables of "Ro, Ku, Ga, Ka, I, and Shi" are recognized, the voice command reception unit 44 determines that the voice command has been acquired.

音声コマンド受付部４４は、車両の搭乗者が車両の進行方向以外を向いている場合のように、音声コマンドが適切に認識されない状況においては、「Ｒｏ・Ｋｕ・Ｇａ・Ｋａ・Ｉ・Ｓｈｉ」の連続した６音節のうち、５音節以上が一致した場合に、音声コマンドを取得したと判定する。この場合、音声コマンド受付部４４は、音声コマンドを取得したと判定する認識率を第一閾値よりも低い第二閾値に設定する。音声コマンド受付部４４は、例えば、第二閾値を８０％に設定する。この場合、音声コマンド受付部４４は、「Ｒｏ・Ｋｕ・Ｇａ・Ｋａ・Ｉ・Ｓｈｉ」の連続した６音節のうち、８０％以上認識できた場合には、音声コマンドを取得したと判定する。すなわち、車両の搭乗者が車両の進行方向以外を向いている場合のように、音声コマンドが適切に認識されない状況においては、搭乗者の発話が完全に認識できなくとも、音声コマンドが発話されたと判定することで、適切に音声コマンドが認識される。 When the voice command is not properly recognized, such as when the occupant of the vehicle is facing in a direction other than the traveling direction of the vehicle, the voice command reception unit 44 outputs "Ro, Ku, Ga, Ka, I, Shi". If 5 or more of the 6 consecutive syllables match, it is determined that the voice command has been acquired. In this case, the voice command reception unit 44 sets the recognition rate for determining that the voice command has been acquired to a second threshold lower than the first threshold. The voice command reception unit 44 sets the second threshold to 80%, for example. In this case, the voice command reception unit 44 determines that the voice command has been acquired when 80% or more of the consecutive six syllables of "Ro, Ku, Ga, Ka, I, and Shi" can be recognized. In other words, in a situation where the voice command is not properly recognized, such as when the passenger of the vehicle is facing in a direction other than the traveling direction of the vehicle, even if the passenger's utterance cannot be completely recognized, it is assumed that the voice command was spoken. A voice command is appropriately recognized by judging.

また、音声コマンド受付部４４は、車両の搭乗者が車両の進行方向を向いている場合には、「ＲｏＫｕＧａＫａＩＳｈｉ」の単語を示す音声波形の音響モデルと、入力された音声の波形との一致率を、音声コマンドを取得した判定する認識率の第一閾値として、例えば、９０％に設定する。この場合、音声コマンド受付部４４は、「ＲｏＫｕＧａＫａＩＳｈｉ」の単語を示す音声波形の音響モデルと、入力された音声の波形との一致率が９０％以上である場合には、音声コマンドを取得したと判定する。 In addition, when the passenger of the vehicle is facing the traveling direction of the vehicle, the voice command reception unit 44 determines the rate of matching between the acoustic model of the voice waveform representing the word "RoKuGaKaIShi" and the waveform of the input voice. is set to, for example, 90% as the first threshold of the recognition rate for determining that the voice command has been acquired. In this case, if the acoustic model of the voice waveform representing the word "RoKuGaKaIShi" matches the waveform of the input voice at a rate of 90% or higher, the voice command reception unit 44 determines that the voice command has been acquired. judge.

また、音声コマンド受付部４４は、車両の搭乗者が車両の進行方向以外を向いている場合には、「ＲｏＫｕＧａＫａＩＳｈｉ」の単語を示す音声波形の音響モデルと、入力された音声の波形との一致率を、音声コマンドを取得した判定する認識率の第一閾値よりも低い第二閾値として、例えば８０％に設定する。この場合、音声コマンド受付部４４は、「ＲｏＫｕＧａＫａＩＳｈｉ」の単語を示す音声波形の音響モデルと、入力された音声の波形との一致率が８０％以上である場合には、音声コマンドを取得したと判定する。すなわち、車両の搭乗者が車両の進行方向以外を向いている場合には、搭乗者の音声が音声コマンドとして認識されやすくなる。 In addition, when the passenger of the vehicle is facing in a direction other than the traveling direction of the vehicle, the voice command reception unit 44 matches the acoustic model of the voice waveform representing the word "RoKuGaKaIShi" with the waveform of the input voice. The rate is set to 80%, for example, as a second threshold that is lower than the first threshold of the recognition rate for determining that voice commands have been acquired. In this case, if the acoustic model of the voice waveform representing the word "RoKuGaKaIShi" matches the waveform of the input voice at a rate of 80% or higher, the voice command reception unit 44 determines that the voice command has been acquired. judge. That is, when the passenger of the vehicle is facing in a direction other than the traveling direction of the vehicle, the passenger's voice is likely to be recognized as a voice command.

イベント検出部４６は、車両に加わる加速度に基づくイベントを検出する。イベント検出部４６は、加速度センサ２０の検出結果に基づいて、イベントを検出する。イベント検出部４６は、加速度情報が、車両の衝突に該当するような予め設定された閾値以上である場合、イベントが発生したことを検出する。 The event detector 46 detects an event based on acceleration applied to the vehicle. The event detection unit 46 detects events based on the detection result of the acceleration sensor 20 . The event detection unit 46 detects that an event has occurred when the acceleration information is greater than or equal to a preset threshold that corresponds to a vehicle collision.

操作制御部４８は、操作部２２が受け付けた操作の操作情報を取得する。例えば、操作制御部４８は、映像データの手動保存操作を示す保存操作情報、再生操作を示す再生操作情報、または、映像データの消去操作を示す消去操作情報を取得して制御信号を出力する。例えば、操作制御部４８は、ループ記録を終了する操作を示す終了操作情報を取得して制御信号を出力する。 The operation control unit 48 acquires operation information of operations accepted by the operation unit 22 . For example, the operation control unit 48 acquires storage operation information indicating a manual storage operation of video data, playback operation information indicating a playback operation, or erasing operation information indicating a video data erasing operation, and outputs a control signal. For example, the operation control unit 48 acquires end operation information indicating an operation for ending loop recording, and outputs a control signal.

操作制御部４８は、音声コマンド受付部４４が認識して受け付けた音声コマンドによるイベント記録操作を受け付ける。 The operation control unit 48 receives an event recording operation by a voice command recognized and received by the voice command receiving unit 44 .

位置情報取得部５０は、車両の現在位置を示す位置情報を取得する。位置情報取得部５０は、ＧＮＳＳ受信部２４が受信したＧＮＳＳ信号に基づいて、車両の現在位置の位置情報を公知の方法によって算出する。 The position information acquisition unit 50 acquires position information indicating the current position of the vehicle. The position information acquisition unit 50 calculates the position information of the current position of the vehicle by a known method based on the GNSS signals received by the GNSS reception unit 24 .

（制御部の処理）
図２を用いて、第一実施形態に係る制御部の処理の流れを説明する。図２は、第一実施形態に係る制御部の処理の流れを示すフローチャートである。図２に示すフローチャートは、記録装置１が装着されている車両のエンジンなどの動力が始動することで開始される。 (Processing of control unit)
The processing flow of the control unit according to the first embodiment will be described with reference to FIG. FIG. 2 is a flow chart showing the flow of processing by a control unit according to the first embodiment. The flowchart shown in FIG. 2 is started when power such as an engine of a vehicle in which the recording apparatus 1 is mounted is started.

処理の開始に伴い、制御部２６は、通常記録および向き検出を開始する（ステップＳ１０）。具体的には、記録制御部３６は、第一カメラ１０および第二カメラ１２が撮影した映像データをバッファメモリ３２に送信し、例えば、６０秒ごとのような所定期間の映像ごとに映像ファイルを生成し、記録部１４に記録させる。検出部４２は、車両の搭乗者の顔の向きの検出を開始する。そして、ステップＳ１２に進む。 With the start of processing, the control unit 26 starts normal recording and orientation detection (step S10). Specifically, the recording control unit 36 transmits video data captured by the first camera 10 and the second camera 12 to the buffer memory 32, and saves a video file every video of a predetermined period, such as every 60 seconds. It is generated and recorded in the recording unit 14 . The detection unit 42 starts detecting the direction of the face of the passenger of the vehicle. Then, the process proceeds to step S12.

検出部４２は、車両の搭乗者は車両の進行方向以外を向いているか否かを判定する（ステップＳ１２）。具体的には、検出部４２は、車両の搭乗者が進行方向以外を所定時間以上向いているか否かを判断する。検出部４２は、車両の搭乗者が進行方向以外を所定時間以上向いている場合に、車両の搭乗者が進行方向以外を向いていると判断する。所定時間は、例えば、２秒以上であるが、これに限定されない。搭乗者が車両の進行方向以外を向いていると判定された場合（ステップＳ１２；Ｙｅｓ）、ステップＳ１４に進む。搭乗者が車両の進行方向以外を向いていると判定されない場合（ステップＳ１２；Ｎｏ）、ステップＳ１８に進む。 The detection unit 42 determines whether or not the occupant of the vehicle is facing in a direction other than the traveling direction of the vehicle (step S12). Specifically, the detection unit 42 determines whether or not the occupant of the vehicle faces a direction other than the traveling direction for a predetermined time or longer. The detection unit 42 determines that the vehicle occupant is facing the direction other than the direction of travel when the occupant of the vehicle faces the direction other than the direction of travel for a predetermined time or longer. The predetermined time is, for example, two seconds or more, but is not limited to this. If it is determined that the passenger is facing in a direction other than the traveling direction of the vehicle (step S12; Yes), the process proceeds to step S14. If it is determined that the occupant is not facing the traveling direction of the vehicle (step S12; No), the process proceeds to step S18.

ステップＳ１２でＹｅｓと判定された場合、音声コマンド受付部４４は、マイクロフォン１８により車両の搭乗者から音声コマンドを取得したか否かを判定する（ステップＳ１４）。音声コマンドを取得したと判定された場合（ステップＳ１４；Ｙｅｓ）、ステップＳ１６に進む。音声コマンドを取得したと判定されない場合（ステップＳ１４；Ｎｏ）、ステップＳ２４に進む。 When it is determined as Yes in step S12, the voice command reception unit 44 determines whether or not a voice command has been received from the passenger of the vehicle through the microphone 18 (step S14). If it is determined that the voice command has been acquired (step S14; Yes), the process proceeds to step S16. If it is not determined that the voice command has been acquired (step S14; No), the process proceeds to step S24.

ステップＳ１４でＹｅｓと判定された場合、音声コマンド受付部４４は、取得した音声コマンドの認識率は第二閾値以上であるか否かを判定する（ステップＳ１６）。音声コマンドの認識率が第二閾値以上であると判定された場合（ステップＳ１６；Ｙｅｓ）、ステップＳ２２に進む。音声コマンドの認識率が第二閾値以上であると判定されない場合（ステップＳ１６；Ｎｏ）、ステップＳ２４に進む。 When determined as Yes in step S14, the voice command reception unit 44 determines whether or not the recognition rate of the acquired voice command is equal to or higher than the second threshold (step S16). If it is determined that the voice command recognition rate is equal to or higher than the second threshold (step S16; Yes), the process proceeds to step S22. If it is not determined that the voice command recognition rate is equal to or higher than the second threshold (step S16; No), the process proceeds to step S24.

ステップＳ１２でＮｏと判定された場合、音声コマンド受付部４４は、マイクロフォン１８により車両の搭乗者から音声コマンドを取得したか否かを判定する（ステップＳ１８）。音声コマンドを取得したと判定された場合（ステップＳ１８；Ｙｅｓ）、ステップＳ２０に進む。音声コマンドを取得したと判定されない場合（ステップＳ１８；Ｎｏ）、ステップＳ２４に進む。 When determined as No in step S12, the voice command reception unit 44 determines whether or not a voice command has been received from the passenger of the vehicle through the microphone 18 (step S18). If it is determined that the voice command has been acquired (step S18; Yes), the process proceeds to step S20. If it is not determined that the voice command has been acquired (step S18; No), the process proceeds to step S24.

ステップＳ１８でＹｅｓと判定された場合、音声コマンド受付部４４は、取得した音声コマンドの認識率は第一閾値以上であるか否かを判定する（ステップＳ２０）。音声コマンドの認識率が第一閾値以上であると判定された場合（ステップＳ２０；Ｙｅｓ）、ステップＳ２２に進む。音声コマンドの認識率が第二閾値以上であると判定されない場合（ステップＳ２０；Ｎｏ）、ステップＳ２４に進む。 When determined as Yes in step S18, the voice command reception unit 44 determines whether or not the recognition rate of the acquired voice command is equal to or higher than the first threshold (step S20). If it is determined that the voice command recognition rate is equal to or higher than the first threshold (step S20; Yes), the process proceeds to step S22. If it is not determined that the voice command recognition rate is equal to or higher than the second threshold (step S20; No), the process proceeds to step S24.

ステップＳ１４およびステップＳ１８においては、音声コマンドを取得したか否かの判断に加えて、取得した音声コマンドが、緊急性または即時性の高い音声コマンドであるか否かを判断してもよい。言い換えると、ステップＳ１４およびステップＳ１８においては、緊急性または即時性の高い音声コマンドを取得したか否かを判定する。緊急性または即時性の高い音声コマンドとは、音声コマンドが受け付けられることで、遅延なく動作開始することが要求される機能に対する操作を要求する音声コマンドである。例えば、記録装置１における緊急性または即時性の高い音声コマンドとは、イベント記録を指示する音声コマンドである。 In steps S14 and S18, in addition to determining whether or not a voice command has been acquired, it may be determined whether or not the acquired voice command is a highly urgent or immediacy voice command. In other words, in steps S14 and S18, it is determined whether or not a highly urgent or immediacy voice command has been acquired. A voice command with high urgency or immediacy is a voice command that requests an operation for a function that is required to start operation without delay when the voice command is accepted. For example, an urgent or immediacy voice command in the recording device 1 is a voice command instructing event recording.

ステップＳ１６でＹｅｓまたはステップＳ２０でＹｅｓと判定された場合、記録制御部３６は、イベントデータを記録部１４に保存する（ステップＳ２２）。具体的には、記録制御部３６は、音声コマンド受付部４４が音声コマンドを取得した時点の前後の第一映像データをイベントデータとして記録部１４に保存する。そして、ステップＳ２４に進む。 If Yes in step S16 or Yes in step S20, the recording control unit 36 stores the event data in the recording unit 14 (step S22). Specifically, the recording control unit 36 stores the first video data before and after the voice command receiving unit 44 acquires the voice command in the recording unit 14 as event data. Then, the process proceeds to step S24.

ステップＳ１４からステップＳ２０でＮｏと判定された場合、またはステップＳ２２の後、制御部２６は、処理を終了するか否かを判定する（ステップＳ２４）。具体的には、制御部２６は、操作部２２が電源をオフにする操作や、処理を終了する旨の操作を受け付けた場合、または、記録装置１が装着されている車両のエンジンなどの動力がＯＦＦとなることで、処理を終了すると判定する。処理を終了すると判定された場合（ステップＳ２４；Ｙｅｓ）、図２の処理を終了する。処理を終了すると判定されない場合（ステップＳ２４；Ｎｏ）、ステップＳ１２に進む。 When it is determined No in steps S14 to S20, or after step S22, the control unit 26 determines whether or not to end the process (step S24). Specifically, when the operation unit 22 receives an operation to turn off the power or an operation to end processing, the control unit 26 changes the power of the engine of the vehicle in which the recording device 1 is mounted. is turned off, it is determined that the process is finished. If it is determined to end the process (step S24; Yes), the process of FIG. 2 ends. If it is not determined to end the process (step S24; No), the process proceeds to step S12.

上述のとおり、第一実施形態は、車両の搭乗者が進行方向を向いている場合と、進行方向以外を向いている場合とで、音声を音声コマンドとして認識するための認識率を変更して、イベントデータの保存を行う。第一実施形態では、進行方向以外を向いている場合には、進行方向を向いている場合と比較して、認識率を低くしてイベントデータの保存処理を実行する。これにより、第一実施形態は、搭乗者が進行方向以外を向いているため、マイクロフォン１８が搭乗者の音声を収音しづらい状況であっても、音声コマンドによるイベントデータの保存を適切に行うことができる。 As described above, in the first embodiment, the recognition rate for recognizing a voice as a voice command is changed depending on whether the passenger of the vehicle is facing in the direction of travel or in the direction other than the direction of travel. , save the event data. In the first embodiment, when facing in a direction other than the direction of travel, the recognition rate is lowered compared to when facing in the direction of travel, and the event data storage process is executed. As a result, the first embodiment appropriately saves the event data by the voice command even in a situation where it is difficult for the microphone 18 to pick up the voice of the passenger because the passenger is facing in a direction other than the traveling direction. be able to.

［第二実施形態］
第二実施形態について説明する。第二実施形態は、車両の複数の搭乗者が乗車している場合に、複数の搭乗者の向いている方向に基づいて、音声コマンドの認識率を変更する点で、第一実施形態と異なる。 [Second embodiment]
A second embodiment will be described. The second embodiment differs from the first embodiment in that when a vehicle is occupied by multiple passengers, the voice command recognition rate is changed based on the direction in which the multiple passengers are facing. .

（記録装置）
図３を用いて、第二実施形態に係る記録装置の構成例を説明する。図３は、第二実施形態に係る記録装置の構成例を示すブロック図である。 (recording device)
A configuration example of the recording apparatus according to the second embodiment will be described with reference to FIG. FIG. 3 is a block diagram showing a configuration example of a printing apparatus according to the second embodiment.

図３に示すように、記録装置１Ａは、制御部２６Ａが割合算出部５２を備える点で、図１に示す記録装置１と異なる。 As shown in FIG. 3, the recording apparatus 1A differs from the recording apparatus 1 shown in FIG. 1 in that the control section 26A includes a ratio calculation section 52.

検出部４２Ａは、車両に複数の搭乗者が乗車している場合に、複数の搭乗者の各々が向いている方向を検出する。検出部４２Ａ、複数の搭乗者の各々が進行方向を向いているか、進行方向以外を向いているかを検出する。検出部４２Ａは、複数の搭乗者のうち所定割合以上の搭乗者が車両の進行方向以外の同一方向を向いている場合に、車両の搭乗者が車両の進行方向以外を向いていると判断する。 The detection unit 42A detects the direction in which each of the multiple passengers is facing when multiple passengers are riding in the vehicle. The detection unit 42A detects whether each of the plurality of passengers is facing in the direction of travel or in the direction other than the direction of travel. The detection unit 42A determines that the occupants of the vehicle are facing in a direction other than the traveling direction of the vehicle when a predetermined ratio or more of the plurality of occupants are facing the same direction other than the traveling direction of the vehicle. .

割合算出部５２は、複数の搭乗者のうち、所定方向を向いている搭乗者の割合を算出する。割合算出部５２は、複数の搭乗者のうち、進行方向以外を向いている搭乗者の割合を算出する。割合算出部５２は、複数の搭乗者のうち、進行方向以外を向いている搭乗者の割合が所定割合以上であるか否かを判定する。所定割合は、例えば、５０％以上である。具体的には、割合算出部５２は、搭乗者が２名の場合には１名以上、搭乗者が３名の場合には２名以上、搭乗者が４名の場合には２名以上が進行方向以外の同一方向を向いていた場合に、所定以上の割合の搭乗者が進行方向以外を向いていると判断する。なお、所定割合は、５０％に限定されず、その他の値であってもよい。 The ratio calculation unit 52 calculates the ratio of passengers facing a predetermined direction among the plurality of passengers. The ratio calculation unit 52 calculates the ratio of passengers facing in a direction other than the direction of travel among the plurality of passengers. The ratio calculation unit 52 determines whether or not the ratio of passengers facing in a direction other than the traveling direction is equal to or greater than a predetermined ratio among the plurality of passengers. The predetermined percentage is, for example, 50% or more. Specifically, the ratio calculation unit 52 calculates that when there are two passengers, there are one or more passengers, when there are three passengers, there are two or more passengers, and when there are four passengers, there are two or more passengers. If they are facing the same direction other than the direction of travel, it is determined that a predetermined proportion or more of the passengers are facing the direction other than the direction of travel. Note that the predetermined ratio is not limited to 50%, and may be another value.

（制御部の処理）
図４を用いて、第二実施形態に係る制御部の処理の流れを説明する。図４は、第二実施形態に係る制御部の処理の流れを示すフローチャートである。 (Processing of control unit)
The processing flow of the control unit according to the second embodiment will be described with reference to FIG. FIG. 4 is a flow chart showing the flow of processing by a control unit according to the second embodiment.

制御部２６Ａは、通常記録および向き検出を開始する（ステップＳ３０）。通常記録の処理は、図２に示すステップＳ１０と同一なので説明を省略する。検出部４２Ａは、車両に乗車している複数の搭乗者の各々の顔の向きを検出する。そして、ステップＳ２２に進む。 The controller 26A starts normal recording and orientation detection (step S30). Since the processing of normal recording is the same as that of step S10 shown in FIG. 2, the description thereof is omitted. The detection unit 42A detects the face orientation of each of a plurality of passengers riding in the vehicle. Then, the process proceeds to step S22.

検出部４２Ａは、所定割合以上の搭乗者が車両の進行方向以外を向いているか否かを判定する（ステップＳ３２）。具体的には、割合算出部５２は、検出部４２Ａの検出結果に基づいて、所定割合以上の搭乗者が車両の進行方向以外を向いているか否かを判定する。そして、検出部４２Ａは、複数の搭乗者のうち所定割合以上の搭乗者が車両の進行方向以外の同一方向を向いている場合に、車両の搭乗者が車両の進行方向以外を向いていると判断する。所定割合以上の搭乗者が車両の進行方向以外を向いていると判定された場合（ステップＳ３２；Ｙｅｓ）、ステップＳ３４に進む。所定割合以上の搭乗者が車両の進行方向以外を向いていると判定されない場合（ステップＳ３２；Ｎｏ）、ステップＳ３８に進む。 The detection unit 42A determines whether or not a predetermined percentage or more of the passengers are facing in a direction other than the traveling direction of the vehicle (step S32). Specifically, the ratio calculation unit 52 determines whether or not a predetermined ratio or more of the passengers are facing in a direction other than the traveling direction of the vehicle, based on the detection result of the detection unit 42A. Then, the detection unit 42A detects that the vehicle occupant is facing the direction other than the traveling direction of the vehicle when a predetermined ratio or more of the plurality of passengers are facing the same direction other than the traveling direction of the vehicle. to decide. If it is determined that a predetermined percentage or more of the passengers are facing in a direction other than the traveling direction of the vehicle (step S32; Yes), the process proceeds to step S34. If it is not determined that the predetermined percentage or more of the passengers are facing in a direction other than the traveling direction of the vehicle (step S32; No), the process proceeds to step S38.

ステップＳ３４からステップＳ４４の処理は、それぞれ、図２に示すステップＳ１４からステップＳ２４の処理と同一なので、説明を省略する。 The processing from step S34 to step S44 is the same as the processing from step S14 to step S24 shown in FIG. 2, respectively, so description thereof will be omitted.

上述のとおり、第二実施形態は、車両に搭乗している複数の搭乗者のうち、進行方向以外を向いている搭乗者の割合に応じて、音声を音声コマンドとして認識するための認識率を変更して、イベントデータの保存を行う。第二実施形態では、進行方向以外を向いている搭乗者の割合が所定割合以上である場合には、認識率を低くしてイベントデータの保存処理を実行する。これにより、第二実施形態は、複数の搭乗者が進行方向以外を向いているため、マイクロフォン１８が搭乗者の音声を収音しづらい状況であっても、音声コマンドによるイベントデータの保存を適切に行うことができる。 As described above, the second embodiment increases the recognition rate for recognizing a voice as a voice command according to the ratio of passengers facing in a direction other than the direction of travel, among the plurality of passengers in the vehicle. Change and save the event data. In the second embodiment, when the percentage of passengers facing in a direction other than the traveling direction is equal to or greater than a predetermined percentage, the recognition rate is lowered and the event data storage process is executed. As a result, in the second embodiment, even in a situation where it is difficult for the microphone 18 to pick up the voices of the passengers because a plurality of passengers are facing in a direction other than the traveling direction, it is possible to appropriately save the event data using the voice commands. can be done.

[第三実施形態]
第三実施形態について説明する。第三実施形態は、音声コマンドを用いて操作を行う汎用的な装置であり、例えば、スマートスピーカーやテレビジョン受信器などの家庭用装置、スマートフォン、タブレット端末、ＰＣなどの情報装置、車両において用いられるナビゲーション装置やインフォテインメントシステムなどに適用可能である。 [Third Embodiment]
A third embodiment will be described. The third embodiment is a general-purpose device that operates using voice commands. For example, it is used in home devices such as smart speakers and television receivers, information devices such as smartphones, tablet terminals, and PCs, and vehicles. It can be applied to navigation devices, infotainment systems, etc.

図５を用いて、第三実施形態に係る音声コマンド受付装置の構成例について説明する。図５は、第三実施形態に係る音声コマンド受付装置の構成例を示すブロック図である。 A configuration example of the voice command reception device according to the third embodiment will be described with reference to FIG. FIG. 5 is a block diagram showing a configuration example of a voice command accepting device according to the third embodiment.

図５に示すように、音声コマンド受付装置１００は、音声コマンド受付部１０２と、検出部１０４と、実行制御部１０６と、を備える。音声コマンド受付装置１００は、例えば、ＣＰＵやＭＰＵなどの情報処理装置と、ＲＡＭ又はＲＯＭなどの記憶装置とを有する。音声コマンド受付装置１００は、本発明に係るプログラムを実行する。音声コマンド受付装置１００は、例えば、ＡＳＩＣやＦＰＧＡ等の集積回路により実現されてもよい。音声コマンド受付装置１００は、ハードウェアと、ソフトウェアとの組み合わせで実現されてもよい。 As shown in FIG. 5, the voice command reception device 100 includes a voice command reception unit 102, a detection unit 104, and an execution control unit 106. The voice command reception device 100 has, for example, an information processing device such as a CPU or MPU, and a storage device such as a RAM or ROM. Voice command receiving device 100 executes a program according to the present invention. The voice command reception device 100 may be realized by an integrated circuit such as ASIC or FPGA, for example. Voice command accepting device 100 may be realized by a combination of hardware and software.

マイクロフォン１１０は、発話者が発話した音声を収音する。マイクロフォン１１０は、収音した音声に関する音声を音声コマンド受付装置１００に出力する。マイクロフォン１１０は、音声コマンド受付装置１００と一体に構成されていてもよいし、別体に構成されていてもよい。 The microphone 110 picks up the voice uttered by the speaker. The microphone 110 outputs voice related to the collected voice to the voice command accepting device 100 . Microphone 110 may be configured integrally with voice command receiving device 100 or may be configured separately.

カメラ１２０は、発話者を撮影する。カメラ１２０は、少なくとも発話者の顔を撮影する。カメラ１２０は、撮影した映像に関する映像データを音声コマンド受付装置１００に出力する。カメラ１２０は、音声コマンド受付装置１００と一体に構成されていてもよいし、別体に構成されていてもよい。 Camera 120 captures the speaker. Camera 120 captures at least the speaker's face. Camera 120 outputs video data relating to the captured video to voice command reception device 100 . The camera 120 may be configured integrally with the voice command receiving device 100, or may be configured separately.

音声コマンド受付部１０２は、音声コマンドを受け付ける。音声コマンド受付部１０２は、例えば、マイクロフォン１１０が収音した音声を認識することで、音声コマンドを受け付ける。 The voice command accepting unit 102 accepts voice commands. The voice command receiving unit 102 receives voice commands by recognizing voices picked up by the microphone 110, for example.

検出部１０４は、音声コマンドを発話する環境における、音声コマンドが適切に認識されない状況となる条件を検出する。本実施形態においては、検出部１０４は、音声コマンドを発話する人物の顔の向きを検出する。検出部１０４は、例えば、カメラ１２０が撮影した映像データに基づいて、音声コマンドを発話する人物の顔の向きを検出する。 The detection unit 104 detects a condition in which a voice command is not properly recognized in an environment where the voice command is uttered. In this embodiment, the detection unit 104 detects the orientation of the face of the person who speaks the voice command. The detection unit 104 detects the orientation of the face of the person who speaks the voice command based on the video data captured by the camera 120, for example.

実行制御部１０６は、音声コマンド受付部１０２が音声コマンドを受け付けた場合に、受け付けた音声コマンドに対する機能を実行させる。 When the voice command receiving unit 102 receives a voice command, the execution control unit 106 executes a function corresponding to the received voice command.

音声コマンド受付部１０２は、検出部１０４が人物の顔の向きが音声コマンドの発話音声を取得するマイクロフォン１１０の方向を向いているか否かに応じて音声コマンドの認識率を変化させて音声コマンドを受け付ける。音声コマンド受付部１０２は、例えば、人物の顔の向きがマイクロフォン１１０の方向を向いていると判定された場合には、第一閾値以上の認識率で音声コマンドを受け付ける。音声コマンド受付部１０２は、例えば、人物の顔の向きがマイクロフォン１１０以外の方向を向いていると判定された場合には、第一閾値よりも低い第二閾値以上で音声コマンドを受け付ける。 The voice command reception unit 102 changes the recognition rate of the voice command according to whether the detection unit 104 is pointing the face of the person in the direction of the microphone 110 that acquires the uttered voice of the voice command. accept. For example, when it is determined that the person's face is directed toward the microphone 110, the voice command reception unit 102 receives the voice command with a recognition rate equal to or higher than the first threshold. For example, when it is determined that the person's face is directed in a direction other than the microphone 110, the voice command reception unit 102 receives the voice command at a second threshold value lower than the first threshold value.

音声コマンド受付部１０２は、緊急性または即時性の高い音声コマンドに対しては、第二閾値以上の認識率で音声コマンドを受け付ける。第三実施形態において、緊急性または即時性の高い音声コマンドとは、緊急通話、緊急通信、放送コンテンツの記録開始指示、継続リスクの高い機能の停止指示など、機能の実行開始や実行終了に対して、操作時点からの遅延が好ましくない、または遅延によって悪影響やリスクのある機能に対する音声コマンドである。 The voice command accepting unit 102 accepts voice commands with a recognition rate equal to or higher than the second threshold for voice commands with high urgency or immediacy. In the third embodiment, the voice command with high urgency or immediacy is an instruction to start or end the execution of a function, such as an emergency call, an emergency communication, an instruction to start recording broadcast content, or an instruction to stop a function with a high continuation risk. are voice commands for functions where delays from the point of operation are undesirable, or where delays adversely affect or risk.

（音声コマンド受付装置の処理）
図６を用いて、第三実施形態に係る音声コマンド受付装置の処理の流れを説明する。図６は、第三実施形態に係る音声コマンド受付装置の処理の流れを示すフローチャートである。 (Processing of voice command receiving device)
The flow of processing of the voice command accepting device according to the third embodiment will be described with reference to FIG. FIG. 6 is a flow chart showing the flow of processing of the voice command accepting device according to the third embodiment.

検出部１０４は、向き検出を開始する（ステップＳ５０）。具体的には、検出部１０４は、発話者の顔の向きの検出を開始する。そして、ステップＳ５２に進む。 The detection unit 104 starts orientation detection (step S50). Specifically, the detection unit 104 starts detecting the orientation of the speaker's face. Then, the process proceeds to step S52.

検出部１０４は、発話者はマイクロフォン１１０の方向を向いているか否かを判定する（ステップＳ５２）。具体的には、検出部４２は、発話者がマイクロフォン１１０の方向を所定時間以上向いているか否かを判断する。検出部４２は、発話者がマイクロフォン１１０の方向を所定時間以上向いている場合に、発話者がマイクロフォン１１０の方向を向いていると判断する。所定時間は、例えば、２秒以上であるが、これに限定されない。発話者がマイクロフォン１１０の方向を向いていると判定された場合（ステップＳ５２；Ｙｅｓ）、ステップＳ５４に進む。発話者がマイクロフォン１１０の方向を向いていると判定されない場合（ステップＳ５２；Ｎｏ）、ステップＳ５８に進む。 The detection unit 104 determines whether or not the speaker is facing the direction of the microphone 110 (step S52). Specifically, the detection unit 42 determines whether or not the speaker faces the microphone 110 for a predetermined time or longer. The detection unit 42 determines that the speaker is facing the microphone 110 when the speaker is facing the microphone 110 for a predetermined time or longer. The predetermined time is, for example, two seconds or more, but is not limited to this. If it is determined that the speaker is facing the direction of the microphone 110 (step S52; Yes), the process proceeds to step S54. If it is not determined that the speaker is facing the direction of the microphone 110 (step S52; No), the process proceeds to step S58.

ステップＳ５２でＹｅｓと判定された場合、音声コマンド受付部１０２は、マイクロフォン１１０により発話者から音声コマンドを取得したか否かを判定する（ステップＳ５４）。音声コマンドを取得したと判定された場合（ステップＳ５４；Ｙｅｓ）、ステップＳ５６に進む。音声コマンドを取得したと判定されない場合（ステップＳ５４；Ｎｏ）、ステップＳ６４に進む。 If determined as Yes in step S52, the voice command reception unit 102 determines whether or not a voice command has been received from the speaker through the microphone 110 (step S54). If it is determined that the voice command has been acquired (step S54; Yes), the process proceeds to step S56. If it is not determined that the voice command has been acquired (step S54; No), the process proceeds to step S64.

ステップＳ５４でＹｅｓと判定された場合、音声コマンド受付部１０２は、取得した音声コマンドの認識率は第一閾値以上であるか否かを判定する（ステップＳ５６）。音声コマンドの認識率が第一閾値以上であると判定された場合（ステップＳ５６；Ｙｅｓ）、ステップＳ６２に進む。音声コマンドの認識率が第一閾値以上であると判定されない場合（ステップＳ５６；Ｎｏ）、ステップＳ６４に進む。 If determined as Yes in step S54, the voice command reception unit 102 determines whether or not the recognition rate of the acquired voice command is equal to or higher than the first threshold (step S56). If it is determined that the voice command recognition rate is equal to or higher than the first threshold (step S56; Yes), the process proceeds to step S62. If it is not determined that the voice command recognition rate is equal to or higher than the first threshold (step S56; No), the process proceeds to step S64.

ステップＳ５２でＮｏと判定された場合、音声コマンド受付部１０２は、マイクロフォン１１０により発話者から音声コマンドを取得したか否かを判定する（ステップＳ５８）。音声コマンドを取得したと判定された場合（ステップＳ５８；Ｙｅｓ）、ステップＳ６０に進む。音声コマンドを取得したと判定されない場合（ステップＳ５８；Ｎｏ）、ステップＳ６４に進む。 When determined as No in step S52, the voice command reception unit 102 determines whether or not a voice command has been received from the speaker through the microphone 110 (step S58). If it is determined that the voice command has been acquired (step S58; Yes), the process proceeds to step S60. If it is not determined that the voice command has been acquired (step S58; No), the process proceeds to step S64.

ステップＳ５８でＹｅｓと判定された場合、音声コマンド受付部１０２は、取得した音声コマンドの認識率は第二閾値以上であるか否かを判定する（ステップＳ６０）。音声コマンドの認識率が第二閾値以上であると判定された場合（ステップＳ６０；Ｙｅｓ）、ステップＳ６２に進む。音声コマンドの認識率が第二閾値以上であると判定されない場合（ステップＳ６０；Ｎｏ）、ステップＳ６４に進む。 When determined as Yes in step S58, the voice command reception unit 102 determines whether or not the recognition rate of the acquired voice command is equal to or higher than the second threshold (step S60). If it is determined that the voice command recognition rate is equal to or higher than the second threshold (step S60; Yes), the process proceeds to step S62. If it is not determined that the voice command recognition rate is equal to or higher than the second threshold (step S60; No), the process proceeds to step S64.

ステップＳ５４およびステップＳ５８においては、音声コマンドを取得したか否かの判断に加えて、取得した音声コマンドが、緊急性または即時性の高い音声コマンドであるか否かを判断してもよい。 In steps S54 and S58, in addition to determining whether or not a voice command has been acquired, it may be determined whether or not the acquired voice command is a highly urgent or immediacy voice command.

ステップＳ５６でＹｅｓまたはステップＳ６０でＹｅｓと判定された場合、実行制御部１０６は、音声コマンドに対する機能を実行する（ステップＳ６２）。そして、ステップＳ６４に進む。 If Yes in step S56 or Yes in step S60, the execution control unit 106 executes the function corresponding to the voice command (step S62). Then, the process proceeds to step S64.

ステップＳ５４からステップＳ６０でＮｏと判定された場合、またはステップＳ６２の後、音声コマンド受付装置１００は、処理を終了するか否かを判定する（ステップＳ６４）。具体的には、音声コマンド受付装置１００は、電源をオフにする操作や、処理を終了する旨の操作を受け付けた場合などに、処理を終了すると判定する。処理を終了すると判定された場合（ステップＳ６４；Ｙｅｓ）、図６の処理を終了する。処理を終了すると判定されない場合（ステップＳ６４；Ｎｏ）、ステップＳ５２に進む。 When it is determined No in steps S54 to S60, or after step S62, the voice command receiving device 100 determines whether or not to end the process (step S64). Specifically, the voice command receiving apparatus 100 determines to end the process when an operation to turn off the power or an operation to end the process is received. If it is determined to end the process (step S64; Yes), the process of FIG. 6 ends. If it is not determined to end the process (step S64; No), the process proceeds to step S52.

上述のとおり、第三実施形態は、発話者がマイクロフォンを向いている場合と、マイクロフォン以外を向いている場合とで、音声を音声コマンドとして認識するための認識率を変更して、音声コマンドに対する機能を実行する。第三実施形態では、マイクロフォン以外の方向を向いている場合には、マイクロンフォンの方向を向いている場合と比較して、認識率を低くして音声コマンドに対する機能を実行する。これにより、第三実施形態は、発話者がマイクロフォン以外を向いているため、マイクロフォンが発話者の音声を収音しづらい状況であっても、音声コマンドに対する機能を適切に実行することができる。 As described above, in the third embodiment, the recognition rate for recognizing a voice as a voice command is changed depending on whether the speaker is facing the microphone or when the speaker is facing something other than the microphone. perform a function. In the third embodiment, when facing in a direction other than the microphone, the recognition rate is lowered compared to when facing in the direction of the microphone, and the function for the voice command is executed. As a result, in the third embodiment, since the speaker is facing something other than the microphone, even in a situation where it is difficult for the microphone to pick up the speaker's voice, it is possible to appropriately execute the function corresponding to the voice command.

以上、本発明の実施形態を説明したが、これら実施形態の内容により本発明が限定されるものではない。また、前述した構成要素には、当業者が容易に想定できるもの、実質的に同一のもの、いわゆる均等の範囲のものが含まれる。さらに、前述した構成要素は適宜組み合わせることが可能である。さらに、前述した実施形態の要旨を逸脱しない範囲で構成要素の種々の省略、置換又は変更を行うことができる。 Although the embodiments of the present invention have been described above, the present invention is not limited by the contents of these embodiments. In addition, the components described above include those that can be easily assumed by those skilled in the art, those that are substantially the same, and those within the so-called equivalent range. Furthermore, the components described above can be combined as appropriate. Furthermore, various omissions, replacements, or modifications of components can be made without departing from the gist of the above-described embodiments.

１,１Ａ記録装置
１０第一カメラ
１２第二カメラ
１４記録部
１６表示部
１８，１１０マイクロフォン
２０加速度センサ
２２操作部
２４ＧＮＳＳ受信部
２６，２６Ａ制御部（記録制御装置）
３０映像データ取得部
３２バッファメモリ
３４映像データ処理部
３６記録制御部
３８再生制御部
４０表示制御部
４２，４２Ａ，１０４検出部
４４音声コマンド受付部
４６イベント検出部
４８操作制御部
５０位置情報取得部
５２割合算出部
１００音声コマンド受付装置
１０２音声コマンド受付部
１０６実行制御部
１２０カメラ Reference Signs List 1, 1A recording device 10 first camera 12 second camera 14 recording unit 16 display unit 18, 110 microphone 20 acceleration sensor 22 operation unit 24 GNSS receiving unit 26, 26A control unit (recording control device)
30 video data acquisition unit 32 buffer memory 34 video data processing unit 36 recording control unit 38 playback control unit 40 display control unit 42, 42A, 104 detection unit 44 voice command reception unit 46 event detection unit 48 operation control unit 50 position information acquisition unit 52 ratio calculator 100 voice command reception device 102 voice command reception unit 106 execution control unit 120 camera

Claims

a voice command reception unit that receives a voice command;
a detection unit that detects the orientation of the face of the person who utters the voice command;
an execution control unit that executes a function corresponding to the received voice command when the voice command receiving unit receives the voice command;
with
The voice command reception unit, when the detection unit determines that the direction of the face of the person faces the direction of the microphone for acquiring the uttered voice of the voice command, the voice command received by the voice command reception unit When the voice command is received with the recognition rate of above the first threshold and it is determined that the person's face is facing in a direction other than the direction of the microphone, the recognition rate of the voice command acquired by the voice command receiving unit is , accepting voice commands at or above a second threshold lower than the first threshold;
Voice command reception device.

The voice command reception unit accepts a voice command with a recognition rate of the voice command acquired by the voice command reception unit equal to or higher than a second threshold lower than the first threshold, with respect to voice commands with high urgency or immediacy. ,
The voice command accepting device according to claim 1.

The voice command reception device is a vehicle recording control device used in a vehicle,
a video data acquisition unit that acquires first video data captured by a first capturing unit that captures the surroundings of the vehicle and second video data captured by a second capturing unit that captures the interior of the vehicle,
The voice command reception unit receives an event recording instruction by a voice command,
The detection unit detects the orientation of the face of the passenger of the vehicle from the second image data,
When the voice command reception unit receives an event recording instruction by a voice command, the execution control unit saves the first video data including the time point when the event recording instruction is received as event data.
The voice command accepting device according to claim 1.

The detection unit determines that the occupant of the vehicle is facing in a direction other than the direction of the microphone when the occupant of the vehicle is facing a direction other than the traveling direction of the vehicle for a predetermined time or longer.
4. The voice command accepting device according to claim 3.

The detection unit detects face orientations of a plurality of passengers of the vehicle from the second image data, and detects a direction of faces of a plurality of passengers of the vehicle, and a predetermined ratio or more of the passengers face the same direction other than the traveling direction of the vehicle. , determining that the occupant of the vehicle is facing in a direction other than the direction of travel of the vehicle,
5. The voice command accepting device according to claim 3 or 4.

detecting the face orientation of the person speaking the voice command;
When it is determined that the direction of the person's face is facing the direction of the microphone that acquires the uttered voice of the voice command, the voice command is accepted when the recognition rate of the acquired voice command is equal to or higher than a first threshold, and the person's face If it is determined that the face is oriented in a direction other than the direction of the microphone, the recognition rate of the acquired voice command is a second threshold value lower than the first threshold value or higher.
a step of executing a function corresponding to the received voice command when the voice command is received;
is executed by the voice command receiving device.

detecting the face orientation of the person speaking the voice command;
When it is determined that the direction of the person's face is facing the direction of the microphone that acquires the uttered voice of the voice command, the voice command is accepted when the recognition rate of the acquired voice command is equal to or higher than a first threshold, and the person's face If it is determined that the face is oriented in a direction other than the direction of the microphone, the recognition rate of the acquired voice command is a second threshold value lower than the first threshold value or higher.
a step of executing a function corresponding to the received voice command when the voice command is received;
A program that causes a computer to run