JP2012121387A

JP2012121387A - Vehicle interior monitoring device

Info

Publication number: JP2012121387A
Application number: JP2010272058A
Authority: JP
Inventors: Nobuaki Ueda; 伸晃上田; Jun Fujiwara; 純冨士原; Katsutoshi Okada; 勝利岡田; Minoru Fujioka; 稔藤岡; Hironori Nomori; 寛典野守; Natsumi Nishiyama; 奈津美西山; Kimiaki Shima; 公章四間
Original assignee: Denso Ten Ltd
Current assignee: Denso Ten Ltd
Priority date: 2010-12-06
Filing date: 2010-12-06
Publication date: 2012-06-28
Anticipated expiration: 2030-12-06
Also published as: JP5687889B2

Abstract

PROBLEM TO BE SOLVED: To allow a surveillant to check actively and precisely the appearance of an occupant in a vehicle who is an object to be monitored.SOLUTION: A vehicle interior monitoring device is so constituted that its personal information registration part may register in advance the personal information of respective occupants to board a relative vehicle, while a voice recognition part thereof recognizes the voice of an occupant from the sound heard in the vehicle's interior, whereas call word identification part identifies the calls of the surveillant from the sound of an occupant for an analysis, with a description acquisition part acquiring the characteristics of an occupant whose address term is in agreement with the address term of an occupant to be monitored that has been analyzed, with a position search part retrieving and determining the position of an occupant whose characteristics are in agreement with the foregoing characteristics, with a display image generation part generating a display image of the person to be monitored on the basis of the specified position for displaying.

Description

本発明は、車両の車室内を撮像して表示する車室内監視装置に関する。 The present invention relates to a vehicle interior monitoring apparatus that images and displays a vehicle interior of a vehicle.

従来、車両の車室内に配置されたカメラで車室内の様子を撮像し、カーナビゲーションシステムなどに備えられた車載モニタへ表示する車室内監視装置が知られている。かかる車室内監視装置は、運転中の運転者が目視しにくい位置の搭乗者の様子などを表示して、搭乗者の安全を確保する役割を担っている。 2. Description of the Related Art Conventionally, a vehicle interior monitoring device that captures an image of the interior of a vehicle interior with a camera disposed in the vehicle interior of a vehicle and displays the image on a vehicle-mounted monitor provided in a car navigation system or the like is known. Such a vehicle interior monitoring device plays a role of ensuring the safety of the passenger by displaying the state of the passenger at a position that is difficult for the driver who is driving to view.

そして、搭乗者の安全をより確保するため、近年では、各種センサなどを用いて後部座席における異常を検知した場合や、子供が泣いていることを検知した場合などに、かかる検知方向の様子を表示する車室内監視装置も提案されている（たとえば、特許文献１または特許文献２参照）。 And in order to further ensure the safety of passengers, in recent years, when detecting abnormalities in the rear seat using various sensors, etc., or detecting that the child is crying, the state of such detection direction is shown. A vehicle interior monitoring device for displaying is also proposed (see, for example, Patent Document 1 or Patent Document 2).

特開２００６−３４７２３２号公報JP 2006-347232 A 特開２００８−２２１９８９号公報JP 2008-221989

しかしながら、従来技術を用いた場合、運転者に、即座に監視対象である搭乗者の様子を確認させることができないという問題があった。具体的には、従来技術では、子供が泣いているなどの異常を検知してからかかる検知方向の様子を表示するため、監視者である運転者は、あくまで受動的にしか様子を確認できなかった。 However, when the conventional technology is used, there is a problem that the driver cannot immediately confirm the state of the passenger to be monitored. Specifically, in the prior art, since the state of the detection direction is displayed after detecting an abnormality such as a child crying, the driver who is a supervisor can only confirm the state passively. It was.

このため、たとえば、後部座席の子供が泣きはしないものの重篤な状態にある場合などには、救護の対策が遅れる可能性があった。また、子供の泣いた方向の様子を表示しても、子供が泣きながら移動した場合には、確実に子供の様子を表示できないケースもあった。 For this reason, for example, when the child in the rear seat does not cry but is in a serious state, the rescue measures may be delayed. In addition, even if the state of the child's crying is displayed, if the child moves while crying, the child's state may not be displayed reliably.

これらのことから、能動的、かつ、精度よく、監視対象である搭乗者の様子を運転者に確認させることができる車室内監視装置をいかにして実現するかが大きな課題となっている。なお、かかる課題は、運転者が監視される側である場合にも同様に発生する課題である。 For these reasons, it has become a major issue how to realize a vehicle interior monitoring device that allows the driver to confirm the state of the passenger to be monitored actively and accurately. Such a problem is also a problem that occurs when the driver is on the side to be monitored.

本発明は、上述した従来技術による問題点を解消するためになされたものであって、能動的、かつ、精度よく、監視対象である搭乗者の様子を監視者に確認させることができる車室内監視装置を提供することを目的とする。 The present invention has been made in order to solve the above-described problems caused by the prior art, and is a vehicle interior in which the supervisor can confirm the state of the passenger to be monitored actively and accurately. An object is to provide a monitoring device.

上述した課題を解決し、目的を達成するため、本発明は、車両の車室内を撮像して表示する車室内監視装置であって、前記車両の搭乗者それぞれについての呼称および特徴を含む個人情報を登録する個人情報登録手段と、前記搭乗者の音声を認識する音声認識手段と、前記音声認識手段によって認識された前記音声から抽出された、前記呼称を含んだ特定の前記搭乗者に対する呼びかけに基づき、前記個人情報から当該呼称に対応する前記搭乗者の前記特徴を取得する特徴取得手段と、前記特徴取得手段によって取得された前記特徴に該当する前記搭乗者の前記車室内における位置を検索する位置検索手段と、前記位置検索手段によって検索された前記位置に対応する表示映像を生成する表示映像生成手段とを備えたことを特徴とする。 In order to solve the above-described problems and achieve the object, the present invention is a vehicle interior monitoring device that captures and displays a vehicle interior of a vehicle, and includes personal information including names and features of each passenger of the vehicle Personal information registration means for registering, voice recognition means for recognizing the voice of the passenger, and a call to the specific passenger including the designation extracted from the voice recognized by the voice recognition means Based on the personal information, the feature acquisition means for acquiring the feature of the occupant corresponding to the designation, and the position of the occupant corresponding to the feature acquired by the feature acquisition means are searched for in the vehicle interior. It is characterized by comprising position search means and display video generation means for generating a display video corresponding to the position searched by the position search means.

本発明によれば、車両の搭乗者それぞれについての呼称および特徴を含む個人情報を登録し、搭乗者の音声を認識し、認識された音声から抽出された、呼称を含んだ特定の搭乗者に対する呼びかけに基づき、個人情報からかかる呼称に対応する搭乗者の特徴を取得し、取得された特徴に該当する搭乗者の車室内における位置を検索し、検索された位置に対応する表示映像を生成することとしたので、能動的、かつ、精度よく、監視対象である搭乗者の様子を監視者に確認させることができるという効果を奏する。 According to the present invention, personal information including a name and characteristics for each passenger of a vehicle is registered, a voice of the passenger is recognized, and a specific passenger including a name extracted from the recognized voice is recorded. Based on the call, the characteristics of the passenger corresponding to the name are acquired from the personal information, the position of the passenger in the passenger compartment corresponding to the acquired characteristic is searched, and a display image corresponding to the searched position is generated. As a result, it is possible to cause the supervisor to confirm the state of the passenger to be monitored actively and accurately.

図１は、本発明に係る車室内監視手法の概要を示す図である。FIG. 1 is a diagram showing an outline of a vehicle interior monitoring method according to the present invention. 図２は、車室内監視装置の構成を示すブロック図である。FIG. 2 is a block diagram illustrating a configuration of the vehicle interior monitoring apparatus. 図３は、カメラおよびマイクの配置例を示す図である。FIG. 3 is a diagram illustrating an arrangement example of a camera and a microphone. 図４は、個人情報の登録までの動作を説明するための図である。FIG. 4 is a diagram for explaining the operation up to registration of personal information. 図５は、トリガ情報の設定例を示す図である。FIG. 5 is a diagram illustrating an example of setting trigger information. 図６は、音声認識部における音声認識処理および呼びかけワード識別部における呼びかけワード識別処理を説明するための図である。FIG. 6 is a diagram for explaining speech recognition processing in the speech recognition unit and call word identification processing in the call word identification unit. 図７は、呼びかけ例とそれにともなう特徴取得部、位置検索部および表示映像生成部の動作を説明するための図である。FIG. 7 is a diagram for explaining an example of a call and operations of a feature acquisition unit, a position search unit, and a display video generation unit associated therewith. 図８は、位置検索部における対象検索処理および表示映像生成部における表示映像生成処理を説明するための図である。FIG. 8 is a diagram for explaining the target search process in the position search unit and the display video generation process in the display video generation unit. 図９は、車室内監視装置が実行する処理手順を示すフローチャートである。FIG. 9 is a flowchart showing a processing procedure executed by the vehicle interior monitoring apparatus. 図１０は、個人情報の登録の変形例を示す図である。FIG. 10 is a diagram showing a modification of registration of personal information. 図１１は、呼びかけ例とそれにともなう特徴取得部、位置検索部および表示映像生成部の動作の変形例を説明するための図である。FIG. 11 is a diagram for explaining a calling example and a modified example of operations of the feature acquisition unit, the position search unit, and the display video generation unit associated therewith.

以下に、添付図面を参照して、本発明に係る車室内監視手法の好適な実施例を詳細に説明する。なお、以下では、本発明に係る車室内監視手法の概要について図１を用いて説明した後に、本発明に係る車室内監視手法を適用した車室内監視装置についての実施例を図２〜図１１を用いて説明することとする。 Exemplary embodiments of a vehicle interior monitoring method according to the present invention will be described below in detail with reference to the accompanying drawings. In the following, the outline of the vehicle interior monitoring method according to the present invention will be described with reference to FIG. 1, and then an embodiment of the vehicle interior monitoring device to which the vehicle interior monitoring method according to the present invention is applied will be described with reference to FIGS. This will be described using.

また、以下では、監視者が、主に運転者である場合を例に挙げて説明することとする。また、以下では、車室内監視装置が、撮像部として車載カメラを、表示部として車載モニタを、それぞれ有している場合について説明することとし、車載カメラの撮像映像に基づく表示映像をかかる車載モニタへ表示することを「モニタリング」と記載するものとする。 In the following, a case where the supervisor is mainly a driver will be described as an example. In the following, a case where the vehicle interior monitoring apparatus has an in-vehicle camera as an imaging unit and an in-vehicle monitor as a display unit will be described. Displaying in the form of “monitoring” shall be described.

まず、本発明に係る車室内監視手法の概要について図１を用いて説明する。図１は、本発明に係る車室内監視手法の概要を示す図である。なお、図１の（Ａ）には、車両５０を上方からみた場合の車室５１内の着座位置の配置例について、図１の（Ｂ）には、本発明に係る車室内監視手法の概要について、それぞれ示している。 First, the outline | summary of the vehicle interior monitoring method which concerns on this invention is demonstrated using FIG. FIG. 1 is a diagram showing an outline of a vehicle interior monitoring method according to the present invention. 1A shows an arrangement example of seating positions in the passenger compartment 51 when the vehicle 50 is viewed from above, and FIG. 1B shows an overview of the vehicle interior monitoring method according to the present invention. For each.

図１の（Ａ）に示すように、車両５０は車室５１を備えており、車室５１は、その後部に着座位置５１ＲＲおよび５１ＲＬを、その前部に着座位置５１ＦＲおよび５１ＦＬを、それぞれ備えているものとする。 As shown in FIG. 1A, the vehicle 50 is provided with a passenger compartment 51. The passenger compartment 51 is provided with seating positions 51RR and 51RL in the rear part thereof and seating positions 51FR and 51FL in the front part thereof. It shall be.

また、着座位置５１ＲＲには搭乗者ａが、着座位置５１ＲＬには搭乗者ｂが、着座位置５１ＦＲには搭乗者ｃが、着座位置５１ＦＬには搭乗者ｄが、それぞれ着座しているものとする。また、運転者かつ監視者は、搭乗者ｃであるものとする。なお、図中の「×」印は、搭乗者ｃから搭乗者ａの表情を目視できないことをあらわしている。 Further, it is assumed that the passenger a is seated at the seating position 51RR, the passenger b is seated at the seating position 51RL, the passenger c is seated at the seating position 51FR, and the passenger d is seated at the seating position 51FL. . The driver and the supervisor are passengers c. In addition, the "x" mark in a figure represents that the passenger's facial expression cannot be visually recognized from the passenger c.

ここで、従来の手法によれば、搭乗者ｃは、他の搭乗者の様子を確認したい場合であっても、即座にかかる搭乗者の様子を確認できなかった。具体的には、従来の手法によれば、着座位置５１ＲＲあるいは５１ＲＬなどにおける異常を検知した場合でなければ、該当する着座位置をモニタリングしていなかった。すなわち、搭乗者ｃは受動的にしか、搭乗者ａあるいはｂなどの様子を確認できなかった。 Here, according to the conventional technique, even if the passenger c wants to check the state of another passenger, the passenger c cannot immediately check the state of the passenger. Specifically, according to the conventional method, the corresponding seating position is not monitored unless an abnormality is detected at the seating position 51RR or 51RL. That is, the passenger c can confirm the state of the passenger a or b only passively.

なお、搭乗者ｃが、能動的に後方を振り返ったり、ルームミラーを注視したりすることで、搭乗者ａあるいはｂの様子を確認することはできる。しかし、運転者でもある搭乗者ｃの振り返りやルームミラーの注視は、運転動作として適切ではない場合がある。 In addition, the state of the passenger a or b can be confirmed by the passenger c actively looking back and gazing at the rearview mirror. However, looking back on the passenger c, who is also the driver, and gazing at the room mirror may not be appropriate as a driving operation.

とりわけ、即座に搭乗者ｃの死角に着座する搭乗者ａの様子を確認することは、受動的であれ能動的であれ、きわめて難しかった。 In particular, it is extremely difficult to immediately confirm the state of the passenger a who sits in the blind spot of the passenger c, whether it is passive or active.

そこで、図１の（Ｂ）に示したように、本発明に係る車室内監視手法では、監視者の監視対象者に対する能動的な「呼びかけ」に基づき、撮像映像における監視対象者の位置を検索および特定し、即座にかかる監視対象者の様子をモニタリングすることとした。 Therefore, as shown in FIG. 1B, in the vehicle interior monitoring method according to the present invention, the position of the monitoring subject in the captured image is searched based on the active “calling” of the monitoring subject to the monitoring subject. And, it was decided to monitor the situation of the subject to be monitored immediately.

まず、本発明に係る車室内監視手法では、車両５０への乗車時などに、すべての搭乗者ａ〜ｄについて、「呼称」および「特徴」を含む個人情報を、データベースである個人情報１５ａへ登録する（図１の（Ｂ−１）参照）。 First, in the vehicle interior monitoring method according to the present invention, personal information including “name” and “feature” is stored in the personal information 15a that is a database for all passengers a to d when getting into the vehicle 50 or the like. Register (see (B-1) in FIG. 1).

つづいて、たとえば、車両５０の走行中などに、搭乗者ｃが搭乗者ａに対して「大丈夫？ａさん」と「呼びかけ」を行ったものとする（図１の（Ｂ−２）参照）。ここで、本発明に係る車室内監視手法では、かかる「大丈夫？ａさん」という「呼びかけ」に基づき、個人情報１５ａから「ａさん」の「特徴」を取得する。 Subsequently, for example, while the vehicle 50 is traveling, the passenger c makes a “call” to the passenger a with “Okay? Mr. a” (see (B-2) in FIG. 1). . Here, in the vehicle interior monitoring method according to the present invention, the “feature” of “Mr. a” is acquired from the personal information 15a based on the “call” “Okay?

なお、図１の（Ｂ−３）では、「ａさん」の「特徴」が、「ａさん」の顔画像である場合を示しているが（図中の閉曲線１で囲まれた部分参照）、かかる「特徴」は、顔画像に限らなくともよい。かかる点については、図１０を用いて後述する。 1B-3 shows a case where the “feature” of “Mr. a” is a face image of “Mr. a” (see the portion surrounded by the closed curve 1 in the figure). The “feature” is not limited to the face image. This point will be described later with reference to FIG.

そして、本発明に係る車室内監視手法では、取得した「特徴」に基づいて車載カメラの撮像映像１１ａから「特徴」に合致する「ａさん」を検索し（図１の（Ｂ−４）参照）、かかる撮像映像１１ａにおける「ａさん」の位置を特定する（図１の（Ｂ−５）参照）。 In the vehicle interior monitoring method according to the present invention, “Mr. a” that matches the “feature” is searched from the captured image 11a of the in-vehicle camera based on the acquired “feature” (see (B-4) in FIG. 1). ), The position of “Mr. a” in the captured image 11a is specified (see (B-5) in FIG. 1).

そして、本発明に係る車室内監視手法では、特定した「ａさん」の位置（図１の（Ｂ−５）の閉曲線２に囲まれた部分参照）を、搭乗者ｃが、運転動作に支障をきたすことなく確実に視認できるように表示部１３（車載モニタに対応）へ表示する（図１の（Ｂ−６）参照）。 In the vehicle interior monitoring method according to the present invention, the occupant c does not interfere with the driving operation based on the position of the identified “Mr. a” (see the portion surrounded by the closed curve 2 in FIG. 1 (B-5)). Is displayed on the display unit 13 (corresponding to the vehicle-mounted monitor) so that it can be surely seen without causing any trouble (see (B-6) in FIG. 1).

なお、図１の（Ｂ−６）では、「ａさん」の位置を拡大表示した例を示しているが、かかる拡大表示に限らなくともよい。この点については、図８を用いて後述する。また、上述した「呼びかけ」をトリガとする一連の動作の詳細については、図５から図７を用いて後述する。 In addition, although (B-6) of FIG. 1 has shown the example which expanded and displayed the position of "Mr. a", it does not need to be restricted to this enlarged display. This will be described later with reference to FIG. Details of a series of operations triggered by the above “calling” will be described later with reference to FIGS.

このように、本発明に係る車室内監視手法では、監視者の監視対象者に対する能動的な「呼びかけ」に基づき、撮像映像における監視対象者の位置を検索および特定し、即座にかかる監視対象者の様子をモニタリングすることとした。 As described above, in the vehicle interior monitoring method according to the present invention, based on the active “call” by the supervisor to the monitoring target person, the position of the monitoring target person in the captured image is searched and specified, and the monitoring target person is immediately detected. It was decided to monitor the situation.

したがって、本発明に係る車室内監視手法によれば、能動的、かつ、精度よく、監視対象である搭乗者の様子を監視者に確認させることができる。 Therefore, according to the vehicle interior monitoring method according to the present invention, it is possible to cause the monitor to confirm the state of the passenger who is the monitoring target actively and accurately.

なお、図１を用いた説明では、監視者の「呼びかけ」をトリガとして、監視対象者の特徴を取得したうえで、監視対象者の位置を検索および特定する場合について説明したが、即時性を高めるために、「呼びかけ」をトリガとすることなく、あらかじめ各搭乗者の位置の当たりをつけておいてもよい。かかる点については、図１１を用いて後述する。 In the description with reference to FIG. 1, the case where the monitoring person's position is searched and specified after the monitoring person's “calling” is used as a trigger to obtain the characteristics of the monitoring person has been described. In order to increase the position, each passenger's position may be determined in advance without using “calling” as a trigger. This point will be described later with reference to FIG.

以下では、図１を用いて説明した車室内監視手法を適用した車室内監視装置についての実施例を詳細に説明する。なお、以下に示す実施例では、図１を用いて説明した場合と同様に、運転者かつ監視者が搭乗者ｃであり、監視対象者が搭乗者ａである場合について主に説明する。また、以下では、上述した「呼びかけ」を、「呼びかけワード」と記載することとする。 Below, the Example about the vehicle interior monitoring apparatus to which the vehicle interior monitoring method demonstrated using FIG. 1 is applied is described in detail. In the following embodiment, as in the case described with reference to FIG. 1, the case where the driver and the monitor are the passengers c and the monitoring target is the passengers a will be mainly described. In the following, the above “call” is referred to as “call word”.

図２は、本実施例に係る車室内監視装置１０の構成を示すブロック図である。なお、図２では、車室内監視装置１０の特徴を説明するために必要な構成要素のみを示しており、一般的な構成要素についての記載を省略している。 FIG. 2 is a block diagram illustrating a configuration of the vehicle interior monitoring apparatus 10 according to the present embodiment. In FIG. 2, only the components necessary for explaining the characteristics of the vehicle interior monitoring apparatus 10 are shown, and descriptions of general components are omitted.

図２に示すように、車室内監視装置１０は、カメラ１１と、マイク１２と、表示部１３と、制御部１４と、記憶部１５とを備えている。 As shown in FIG. 2, the vehicle interior monitoring apparatus 10 includes a camera 11, a microphone 12, a display unit 13, a control unit 14, and a storage unit 15.

また、制御部１４は、呼びかけワード入力部１４ａと、個人映像入力部１４ｂと、登録情報生成部１４ｃと、個人情報登録部１４ｄと、音声認識部１４ｅと、呼びかけワード識別部１４ｆと、特徴取得部１４ｇと、位置検索部１４ｈと、表示映像生成部１４ｉとをさらに備えている。そして、記憶部１５は、個人情報１５ａと、トリガ情報１５ｂとを記憶する。 The control unit 14 also includes a call word input unit 14a, a personal video input unit 14b, a registration information generation unit 14c, a personal information registration unit 14d, a voice recognition unit 14e, a call word identification unit 14f, and feature acquisition. 14 g, a position search unit 14 h, and a display video generation unit 14 i. The storage unit 15 stores personal information 15a and trigger information 15b.

カメラ１１は、搭乗者、および、車室５１（図１参照）内をそれぞれ撮像するカメラデバイスであり、車載カメラに対応する。なお、あらかじめ個人情報１５ａの登録（図１参照）を行うにあたって各搭乗者を撮像するカメラデバイスは、車載カメラに限らなくともよい。また、車両５０の夜間走行時に備えて、赤外線カメラを用いてもよい。 The camera 11 is a camera device that captures images of the passenger and the passenger compartment 51 (see FIG. 1), and corresponds to an in-vehicle camera. Note that the camera device that images each occupant when the personal information 15a is registered in advance (see FIG. 1) is not limited to the in-vehicle camera. In addition, an infrared camera may be used in preparation for the vehicle 50 traveling at night.

マイク１２は、搭乗者の「呼称」、および、監視者が発声する「呼びかけワード」を含む車室５１内のさまざまな音を、音響信号として入力する入力デバイスである。 The microphone 12 is an input device that inputs various sounds in the passenger compartment 51 including the “name” of the passenger and the “call word” uttered by the supervisor as acoustic signals.

なお、あらかじめ個人情報１５ａの登録を行うにあたっては、マイク１２は音響信号を呼びかけワード入力部１４ａへ、カメラ１１は撮像信号を個人映像入力部１４ｂへ、それぞれ出力する。また、監視対象者のモニタリングを行うにあたっては、マイク１２は音響信号を音声識別部１４ｅへ、カメラ１１は撮像映像信号を位置検索部１４ｈへ、それぞれ出力する。 When the personal information 15a is registered in advance, the microphone 12 calls an acoustic signal to the word input unit 14a, and the camera 11 outputs an imaging signal to the personal video input unit 14b. Further, when monitoring the person to be monitored, the microphone 12 outputs an acoustic signal to the voice identification unit 14e, and the camera 11 outputs a captured video signal to the position search unit 14h.

ここで、車室５１内におけるカメラ１１およびマイク１２の配置例について、図３を用いて説明しておく。図３は、カメラ１１およびマイク１２の配置例を示す図である。なお、図３の（Ａ）には、カメラ１１の配置例を、図３の（Ｂ）には、マイク１２の配置例を、それぞれ示している。 Here, the example of arrangement | positioning of the camera 11 and the microphone 12 in the vehicle interior 51 is demonstrated using FIG. FIG. 3 is a diagram illustrating an arrangement example of the camera 11 and the microphone 12. 3A shows an arrangement example of the camera 11, and FIG. 3B shows an arrangement example of the microphone 12.

図３の（Ａ−ａ）に示したように、カメラ１１は、ダッシュボード中央部やルームミラーの近傍など車室５１前部の中央部へ配置することができる。このように配置した場合の利点は、車室５１全体を見わたせる点にある。車室５１全体を見わたせる位置であれば上記位置に限定されず、その他の位置にも設置可能である。 As shown in (A-a) of FIG. 3, the camera 11 can be disposed at the center of the front part of the passenger compartment 51 such as the center of the dashboard or the vicinity of the room mirror. The advantage of this arrangement is that the entire vehicle compartment 51 can be seen. The position is not limited to the above position as long as the entire vehicle compartment 51 can be seen, and can be installed at other positions.

なお、車室５１全体を見わたせる意味においては、図３の（Ａ−ｂ）に示したように、車室５１の天井部中央、たとえば、ルームランプ近傍にカメラ１１を配置してもよい。かかる場合、使用するレンズに魚眼レンズなどを用いて、各搭乗者の表情を含めた様子を確認しやすくするとより好ましい。 In addition, as shown in FIG. 3 (Ab), the camera 11 may be disposed in the center of the ceiling portion of the passenger compartment 51, for example, in the vicinity of the room lamp, in order to see the entire passenger compartment 51. . In such a case, it is more preferable to use a fish-eye lens or the like as the lens to be used so that it is easy to confirm the appearance including the expression of each passenger.

また、図３の（Ａ−ｃ）に示したように、カメラ１１を複数配置してもよい。なお、図３の（Ａ−ｃ）には、各搭乗者の前方へそれぞれカメラ１１を配置した例を示している。 Further, as shown in (Ac) of FIG. 3, a plurality of cameras 11 may be arranged. In addition, (Ac) of FIG. 3 shows an example in which the camera 11 is arranged in front of each passenger.

かかる場合、車室５１の前部に着座する搭乗者分のカメラ１１は、たとえば、ダッシュボードやフロントウィンドウの近傍などに配置される。また、車室５１の後部に着座する搭乗者分のカメラ１１は、着座位置５１ＦＲや着座位置５１ＦＬの背面、たとえば、運転席および助手席のヘッドレストの背面などに配置することができる。 In such a case, the camera 11 for the passenger sitting in the front part of the passenger compartment 51 is disposed, for example, in the vicinity of a dashboard or a front window. The camera 11 for the passenger sitting in the rear part of the passenger compartment 51 can be arranged on the back of the seating position 51FR or the seating position 51FL, for example, on the back of the headrests of the driver seat and the passenger seat.

なお、図３の（Ａ−ｃ）に示したように、複数のカメラ１１を配置した場合、監視対象者の様子をモニタリングする際には、かかる監視対象者の様子をもっとも確認しやすいカメラ１１の撮像画像を選択するのみでよいという利点がある。 As shown in FIG. 3 (A-c), when a plurality of cameras 11 are arranged, when monitoring the state of the person to be monitored, the camera 11 that most easily confirms the state of the person to be monitored. There is an advantage that it is only necessary to select the captured image.

かかる点については、図８を用いて後述する。また、以下では、図３の（Ａ−ａ）に示した配置例で、カメラ１１が配置されているものとする。 This point will be described later with reference to FIG. In the following, it is assumed that the camera 11 is arranged in the arrangement example shown in FIG.

また、図３の（Ｂ−ａ）に示したように、マイク１２は、着座位置５１ＦＲ前方、すなわち、運転席前方のダッシュボードやフロントウィンドウの近傍などへ配置することができる。かかる場合、監視者となる場合の多い運転者の発声する「呼びかけワード」を確実に集音することができるというメリットがある。 Further, as shown in FIG. 3B-a, the microphone 12 can be disposed in front of the seating position 51FR, that is, in the vicinity of the dashboard or the front window in front of the driver's seat. In such a case, there is an advantage that it is possible to reliably collect “calling words” uttered by a driver who often becomes a supervisor.

また、図３の（Ｂ−ｂ）に示したように、車室５１の天井部中央、たとえば、ルームランプ近傍にマイク１２を配置してもよい。かかる場合、無指向性マイクなどを用いることによって、いずれの搭乗者が監視者になる場合であっても、確実に「呼びかけワード」を集音できるというメリットがある。 Further, as shown in (B-b) of FIG. 3, the microphone 12 may be disposed in the center of the ceiling portion of the passenger compartment 51, for example, in the vicinity of the room lamp. In such a case, the use of an omnidirectional microphone or the like has an advantage that the “calling word” can be reliably collected regardless of which passenger is the supervisor.

なお、以下では、図３の（Ｂ−ａ）に示した配置例で、マイク１２が配置されているものとする。 In the following, it is assumed that the microphone 12 is arranged in the arrangement example shown in FIG.

図２の説明に戻り、表示部１３について説明する。表示部１３は、表示映像生成部１４ｉが生成し出力する監視対象者のモニタリング映像を表示する車載モニタなどの表示デバイスである。 Returning to the description of FIG. 2, the display unit 13 will be described. The display unit 13 is a display device such as an in-vehicle monitor that displays the monitoring video of the monitoring subject generated and output by the display video generation unit 14i.

制御部１４は、搭乗者それぞれの個人情報を登録したうえで、「呼びかけ」られた搭乗者の「特徴」をかかる個人情報から取得し、かかる「特徴」に該当する搭乗者の位置をカメラ１１の撮像映像において検索および特定し、特定した位置に基づいて表示映像を生成のうえ、表示部１３へ表示させる処理を行う処理部である。 The control unit 14 registers the personal information of each passenger, obtains the “feature” of the “calling” passenger from the personal information, and determines the position of the passenger corresponding to the “feature” from the camera 11. This is a processing unit that performs a process of searching for and specifying the captured video, generating a display video based on the specified position, and causing the display unit 13 to display the display video.

呼びかけワード入力部１４ａは、個人情報１５ａの登録にあたり、マイク１２から入力される音響信号に基づいて各搭乗者の「呼称」を抽出したうえで、抽出した「呼称」の音声信号を、登録情報生成部１４ｃへ出力する処理を行う処理部である。 In registering the personal information 15a, the call word input unit 14a extracts the “name” of each passenger based on the acoustic signal input from the microphone 12, and then uses the extracted “name” voice signal as registration information. It is a processing part which performs the process output to the production | generation part 14c.

個人映像入力部１４ｂは、個人情報１５ａの登録にあたり、カメラ１１から入力される撮像信号に基づいて各搭乗者の「特徴」をあらわす画像データを生成したうえで、生成した画像データを登録情報生成部１４ｃへ出力する処理を行う処理部である。 When registering the personal information 15a, the personal video input unit 14b generates image data representing “features” of each passenger based on the imaging signal input from the camera 11, and then generates the generated image data as registration information. It is a process part which performs the process output to the part 14c.

登録情報生成部１４ｃは、呼びかけワード入力部１４ａから入力された「呼称」の音声信号をテキスト化したうえで、かかるテキスト化された「呼称」と、個人映像入力部１４ｂから入力された画像データとを関連付けた情報である登録情報を生成する処理を行う処理部である。また、登録情報生成部１４ｃは、生成した登録情報を、個人情報登録部１４ｄへ出力する処理を併せて行う。 The registration information generation unit 14c converts the text signal of the “name” input from the calling word input unit 14a into text, and then converts the textized “name” and image data input from the personal video input unit 14b. Is a processing unit that performs a process of generating registration information that is information in which The registration information generation unit 14c also performs a process of outputting the generated registration information to the personal information registration unit 14d.

個人情報登録部１４ｄは、登録情報生成部１４ｃから入力された登録情報を、個人情報１５ａへ記憶させる処理を行う処理部である。 The personal information registration unit 14d is a processing unit that performs processing for storing the registration information input from the registration information generation unit 14c in the personal information 15a.

ここで、個人情報１５ａの登録までの動作について、図４を用いて説明する。図４は、個人情報１５ａの登録までの動作を説明するための図である。なお、図中の閉曲線１４ａに囲まれた部分は呼びかけワード入力部１４ａにおける動作に、閉曲線１４ｂに囲まれた部分は、個人映像入力部１４ｂにおける動作に、閉曲線１４ｃおよび１４ｄに囲まれた部分は、登録情報生成部１４ｃおよび個人情報登録部１４ｄにおける動作に、それぞれ対応している。また、閉曲線１４ｃおよび１４ｄに囲まれた部分には、個人情報１５ａの登録例を含んで示している。 Here, the operation up to the registration of the personal information 15a will be described with reference to FIG. FIG. 4 is a diagram for explaining the operation up to the registration of the personal information 15a. In the figure, the portion surrounded by the closed curve 14a is the operation in the calling word input unit 14a, the portion surrounded by the closed curve 14b is the operation in the personal video input unit 14b, and the portion surrounded by the closed curves 14c and 14d is. This corresponds to the operations in the registration information generation unit 14c and the personal information registration unit 14d, respectively. In addition, the portion surrounded by the closed curves 14c and 14d includes an example of registration of the personal information 15a.

まず、閉曲線１４ａに囲まれた部分に示したように、呼びかけワード入力部１４ａは、マイク１２から入力される登録者（ここでは、搭乗者ｃなど）の発声する登録対象者の「呼称」（ここでは、「ａさん」）の音声信号を、登録情報生成部１４ｃへ出力する。 First, as shown in the portion surrounded by the closed curve 14 a, the calling word input unit 14 a is a “name” (name) of a registration target person uttered by a registrant (here, passenger c) input from the microphone 12. Here, the voice signal “Mr. a” is output to the registration information generation unit 14c.

一方、閉曲線１４ｂに囲まれた部分に示したように、個人映像入力部１４ｂは、カメラ１１から入力される登録対象者（ここでは、搭乗者ａ）の画像データ（ここでは、顔画像）を生成したうえで、生成した画像データを登録情報生成部１４ｃへ出力する。 On the other hand, as shown in the portion surrounded by the closed curve 14b, the personal video input unit 14b receives the image data (here, the face image) of the person to be registered (here, the passenger a) input from the camera 11. After the generation, the generated image data is output to the registration information generation unit 14c.

そして、閉曲線１４ｃおよび１４ｄに囲まれた部分に示したように、登録情報生成部１４ｃおよび個人情報登録部１４ｄは、呼びかけワード入力部１４ａから入力された「ａさん」の音声信号をテキスト化したうえで、個人映像入力部１４ｂから入力された「搭乗者ａの顔画像」と関連付け、個人情報１５ａへ登録する。 Then, as shown in the portions surrounded by the closed curves 14c and 14d, the registration information generation unit 14c and the personal information registration unit 14d convert the voice signal of “Mr. a” input from the call word input unit 14a into text. Then, it is associated with “the face image of the passenger a” input from the personal video input unit 14b and registered in the personal information 15a.

なお、「ａさん」などの「呼称」をテキスト化することで、搭乗者のうちの誰もが「呼びかけワード」を発声する監視者になれるというメリットがある。 In addition, there is an advantage that by making “name” such as “Mr. a” into text, any of the passengers can become a supervisor who utters “call word”.

ここで、閉曲線１４ｃおよび１４ｄに囲まれた部分に示した個人情報１５ａの登録例について説明する。個人情報１５ａは、「Ｎｏ．」項目と、「呼称」項目と、「特徴」項目とを含んでいる。 Here, a registration example of the personal information 15a shown in the portion surrounded by the closed curves 14c and 14d will be described. The personal information 15a includes a “No.” item, a “name” item, and a “feature” item.

「Ｎｏ．」項目は、「呼称」ごとに登録される各レコードを一意に示すレコード番号が格納される項目である。「呼称」項目は、上述のようにテキスト化された各搭乗者の「呼称」が格納される項目である。「特徴」項目は、各「呼称」に対応する搭乗者の特徴を示すデータ（ここでは、顔画像）が格納される項目である。 The “No.” item is an item in which a record number uniquely indicating each record registered for each “name” is stored. The “name” item is an item in which the “name” of each passenger converted into text as described above is stored. The “feature” item is an item in which data (in this case, a face image) indicating the characteristics of the passenger corresponding to each “name” is stored.

なお、図示したように、「呼称」項目に格納される呼称は、「氏名」、「愛称」あるいは「敬称」など、その種別を問わない。たとえば、図４には、「ａさん」、「ｂくん」、「ｃ」、「ｄさん」および「ｄ先輩」など、さまざま呼び方で「呼称」が登録されている例を示している。また、「ｄさん」あるいは「ｄ先輩」のように、同一人物を異なる「呼称」で登録してもよい。 As shown in the figure, the name stored in the “name” item may be of any type such as “name”, “nickname”, or “honor”. For example, FIG. 4 shows an example in which “name” is registered in various ways such as “Mr. a”, “Mr. b”, “c”, “Mr. d”, and “d Senior”. Further, the same person may be registered with different “names” such as “Mr. d” or “d senior”.

図２の説明に戻り、音声認識部１４ｅについて説明する。音声認識部１４ｅは、監視対象者のモニタリングを行うにあたり、マイク１２から入力される音響信号に基づいて「搭乗者」の発声した音声を抽出したうえで、抽出した音声の音声信号を呼びかけワード識別部１４ｆへ出力する処理を行う処理部である。 Returning to the description of FIG. 2, the voice recognition unit 14e will be described. The voice recognition unit 14e extracts the voice uttered by the “passenger” based on the acoustic signal input from the microphone 12 and monitors the voice signal of the extracted voice when performing monitoring of the monitoring subject. It is a processing part which performs the process output to the part 14f.

呼びかけワード識別部１４ｆは、音声認識部１４ｅから入力された音声信号に基づき、監視者が監視対象者へ向けて発声した「呼びかけワード」をテキスト化する処理を行う処理部である。 The calling word identifying unit 14f is a processing unit that performs processing for converting the “calling word” uttered by the monitor toward the monitoring target person based on the voice signal input from the voice recognition unit 14e.

また、呼びかけワード識別部１４ｆは、テキスト化した「呼びかけワード」を「呼称」と「呼称」以外のワードとに分解したうえで、かかる「呼称」以外のワードが監視対象者のモニタリングに関する「トリガ」となるワードであるか否かを判別する処理を併せて行う。なお、かかる判別時には、記憶部１５のトリガ情報１５ｂを参照する。 Further, the call word identification unit 14f disassembles the text “call word” into words other than “name” and “name”, and then the words other than the “name” become “triggers” related to monitoring of the monitoring subject. The process of determining whether or not the word is “ At the time of such determination, the trigger information 15b in the storage unit 15 is referred to.

また、呼びかけワード識別部１４ｆは、かかる判別によって監視対象者のモニタリングの「開始」を判別した場合には、特徴取得部１４ｇへ「呼称」を出力する処理を併せて行う。 In addition, the call word identification unit 14f also performs a process of outputting a “name” to the feature acquisition unit 14g when the monitoring start of the monitoring subject is determined by the determination.

また、呼びかけワード識別部１４ｆは、かかる判別によって監視対象者のモニタリングの「終了」を判別した場合には、かかるモニタリングを終了させる処理を併せて行う。 Further, the call word identification unit 14f also performs a process of terminating the monitoring when the monitoring target person's “end” of the monitoring is determined by the determination.

ここで、呼びかけワード識別部１４ｆが参照する記憶部１５のトリガ情報１５ｂについて、図５を用いて説明しておく。図５は、トリガ情報１５ｂの設定例を示す図である。 Here, the trigger information 15b of the storage unit 15 referred to by the calling word identification unit 14f will be described with reference to FIG. FIG. 5 is a diagram illustrating a setting example of the trigger information 15b.

図５に示したように、トリガ情報１５ｂは、監視対象者のモニタリングに関して「トリガ」となるワードをあらかじめ設定した情報である。なお、トリガ情報１５ｂは、「カテゴリ」項目と「ワード」項目とを含んでいる。 As illustrated in FIG. 5, the trigger information 15b is information in which a word to be a “trigger” is set in advance for monitoring of the monitoring target person. The trigger information 15b includes a “category” item and a “word” item.

「カテゴリ」項目は、監視対象者のモニタリングに関する動作を大別するカテゴリ値が格納される項目である。たとえば、図５に示したように、カテゴリ値「開始」を格納することによって、監視対象者のモニタリングを開始するワード（以下、「開始ワード」と記載する）を設定することができる。 The “category” item is an item in which a category value that roughly classifies operations related to monitoring by the monitoring target person is stored. For example, as shown in FIG. 5, by storing the category value “start”, it is possible to set a word (hereinafter referred to as “start word”) for starting monitoring of the monitoring subject.

また、カテゴリ値「終了」を格納することによって、監視対象者のモニタリングを終了するワード（以下、「終了ワード」と記載する）を設定することができる。 Further, by storing the category value “end”, it is possible to set a word (hereinafter referred to as “end word”) for ending the monitoring of the monitoring target person.

「ワード」項目は、「トリガ」となるワードの具体例が格納される項目である。たとえば、図５には、「開始ワード」として、「映像表示」、「映像スタート」、「大丈夫？」および「起きてる？」などが設定された例を示している。 The “word” item is an item in which a specific example of a word to be “trigger” is stored. For example, FIG. 5 shows an example in which “video display”, “video start”, “OK?”, “Wake up?”, Etc. are set as “start words”.

また、「終了ワード」として、「表示終了」、「映像ストップ」および「もういいよ」などが設定された例を示している。なお、以下では、かかる設定例でトリガ情報１５ｂが設定されているものとし、「開始ワード」として「大丈夫？」を、「終了ワード」として「もういいよ」を、それぞれ用いるものとして説明を行う。 In addition, an example is shown in which “display end”, “video stop”, “more good”, and the like are set as the “end word”. In the following description, it is assumed that the trigger information 15b is set in this setting example, and “OK” is used as the “start word”, and “OK” is used as the “end word”. .

つづいて、音声認識部１４ｅにおける音声認識処理および呼びかけワード識別部１４ｆにおける呼びかけワード識別処理について、図６を用いて説明する。図６は、音声認識部１４ｅにおける音声認識処理および呼びかけワード識別部１４ｆにおける呼びかけワード識別処理を説明するための図である。 Next, the speech recognition process in the speech recognition unit 14e and the call word identification process in the call word identification unit 14f will be described with reference to FIG. FIG. 6 is a diagram for explaining speech recognition processing in the speech recognition unit 14e and calling word identification processing in the calling word identification unit 14f.

なお、図中の閉曲線１４ｅに囲まれた部分は音声認識部１４ｅにおける動作に、閉曲線１４ｆに囲まれた部分は、呼びかけワード識別部１４ｆにおける動作に、それぞれ対応している。 In the figure, the portion surrounded by the closed curve 14e corresponds to the operation in the speech recognition unit 14e, and the portion surrounded by the closed curve 14f corresponds to the operation in the calling word identification unit 14f.

まず、図６の（１）に示したように、音声認識部１４ｅは、マイク１２から入力される搭乗者ｃなどの発声した音声の音声信号を、呼びかけワード識別部１４ｆへ出力する。なお、図６の（１）では、「大丈夫？ａさん」という「呼びかけワード」である音声がマイク１２から入力された例を示しており、以下でも、かかる「呼びかけワード」を前提に説明を進める。 First, as shown in (1) of FIG. 6, the voice recognition unit 14 e outputs a voice signal of voice uttered by the passenger c or the like input from the microphone 12 to the calling word identification unit 14 f. Note that FIG. 6 (1) shows an example in which a voice of “calling word” “Okay? Mr. a” is input from the microphone 12, and in the following, the description will be made on the assumption of the “calling word”. Proceed.

そして、図６の（２）に示したように、呼びかけワード識別部１４ｆは、音声認識部１４ｅから入力された音声信号に基づいて「大丈夫？ａさん」という「呼びかけワード」をテキスト化する。 Then, as shown in (2) of FIG. 6, the calling word identification unit 14f converts the “calling word” “Okay? Mr. a” into text based on the voice signal input from the voice recognition unit 14e.

そして、図６の（３）に示したように、呼びかけワード識別部１４ｆは、テキスト化した「大丈夫？ａさん」を、「呼称」である「ａさん」と「呼称」以外のワードである「大丈夫？」とに分解する。 Then, as shown in (3) of FIG. 6, the calling word identifying unit 14 f is a word other than “named” and “named” as “named okay? Decompose into "Is it all right?"

ここで、呼びかけワード識別部１４ｆは、「呼称」以外のワードである「大丈夫？」が、トリガ情報１５ｂにあらかじめ設定された監視対象者のモニタリングに関する「トリガ」となるワードと合致するか否かを判別する。 Here, the calling word identifying unit 14f determines whether or not “OK”, which is a word other than “name”, matches a word “trigger” related to monitoring of the monitoring target set in advance in the trigger information 15b. Is determined.

なお、上述のように、「大丈夫？」のワードは、トリガ情報１５ｂに「開始ワード」として設定されている前提であるため、ここで、呼びかけワード識別部１４ｆは、「呼称」のワードである「ａさん」を特徴取得部１４ｇへ出力する（図６に示した「特徴取得部へ」参照）。 As described above, the word “OK” is assumed to be set as the “start word” in the trigger information 15b, and therefore the calling word identification unit 14f is the word “name”. “Mr. a” is output to the feature acquisition unit 14g (see “to feature acquisition unit” shown in FIG. 6).

また、仮に、「大丈夫？」のワードが「終了ワード」として設定されている、あるいは、「大丈夫？」のワード自体が設定されていないならば、車室内監視装置１０は、音声認識部１４ｅの動作へと制御を移す。 Also, if the word “OK” is set as the “end word”, or if the word “OK” is not set, the vehicle interior monitoring apparatus 10 determines that the voice recognition unit 14e Transfer control to action.

図２の説明に戻り、特徴取得部１４ｇについて説明する。特徴取得部１４ｇは、呼びかけワード識別部１４ｆから入力された「呼称」と合致する搭乗者の「特徴」を、記憶部１５の個人情報１５ａから取得する処理を行う処理部である。また、特徴取得部１４ｇは、取得した「特徴」を位置検索部１４ｈへ出力する処理を併せて行う。 Returning to the description of FIG. 2, the feature acquisition unit 14g will be described. The feature acquisition unit 14g is a processing unit that performs processing for acquiring, from the personal information 15a of the storage unit 15, the “feature” of the passenger that matches the “name” input from the calling word identification unit 14f. The feature acquisition unit 14g also performs a process of outputting the acquired “feature” to the position search unit 14h.

位置検索部１４ｈは、特徴取得部１４ｇから入力された「特徴」に該当する搭乗者の、カメラ１１の撮像映像における位置を検索および特定する処理を行う処理部である。なお、かかる検索および特定は、カメラ１１の撮像映像と「特徴」をあらわす画像データとの比較などによって行われる。また、位置検索部１４ｈは、特定した位置を表示映像生成部１４ｉへ出力する処理を併せて行う。 The position search unit 14h is a processing unit that performs a process of searching and specifying the position in the captured image of the camera 11 of the passenger corresponding to the “feature” input from the feature acquisition unit 14g. Such search and identification are performed by comparing the captured image of the camera 11 with image data representing “features”. The position search unit 14h also performs a process of outputting the specified position to the display video generation unit 14i.

表示映像生成部１４ｉは、位置検索部１４ｈから入力された監視対象者の位置に基づき、監視者が監視対象者の様子を確実に視認可能な表示映像を生成する処理を行う処理部である。 The display video generation unit 14i is a processing unit that performs a process of generating a display video in which the monitor can surely see the state of the monitoring target person based on the position of the monitoring target person input from the position search unit 14h.

また、表示映像生成部１４ｉは、生成した表示映像を車載モニタなどに対応する表示部１３へ出力する処理を併せて行う。なお、表示映像生成部１４ｉにおける表示映像生成処理の詳細については、図８を用いて後述する。 The display video generation unit 14i also performs a process of outputting the generated display video to the display unit 13 corresponding to the in-vehicle monitor. Details of the display video generation processing in the display video generation unit 14i will be described later with reference to FIG.

ここで、呼びかけ例とそれにともなう特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの動作について、図７を用いて説明する。図７は、呼びかけ例とそれにともなう特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの動作を説明するための図である。なお、図７に示した横軸ｔは、時間の経過をあらわしている。 Here, an example of a call and the operations of the feature acquisition unit 14g, the position search unit 14h, and the display video generation unit 14i associated therewith will be described with reference to FIG. FIG. 7 is a diagram for explaining an example of a call and operations of the feature acquisition unit 14g, the position search unit 14h, and the display video generation unit 14i associated therewith. The horizontal axis t shown in FIG. 7 represents the passage of time.

図７の（１）あるいは（２）に示したように、音声認識部１４ｅから「お腹すいたね。ａさん」あるいは「気分悪い？ａさん」といった音声の音声信号が入力された場合、呼びかけワード識別部１４ｆは、「お腹すいたね。」あるいは「気分悪い？」のワードがトリガ情報１５ｂへ設定されていなければ、「呼称」にあたる「ａさん」を特徴取得部１４ｇへ出力しない。 As shown in (1) or (2) of FIG. 7, when a voice signal such as “I am hungry. Mr. a” or “I feel sick? If the word “I feel hungry” or “I feel sick?” Is not set in the trigger information 15b, the identification unit 14f does not output “Mr. a” corresponding to “name” to the feature acquisition unit 14g.

一方、図７の（３）に示したように、音声認識部１４ｅから「大丈夫？ａさん」という音声の音声信号が入力された場合、呼びかけワード識別部１４ｆは、「大丈夫？」のワードがトリガ情報１５ｂの「開始ワード」と合致すれば、「呼称」にあたる「ａさん」を特徴取得部１４ｇへ出力する。 On the other hand, as shown in (3) of FIG. 7, when a voice signal of “Okay? Mr. a” is input from the voice recognition unit 14e, the calling word identification unit 14f displays the word “OK”? If it matches the “start word” in the trigger information 15b, “san” corresponding to “name” is output to the feature acquisition unit 14g.

したがって、上述の図５の設定例を前提とした場合、図７に示した例では、時間ｔ３に至るまでは、特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの非動作区間となる。すなわち、時間ｔ３に至るまでは、監視対象者のモニタリングは行われない。 Therefore, assuming the setting example of FIG. 5 described above, in the example shown in FIG. 7, until the time t3, the non-operating section of the feature acquisition unit 14g, the position search unit 14h, and the display video generation unit 14i Become. That is, the monitoring target person is not monitored until time t3.

また、図７に示したように、時間ｔ３以降は、特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの動作区間となる。すなわち、搭乗者ａを監視対象者とするモニタリングが行われることとなる。 Further, as shown in FIG. 7, after the time t3, it becomes an operation section of the feature acquisition unit 14g, the position search unit 14h, and the display video generation unit 14i. That is, monitoring with the passenger a as a monitoring subject is performed.

かかる動作区間中は、たとえば、「体調悪い？」といった「呼称」を含まない音声信号（図７の（４）参照）や、既に一度入力された「気分悪い？ａさん」との音声信号（図７の（２）および（５）参照）が入力されても、搭乗者ａのモニタリングは継続して行われる。 During such an operation period, for example, a voice signal (refer to (4) in FIG. 7) that does not include “name” such as “I feel sick?” Or a voice signal (“Mr. a” who has already been input) ( Even if (2) and (5) in FIG. 7 are input, the monitoring of the passenger a is continued.

そして、図７の（６）に示したように、トリガ情報１５ｂの「終了ワード」に合致する「もういいよ」との音声信号が入力されたならば、再び特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの非動作区間となる。したがって、搭乗者ａのモニタリングは、時間ｔ６において終了する。 Then, as shown in (6) of FIG. 7, if a voice signal “I'm fine” that matches the “end word” of the trigger information 15 b is input, the feature acquisition unit 14 g and the position search unit again. 14h and the non-operation period of the display video generation unit 14i. Therefore, the monitoring of the passenger a ends at time t6.

つづいて、位置検索部１４ｈにおける対象検索処理および表示映像生成部１４ｉにおける表示映像生成処理について、図８を用いて説明する。図８は、位置検索部１４ｈにおける対象検索処理および表示映像生成部１４ｉにおける表示映像生成処理を説明するための図である。 Next, the target search process in the position search unit 14h and the display video generation process in the display video generation unit 14i will be described with reference to FIG. FIG. 8 is a diagram for explaining target search processing in the position search unit 14h and display video generation processing in the display video generation unit 14i.

なお、図８の（Ａ）には、監視対象者を拡大表示する場合について、図８の（Ｂ）には、複数のカメラ１１であるカメラ＃１〜＃４を配置した場合について、それぞれ示している。 8A shows a case where the monitoring subject is enlarged and FIG. 8B shows a case where cameras # 1 to # 4 which are a plurality of cameras 11 are arranged. ing.

図８の（Ａ−１）に示したように、位置検索部１４ｈは、たとえば、特徴取得部１４ｇから入力された監視対象者の「特徴」をあらわす顔画像に基づき、撮像映像１１ａにおける監視対象者の位置（図中の閉曲線３に囲まれた部分参照）を検索および特定する。 As illustrated in FIG. 8A-1, the position search unit 14h is, for example, a monitoring target in the captured video 11a based on a face image representing the “feature” of the monitoring target input from the feature acquisition unit 14g. The person's position (refer to the part surrounded by the closed curve 3 in the figure) is searched and specified.

そして、図８の（Ａ−２）に示したように、表示映像生成部１４ｉは、たとえば、かかる閉曲線３に囲まれた部分を拡大表示する表示映像を生成したうえで、車載モニタなどの表示部１３へ表示させることができる。 And as shown to (A-2) of FIG. 8, the display image | video production | generation part 14i produces | generates the display image | video which expands and displays the part enclosed by this closed curve 3, for example, displays on a vehicle-mounted monitor etc. It can be displayed on the unit 13.

したがって、撮像映像１１ａにおいては、その様子をよく確認できないほど監視対象者が小さく映っている場合であっても、監視者にかかる様子を確実に確認させることが可能となる。 Therefore, in the captured image 11a, even when the monitoring subject is so small that the state cannot be well confirmed, it is possible to reliably confirm the state of the monitoring subject.

また、図８の（Ｂ−１）に示したように、上述した図３の（Ａ−ｃ）の配置例などにそって複数のカメラ＃１〜＃４を配置した場合、位置検索部１４ｈは、たとえば、監視対象者がもっとも視認しやすい映像を撮像しているカメラ１１を特定してもよい。なお、図８の（Ｂ−１）には、位置検索部１４ｈが、カメラ＃１を特定した例を示している。 Further, as illustrated in FIG. 8B-1, when a plurality of cameras # 1 to # 4 are disposed in accordance with the above-described arrangement example of FIG. May specify, for example, the camera 11 that is capturing the video that is most easily visible to the person being monitored. 8B-1 shows an example in which the position search unit 14h specifies the camera # 1.

かかる場合、図８の（Ｂ−２）に示したように、表示映像生成部１４ｉは、カメラ＃１の撮像映像を選択して、表示部１３へ表示させることとすればよい。この場合、表示映像生成部１４ｉは、表示映像をあらたに生成することなく、カメラ＃１〜＃４の切り替えを行うのみで済むため、監視対象者のモニタリングの即時性を高めることができる。 In such a case, as shown in (B-2) of FIG. 8, the display video generation unit 14 i may select the captured video of the camera # 1 and display it on the display unit 13. In this case, the display video generation unit 14i only needs to switch the cameras # 1 to # 4 without newly generating a display video, so that the immediacy of monitoring of the monitoring subject can be improved.

また、図８には示していないが、表示映像生成部１４ｉは、車両５０の車速に応じて生成する表示映像を異ならせてもよい。たとえば、車速が０である場合には、図８に示したように、顔画像のみを表示映像として生成してもよい。また、車速が０より大きい場合には、ナビゲーション画面と顔画像とを組み合わせた表示映像を生成してもよい。 Although not shown in FIG. 8, the display video generation unit 14 i may change the display video generated according to the vehicle speed of the vehicle 50. For example, when the vehicle speed is 0, only a face image may be generated as a display video as shown in FIG. When the vehicle speed is higher than 0, a display image combining the navigation screen and the face image may be generated.

これにより、監視者が運転者である場合であっても、運転者を支援するナビゲーション情報を表示しつつ、監視対象者の様子をも確認させることができる。すなわち、運転に関する利便性を高めつつ、監視対象者の安全を確保することが可能となる。 Thereby, even if the supervisor is the driver, the state of the monitoring subject can be confirmed while displaying the navigation information for assisting the driver. That is, it is possible to ensure the safety of the person being monitored while improving the convenience of driving.

図２の説明に戻り、記憶部１５について説明する。記憶部１５は、ハードディスクドライブや不揮発性メモリ、レジスタといった記憶デバイスで構成される記憶部であり、個人情報１５ａと、トリガ情報１５ｂとを記憶する。 Returning to the description of FIG. 2, the storage unit 15 will be described. The storage unit 15 is a storage unit configured by a storage device such as a hard disk drive, a nonvolatile memory, or a register, and stores personal information 15a and trigger information 15b.

なお、個人情報１５ａについては図４を用いて、トリガ情報１５ｂについては図５を用いて、それぞれ既に説明したため、ここでの記載を省略する。 Since the personal information 15a has already been described with reference to FIG. 4 and the trigger information 15b has been described with reference to FIG. 5, description thereof is omitted here.

次に、車室内監視装置１０が実行する処理手順について図９を用いて説明する。図９は、車室内監視装置１０が実行する処理手順を示すフローチャートである。なお、本フローチャートでは、個人情報１５ａの登録（ステップＳ１０１に対応）と監視対象者のモニタリング（ステップＳ１０２からステップＳ１０９に対応）とを一連の動作として示している。 Next, a processing procedure executed by the vehicle interior monitoring apparatus 10 will be described with reference to FIG. FIG. 9 is a flowchart showing a processing procedure executed by the vehicle interior monitoring apparatus 10. In this flowchart, registration of personal information 15a (corresponding to step S101) and monitoring of a monitoring target person (corresponding to steps S102 to S109) are shown as a series of operations.

また、ステップＳ１０２からステップＳ１０９では、モニタリング１回分の「開始」から「終了」までの処理手順を示すこととする。 Further, in steps S102 to S109, a processing procedure from “start” to “end” for one monitoring is shown.

図９に示したように、まず、呼びかけワード入力部１４ａおよび個人映像入力部１４ｂの出力に基づいて登録情報生成部１４ｃが生成した「呼称」および「特徴」を含む登録情報を、個人情報登録部１４ｄが、データベースである個人情報１５ａへ登録する（ステップＳ１０１）。 As shown in FIG. 9, first, registration information including “name” and “feature” generated by the registration information generation unit 14c based on the outputs of the calling word input unit 14a and the personal video input unit 14b is registered as personal information. The unit 14d registers in the personal information 15a that is a database (step S101).

つづいて、車両５０の走行中などに、音声認識部１４ｅが、マイク１２の集音した車室５１内の音響から搭乗者の音声を認識する（ステップＳ１０２）。そして、呼びかけワード識別部１４ｆが、かかる音声から監視者の「呼びかけワード」を識別する（ステップＳ１０３）。 Subsequently, during traveling of the vehicle 50, the voice recognition unit 14e recognizes the voice of the passenger from the sound in the passenger compartment 51 collected by the microphone 12 (step S102). Then, the calling word identifying unit 14f identifies the “calling word” of the supervisor from the voice (step S103).

そして、呼びかけワード識別部１４ｆは、かかる「呼びかけワード」を解析して、「終了ワード」が含まれていないか否かを判定する（ステップＳ１０４）。 Then, the calling word identification unit 14f analyzes the “calling word” and determines whether or not the “end word” is included (step S104).

ここで、「終了ワード」が含まれていないと判定された場合（ステップＳ１０４，Ｙｅｓ）、呼びかけワード識別部１４ｆは、「開始ワード」が含まれているか否かを判定する（ステップＳ１０５）。一方、ステップＳ１０４の判定条件を満たさなかった場合（ステップＳ１０４，Ｎｏ）、車室内監視装置１０は、モニタリング１回分の処理を終了する。 If it is determined that the “end word” is not included (step S104, Yes), the calling word identification unit 14f determines whether the “start word” is included (step S105). On the other hand, when the determination condition of step S104 is not satisfied (step S104, No), the vehicle interior monitoring apparatus 10 ends the process for one monitoring.

つづいて、「開始ワード」が含まれていると判定された場合（ステップＳ１０５，Ｙｅｓ）、特徴取得部１４ｇが、「呼びかけワード」内の「呼称」に合致する搭乗者の「特徴」を、個人情報１５ａから取得する（ステップＳ１０６）。 Subsequently, when it is determined that the “start word” is included (step S105, Yes), the feature acquisition unit 14g displays the “feature” of the passenger that matches the “name” in the “call word”. Obtained from the personal information 15a (step S106).

なお、ステップＳ１０５の判定条件を満たさなかった場合（ステップＳ１０５，Ｎｏ）、車室内監視装置１０は、ステップＳ１０２からの処理を繰り返す。 When the determination condition in step S105 is not satisfied (step S105, No), the vehicle interior monitoring apparatus 10 repeats the processing from step S102.

そして、位置検索部１４ｈが、カメラ１１の撮像映像に基づき、特徴取得部１４ｇの取得した「特徴」に該当する搭乗者の位置を検索および特定する（ステップＳ１０７）。そして、表示映像生成部１４ｉが、検索および特定された位置に基づいて監視対象者である搭乗者の表示映像を生成したうえで（ステップＳ１０８）、かかる表示映像を表示部１３へ出力させる（ステップＳ１０９）。 Then, the position search unit 14h searches and specifies the position of the passenger corresponding to the “feature” acquired by the feature acquisition unit 14g based on the captured image of the camera 11 (step S107). Then, the display video generation unit 14i generates a display video of the passenger who is the monitoring subject based on the searched and specified position (step S108), and then outputs the display video to the display unit 13 (step S108). S109).

そして、車室内監視装置１０は、「呼びかけワード」に「終了ワード」が含まれるまで、ステップＳ１０２からの処理を繰り返す。 Then, the vehicle interior monitoring apparatus 10 repeats the processing from step S102 until the “call word” includes the “end word”.

上述してきたように、本実施例では、個人情報登録部が、各搭乗者に関する個人情報をあらかじめ登録するように車室内監視装置を構成した。また、音声認識部が、車室内の音響から搭乗者の音声を認識し、呼びかけワード識別部が、搭乗者の音声から監視者の「呼びかけワード」を識別のうえ解析し、特徴取得部が、解析された監視対象者の「呼称」と合致する搭乗者の「特徴」を取得し、位置検索部が、かかる「特徴」に該当する搭乗者の位置を検索および特定し、表示映像生成部が、特定された位置に基づいて監視対象者についての表示映像を生成のうえ表示させるように車室内監視装置を構成した。 As described above, in this embodiment, the vehicle interior monitoring device is configured such that the personal information registration unit registers in advance personal information related to each passenger. In addition, the voice recognition unit recognizes the passenger's voice from the sound in the passenger compartment, the call word identification unit identifies and analyzes the supervisor's “call word” from the passenger's voice, and the feature acquisition unit, The “feature” of the passenger that matches the “name” of the analyzed monitoring subject is acquired, and the position search unit searches and identifies the position of the passenger corresponding to the “feature”, and the display video generation unit The vehicle interior monitoring device is configured to generate and display a display image of the person to be monitored based on the specified position.

したがって、能動的、かつ、精度よく、監視対象である搭乗者の様子を監視者に確認させることができる。 Therefore, it is possible to cause the supervisor to confirm the state of the passenger who is the subject of monitoring actively and accurately.

ところで、上述した実施例では、個人情報１５ａに含まれる「特徴」が、各搭乗者の顔画像である場合について説明した。しかし、車両５０へ家族が揃って搭乗する場合のように、各搭乗者の顔が似ているケースも考えられる。 By the way, in the above-described embodiment, the case where the “feature” included in the personal information 15a is a face image of each passenger has been described. However, there may be cases where the faces of the passengers are similar, such as when a family boardes the vehicle 50 together.

かかる場合、上述の「特徴」を顔画像に限定することなく、それ以外の情報を「特徴」として用いることで、監視対象者である搭乗者の位置の検索および特定を精度よく行うことができる。そこで、以下では、かかる変形例について図１０を用いて説明する。 In such a case, the above-described “feature” is not limited to the face image, and other information is used as the “feature”, so that the position and the position of the passenger who is the monitoring target can be accurately searched. . Therefore, such a modification will be described below with reference to FIG.

図１０は、個人情報１５ａの登録の変形例を示す図である。なお、図１０の（Ａ）には、登録用の撮像画像に基づく変形例を、図１０の（Ｂ）には、撮像画像を用いない変形例を、それぞれ示している。 FIG. 10 is a diagram showing a modification of registration of the personal information 15a. 10A shows a modification based on a registered captured image, and FIG. 10B shows a modification that does not use a captured image.

図１０の（Ａ）に示したように、変形例に係る個人情報１５ａａは、「特徴」項目にさらに上述した顔画像以外のデータを格納する項目を含むことができる。たとえば、図１０の（Ａ）には、「特徴」項目が、「顔」項目と、「配色」項目と、「服のデザイン」項目とをさらに含んでいる例を示している。 As illustrated in FIG. 10A, the personal information 15aa according to the modification may include an item for storing data other than the above-described face image in the “feature” item. For example, FIG. 10A shows an example in which the “feature” item further includes a “face” item, a “color scheme” item, and a “clothing design” item.

ここで、「顔」項目と、「配色」項目と、「服のデザイン」項目とは、登録用の撮像画像に基づく項目である。「顔」項目は、上述した実施例と同様に、搭乗者の顔画像を格納する項目である。 Here, the “face” item, the “color scheme” item, and the “clothing design” item are items based on the captured image for registration. The “face” item is an item for storing a passenger's face image, as in the above-described embodiment.

「配色」項目は、撮像画像に基づいて算出された配色の分布を格納する項目である。なお、図１０の（Ａ）では、説明を分かりやすくするために、大まかな配色の分布を色名で記載した例を示しているが、実際の格納データにはＲＧＢ値などを用いることができる。 The “color scheme” item is an item for storing a color scheme distribution calculated based on the captured image. In FIG. 10A, an example in which a rough color distribution is described by color name is shown for easy understanding, but RGB values or the like can be used for actual stored data. .

「服のデザイン」項目は、撮像画像に基づいて解析された各搭乗者が着ている服のデザインパターンを格納する項目である。たとえば、図１０の（Ａ）に示した例では、「呼称」が「ｃ」である搭乗者ｃの「服のデザイン」項目のみに、無地でないストライプのデザインパターンが格納されている例を示している。 The “clothing design” item is an item for storing a design pattern of clothing worn by each passenger analyzed based on the captured image. For example, the example shown in FIG. 10A shows an example in which a solid stripe design pattern is stored only in the “clothing design” item of the passenger c whose “name” is “c”. ing.

したがって、かかる例では、「顔」項目や「配色」項目の格納データを用いることなく、「服のデザイン」項目の格納データのみを用いて搭乗者ｃの位置を検索および特定することが可能となる。無論、各項目の格納データを組み合わせることによって、搭乗者の位置を検索および特定してもよい。 Therefore, in this example, it is possible to search and specify the position of the passenger c using only the storage data of the “clothing design” item without using the storage data of the “face” item and the “color scheme” item. Become. Of course, the position of the passenger may be searched and specified by combining the stored data of each item.

また、図１０の（Ｂ）に示した変形例に係る個人情報１５ａｂのように、登録用の撮像画像を用いることなく、たとえば、ＩＣタグを用いることによって各搭乗者を識別することとしてもよい。 Moreover, it is good also as identifying each passenger by using an IC tag, for example, without using the picked-up image for registration like the personal information 15ab which concerns on the modification shown to (B) of FIG. .

具体的には、車両５０への乗車時などに、各搭乗者へＩＣタグを配布することとしたうえで、個人情報１５ａｂの登録にあたり、各「呼称」と各「ＩＣタグＩＤ」とを関連付けることとすればよい。かかる場合、データ量の大きくなりがちな画像データの処理にかかる処理負荷を軽減することができるので、監視対象者のモニタリングの即時性を向上させることが可能となる。 Specifically, an IC tag is distributed to each passenger when boarding the vehicle 50, and each “name” is associated with each “IC tag ID” when registering the personal information 15ab. You can do that. In such a case, the processing load on the processing of the image data, which tends to increase the amount of data, can be reduced, so that it is possible to improve the immediacy of monitoring of the monitoring subject.

また、上述した実施例では、監視者の「呼びかけワード」をトリガとする一連の動作中において、監視対象者である搭乗者の特徴の取得や、かかる特徴の取得に基づく位置の検索および特定などを行う場合について説明した。しかし、「呼びかけワード」をトリガとすることなく、あらかじめ各搭乗者の暫定位置を推定しておくこととしてもよい。 Further, in the above-described embodiment, during the series of operations triggered by the supervisor's “calling word”, the characteristics of the passenger who is the monitoring target are acquired, and the position is searched and specified based on the acquisition of such characteristics. Explained the case of performing. However, the provisional position of each passenger may be estimated in advance without using the “calling word” as a trigger.

そこで、以下では、かかる変形例について図１１を用いて説明する。図１１は、呼びかけ例とそれにともなう特徴取得部１４ｇ、位置検索部１４ｈおよび表示映像生成部１４ｉの動作の変形例を説明するための図である。 Therefore, in the following, such a modification will be described with reference to FIG. FIG. 11 is a diagram for explaining a modification example of the operation of the calling example and the accompanying feature acquisition unit 14g, position search unit 14h, and display video generation unit 14i.

なお、図１１では、特徴取得部１４ｇおよび位置検索部１４ｈの動作区間を「動作区間Ａ」と、表示映像生成部１４ｉの動作区間を「動作区間Ｂ」と、それぞれ示している。また、図１１を用いた説明では、図７と異なる部分について主に説明することとし、重複する部分については説明を省略するか、あるいは、簡単な説明にとどめることとする。 In FIG. 11, the operation interval of the feature acquisition unit 14 g and the position search unit 14 h is indicated as “operation interval A”, and the operation interval of the display video generation unit 14 i is indicated as “operation interval B”. In the description using FIG. 11, portions different from those in FIG. 7 will be mainly described, and description of overlapping portions will be omitted or only a brief description will be given.

図１１に示したように、変形例に係る車室内監視装置においては、「呼びかけワード」をトリガとすることなく、特徴取得部１４ｇや位置検索部１４ｈなどを動作させることができる。 As shown in FIG. 11, in the vehicle interior monitoring apparatus according to the modification, the feature acquisition unit 14g, the position search unit 14h, and the like can be operated without using the “calling word” as a trigger.

具体的には、図１１の（０）に示したように、「乗車」時（図中のｔ０参照）の個人情報の登録後から、特徴取得部１４ｇや位置検索部１４ｈなどをバックグラウンド動作させることができる（図１１の「動作区間Ａ」参照）。 Specifically, as shown in (0) of FIG. 11, the feature acquisition unit 14g and the position search unit 14h are operated in the background after registration of personal information at the time of “ride” (see t0 in the figure). (Refer to “Operation section A” in FIG. 11).

かかるバックグラウンド動作における特徴取得部１４ｇおよび位置検索部１４ｈは、個人情報１５ａに登録された各搭乗者の「特徴」に基づき、たとえば、車両５０の走行開始時などに搭乗者それぞれの位置を暫定位置として推定する。 Based on the “feature” of each passenger registered in the personal information 15a, the feature acquisition unit 14g and the position search unit 14h in the background operation tentatively determine the position of each passenger, for example, when the vehicle 50 starts to travel. Estimated as position.

そして、図示しないが、搭乗者が席を替わるといったイベントが生じた場合などに、あらためて搭乗者それぞれの位置を暫定位置として推定する。 Although not shown in the drawings, when an event occurs such as when the passenger changes his / her seat, the position of each passenger is estimated as a provisional position again.

そして、図１１の（３）に示したように、「大丈夫？ａさん」のようなモニタリング開始を示す「呼びかけワード」を呼びかけワード識別部１４ｆが識別した場合に、表示映像生成部１４ｉは、監視対象者にあたる搭乗者の直近の暫定位置に基づいて表示映像生成処理を開始する。 Then, as shown in (3) of FIG. 11, when the calling word identifying unit 14f identifies the “calling word” indicating the start of monitoring such as “Okay? Mr. a”, the display image generating unit 14i The display video generation process is started based on the temporary position nearest to the passenger corresponding to the monitoring subject.

そして、図１１の（６）に示したように、表示映像生成部１４ｉは、「もういいよ」のようなモニタリング終了を示す「呼びかけワード」を呼びかけワード識別部１４ｆが識別するまで、表示映像生成処理を継続する（図１１の「動作区間Ｂ」参照）。 Then, as shown in (6) of FIG. 11, the display video generation unit 14i displays the display video until the call word identification unit 14f identifies the “call word” indicating the end of the monitoring such as “It is OK”. The generation process is continued (see “operation section B” in FIG. 11).

このように、特徴取得部１４ｇや位置検索部１４ｈなどをバックグラウンド動作させることによって、監視対象者の位置の検索および特定にかかる処理負荷を軽減することができるので、監視対象者をモニタリングする即時性を高めることが可能となる。 As described above, since the feature acquisition unit 14g, the position search unit 14h, and the like are operated in the background, the processing load for searching and specifying the position of the monitoring target person can be reduced. It becomes possible to improve the nature.

なお、図１１に示した例では、「動作区間Ａ」が、「乗車」時からモニタリング終了を示す「もういいよ」の識別時までである場合を示しているが、図中のｔ６以降の「降車」時あるいは「エンジン停止」時までとしてもよい。 In the example shown in FIG. 11, the “operation section A” is from the time of “boarding” to the time of identification of “more good” indicating the end of monitoring, but after t6 in the figure. It may be until “get off” or “engine stop”.

また、上述した実施例および変形例では、「呼称」に基づいて監視対象者の位置を特定するに至る場合について説明したが、かかる場合に限られるものではない。たとえば、図示しないが、車室内の音声の指向性に基づき、いずれの位置に着座する搭乗者に対する「呼びかけワード」であるかを解析して、かかる搭乗者を監視対象者とするモニタリングを行ってもよい。 In the above-described embodiments and modifications, the case where the position of the person to be monitored is specified based on “name” has been described, but the present invention is not limited to such a case. For example, although not shown in the figure, based on the directivity of the voice in the passenger compartment, it is analyzed whether the passenger is a “calling word” for the seated passenger and monitoring the passenger as a monitoring subject. Also good.

以上のように、本発明に係る車室内監視装置は、能動的、かつ、精度よく、監視対象である搭乗者の様子を監視者に確認させたい場合に有用であり、特に、運転者にとって死角となる部分の多い多人数向け乗用車などの車室内監視装置への適用に適している。 As described above, the vehicle interior monitoring apparatus according to the present invention is useful when it is desired to make the supervisor confirm the state of the passenger to be monitored actively and accurately, and in particular, the blind spot for the driver. It is suitable for application to vehicle interior monitoring devices such as passenger cars for many people.

１０車室内監視装置
１１カメラ
１１ａ撮像映像
１２マイク
１３表示部
１４制御部
１４ａ呼びかけワード入力部
１４ｂ個人映像入力部
１４ｃ登録情報生成部
１４ｄ個人情報登録部
１４ｅ音声認識部
１４ｆ呼びかけワード識別部
１４ｇ特徴取得部
１４ｈ位置検索部
１４ｉ表示映像生成部
１５記憶部
１５ａ個人情報
１５ａａ、１５ａｂ個人情報
１５ｂトリガ情報
５０車両
５１車室 DESCRIPTION OF SYMBOLS 10 Car interior monitoring apparatus 11 Camera 11a Image picked-up image 12 Microphone 13 Display part 14 Control part 14a Calling word input part 14b Personal image input part 14c Registration information generation part 14d Personal information registration part 14e Voice recognition part 14f Calling word identification part 14g Feature acquisition Unit 14h position search unit 14i display image generation unit 15 storage unit 15a personal information 15aa, 15ab personal information 15b trigger information 50 vehicle 51 vehicle compartment

Claims

A vehicle interior monitoring device that images and displays a vehicle interior,
Personal information registration means for registering personal information including names and features for each passenger of the vehicle;
Voice recognition means for recognizing the voice of the passenger;
Feature acquisition for acquiring the characteristics of the occupant corresponding to the designation from the personal information based on a call to the specific occupant including the designation extracted from the voice recognized by the voice recognition means Means,
Position search means for searching a position of the passenger corresponding to the characteristic acquired by the characteristic acquisition means in the vehicle interior;
A vehicle interior monitoring device, comprising: display video generation means for generating a display video corresponding to the position searched by the position search means.

The personal information registration means includes:
The vehicle interior monitoring device according to claim 1, wherein the name included in the voice is converted into text, and the text and the feature of the occupant corresponding to the name are associated and registered. .

Further comprising word identifying means for extracting the call from the voice;
The word identification means is
The extracted call is converted into text, the text is separated into the name and a word other than the name, and when the word indicates the start of display, the characteristics of the passenger corresponding to the name The vehicle interior monitoring apparatus according to claim 1, wherein the feature acquisition unit acquires the vehicle interior.

The position search means includes
If the personal information has already been registered by the personal information registration means, the position for each passenger is searched based on the characteristics included in the personal information even before the call is extracted. The vehicle interior monitoring device according to claim 1, 2, or 3.

The display video generation means includes
The vehicle interior monitoring apparatus according to claim 1, wherein the display image is generated by enlarging a captured image at the position searched by the position search unit.

The display video generation means includes
The vehicle interior monitoring device according to claim 1, wherein the display image is generated by selecting a captured image at the position searched by the position search unit.