JP7269694B2

JP7269694B2 - LEARNING DATA GENERATION METHOD/PROGRAM, LEARNING MODEL AND EVENT OCCURRENCE ESTIMATING DEVICE FOR EVENT OCCURRENCE ESTIMATION

Info

Publication number: JP7269694B2
Application number: JP2019148395A
Authority: JP
Inventors: 和之田坂; 勝菅野
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2019-08-13
Filing date: 2019-08-13
Publication date: 2023-05-09
Anticipated expiration: 2039-08-13
Also published as: JP2021034739A

Description

本発明は、画像認識用学習モデルを生成するための学習データの生成技術に関する。 The present invention relates to a technology for generating learning data for generating a learning model for image recognition.

現在、監視やマーケティング等の目的をもって、さらには自動運転車や自律ロボット等の「視覚系」として、カメラで撮影され生成された画像データを解析し、撮影された対象を識別する技術の開発が盛んに進められている。 Currently, for the purpose of surveillance and marketing, and also as a "visual system" for self-driving cars and autonomous robots, technology is being developed to analyze the image data generated by cameras and identify the photographed object. It is progressing vigorously.

ここで、例えば自動運転車では、安全且つ確実な運転を実施可能とするため、撮影画像データに基づいて自車の周囲の状況を的確に認識することが非常に重要となる。 Here, for example, in an automatic driving vehicle, it is very important to accurately recognize the situation around the vehicle based on the captured image data, in order to enable safe and reliable driving.

このような周囲の状況を認識するための技術として、例えば特許文献１には、運転者の負担を軽減しつつ危険な状況を回避することを意図した車両周辺監視技術が開示されている。ここで、この技術に係る障害物監視装置は、車載カメラの搭載された車両についての車速条件が満たされる場合、車載カメラによって撮像された画像を運転者に対し表示する。さらに、車載カメラによって撮像された画像から車両の周辺で障害物が画像認識された場合は、車速条件以外の速度であっても、画像とともにこの障害物の存在を示す表示を行って運転者に対し報知を行う。 As a technique for recognizing such a surrounding situation, for example, Patent Document 1 discloses a vehicle surroundings monitoring technique intended to avoid a dangerous situation while reducing the burden on the driver. Here, the obstacle monitoring device according to this technology displays an image captured by the on-board camera to the driver when the vehicle speed condition for the vehicle equipped with the on-board camera is satisfied. Furthermore, if an obstacle is recognized around the vehicle from the image captured by the on-board camera, the presence of the obstacle is displayed along with the image to inform the driver even if the speed is outside the vehicle speed conditions. Notify

また、特許文献２には、車両を通じて撮像された事故画像を適切に取得するための事故画像取得システムが開示されている。この事故画像取得システムでは、車両は、自車両とは別の事故車が絡む事故画像であると認識された撮像画像をセンタへ送信する。次いで、このセンタでは、受信した撮像画像の画像処理を通じてその画像中に事故車とその事故車に近い相対物とが存在すると判断したとき、それら事故車と相対物との近接度合いに基づいて撮像画像が事故画像であるか否かを判定するのである。 Further, Patent Literature 2 discloses an accident image acquisition system for appropriately acquiring an accident image captured through a vehicle. In this accident image acquisition system, a vehicle transmits to a center a captured image recognized as an accident image involving another accident vehicle other than the own vehicle. Next, when the center judges through image processing of the received captured image that the accident vehicle and a relative object close to the accident vehicle exist in the image, the center takes images based on the degree of proximity between the accident vehicle and the relative object. It is determined whether or not the image is an accident image.

さらに、特許文献３には、予め定める転送条件が満足される場合に、車両の周辺状況を表す周辺状況情報を、外部のサーバ装置へ転送する車載装置が開示されている。ここで、周辺状況情報には、車両の周辺で交通事故が発生したことを表す交通事故発生情報、及び車両の周辺に存在する駐車場に関する駐車場情報のうちの少なくとも一方が含まれる。 Furthermore, Patent Document 3 discloses an in-vehicle device that transfers peripheral situation information representing the situation around the vehicle to an external server device when a predetermined transfer condition is satisfied. Here, the surrounding situation information includes at least one of traffic accident occurrence information indicating that a traffic accident has occurred in the vicinity of the vehicle and parking lot information related to parking lots existing in the vicinity of the vehicle.

特開２０１２－０６９１５４号公報JP 2012-069154 A 特開２０１５－２３０５７９号公報JP 2015-230579 A 特開２０１５－２１０７７５号公報JP 2015-210775 A

以上に説明した特許文献１～３に記載されたような従来技術ではたしかに、画像データに基づき、周辺の状況として事故現場、事故車両や障害物等の存在を認識し、この認識結果を、運転に役立つ情報として提示することが可能となっている。 It is true that the prior art as described in Patent Documents 1 to 3 described above recognizes the existence of an accident site, an accident vehicle, an obstacle, etc. as surrounding conditions based on image data, and the recognition results are used as driving It is possible to present it as useful information for

ここで、例えば自動運転の現場においては、障害物の存在等に限らず、自車両周辺の状況が、事故の発生し易い状況であるか否かの情報を取得することが非常に重要となる。例えば人通りの多い狭隘な道路を走行する際には、自動運転主体が、撮影画像データから「事故が発生し易い」ことを認識し、より安全な運転を実施することが極めて大事となる。また勿論、このように認識することは、非自動運転車の運転者にとっても同様に重要となるのである。 Here, for example, in the field of automatic driving, it is very important to acquire information on whether or not the situation around the vehicle is likely to cause an accident, not just the presence of obstacles. . For example, when driving on a narrow road with a lot of people, it is extremely important for the autonomous driving body to recognize that "accidents are likely to occur" from the captured image data and to drive safely. And of course, this perception is just as important for drivers of non-autonomous vehicles.

しかしながら、特許文献１～３に記載されたような従来技術では、周辺の撮影画像データからそのような事故の発生し易さを判断することは困難であり、特に、事故車両や障害物等の存在に限らず、周辺の様々な状況を勘案して事故発生の危険度を推定することは概ね不可能となっている。 However, with the prior arts such as those described in Patent Documents 1 to 3, it is difficult to determine the likelihood of such an accident from the captured image data of the surroundings. It is generally impossible to estimate the degree of risk of an accident taking into consideration not only the existence of a vehicle but also various surrounding conditions.

そこで、本発明は、周辺又は所定環境の状況に係る画像データに基づき、所定事象の発生し易さを推定する処理を実施可能にすることを目的とする。 SUMMARY OF THE INVENTION Accordingly, it is an object of the present invention to enable execution of processing for estimating the likelihood of occurrence of a predetermined event based on image data relating to the surroundings or the state of a predetermined environment.

本発明によれば、周辺又は所定環境の状況に係る入力画像データに基づき、所定の事象の発生し易さを推定可能な学習モデルを生成するための学習データを生成するコンピュータにおける学習データ生成方法であって、
周辺又は所定環境の状況に係る画像データに対し画素又は単位領域毎に当該状況に係るクラスを割り当てる処理を実施し、形成されたクラス領域としての画素領域の各々に対し当該状況に係るクラスに対応したラベルを付与した画像データである領域分割画像データを生成するステップと、
当該領域分割画像データの当該画素領域及び当該ラベルに基づいて、当該事象の発生に関係し得る情報である状況情報を決定するステップと、
所定の条件を満たす当該状況情報が決定された画像データに対し、当該事象が発生し易いことを示す値を正解データとして対応付けたデータセットを含む学習データを生成するステップと
を有する学習データ生成方法が提供される。 According to the present invention, a learning data generation method in a computer for generating learning data for generating a learning model capable of estimating the likelihood of occurrence of a given event based on input image data relating to surroundings or given environment conditions. and
Perform processing to assign a class related to the situation to each pixel or unit area for image data related to the situation of the surroundings or a predetermined environment, and correspond to the class related to the situation to each pixel area as the formed class area. generating segmented image data, which is image data labeled with
determining context information, which is information that may be related to the occurrence of the event, based on the pixel region and the label of the segmented image data;
generating learning data including a data set in which a value indicating that the event is likely to occur is associated as correct data with respect to image data in which the situation information satisfying a predetermined condition is determined. A method is provided.

この本発明による学習データ生成方法の一実施形態として、上記の状況情報を決定するステップでは、予め設定された状況情報に係る項目毎に、当該事象が発生する要因となる度合いを示す発生要因スコアを決定して、当該項目毎に決定した発生要因スコアを当該状況情報とし、
上記の学習データを生成するステップでは、当該発生要因スコアが所定の条件を満たすことになる画像データに対し、当該事象が発生し易いことを示す値を正解データとして対応付けたデータセットを含む学習データを生成することも好ましい。 As an embodiment of the learning data generating method according to the present invention, in the step of determining the situation information, for each item related to the situation information set in advance, an occurrence factor score indicating the degree to which the event occurs is determined, and the occurrence factor score determined for each item is used as the situation information,
In the step of generating learning data, learning includes a data set in which a value indicating that the event is likely to occur is associated with image data whose occurrence factor score satisfies a predetermined condition as correct data. It is also preferred to generate data.

また、本発明による学習データ生成方法において、当該画像データは、通行エリアを移動する移動体での撮影によって生成され、当該所定の事象は、この移動体に係る事故又は事故未遂事象であり、
当該状況情報は、当該通行エリアの幅に係る項目の情報、当該通行エリアでの人及び／又は移動体の存在に係る項目の情報、複数の通行エリアが交差する交差点の存在に係る項目の情報、視界障害物の存在に係る項目の情報、人及び／又は移動体の数に係る項目の情報、並びに、周囲の移動体及び／又は人における所定以上の位置の変化に係る項目の情報のうちの少なくとも１つを含む情報であることも好ましい。 Further, in the learning data generation method according to the present invention, the image data is generated by photographing with a moving body moving in a traffic area, the predetermined event is an accident or an attempted accident event related to the moving body,
The situation information includes information on items related to the width of the traffic area, information on items related to the existence of people and/or moving objects in the traffic area, and information on items related to the existence of intersections where multiple traffic areas intersect. , information on items related to the presence of visual obstacles, information on items related to the number of people and/or moving objects, and information on items related to changes in the positions of surrounding moving objects and/or people beyond a predetermined level. It is also preferable that the information includes at least one of

さらに、本発明による学習データ生成方法の他の実施形態として、当該画像データは、移動体が所定条件を満たすだけの急な減速若しくは制動動作、若しくは急なステアリング動作を行った時点又はその近傍時点におけるこの移動体での撮影によって生成され、当該所定の事象は、この移動体に係る事故又は事故未遂事象であることも好ましい。 Furthermore, as another embodiment of the learning data generation method according to the present invention, the image data is generated at or near the point in time when the moving object performs a sudden deceleration or braking operation that satisfies a predetermined condition, or a sudden steering operation. It is also preferable that the predetermined event is an accident or an attempted accident related to the mobile object generated by photographing with the mobile object in .

また、上記の実施形態において、当該移動体が所定条件を満たすだけの急な減速若しくは制動動作、若しくは急なステアリング動作を行った時点又はその近傍時点は、当該移動体に搭載されたＣＡＮ（Controller Area Network）から取得される速度若しくは制動情報、当該画像データにおけるオプティカルフローのベクトル量、及び当該画像データを含む画像データ群に対する符号化圧縮処理の際に決定される動きベクトルのうちの少なくとも１つに基づいて決定される時点とすることも好ましい。 Further, in the above-described embodiment, the CAN (Controller) mounted on the moving object is used at the point in time when the moving object performs a sudden deceleration or braking operation, or a sudden steering operation that satisfies a predetermined condition, or at a point close thereto. At least one of velocity or braking information obtained from a network), the vector amount of optical flow in the image data, and the motion vector determined during encoding and compression processing for the image data group including the image data. It is also preferred that the time point is determined based on

さらに、本発明による学習データ生成方法における領域分割画像データは、当該画像データに対しセマンティックセグメンテーション（Semantic Segmentation）処理を施して生成されることも好ましい。 Furthermore, the segmented image data in the learning data generation method according to the present invention is preferably generated by subjecting the image data to semantic segmentation processing.

本発明によれば、また、以上に述べた学習データ生成方法によって生成された学習データを用い、機械学習アルゴリズムに基づき生成される学習モデルであって、
周辺又は所定環境の状況に係る画像データを入力として、該画像データの特徴に係る情報である特徴情報を出力する手段と、
当該特徴情報を入力として、当該事象の発生し易さに係る情報を出力する手段と
してコンピュータを機能させることを特徴とする学習モデルが提供される。
さらに、本発明によれば、この学習モデルを用い、周辺又は所定環境の状況に係る入力画像データについて所定の事象の発生し易さを推定し、当該推定の推定結果を出力する事象発生推定手段を有する事象発生推定装置が提供される。 According to the present invention, a learning model generated based on a machine learning algorithm using learning data generated by the learning data generation method described above ,
means for inputting image data relating to the situation of the surroundings or a predetermined environment and outputting feature information, which is information relating to features of the image data;
means for inputting the feature information and outputting information relating to the likelihood of occurrence of the event;
A learning model is provided that features a computer functioning by
Further, according to the present invention, there is provided event occurrence estimating means for estimating the likelihood of occurrence of a given event with respect to input image data relating to the situation of the surroundings or given environment, using this learning model, and outputting the estimated result of the estimation . is provided.

この本発明による事象発生推定装置の一実施形態として、本事象発生推定装置は、
当該入力画像データに対し画素又は単位領域毎に当該周辺又は所定環境の状況に係るクラスを割り当てる処理を実施し、形成されたクラス領域としての画素領域の各々に対し当該周辺又は所定環境の状況に係るクラスに対応したラベルを付与した画像データである領域分割画像データを生成する領域分割画像生成手段と、
当該領域分割画像データの当該画素領域及び当該ラベルに基づいて、当該事象の発生に関係し得る情報である状況情報を決定する状況情報決定手段と、
当該状況情報に基づいて、出力された当該推定結果を裏付ける情報であって当該事象が発生する要因に係る情報である発生要因情報を生成して出力する発生要因情報生成手段と
を更に有することも好ましい。 As an embodiment of the event occurrence estimation device according to the present invention, the event occurrence estimation device comprises:
A process for assigning a class related to the situation of the surrounding area or a predetermined environment to each pixel or unit area of the input image data is performed, and each pixel area as a formed class area is assigned to the situation of the surrounding area or the predetermined environment. a region-divided image generation means for generating region-divided image data, which is image data to which a label corresponding to the class is added ;
situation information determination means for determining situation information, which is information that may be related to the occurrence of the event, based on the pixel area and the label of the area-divided image data;
and an occurrence factor information generating means for generating and outputting occurrence factor information, which is information supporting the output estimation result based on the situation information and is information related to the cause of the occurrence of the event, based on the situation information. preferable.

本発明によれば、さらに、周辺又は所定環境の状況に係る入力画像データに基づき、所定の事象の発生し易さを推定可能な学習モデルを生成するための学習データを生成するコンピュータを機能させるプログラムであって、
周辺又は所定環境の状況に係る画像データに対し画素又は単位領域毎に当該状況に係るクラスを割り当てる処理を実施し、形成されたクラス領域としての画素領域の各々に対し当該状況に係るクラスに対応したラベルを付与した画像データである領域分割画像データを生成する領域分割画像生成手段と、
当該領域分割画像データの当該画素領域及び当該ラベルに基づいて、当該事象の発生に関係し得る情報である状況情報を決定する状況情報決定手段と、
所定の条件を満たす当該状況情報が決定された画像データに対し、当該事象が発生し易いことを示す値を正解データとして対応付けたデータセットを含む学習データを生成する学習データ生成手段と
してコンピュータを機能させる学習データ生成プログラムが提供される。
Further, according to the present invention, a computer that generates learning data for generating a learning model capable of estimating the likelihood of occurrence of a predetermined event is caused to function based on input image data relating to the situation of the surroundings or the predetermined environment. a program,
Perform processing to assign a class related to the situation to each pixel or unit area for image data related to the situation of the surroundings or a predetermined environment, and correspond to the class related to the situation to each pixel area as the formed class area. a region- divided image generation means for generating region-divided image data, which is image data labeled with a
situation information determination means for determining situation information, which is information that may be related to the occurrence of the event, based on the pixel area and the label of the area-divided image data;
A computer as learning data generating means for generating learning data including a data set in which a value indicating that the event is likely to occur is associated as correct data with respect to the image data in which the situation information satisfying a predetermined condition is determined. A functioning learning data generator is provided.

本発明の学習データ生成方法・プログラム、学習モデル及び事象発生推定装置によれば、周辺又は所定環境の状況に係る画像データに基づき、所定事象の発生し易さを推定する処理が実施可能となる。 According to the learning data generating method/program, learning model, and event occurrence estimating device of the present invention, it is possible to perform a process of estimating the likelihood of occurrence of a given event based on image data relating to the surroundings or the situation of the given environment. .

本発明に係る学習モデル生成装置及び事象発生推定装置を備えた事象発生推定システムの一実施形態を説明するための模式図及び機能ブロック図である。1 is a schematic diagram and a functional block diagram for explaining an embodiment of an event occurrence estimation system including a learning model generation device and an event occurrence estimation device according to the present invention; FIG. 本発明に係る領域分割画像生成処理及び周辺状況情報生成処理についての実施例を説明するための模式図である。FIG. 10 is a schematic diagram for explaining an embodiment of region-divided image generation processing and peripheral situation information generation processing according to the present invention; 本発明に係る領域分割画像生成処理及び周辺状況情報生成処理についての実施例を説明するための模式図である。FIG. 10 is a schematic diagram for explaining an embodiment of region-divided image generation processing and peripheral situation information generation processing according to the present invention; 本発明に係る提示情報生成処理についての一実施例を説明するための模式図である。FIG. 4 is a schematic diagram for explaining an example of presentation information generation processing according to the present invention; 本発明による学習データ・学習モデル生成方法、及び事象発生推定方法における一実施形態の概略を示すシーケンス図である。1 is a sequence diagram showing an outline of an embodiment of a learning data/learning model generation method and an event occurrence estimation method according to the present invention; FIG.

以下、本発明の実施形態について、図面を用いて詳細に説明する。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

［事象発生推定システム］
図１は、本発明に係る学習モデル生成装置及び事象発生推定装置を備えた事象発生推定システムの一実施形態を説明するための模式図及び機能ブロック図である。 [Event occurrence estimation system]
FIG. 1 is a schematic diagram and a functional block diagram for explaining an embodiment of an event occurrence estimation system having a learning model generation device and an event occurrence estimation device according to the present invention.

図１に示した本実施形態の事象発生推定システムは、
（ａ）（本実施形態において移動可能な）事象発生推定装置である１つ以上の端末２０と、
（ｂ）端末２０（又は画像データベース）から取得された画像データに基づいて学習モデルを生成可能な学習モデル生成装置であるクラウドサーバ１と
を有し、このうちクラウドサーバ１においては、この後詳細に説明する「周辺状況情報」を勘案して選別された画像データを用いた「学習データ」が生成され、さらにこれにより「学習モデル」が生成されて、この「学習モデル」が端末２０に供給される。 The event occurrence estimation system of this embodiment shown in FIG.
(a) one or more terminals 20 that are event occurrence estimation devices (movable in this embodiment);
(b) a cloud server 1 which is a learning model generation device capable of generating a learning model based on image data acquired from the terminal 20 (or image database); "Learning data" is generated using image data selected in consideration of the "surrounding situation information" described in 1, furthermore, a "learning model" is generated by this, and this "learning model" is supplied to the terminal 20 be done.

ここで、この「学習モデル」は、周辺又は所定環境の状況に係る入力画像データ（例えばカメラ２０３で撮影され生成された自動車２周辺の状況を示す画像データ）に基づき、所定の事象（例えば交通事故又は交通事故未遂事象）の発生し易さを推定可能なモデルとなっている。このような「学習モデル」を供給された端末２０は、この「学習モデル」を用いて所定の事象の発生し易さを推定する事象発生推定処理を実施するのである。 Here, this "learning model" is based on input image data (for example, image data showing the situation around the automobile 2 captured and generated by the camera 203) relating to the situation of the surroundings or a given environment, and a given event (for example, traffic It is a model that can estimate the likelihood of accidents or attempted traffic accidents). The terminal 20 supplied with such a "learning model" uses this "learning model" to perform event occurrence estimation processing for estimating the likelihood of occurrence of a predetermined event.

ちなみに、端末２０は本実施形態において、通信機能を有するドライブレコーダであり、自動車２における例えば車両前方を撮影可能な位置（例えばダッシュボード上部）に設置されている。また、各端末２０は、例えば携帯電話通信網やインターネット等を介してクラウドサーバ１と無線通信接続が可能となっており、学習モデル生成のための画像データ（映像データ，画像フレーム群）をクラウドサーバ１へ送信することができる。 Incidentally, in this embodiment, the terminal 20 is a drive recorder having a communication function, and is installed at a position (for example, above the dashboard) of the automobile 2 where, for example, the front of the vehicle can be photographed. In addition, each terminal 20 can be connected to the cloud server 1 for wireless communication via, for example, a mobile phone communication network or the Internet. It can be sent to the server 1.

さらに、クラウドサーバ１は、以上に説明した「学習モデル」を生成するための「学習データ」を生成するべく、
（Ａ）周辺又は所定環境の状況に係る画像データに対し画素又は単位領域毎に当該状況に係る値を割り当てる処理を実施し、形成された各画素領域（画像部分領域）に当該状況に係るラベルを付与した画像データである「領域分割画像データ」を生成する領域分割画像生成部１１２と、
（Ｂ）「領域分割画像データ」の当該画素領域及び当該ラベルに基づいて、所定事象の発生に関係し得る情報である「周辺状況情報」（状況情報）を決定する周辺状況情報決定部１１３と、
（Ｃ）所定の条件を満たす「周辺状況情報」が決定された画像データに対し、所定事象が発生し易いことを示す値を正解データとして対応付けたデータセットを含む「学習データ」を生成する学習データ生成部１１４と
を有することを特徴としているのである。 Furthermore, the cloud server 1 generates "learning data" for generating the "learning model" described above,
(A) Performing a process of assigning a value related to the situation to each pixel or unit area for image data related to the situation of the surroundings or a predetermined environment, and labeling the formed pixel area (image partial area) related to the situation a region-divided image generation unit 112 that generates “region-divided image data”, which is image data to which
(B) Peripheral situation information determination unit 113 that determines "peripheral situation information" (situation information), which is information that may be related to the occurrence of a predetermined event, based on the pixel region and the label of the "region-divided image data"; ,
(C) Generating "learning data" including a data set in which, as correct data, a value indicating that a predetermined event is likely to occur is associated with image data for which "surrounding situation information" that satisfies a predetermined condition has been determined. It is characterized by having a learning data generation unit 114 .

このようにクラウドサーバ１は、「領域分割画像データ」を生成して「周辺状況情報」を決定する。この「周辺状況情報」は後に詳述するが例えば、道路幅の情報や、道路付近での人の数の情報とすることができ、所定事象（例えば交通事故又は交通事故未遂事象）の発生に関係し得る重要な情報となっている。このような「周辺状況情報」に基づくことによって、クラウドサーバ１は「学習データ」に好適な画像データを決定、選別又は抽出することができ、その結果、所定事象の発生し易さの推定処理を実施可能にする「学習モデル」の生成に好適な「学習データ」を生成することができるのである。 In this manner, the cloud server 1 generates the "region-divided image data" and determines the "peripheral situation information". This "surrounding information", which will be described in detail later, can be, for example, road width information or information on the number of people near the road. This is important information that you can relate to. Based on such "peripheral situation information", the cloud server 1 can determine, select or extract image data suitable for "learning data". It is possible to generate "learning data" suitable for generating a "learning model" that enables

ちなみに上記（Ａ）で生成される「領域分割画像データ」は、周辺又は所定環境の状況に係る画像データに対し、セマンティックセグメンテーション（Semantic Segmentation）処理を施して生成されるものとすることができる。 Incidentally, the "region-divided image data" generated in (A) above can be generated by applying semantic segmentation processing to image data relating to the situation of the surroundings or a predetermined environment.

このSemantic Segmentation処理は、フレーム画像内の各ピクセルをクラス、例えば"人物"、"自動車"、"道路"や、"空"等に分類した上で、フレーム画像内にクラス領域、例えば"人物"領域、"自動車"領域、"道路"領域や、"空"領域等を決定する公知の処理である。これにより、画像内に写っているモノが何であるかの意味付けを行うことが可能となり、その結果、意味付けされた画素領域（画像部分領域）の情報から種々の「周辺状況情報」が決定可能となるのである。 This semantic segmentation process classifies each pixel in the frame image into a class such as "person", "car", "road", "sky", etc., and then classifies the class area such as "person" into the frame image. It is a known process for determining regions, such as "car" regions, "road" regions, and "sky" regions. As a result, it becomes possible to give meaning to what the object in the image is, and as a result, various "surrounding situation information" is determined from the information of the pixel area (image partial area) that has the meaning. It becomes possible.

例えば、「学習データ」を生成するに当たり、まさに所定事象（例えば交通事故又は交通事故未遂事象）の発生した現場の画像データを用いることも考えられるが、この場合、所定事象の発生し易さを推定する学習モデルが効果的に生成されない可能性も生じる。これに対し、クラウドサーバ１は、発生現場の画像データに限定されず、意味付けされた「周辺状況情報」を勘案して決定、選別又は抽出された画像データを利用するので、所定事象の発生し易さを推定する「学習モデル」をより効果的に生成可能とするのである。 For example, when generating "learning data", it is conceivable to use image data of the scene where a predetermined event (for example, a traffic accident or attempted traffic accident) occurred. A learning model to be estimated may not be effectively generated. On the other hand, the cloud server 1 is not limited to the image data of the occurrence site, but uses image data determined, selected or extracted in consideration of the "surrounding situation information" with meaning. This makes it possible to more effectively generate a "learning model" that estimates the ease of doing so.

ちなみに、端末２０は当然に、自動車２に設置された車載装置（ドライブレコーダ）に限定されるものではなく、例えば自転車、鉄道車両や、ロボット、ドローン等の移動体に設置された（又は搭乗した）ものとすることができる。また、ＨＭＤ（Head Mounted Display）やグラス型端末等のウェアラブル端末であってもよい。さらには、移動可能ではない（非モバイルである）パーソナル・コンピュータ（ＰＣ）等の情報処理装置とすることもできる。 Incidentally, the terminal 20 is of course not limited to an in-vehicle device (drive recorder) installed in the automobile 2. ). It may also be a wearable terminal such as an HMD (Head Mounted Display) or a glass-type terminal. Furthermore, it may be an information processing device such as a non-movable (non-mobile) personal computer (PC).

また例えば、工場内の監視ロボットに本発明に係る端末（事象発生推定装置）を搭載し、工場内で発生し得る様々なトラブル（事象）の発生を、ロボットのカメラ映像から予測するといったことも可能となるのである。 Further, for example, a monitoring robot in a factory may be equipped with a terminal (event occurrence estimation device) according to the present invention, and the occurrence of various troubles (events) that may occur in the factory can be predicted from the images captured by the robot's camera. It becomes possible.

さらに、図１とは全く別の実施形態となるが、クラウドサーバ１が事象発生推定機能も備えており、端末２０は、カメラ２０３で撮影された画像データをクラウドサーバ１へアップロードし、クラウドサーバ１から事象発生推定結果を取得するような実施形態をとることも可能である。 1, the cloud server 1 also has an event occurrence estimation function, and the terminal 20 uploads image data captured by the camera 203 to the cloud server 1, It is also possible to adopt an embodiment in which the event occurrence estimation result is acquired from 1.

ここで以下、本実施形態の装置機能構成を詳しく説明していくが、説明の容易さのため、発生し易さを推定するべき所定事象は、特に断りのない限り以下、交通事故又は交通事故未遂事象であるとする。 Here, the functional configuration of the apparatus of this embodiment will be described in detail below. Assume that it is an attempted event.

［学習モデル生成装置の機能構成］
同じく図１に示した機能ブロック図によれば、クラウドサーバ１は、通信インタフェース１０１と、プロセッサ・メモリとを有する。ここで、このプロセッサ・メモリは、本発明による学習データ生成プログラムを含む学習モデル生成プログラムの一実施形態を保存しており、さらに、コンピュータ機能を有していて、この学習モデル生成プログラムを実行することによって、学習モデル生成処理を実施する。 [Functional Configuration of Learning Model Generating Device]
Similarly, according to the functional block diagram shown in FIG. 1, the cloud server 1 has a communication interface 101 and a processor/memory. Here, the processor memory stores an embodiment of the learning model generation program including the learning data generation program according to the present invention, and further has a computer function to execute the learning model generation program. By doing so, the learning model generation process is executed.

またこのことから、本発明に係る学習モデル生成装置として、本クラウドサーバ１に代えて、本発明に係る上記の学習モデル生成プログラムを搭載した、例えば非クラウドのサーバ装置、パーソナル・コンピュータ（ＰＣ）、ノート型若しくはタブレット型コンピュータ、又はスマートフォン等を採用することも可能となる。 In addition, from this, as a learning model generation device according to the present invention, instead of the cloud server 1, a non-cloud server device, such as a personal computer (PC), which is equipped with the learning model generation program according to the present invention. , a notebook or tablet computer, or a smart phone or the like can be used.

例えば、端末２０に本発明に係る上記の学習モデル生成プログラムを搭載し、当該端末２０を本発明による学習モデル生成装置とすることもできる。また、本発明による学習モデル生成装置を、端末２０とともに自動車２に設置する実施形態も可能となるのである。 For example, the learning model generation program according to the present invention can be installed in the terminal 20, and the terminal 20 can be used as the learning model generation device according to the present invention. Further, an embodiment in which the learning model generation device according to the present invention is installed in the car 2 together with the terminal 20 is also possible.

さらに、プロセッサ・メモリは、画像取得部１１１と、領域分割画像生成部１１２と、周辺状況情報決定部１１３と、学習データ生成部１１４と、学習モデル生成部１１５とを有する。なお、これらの機能構成部は、プロセッサ・メモリに保存された、本発明による学習データ生成プログラムを含む学習モデル生成プログラムの機能と捉えることができる。また、図１におけるクラウドサーバ１の機能構成部間を矢印で接続して示した処理の流れは、この本発明に係る学習モデル生成方法の一実施形態としても理解される。 Further, the processor memory has an image acquisition unit 111 , a segmented image generation unit 112 , a surrounding situation information determination unit 113 , a learning data generation unit 114 , and a learning model generation unit 115 . It should be noted that these functional configuration units can be regarded as the functions of the learning model generation program, including the learning data generation program according to the present invention, stored in the processor memory. Further, the flow of processing in which the functional components of the cloud server 1 are connected by arrows in FIG. 1 can also be understood as an embodiment of the learning model generation method according to the present invention.

同じく図１の機能ブロック図において、画像取得部１１１は、「学習モデル」を生成するための「学習データ」に含まれる画像データを収集して保存し、当該画像データを、学習データ生成のために適宜出力する画像データ管理手段である。画像取得部１１１は例えば、各端末２０から通信インタフェース１０１を介して多数の画像データを取得することができる。また、外部の画像データベースから多数の画像データを取得してもよい。 Also in the functional block diagram of FIG. 1, an image acquisition unit 111 collects and stores image data included in "learning data" for generating a "learning model", and stores the image data for generating learning data. image data management means for appropriately outputting to . For example, the image acquisition unit 111 can acquire a large amount of image data from each terminal 20 via the communication interface 101 . Also, a large number of image data may be acquired from an external image database.

ここで本実施形態において、画像取得部１１１が各端末２０から取得する画像データは、後に詳細に説明するが、当該端末２０の搭載された自動車２が所定条件を満たすだけの急な減速若しくは制動動作（ブレーキ動作）、又は急なステアリング（ハンドル動作）を行った時点又はその近傍時点における当該自動車２における撮影によって生成された画像データを含むことも好ましい。またさらに、このような急減速、急制動や、急なステアリングに係る画像データには「急減速」タグ、「急ブレーキ」タグや、「急ハンドル」タグが付与されていることも好ましい。 Here, in this embodiment, the image data acquired from each terminal 20 by the image acquisition unit 111 will be described in detail later, but the vehicle 2 in which the terminal 20 is mounted suddenly decelerates or brakes only to satisfy a predetermined condition. It is also preferable to include image data generated by taking pictures of the motor vehicle 2 at or near the time of the action (braking action) or sharp steering (steering action). Furthermore, it is preferable that a "rapid deceleration" tag, a "rapid braking" tag, and a "rapid steering" tag are added to the image data related to such sudden deceleration, sudden braking, and sudden steering.

このようなタグの付与された画像データは、所定事象としての自動車２に係る交通事故又は交通事故未遂事象の起こり易さを推定する「学習モデル」を生成するのに好適な画像データとなっているのである。 Such tagged image data is image data suitable for generating a "learning model" for estimating the likelihood of a traffic accident or attempted traffic accident involving the automobile 2 as a predetermined event. There is.

領域分割画像生成部１１２は本実施形態において、画像取得部１１１から出力された画像データに対し上述したSemantic Segmentation処理を施して領域分割画像データを生成する。具体的な領域分割画像データの内容は、後に図２及び３を用いて詳細に説明する。 In this embodiment, the segmented image generation unit 112 performs the above-described Semantic Segmentation processing on the image data output from the image acquisition unit 111 to generate segmented image data. The specific contents of the segmented image data will be described in detail later with reference to FIGS. 2 and 3. FIG.

周辺状況情報決定部１１３は、生成された領域分割画像データにおける画素領域（画像部分領域）及び（当該領域に付与された又は付与されていない）クラスのラベルに基づき、所定事象としての交通事故又は交通事故未遂事象の発生に関係し得る情報である周辺状況情報を決定する。ここで本実施形態では特に、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグの付与された画像データ（から生成された領域分割画像データ）について周辺状況情報を決定することが重要となる。なお、具体的な周辺状況情報の内容については、この後図２及び３を用いて詳細に説明する。 The surrounding situation information determining unit 113 determines whether a traffic accident or Surroundings information is determined which is information that may be relevant to the occurrence of an attempted traffic accident event. Here, in this embodiment, in particular, surrounding situation information can be determined for image data to which a "sudden deceleration" tag, a "sudden braking" tag, or a "sudden steering" tag has been assigned (region-divided image data generated from the image data). important. The specific content of the peripheral situation information will be described later in detail with reference to FIGS.

図２及び３は、本発明に係る領域分割画像生成処理及び周辺状況情報生成処理についての実施例を説明するための模式図である。 2 and 3 are schematic diagrams for explaining an embodiment of the region-divided image generation processing and the peripheral situation information generation processing according to the present invention.

図２に示した実施例では、領域分割画像生成部１１２は、自動車２における進行方向の景色を含むＲＧＢ画像データに対しSemantic Segmentation処理を実施して、領域分割画像を生成している。ここで生成された領域分割画像は、図２に示すようにクラスラベル"道路"の付与された画素領域Raを含み、一方で、クラスラベル"自動車"や"人間”の付与された画素領域や、クラスラベル"視界障害物"の付与された画素領域を含んでいない。 In the embodiment shown in FIG. 2, the segmented image generation unit 112 performs semantic segmentation processing on RGB image data including scenery in the traveling direction of the automobile 2 to generate segmented images. As shown in FIG. 2, the segmented image generated here includes pixel regions Ra to which the class label "road" is assigned, while pixel regions Ra to which the class labels "automobile" and "human" are assigned. , does not contain pixel regions with the class label "visual obstruction".

次いで、周辺状況情報決定部１１３は、生成された領域分割画像における画素領域の情報及び（付与された又は付与されていない）クラスラベルの情報に基づいて、周辺状況情報を決定している。本実施形態において具体的には、周辺状況情報の項目として、
（ａ）道路幅，（ｂ）視界障害物，（ｃ）人の数，（ｄ）自動車の数，・・・
を予め設定しておき、当該項目毎に、交通事故又は交通事故未遂事象が発生する要因となる度合いを示す「危険スコア（発生要因スコア）」を決定して、当該項目毎に決定した「危険スコア」を周辺状況情報としている。 Next, the peripheral situation information determining unit 113 determines peripheral situation information based on the information on the pixel regions in the generated segmented image and the information on the class labels (given or not given). Specifically, in this embodiment, as items of the surrounding situation information,
(a) road width, (b) visibility obstruction, (c) number of people, (d) number of cars, ...
is set in advance, and for each item, a "risk score (occurrence factor score)" that indicates the degree to which a traffic accident or attempted traffic accident event occurs is determined. Score” is used as peripheral situation information.

ちなみに、画像データが通行エリアを移動する移動体での撮影によって生成され、所定事象がこの移動体に係る事故又は事故未遂事象である場合においては、周辺状況情報は、（ａ）この移動体の幅及び／又は通行エリアの幅に係る項目の情報、（ｂ）視界障害物の存在に係る項目の情報、（ｃ）通行エリアでの人及び／又は（動物を含む）移動体の存在に係る項目の情報、（ｄ）複数の通行エリアが交差する交差点の存在に係る項目の情報、（ｅ）人及び／又は移動体の数に係る項目の情報、（ｆ）周囲の移動体及び／又は人における所定以上の位置の変化（例えば前方車両の急ブレーキや急な割込みといった挙動）に係る項目の情報、並びに、（ｇ）元の画像データにおける輝度に係る項目の情報のうちの少なくとも１つを含む情報とすることができる。 Incidentally, in the case where the image data is generated by shooting with a moving body moving in a traffic area, and the predetermined event is an accident or an attempted accident involving this moving body, the peripheral situation information includes (a) this moving body (b) item information related to the presence of visual obstructions; (c) item information related to the presence of people and/or moving objects (including animals) in the passage area; (d) item information related to the existence of an intersection where multiple traffic areas intersect; (e) item information related to the number of people and/or moving bodies; (f) surrounding moving bodies and/or At least one of (g) item information related to a change in position of a person more than a predetermined amount (e.g. behavior such as sudden braking or sudden interruption of a vehicle in front), and (g) information related to brightness in the original image data. can be information including

図２に示した実施例では、算出された「道路幅」危険スコア、「視界障害物」危険スコア及び「人の数」危険スコアはいずれも、相対的に小さい値となっている。ここで、「道路幅」危険スコアは、予め画像座標系と（カメラ位置を原点とした）実空間座標系との間の座標変換式を決定しておき、これを用いて画素領域Raの実空間での幅（水平方向での距離）を算出して道路幅とし、この道路幅から、予め道路幅が狭いほどより大きいスコアを対応付けたテーブルを参照して決定することができる。ここでこのテーブルのスコアは、自動車２の幅（と道路幅との比）も勘案して決められることも好ましい。 In the example shown in FIG. 2, the calculated "road width" danger score, "visual obstruction" danger score, and "number of people" danger score are all relatively small values. Here, the "road width" risk score is determined in advance by a coordinate conversion formula between the image coordinate system and the real space coordinate system (with the camera position as the origin), and is used to calculate the actual pixel area Ra. The width of the space (distance in the horizontal direction) is calculated as the road width, and the road width can be determined by referring to a table in which the smaller the road width, the higher the score. Here, it is also preferable that the score of this table is determined taking into consideration the width of the automobile 2 (ratio to the road width).

また、「視界障害物」危険スコアは、領域分割画像内において画素領域Raも勘案して「視界領域」を規定した上で、クラスラベル"視界障害物"の付与された画素領域が、この視界領域を占める割合を決定し、この割合から、予め割合が大きいほどより大きいスコアを対応付けたテーブルを参照して決定することができる。さらに、"視界障害物"の付与された画素領域が画素領域Ra内において占める位置に基づいて、「視界障害物」危険スコアを決定してもよい。例えば、"視界障害物"の付与された画素領域の位置が画素領域Raの中央に近いほど、より大きなスコアを付与するようにしてもよい。 In addition, the "visual field obstruction" risk score is defined by taking into account the pixel area Ra in the segmented image and defining the "visual field area". It is possible to determine the ratio of the region to be occupied, and refer to a table in which the higher the ratio, the higher the score is associated in advance. Further, a "visual obstruction" danger score may be determined based on the position occupied by the "visual obstruction" assigned pixel area within the pixel area Ra. For example, a higher score may be given as the position of the pixel region given the “visual obstruction” is closer to the center of the pixel region Ra.

さらに、「人の数」危険スコアは、領域分割画像内において画素領域Raも勘案して「人存在留意領域」を規定した上で、クラスラベル"人間"の付与された画素領域であってその下端（足元）がこの人存在留意領域内にある画素領域の「数」を決定し、この「数」から、予め「数」が大きいほどより大きいスコアを対応付けたテーブルを参照して決定することができる。またこの場合、当該下端（足元）の位置が「人存在留意領域」内であって画素領域Raの中央に近いほど、その「数」のカウントの際により大きい重みを付与する、すなわち、1よりもより大きなカウント数を計上するようにしてもよい。 Furthermore, the "number of people" risk score is determined by taking into account the pixel area Ra in the segmented image and defining the "human presence attention area", and then assigning the class label "human" to the pixel area. The "number" of pixel areas whose bottom edge (foot) is within this human presence attention area is determined, and from this "number", a table is determined in which the larger the "number" is, the higher the score is associated in advance. be able to. Also, in this case, the closer the position of the lower end (foot) is within the "human presence attention area" and closer to the center of the pixel area Ra, the more weight is given when counting the "number". may also account for a larger number of counts.

次に、図３に示した実施例では、領域分割画像生成部１１２は、図２の実施例と同様、自動車２における進行方向の景色を含むＲＧＢ画像データに対しSemantic Segmentation処理を実施して、領域分割画像を生成している。ここで生成された領域分割画像は、図３に示すようにクラスラベル"道路"の付与された画素領域Ra、クラスラベル"柱状体（視界障害物）"の付与された画素領域Rb、クラスラベル"壁状体（視界障害物）"の付与された画素領域Rc、及びクラスラベル"人間”の付与された画素領域Rdを含み、一方で、クラスラベル"自動車"の付与された画素領域を含んでいない。 Next, in the embodiment shown in FIG. 3, the segmented image generation unit 112 performs semantic segmentation processing on RGB image data including scenery in the traveling direction of the automobile 2, as in the embodiment in FIG. Generating segmented images. As shown in FIG. 3, the segmented image generated here includes a pixel region Ra assigned with the class label "road", a pixel region Rb assigned with the class label "columnar body (visual obstruction)", and a pixel region Rb assigned with the class label It includes a pixel region Rc assigned the "wall-like body (visual obstruction)" and a pixel region Rd assigned the class label "human", while it includes a pixel area assigned the class label "automobile". not

次いで、周辺状況情報決定部１１３は、図２の実施例と同様に、生成された領域分割画像における画素領域の情報及び（付与された又は付与されていない）クラスラベルの情報に基づいて、周辺状況情報を決定している。具体的に図３の実施例では、算出された「道路幅」危険スコア及び「視界障害物」危険スコアは相対的に大きい値となっており、「人の数」危険スコアは相対的に中程度の値となっている。 2, the surrounding situation information determination unit 113 determines the surrounding area based on the pixel area information and the (assigned or not assigned) class label information in the generated segmented image. Determining status information. Specifically, in the embodiment of FIG. 3, the calculated "road width" risk score and "visual field obstruction" risk score are relatively large values, and the "number of people" risk score is relatively medium. It is a value of degree.

図１の機能ブロック図に戻って、学習データ生成部１１４は、
（ａ）所定条件を満たす周辺状況情報（危険スコア群）が決定された元の画像データに対し、交通事故（又は交通事故未遂事象）の起き易いことを示す値（ラベル）を正解データとして対応付けたデータセットを生成し、
（ｂ）当該所定条件を満たさない周辺状況情報（危険スコア群）が決定された元の画像データに対し、交通事故（又は交通事故未遂事象）が起き易いとは言えないことを示す値（ラベル）を正解データとして対応付けたデータセットを生成し、
これらのデータセットを含む「学習データ」を生成する。 Returning to the functional block diagram of FIG. 1, the learning data generation unit 114
(a) Values (labels) indicating the likelihood of traffic accidents (or attempted traffic accidents) are associated as correct data with respect to the original image data for which surrounding situation information (danger score group) that satisfies a predetermined condition has been determined. generate a dataset with
(b) A value (label ) as correct data,
Generate "learning data" containing these datasets.

なおここで、（元の）画像データとして、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグの付与された画像データのみを用いることも好ましい。この場合、学習データ生成部１１４は、交通事故又は交通事故未遂事象に関係し得る、これらのタグの付与された画像データについてさらに、周辺状況情報を勘案してより的確な正解ラベルを対応付けることが可能となる。すなわち、単に「「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグの有無だけに依存しない、より適切な「学習データ」を生成することができるのである。ただし勿論、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグの付与されていない画像データについても、適切なラベルを対応付けた上で学習データに採用することも可能である。 Here, as the (original) image data, it is also preferable to use only the image data to which the "sudden deceleration" tag, the "sudden braking" tag, or the "sudden steering" tag is attached. In this case, the learning data generation unit 114 can associate more accurate correct labels in consideration of the surrounding situation information for the tagged image data that may be related to traffic accidents or attempted traffic accidents. It becomes possible. That is, it is possible to generate more appropriate "learning data" that does not simply depend on the presence or absence of "sudden deceleration" tags, "sudden braking" tags, and "sudden steering" tags. However, it is of course possible to assign appropriate labels to image data that are not tagged with a "sudden deceleration" tag, a "sudden braking" tag, or a "sudden steering" tag, and then use them as learning data.

また、上記（ａ）の所定条件としては例えば、周辺状況情報としての危険スコア群についての各危険スコアに重み付けした上での総和（summation）の値、すなわち「総合危険スコア」が、所定閾値以上であることとしてもよい。この場合例えば、「総合危険スコア」が所定閾値以上である画像データに対し1（危険）を対応付けたデータセットと、「総合危険スコア」が所定閾値未満である画像データに対し0（安全）を対応付けたデータセットとを生成して「学習データ」とすることができる。例えば、図２に示した画像データには0（安全）を対応付け、一方、図３に示した画像データには1（危険）を対応付けて「学習データ」を生成するのである。 As the predetermined condition (a), for example, the value of the summation of weighted risk scores for the risk score group as the surrounding situation information, that is, the "total risk score" is equal to or greater than a predetermined threshold. It is also possible to be In this case, for example, a data set in which 1 (dangerous) is associated with image data whose "total risk score" is greater than or equal to a predetermined threshold, and 0 (safe) is associated with image data whose "total risk score" is less than the predetermined threshold. can be generated as "learning data". For example, 0 (safe) is associated with the image data shown in FIG. 2, and 1 (dangerous) is associated with the image data shown in FIG. 3 to generate "learning data."

さらに変更態様として、上述した「総合危険スコア」について予め第1～第N範囲（例えば第1（小）,第2（中）,第3（大）の３つの範囲）を規定しておき、「総合危険スコア」が第p（1≦p≦N）範囲にある画像データに対しpを正解ラベルとして対応付けた（例えば総合危険スコアが第1（小）範囲にある画像データに対し1（安全）を対応付けた）データセットを生成して「学習データ」とすることも好ましい。または、「総合危険スコア」を危険の度合い（0.0：安全～1.0：危険）に換算し、この危険の度合いを正解ラベルとして画像データに対応付けたデータセットを生成して「学習データ」としてもよい。 Furthermore, as a modification mode, the 1st to Nth ranges (for example, 1st (small), 2nd (middle), and 3rd (large) ranges are defined in advance for the above-mentioned "total risk score"), For image data whose "total risk score" is in the p-th range (1 ≤ p ≤ N), p is associated as a correct label (for example, 1 ( It is also preferable to generate a data set associated with safety) and use it as “learning data”. Alternatively, the "total risk score" can be converted to the degree of risk (0.0: safe to 1.0: dangerous), and the degree of risk can be used as the correct label to generate a data set associated with the image data, which can be used as "learning data." good.

いずれにしても学習データ生成部１１４は、例えば急ブレーキ時、急減速時や急ハンドル時に生成された画像データであって、周辺状況情報（危険スコア群）が交通事故（又は交通事故未遂事象）の起き易い状況であることを示している画像データに対しては、交通事故（又は交通事故未遂事象）の起き易いことを示す値（ラベル）を正解データとして対応付けたデータセットを生成し「学習データ」を生成することができるのである。 In any case, the learning data generation unit 114 is image data generated, for example, during sudden braking, sudden deceleration, or sudden steering, and the surrounding situation information (risk score group) is a traffic accident (or attempted traffic accident event). For the image data indicating that the situation is likely to occur, a data set is generated in which a value (label) indicating that a traffic accident (or attempted traffic accident event) is likely to occur is associated as correct data. It is possible to generate "learning data".

学習モデル生成部１１５は、学習データ生成部１１４で生成された「学習データ」を用い、所定の機械学習アルゴリズムに基づいて「学習モデル」を生成する。この「学習モデル」は、本実施形態において、入力画像データから交通事故（又は交通事故未遂事象）の起こり易さ（危険度）を出力可能なモデルとなる。 The learning model generation unit 115 uses the “learning data” generated by the learning data generation unit 114 to generate a “learning model” based on a predetermined machine learning algorithm. In this embodiment, this "learning model" is a model capable of outputting the likelihood (degree of risk) of traffic accidents (or attempted traffic accident events) from the input image data.

ここで、機械学習アルゴリズムとして、画像認識用に広く使用されているディープニューラルネットワーク（ＤＮＮ，Deep Neural Network）や、サポートベクタマシーン（ＳＶＭ）、さらにはランダムフォレスト（Random Forest）等、種々のアルゴリズムが適用可能である。いずれにしても、画像データが入力されて識別結果が出力される識別器を構成するアルゴリズムならば、種々のものを採用することができる。 Here, as machine learning algorithms, there are various algorithms such as deep neural networks (DNN, Deep Neural Network), which are widely used for image recognition, support vector machines (SVM), and random forests. Applicable. In any case, various algorithms can be employed as long as they constitute a discriminator that inputs image data and outputs a discrimination result.

具体的に１つの実施態様として、学習モデル生成部１１５は、
（ａ）画像データを入力してこれらの特徴に係る特徴情報を出力する第１ＮＮとしての畳み込み層部（Convolutional Layers）と、
（ｂ）畳み込み層部から出力された特徴情報を入力してクラスに係る情報（ラベルに係る情報）を出力する第２ＮＮとしての全結合層部（Fully-Connected Layers）と
を含む識別器を構成し、これに対し「学習データ」を用いて学習処理を行って「学習モデル」を生成してもよい。 Specifically, as one embodiment, the learning model generation unit 115
(a) Convolutional Layers as a first NN that inputs image data and outputs feature information related to these features;
(b) Configure a classifier including fully-connected layers as a second NN that inputs feature information output from the convolutional layer and outputs information related to classes (information related to labels) However, a "learning model" may be generated by performing a learning process on this using "learning data".

ここで、上記（ａ）の畳み込み層部は、画像データに対しカーネル（重み付け行列フィルタ）をスライドさせて特徴マップを生成する畳み込み処理を実行する。この畳み込み処理によって、画像の解像度を段階的に落としながら、エッジや勾配等の基本的特徴を抽出し、局所的な相関パターンの情報を得ることができる。例えばこの畳み込み層部として、複数の畳み込み層を用いた公知のAlexNetを用いることが可能である。 Here, the convolution layer unit (a) above executes convolution processing for generating a feature map by sliding a kernel (weighting matrix filter) on image data. By this convolution process, basic features such as edges and gradients can be extracted while the resolution of the image is gradually reduced, and information on local correlation patterns can be obtained. For example, it is possible to use the well-known AlexNet using a plurality of convolution layers as this convolution layer section.

このAlexNetでは、各畳み込み層はプーリング層と対になっており、畳み込み処理とプーリング処理とが繰り返される。ここでプーリング処理とは、畳み込み層から出力される特徴マップ（一定領域内の畳み込みフィルタの反応）を最大値や平均値等でまとめ、調整パラメータを減らしつつ、局所的な平行移動不変性を確保する処理である。 In this AlexNet, each convolutional layer is paired with a pooling layer, and convolutional processing and pooling processing are repeated. The pooling process here means that the feature map output from the convolution layer (reaction of the convolution filter in a certain area) is summarized by the maximum value, average value, etc., and local translation invariance is secured while reducing the adjustment parameters. It is a process to

また他の実施態様として、学習モデル生成部１１５は、畳み込み層を含む畳み込みニューラルネットワーク（ＣＮＮ，Convolutional Neural Network）の出力側に、判別すべきクラス（ラベル）毎に設けられたサポートベクタマシン（ＳＶＭ）を接続した構成の識別器を構成し、これに対し学習データを用いて学習処理を行って「学習モデル」を生成することも可能である。 As another embodiment, the learning model generation unit 115 includes a support vector machine (SVM ), and perform learning processing using learning data to generate a “learning model”.

いずれにしても学習モデル生成部１１５は本実施形態において、生成した「学習モデル」を、通信インタフェース１０１を介して例えば事象発生推定装置である端末２０へ送信することができる。 In any case, in this embodiment, the learning model generation unit 115 can transmit the generated “learning model” to the terminal 20, which is an event occurrence estimation device, for example, via the communication interface 101. FIG.

［事象発生推定装置の機能構成］
同じく図１に示した機能ブロック図によれば、端末２０は、通信インタフェース２０１と、通信インタフェース２０２と、カメラ２０３と、ディスプレイ（ＤＰ）２０４と、プロセッサ・メモリとを有する。ここで、このプロセッサ・メモリは、本発明に係る事象発生推定プログラムの一実施形態を保存しており、さらに、コンピュータ機能を有していて、この事象発生推定プログラムを実行することによって、事象発生推定処理を実施する。 [Functional Configuration of Event Occurrence Estimating Device]
Also according to the functional block diagram shown in FIG. 1, the terminal 20 has a communication interface 201, a communication interface 202, a camera 203, a display (DP) 204, and a processor memory. Here, the processor memory stores an event occurrence estimating program according to an embodiment of the present invention, and further has a computer function. Perform estimation processing.

またこのことから、本発明に係る事象発生推定装置として、ドライブレコーダである本端末２０に代えて、本発明に係る事象発生推定プログラムを搭載した他の車載情報処理装置や、さらにはカメラを備えた又はカメラと接続されたスマートフォン、ノート型若しくはタブレット型コンピュータ、又はパーソナル・コンピュータ（ＰＣ）等を採用することも可能となる。また、ドライブレコーダとＷｉ-Ｆｉ（登録商標）やBluetooth（登録商標）等で通信接続された端末、例えばスマートフォンを本事象発生推定装置としてもよい。 For this reason, as the event occurrence estimation device according to the present invention, instead of the present terminal 20, which is a drive recorder, another in-vehicle information processing device equipped with the event occurrence estimation program according to the present invention, or a camera is provided. Alternatively, a smart phone, notebook or tablet computer, or personal computer (PC) connected to a camera can be used. A terminal, such as a smartphone, which is connected to the drive recorder via Wi-Fi (registered trademark), Bluetooth (registered trademark), or the like, may be used as the event occurrence estimation device.

さらに、プロセッサ・メモリは、映像生成部２１１と、画像選択部２１２と、事故発生推定部２１３と、提示情報生成部２１４と、領域分割画像生成部２１５と、周辺状況情報決定部２１６と、発生要因情報生成部２１７とを有する。なお、これらの機能構成部は、プロセッサ・メモリに保存された事象発生推定プログラムの機能と捉えることができる。また、図１における端末２０の機能構成部間を矢印で接続して示した処理の流れは、本発明に係る事象発生推定方法の一実施形態としても理解される。 Further, the processor memory includes a video generation unit 211, an image selection unit 212, an accident occurrence estimation unit 213, a presentation information generation unit 214, a segmented image generation unit 215, a peripheral situation information determination unit 216, an occurrence and a factor information generator 217 . It should be noted that these functional components can be regarded as functions of the event occurrence estimation program stored in the processor memory. Further, the flow of processing in which the functional components of the terminal 20 are connected by arrows in FIG. 1 can also be understood as an embodiment of the event occurrence estimation method according to the present invention.

同じく図１の機能ブロック図において、映像生成部２１１は、カメラ２０３から出力された撮影データに基づいて映像データ（画像データ群）を生成する。本実施形態において端末２０はドライブレコーダであり、映像生成部２１１は通常、デフォルトの設定として少なくとも自動車２の走行時は常に、車外の状況を撮影した撮影データをカメラ２０３から取得し、映像データ（画像データ群）を生成している。 Similarly, in the functional block diagram of FIG. In the present embodiment, the terminal 20 is a drive recorder, and the image generation unit 211 normally acquires image data of the situation outside the vehicle from the camera 203 at least when the vehicle 2 is running as a default setting, and outputs the image data ( image data group).

画像選択部２１２は本実施形態において、映像生成部２１１で生成された画像データのうち、所定条件を満たすだけの急な減速若しくは制動動作（ブレーキ動作）、又は急なステアリング（ハンドル動作）を行った時点又はその近傍時点（例えば当該時点を中点とする所定時間範囲内の時点）における画像データを選択して、該当する「急減速」タグ、「急ブレーキ」タグや、「急ハンドル」タグを付与し、これらのタグの付与された画像データを、携帯電話通信網やインターネット等の通信網と接続された通信インタフェース２０１を介してクラウドサーバ１へ送信する。 In this embodiment, the image selection unit 212 performs a sudden deceleration or braking operation (brake operation) or a sudden steering operation (handle operation) that satisfies a predetermined condition from the image data generated by the image generation unit 211. Select image data at a point in time or a point in the vicinity thereof (for example, a point in time within a predetermined time range with the point in time as the middle point), and select the corresponding "sudden deceleration" tag, "sudden braking" tag, or "sudden steering" tag , and the image data with these tags is transmitted to the cloud server 1 via the communication interface 201 connected to a communication network such as a mobile phone communication network or the Internet.

ここで端末２０は、通信インタフェース２０２を介して自動車２のＣＡＮ（Controller Area Network）とＷｉ-Ｆｉ（登録商標）やBluetooth（登録商標）等の近距離無線通信網又は有線で接続されていて、画像選択部２１２は、このＣＡＮから、通信インタフェース２０２を介して速度情報、制動情報や、ステアリング情報を取得してもよい。 Here, the terminal 20 is connected to the CAN (Controller Area Network) of the automobile 2 via a communication interface 202 via a short-range wireless communication network such as Wi-Fi (registered trademark) or Bluetooth (registered trademark) or by wire. The image selection unit 212 may acquire speed information, braking information, and steering information from this CAN via the communication interface 202 .

この場合、画像選択部２１２は、取得した速度情報から自動車の加速度を算出し、当該加速度が所定条件、例えば負値であってその絶対値が所定時間以上所定閾値を超える値であるとの条件、を満たすならば、対応する画像データに「急減速」タグを付与することができる。また、取得した制動情報から自動車の制動（ブレーキ）動作の強さを示す値を算出し、当該強さを示す値が所定条件、例えば所定時間以上所定閾値を超える値であるとの条件、を満たすならば、対応する画像データに「急ブレーキ」タグを付与することも好ましい。さらに、取得したステアリング情報から自動車のステアリング（ハンドル）動作の強さを示す値を算出し、当該強さを示す値が所定条件、例えば所定時間以上所定閾値を超える値であるとの条件、を満たすならば、対応する画像データに「急ハンドル」タグを付与することも好ましい。 In this case, the image selection unit 212 calculates the acceleration of the automobile from the acquired speed information, and satisfies a predetermined condition, for example, that the acceleration is a negative value and the absolute value exceeds a predetermined threshold for a predetermined time or more. , a "rapid deceleration" tag can be attached to the corresponding image data. Also, a value indicating the strength of the braking (brake) operation of the vehicle is calculated from the acquired braking information, and the value indicating the strength indicates a predetermined condition, for example, a condition that the value exceeds a predetermined threshold for a predetermined time or more. If so, it is also preferable to add a "sudden braking" tag to the corresponding image data. Furthermore, a value indicating the strength of the steering (steering wheel) operation of the automobile is calculated from the acquired steering information, and the value indicating the strength indicates a predetermined condition, for example, a condition that the value exceeds a predetermined threshold for a predetermined period of time or longer. If so, it is also preferable to tag the corresponding image data with a "quick handle" tag.

また変更態様として、画像選択部２１２は、映像生成部２１１から取得した時系列の画像データについてオプティカルフロー解析を行い、所定条件を満たす大きなオプティカルフローのベクトル量が算出された際、対応する画像データに「状況急変」タグを付与してクラウドサーバ１へ送信することも好ましい。さらに、当該時系列の画像データについて画像差分解析を行い、所定時間内に所定以上の画像差分量が検出された際に「状況急変」タグを付与することも可能である。 As a modification mode, the image selection unit 212 performs optical flow analysis on the time-series image data acquired from the image generation unit 211, and when a large optical flow vector amount that satisfies a predetermined condition is calculated, the corresponding image data. It is also preferable to attach a “sudden change” tag to the data and transmit it to the cloud server 1 . Furthermore, it is also possible to perform image difference analysis on the time-series image data, and attach a "sudden change in situation" tag when an image difference amount equal to or greater than a predetermined amount is detected within a predetermined period of time.

また、端末２０からクラウドサーバ１へ画像データを送信する際、通常、当該画像データに対しＭＰＥＧ(Moving Picture Experts Group)等の符号化処理を行い、圧縮画像データを生成して送信することになるが、ここで、当該符号化処理の際に決定される（符号化パラメータとしての）動きベクトルが所定条件を満たす大きさを有する場合に、対応する圧縮画像データに「状況急変」タグを付与してクラウドサーバ１へ送信することも好ましい。 In addition, when image data is transmitted from the terminal 20 to the cloud server 1, the image data is usually encoded by MPEG (Moving Picture Experts Group) or the like to generate and transmit compressed image data. However, if the motion vector (as an encoding parameter) determined during the encoding process has a size that satisfies a predetermined condition, the corresponding compressed image data is given a "sudden change" tag. It is also preferable to transmit to the cloud server 1 by

なお変更態様として、クラウドサーバ１が、端末２０から受信した圧縮画像データを伸張（デコード）して動きベクトルを抽出し、当該動きベクトルが所定条件を満たす大きさを有する場合に、伸張した画像データに「状況急変」タグを付与してもよい。さらには、端末２０からクラウドサーバ１へ圧縮画像データとともに対応するＣＡＮ情報が常時、伝送されている場合、又はＣＡＮ情報の付与された圧縮画像データが伝送される場合に、クラウドサーバ１が、取得したＣＡＮ情報に基づいて、受信し伸張した画像データに対し、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグ、さらには「状況急変」タグを付与するようにしてもよい。 As a modification, the cloud server 1 decompresses (decodes) the compressed image data received from the terminal 20 to extract a motion vector, and if the motion vector has a size that satisfies a predetermined condition, the decompressed image data. may be given a "sudden change in situation" tag. Furthermore, when the CAN information corresponding to the compressed image data is always transmitted from the terminal 20 to the cloud server 1, or when the compressed image data to which the CAN information is added is transmitted, the cloud server 1 acquires Based on the received CAN information, a "sudden deceleration" tag, a "sudden braking" tag, a "sudden steering" tag, and a "sudden change in situation" tag may be added to the received and decompressed image data.

いずれにしてもクラウドサーバ１は、以上に説明した「状況急変」タグの付与された画像データを、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグの付与された画像データと同様に取り扱うことになる。 In any case, the cloud server 1 treats the image data to which the "sudden change in situation" tag described above is attached as the image data to which the "sudden deceleration" tag, the "sudden braking" tag, or the "sudden steering" tag is attached. will be treated in the same way.

なお勿論、画像選択部２１２は、「急減速」タグ、「急ブレーキ」タグや「急ハンドル」タグ、さらには「状況急変」タグを付与しない画像データを、学習データ生成用としてクラウドサーバ１へ送信してもよい。このような画像データは例えば、交通事故（又は交通事故未遂事象）が起こり難いことを示す値（正解ラベル）を対応付けた上で学習データに加えることが可能である。 It goes without saying that the image selection unit 212 transfers image data to which the "sudden deceleration" tag, the "sudden braking" tag, the "sudden steering" tag, and the "sudden change in situation" tag are not attached to the cloud server 1 for generating learning data. You may send. For example, such image data can be added to learning data after being associated with a value (correct label) indicating that a traffic accident (or attempted traffic accident event) is unlikely to occur.

同じく図１の機能ブロック図において、事象発生推定手段としての事故発生推定部２１３は、通信インタフェース２０１を介してクラウドサーバ１から取得した「学習モデル」を用いて交通事故発生推定処理を実施する識別器を備えており、映像生成部２１１で生成された、又は画像選択部２１２で選択された（所定のタグを付与された）入力画像データを受け付け、周辺状況に係る当該入力画像データにおける交通事故（又は交通事故未遂事象）の起こり易さを推定する。 Similarly, in the functional block diagram of FIG. 1, an accident occurrence estimation unit 213 as event occurrence estimation means uses a "learning model" acquired from the cloud server 1 via the communication interface 201 to carry out traffic accident occurrence estimation processing. receives input image data generated by the image generation unit 211 or selected by the image selection unit 212 (given a predetermined tag), and detects a traffic accident in the input image data related to the surrounding situation. (or attempted traffic accident event) is estimated.

ここで、推定結果として、「学習モデル」生成の際に用いた「学習データ」における正解クラスラベルの設定に応じ、2値のうちのいずれか、例えばクラスラベル"危険"及び"安全"のうちのいずれかを出力してもよい。または、多値のうちのいずれか、例えばクラスラベル"危険"、"要注意"、"安全運転励行"のうちのいずれかを出力してもよい。 Here, as the estimation result, one of two values, for example, one of the class labels "danger" and "safe", according to the setting of the correct class label in the "learning data" used when generating the "learning model" You can output either Alternatively, one of multiple values, for example, one of the class labels "danger", "caution", and "enforce safe driving" may be output.

領域分割画像生成部２１５は、事象発生推定対象である入力画像データ（事故発生推定部２１３へ入力された画像データ）に対し、画素又は単位領域毎に周辺又は所定環境の状況に係る値を割り当てる処理を実施し、形成された各画素領域（画像部分領域）に当該周辺又は所定環境の状況に係るラベルを付与した画像データである領域分割画像データを生成する。ここで、領域分割画像データを生成するための処理内容は、上述したクラウドサーバ１の領域分割画像生成部１１２での処理内容と同一とすることができ、具体的にSemantic Segmentation処理を行うものであってよい。 The region-divided image generation unit 215 assigns a value related to the situation of the surroundings or a predetermined environment to each pixel or unit region to the input image data (image data input to the accident occurrence estimation unit 213), which is the event occurrence estimation target. Processing is performed to generate region-divided image data, which is image data in which each formed pixel region (image partial region) is labeled according to the surroundings or given environmental conditions. Here, the processing contents for generating the segmented image data can be the same as the processing contents in the segmented image generation unit 112 of the cloud server 1 described above, and specifically, the semantic segmentation processing is performed. It's okay.

周辺状況情報決定部２１６は、領域分割画像生成部２１５で生成された領域分割画像データの画素領域及びラベルに基づいて、交通事故又は交通事故未遂事象発生に関係し得る情報である周辺状況情報を決定する。ここでも、周辺状況情報を決定するための処理内容は、上述したクラウドサーバ１の周辺状況情報決定部１１３での処理内容と同一とすることができる。 The peripheral situation information determination unit 216 determines peripheral situation information, which is information that may be related to the occurrence of a traffic accident or attempted traffic accident, based on the pixel regions and labels of the region-divided image data generated by the region-divided image generation unit 215. decide. Here, too, the processing content for determining the peripheral situation information can be the same as the processing content in the peripheral situation information determining unit 113 of the cloud server 1 described above.

発生要因情報生成部２１７は、周辺状況情報決定部２１６で決定された周辺状況情報に基づいて、事故発生推定部２１３から出力された推定結果を裏付ける情報であって交通事故又は交通事故未遂事象が発生する要因に係る情報である「発生要因情報」を生成して出力する。この「発生要因情報」は、例えば「道路幅」危険スコア（周辺状況情報）が所定閾値以上である場合には「道路幅狭し」であったり、「人の数」危険スコア（周辺状況情報）が所定閾値以上である場合には「歩行者多し」であったりしてもよい。 Based on the surrounding situation information determined by the surrounding situation information determining section 216, the occurrence factor information generating section 217 generates information that supports the estimation result output from the accident occurrence estimating section 213 and that the traffic accident or attempted traffic accident event occurred. It generates and outputs "occurrence factor information" which is information related to the cause of occurrence. For example, if the "road width" risk score (surrounding situation information) is equal to or greater than a predetermined threshold, this "occurrence factor information" may be "road narrow" or "number of people" risk score (surrounding situation information). is equal to or greater than a predetermined threshold, it may be "a lot of pedestrians".

提示情報生成部２１４は、事故発生推定部２１３に入力された入力画像データについて（ａ）出力された推定結果と（ｂ）生成された発生要因情報とを含む「提示情報」を生成し、この「提示情報」を、当該入力画像データと合せてディスプレイ２０４に表示させる。ここで例えば、上記（ａ）の推定結果が"危険"相当の内容（又は所定以上の危険度）である場合にのみ、「発生要因情報」を含む「提示情報」を生成して表示させることも好ましい。 The presentation information generation unit 214 generates “presentation information” including (a) the output estimation result and (b) the generated cause information for the input image data input to the accident occurrence estimation unit 213, and generates the “presentation information”. The “presentation information” is displayed on the display 204 together with the input image data. Here, for example, "presentation information" including "occurrence factor information" may be generated and displayed only when the estimation result of (a) above is content corresponding to "danger" (or a predetermined degree of danger or higher). is also preferred.

図４は、本発明に係る提示情報生成処理についての一実施例を説明するための模式図である。 FIG. 4 is a schematic diagram for explaining an embodiment of presentation information generation processing according to the present invention.

図４によれば、カメラ２０３で撮影され生成された、自動車２の進行方向の画像データに対し、
（ａ）事故発生推定部２１３が"危険"との推定結果を出力し、
（ｂ）周辺状況情報決定部２１６が、所定閾値以上の「人の数」危険スコアと、所定閾値以上の「視界障害物」危険スコアとを決定し、これを受けて発生要因情報生成部２１７が発生要因情報「歩行者多し」及び「見通し不良」を生成して出力し、
提示情報生成部２１４は、これら（ａ）及び（ｂ）の出力から提示情報「危険！歩行者多し！見通し不良！」を生成して、画像内の吹き出しの形でディスプレイ２０４に表示させている。 According to FIG. 4, for the image data of the traveling direction of the automobile 2 captured and generated by the camera 203,
(a) The accident occurrence estimating unit 213 outputs an estimation result of "dangerous",
(b) The surrounding situation information determination unit 216 determines a “number of people” risk score equal to or greater than a predetermined threshold, and a “obstruction in sight” risk score equal to or greater than a predetermined threshold. generates and outputs the occurrence factor information "many pedestrians" and "poor visibility",
The presentation information generation unit 214 generates the presentation information “Danger! Too many pedestrians! Poor visibility!” there is

ここで本実施例では、事象発生推定処理を含む一連の処理は入力画像データの生成後直ちに実施されて、画像データ及び提示情報は、概ねリアルタイムにディスプレイ２０４に表示されている。 Here, in this embodiment, a series of processes including the event occurrence estimation process are performed immediately after the input image data is generated, and the image data and presentation information are displayed on the display 204 substantially in real time.

このように、特に"危険"相当の推定結果に合わせて、危険である理由（事故の原因となり得る事項）をも提示することによって、ユーザに対しより的確に注意を促し、一方ユーザは、事故を回避するために留意すべき事項を直ちに且つ確実に理解し、より適切・安全な運転を実施することも可能となる。 In this way, by presenting the reason for the danger (matter that may cause an accident) along with the estimated result corresponding to "danger", the user is more accurately alerted, and the user can avoid the accident. It is also possible to immediately and surely understand the points to be noted in order to avoid accidents, and to practice more appropriate and safer driving.

また例えば、自動運転車においても、いわゆるヒヤリハットポイントとして登録された場面ではない現前の未知の道路状況について、危険の度合い（交通事故の発生し易さ）をより的確に認識し、そのような危険の度合いの状況により合致した運転操作を実施することが可能となるのである。 Also, for example, in the case of self-driving cars, it is possible to more accurately recognize the degree of danger (the likelihood of a traffic accident occurring) in unknown road conditions that are not registered as so-called near-miss points. Therefore, it becomes possible to carry out a driving operation that matches the situation of the degree of danger.

［学習データ・学習モデル生成方法，事象発生推定方法］
図５は、本発明による学習データ・学習モデル生成方法、及び事象発生推定方法における一実施形態の概略を示すシーケンス図である。ここで本実施形態では、端末２０Ａ～２０Ｃは各々、常時、自動車２の進行方向の状況をカメラ２０３によって撮影して映像（画像データ）を生成している（ステップＳ１０１）。なお、端末２０の数は本実施形態において３つとなっているが、本発明は当然、これに限定されるものではない。 [Learning data/learning model generation method, event occurrence estimation method]
FIG. 5 is a sequence diagram showing an outline of an embodiment of the learning data/learning model generation method and the event occurrence estimation method according to the present invention. Here, in this embodiment, each of the terminals 20A to 20C always captures the situation in the traveling direction of the automobile 2 with the camera 203 to generate an image (image data) (step S101). Although the number of terminals 20 is three in this embodiment, the present invention is of course not limited to this.

（Ｓ１０２）端末２０Ｃは、自身の搭載された自動車２の急ブレーキ情報をＣＡＮから取得する。
（Ｓ１０３）端末２０Ｃは、急ブレーキ情報の取得時に生成された画像データを選択する。
（Ｓ１０４）端末２０Ｃは、選択した急ブレーキ時の画像データ（急ブレーキタグの付与された画像データ）を、クラウドサーバ１へ送信する。
ここで、他の端末２０Ａ及び２０Ｂについても上記ステップＳ１０２～１０４相当の処理が実施され、さらにいずれの端末においても上記ステップＳ１０２～１０４相当の処理が繰り返されることによって、クラウドサーバ１は、急ブレーキタグの付与された画像データを十分な量だけ取得するものとする。 (S102) The terminal 20C acquires sudden braking information of the automobile 2 in which it is mounted from the CAN.
(S103) The terminal 20C selects the image data generated when the sudden braking information was acquired.
(S104) The terminal 20C transmits the selected image data at the time of sudden braking (image data with the sudden braking tag attached) to the cloud server 1 .
Here, the processes corresponding to steps S102 to 104 are performed on the other terminals 20A and 20B, and the processes corresponding to steps S102 to 104 are repeated in each of the terminals. Sufficient amount of tagged image data shall be obtained.

（Ｓ１１１）クラウドサーバ１は、取得した画像データに対し、セマンティックセグメンテーション処理を実施して領域分割画像を生成する。
（Ｓ１１２）クラウドサーバ１は、生成した領域分割画像から、周辺状況情報としての危険スコア群を決定する。
（Ｓ１１３）クラウドサーバ１は、所定条件を満たす危険スコア群を有する画像データに対し正解ラベル"危険"を対応付けたデータセットを含む「学習データ」を生成する。 (S111) The cloud server 1 performs semantic segmentation processing on the acquired image data to generate a segmented image.
(S112) The cloud server 1 determines a risk score group as surrounding situation information from the generated segmented image.
(S113) The cloud server 1 generates "learning data" including a data set in which the correct label "danger" is associated with image data having a risk score group that satisfies a predetermined condition.

（Ｓ１１４）クラウドサーバ１は、「学習データ」が所定量に達した段階で、当該「学習データ」を用いて「学習モデル」を生成する。
（Ｓ１１５）クラウドサーバ１は、生成した「学習モデル」を各端末２０Ａ～２０Ｃに供給する。 (S114) When the "learning data" reaches a predetermined amount, the cloud server 1 generates a "learning model" using the "learning data".
(S115) The cloud server 1 supplies the generated "learning model" to each terminal 20A-20C.

（Ｓ１２１）各端末２０Ａ～２０Ｃは、取得した「学習モデル」を用い、生成した画像データ毎に交通事故発生の危険度（交通事故の起こり易さ）を推定する。
（Ｓ１２２）各端末２０Ａ～２０Ｃは、推定した危険度に係る提示情報を、当該画像データに合わせてディスプレイ２０４に表示する。 (S121) Each of the terminals 20A to 20C uses the obtained "learning model" to estimate the degree of risk of a traffic accident (likelihood of a traffic accident) for each generated image data.
(S122) Each of the terminals 20A to 20C displays presentation information related to the estimated degree of risk on the display 204 in accordance with the image data.

以上、詳細に説明したように、本発明によれば、画像データを含む「学習データ」を生成する際、画像データから「領域分割画像データ」を生成して、所定事象の発生に関係し得る重要な情報である「周辺状況情報」を決定する。次いで、このような「周辺状況情報」に基づくことによって、「学習データ」に好適な画像データを決定、選別又は抽出することができるのである。またこれにより、所定事象の発生し易さの推定処理を実施可能にする「学習モデル」の生成に好適な「学習データ」を生成することが可能となる。 As described in detail above, according to the present invention, when generating "learning data" including image data, "region-divided image data" is generated from the image data, and the data can be related to the occurrence of a predetermined event. Determine the "surrounding information" that is important information. Then, image data suitable for the "learning data" can be determined, selected or extracted based on such "peripheral situation information". In addition, this makes it possible to generate "learning data" suitable for generating a "learning model" that enables execution of the process of estimating the likelihood of occurrence of a predetermined event.

また例えば、「学習データ」を生成するに当たり、まさに所定事象の発生した現場の画像データを用いることも考えられるが、この場合、所定事象の発生し易さを推定する「学習モデル」が効果的に生成されない可能性も生じる。これに対し本発明は、発生現場の画像データに限定されず、意味付けされた「周辺状況情報」を勘案して決定、選別又は抽出された画像データを利用するので、所定事象の発生し易さを推定する「学習モデル」をより効果的に生成することができるのである。 In addition, for example, when generating "learning data", it is conceivable to use image data of the site where the predetermined event occurred. may not be generated in On the other hand, the present invention is not limited to the image data of the occurrence site, but uses image data determined, selected, or extracted in consideration of the "surrounding situation information" with meaning. It is possible to more effectively generate a “learning model” that estimates the degree of

ちなみに、本発明を用いて生成される「学習モデル」は、将来一層の普及が見込まれる自動運転車、自律移動ロボットや、さらには自律移動ドローン等の自律移動体における周辺状況の「意味」認識力を強化していくことに、大いに貢献するものと考えられる。また、監視カメラシステムや監視ロボットによって、所定環境・エリアにおける様々な事象の発生を予測し対処していく際にも非常に有用なものとなり得るのである。 By the way, the "learning model" generated using the present invention can be used to recognize the "meaning" of surrounding situations in autonomous mobile objects such as self-driving cars, autonomous mobile robots, and even autonomous mobile drones, which are expected to spread further in the future. It is thought that it will contribute greatly to strengthening power. It can also be very useful in predicting and coping with the occurrence of various events in a given environment or area using a monitoring camera system or monitoring robot.

以上に述べた本発明の種々の実施形態について、本発明の技術思想及び見地の範囲内での種々の変更、修正及び省略は、当業者によれば容易に行うことができる。以上に述べた説明はあくまで例示であって、何ら制約を意図するものではない。本発明は、特許請求の範囲及びその均等物によってのみ制約される。 A person skilled in the art can easily make various changes, modifications and omissions within the scope of the technical idea and aspect of the present invention for the various embodiments of the present invention described above. The above description is merely an example and is not intended to be limiting in any way. The invention is limited only by the claims and the equivalents thereof.

１クラウドサーバ（学習モデル生成装置）
１０１通信インタフェース
１１１画像取得部
１１２領域分割画像生成部
１１３周辺状況情報決定部
１１４学習データ生成部
１１５学習モデル生成部
２自動車
２０、２０Ａ、２０Ｂ、２０Ｃ端末（事象発生想定装置）
２０１、２０２通信インタフェース
２０３カメラ
２０４ディスプレイ（ＤＰ）
２１１映像生成部
２１２画像選択部
２１３事故発生推定部
２１４提示情報生成部
２１５領域分割画像生成部
２１６周辺状況情報決定部
２１７発生要因情報生成部 1 Cloud server (learning model generation device)
101 communication interface 111 image acquisition unit 112 segmented image generation unit 113 peripheral situation information determination unit 114 learning data generation unit 115 learning model generation unit 2 automobiles 20, 20A, 20B, 20C terminal (event occurrence assumption device)
201, 202 communication interface 203 camera 204 display (DP)
211 Video generation unit 212 Image selection unit 213 Accident occurrence estimation unit 214 Presentation information generation unit 215 Segmented image generation unit 216 Surrounding situation information determination unit 217 Occurrence factor information generation unit

Claims

A learning data generation method in a computer for generating learning data for generating a learning model capable of estimating the likelihood of occurrence of a predetermined event based on input image data relating to surroundings or a predetermined environment, comprising:
Perform processing to assign a class related to the situation to each pixel or unit area for image data related to the situation of the surroundings or a predetermined environment, and correspond to the class related to the situation to each pixel area as the formed class area. generating segmented image data, which is image data labeled with
determining context information, which is information that may be related to the occurrence of the event, based on the pixel region and the label of the segmented image data;
and generating learning data including a data set in which a value indicating that the event is likely to occur is associated as correct data with respect to the image data in which the situation information satisfying a predetermined condition is determined. A training data generation method.

In the step of determining the situation information, for each item related to the situation information set in advance, an occurrence factor score indicating the degree to which the event occurs is determined, and the occurrence factor score determined for each item is determined. as the relevant situation information,
In the step of generating learning data, learning data including a data set in which a value indicating that the phenomenon is likely to occur is associated with image data whose occurrence factor score satisfies a predetermined condition as correct data. 2. The learning data generating method according to claim 1, wherein:

The image data is generated by photographing with a mobile object moving in a traffic area, and the predetermined event is an accident or an attempted accident event related to the mobile object,
The situation information includes information on items related to the width of the traffic area, information on items related to the existence of people and/or moving objects in the traffic area, and information on items related to the existence of intersections where multiple traffic areas intersect. , information on items related to the presence of visual obstacles, information on items related to the number of people and/or moving objects, and information on items related to changes in the positions of surrounding moving objects and/or people beyond a predetermined level. 3. The learning data generation method according to claim 1, wherein the information includes at least one of:

The image data is generated by photographing the moving object at or near the time when the moving object performs a sudden deceleration or braking operation or a sudden steering operation that satisfies a predetermined condition, and the predetermined event is 4. The learning data generating method according to any one of claims 1 to 3, wherein the learning data generation method is an accident or an attempted accident involving the moving object.

The speed obtained from the CAN (Controller Area Network) mounted on the moving object at the time when the moving object performs a sudden deceleration or braking operation or a sudden steering operation that satisfies a predetermined condition, or at a time near it. or the time point determined based on at least one of the braking information, the vector amount of the optical flow in the image data, and the motion vector determined during the encoding compression process for the image data group including the image data. 5. The learning data generation method according to claim 4, wherein:

6. The learning data generation method according to any one of claims 1 to 5, wherein the region-divided image data is generated by subjecting the image data to a semantic segmentation process.

A learning model generated based on a machine learning algorithm using learning data generated by the learning data generation method according to any one of claims 1 to 6,
means for inputting image data relating to the situation of the surroundings or a predetermined environment and outputting feature information, which is information relating to features of the image data;
means for inputting the feature information and outputting information relating to the likelihood of occurrence of the event;
A learning model characterized by making a computer function by

An event occurrence estimating means for estimating the likelihood of occurrence of a predetermined event with respect to the input image data relating to the situation of the surroundings or the predetermined environment, using the learning model recited in claim 7, and outputting the estimation result of the estimation . An event occurrence estimation device characterized by:

A process for assigning a class related to the situation of the surrounding area or a predetermined environment to each pixel or unit area of the input image data is performed, and each pixel area as a formed class area is assigned to the situation of the surrounding area or the predetermined environment. a region-divided image generation means for generating region-divided image data, which is image data to which a label corresponding to the class is added ;
situation information determination means for determining situation information, which is information that may be related to the occurrence of the event, based on the pixel area and the label of the area-divided image data;
further comprising an occurrence factor information generation means for generating and outputting occurrence factor information, which is information supporting the output estimation result and related to factors causing the event, based on the situation information. 9. The event occurrence estimation device according to claim 8.

A program for causing a computer to generate learning data for generating a learning model capable of estimating the likelihood of occurrence of a given event based on input image data relating to the situation of the surroundings or given environment, comprising:
Perform processing to assign a class related to the situation to each pixel or unit area for image data related to the situation of the surroundings or a predetermined environment, and correspond to the class related to the situation to each pixel area as the formed class area. a region- divided image generation means for generating region-divided image data, which is image data labeled with a
situation information determination means for determining situation information, which is information that may be related to the occurrence of the event, based on the pixel area and the label of the area-divided image data;
A computer as learning data generating means for generating learning data including a data set in which a value indicating that the event is likely to occur is associated as correct data with respect to the image data in which the situation information satisfying a predetermined condition is determined. A learning data generation program characterized by functioning.