JP2011014038A

JP2011014038A - Online risk recognition system

Info

Publication number: JP2011014038A
Application number: JP2009159063A
Authority: JP
Inventors: Motoya Ogawa; 原也小川
Original assignee: Fuji Heavy Industries Ltd
Current assignee: Subaru Corp
Priority date: 2009-07-03
Filing date: 2009-07-03
Publication date: 2011-01-20
Anticipated expiration: 2029-07-03
Also published as: JP5572339B2

Abstract

PROBLEM TO BE SOLVED: To provide an on-line risk recognition system for improving robustness with respect to various external field environments, when recognizing a risk by autonomously learning experience under actual environment.SOLUTION: A model group obtained by dividing an SOM on one dimensional upper loop into a plurality of models, according to a traveling state is set in parallel according to each category of day and night, or the like. Then, the model to be used is selected, on the basis of information such as a vehicle speed or a steering angle from among the model group selected by a recognition model setting part 5a, so that it is possible to avoid arrangement of the unit in a section where input data are thin. Then, only an image-featured value extracted by a featured value extraction part 4 is input to an SOM model so as to be recognized and learnt by a recognition and learning part 5b, so that it is possible to improve robustness, with respect to data input whose characteristics are different. Furthermore, correlation between a state and a teacher is searched by a risk recognition part 7, and the risk of the state is learnt and recognized. Thus, robustness with respect to various external field environments is improved, at risk recognition.

Description

本発明は、自動車等の移動体の外部環境に含まれるリスクを適応的に学習して認識するオンラインリスク認識システムに関する。 The present invention relates to an online risk recognition system that adaptively learns and recognizes risks contained in the external environment of a moving object such as an automobile.

近年、自動車等の移動体における予防安全技術として、カメラを搭載して外界環境を撮像し、撮像した画像を処理して外界環境に含まれる危険度（リスク）の情報を認識し、ドライバに警告する、又は運転をアシストするといった技術が開発されている。 In recent years, as a preventive safety technology for moving objects such as automobiles, a camera is installed to image the external environment, the captured image is processed to recognize risk information contained in the external environment, and alert the driver Technology has been developed to assist driving or assist driving.

このような危険情報の認識技術は、例えば、特許文献１に開示されている。特許文献１の技術は、車両周囲の環境の対象物に対して、その種類や属性毎に危険度パラメータを設定し、この危険度パラメータに基づいて危険度を演算するものである。 Such a technique for recognizing danger information is disclosed in, for example, Patent Document 1. The technique of Patent Literature 1 sets a risk parameter for each type and attribute of an object in the environment around the vehicle, and calculates the risk based on the risk parameter.

特許文献１に開示されているような従来の技術では、歩行者、対向車、障害物、白線等といった危険に結びつく要因を設定し、それらに基づいてリスクの認識を行っており、実際のシステムとしては、開発者が想定したリスク要因や認識を予めシステム内に組み込んでおくという形で実現されている。 In the prior art disclosed in Patent Document 1, factors that lead to danger such as pedestrians, oncoming vehicles, obstacles, white lines, etc. are set, and the risk is recognized based on these factors. Is realized in the form of incorporating risk factors and recognition assumed by the developer into the system in advance.

しかしながら、自動車の走行環境のような実際の環境は、天候の変化、歩行者、車、路上の構造物等の存在といったように多様であり、更には、運転する人間も多様であるため、従来の予め設定した一つの認識モデルでは限界があり、危険に結びつく要因についての認識が高精度で行われなければ、全体としてのリスクを認識できないばかりでなく、予め想定していた以外の危険な場面では認識できないという問題がある。 However, the actual environment such as the driving environment of automobiles is diverse, such as weather changes, the presence of pedestrians, cars, structures on the road, etc. Furthermore, the number of people driving is also diverse. There is a limit in one of the preset recognition models, and if the factors leading to danger are not recognized with high accuracy, not only the overall risk can be recognized, but also dangerous scenes other than those assumed in advance. There is a problem that cannot be recognized.

このため、本出願人は、特許文献２において、実際の環境下での経験をシステムが自律的に学習し、多様な外界環境に対応して危険度の認識を可能とすることのできるオンライン学習システムを提案している。 For this reason, in this patent application, the applicant has learned online experience that allows the system to autonomously learn the experience in an actual environment and to recognize the degree of danger corresponding to various external environments. A system is proposed.

特開２００３−８１０３９号公報JP 2003-81039 A 特開２００８−２３８８３１号公報JP 2008-238831 A

特許文献２の技術は、実際の環境下での経験をシステムが自律的に学習するため、ユーザの運転傾向に適合した認識が可能である。しかしながら、特許文献２の技術は、実環境下での場所の違いや時間帯の違い等に対して学習及び認識効率が低下する可能性があり、ロバスト性を向上する上で改善の余地があった。 In the technique of Patent Document 2, since the system autonomously learns the experience in the actual environment, it is possible to recognize in accordance with the driving tendency of the user. However, the technique disclosed in Patent Document 2 has a possibility that the learning and recognition efficiency may be reduced due to a difference in location or time zone in an actual environment, and there is room for improvement in improving robustness. It was.

本発明は上記事情に鑑みてなされたもので、実際の環境下での経験を自律的に学習して危険度を認識する際、多様な外界環境に対するロバスト性を向上することのできるオンラインリスク認識システムを提供することを目的としている。 The present invention has been made in view of the above circumstances, and is an on-line risk recognition that can improve the robustness to various external environments when autonomously learning the experience in an actual environment and recognizing the danger level. The purpose is to provide a system.

上記目的を達成するため、本発明によるオンラインリスク認識システムは、移動体の外界環境を検出し、この外界環境に含まれるリスクを学習的に認識するオンラインリスク学習システムであって、上記外界環境の検出情報を処理して上記外界環境に含まれる多次元の特徴量を抽出する特徴量抽出部と、上記特徴量をクラスタリングするための認識モデルを階層化して設定する認識モデル設定部と、上記認識モデル設定部で設定された認識モデルを用いて多次元の上記特徴量を１次元の状態として認識する状態認識部と、上記状態認識部で認識された状態と上記危険度に係るリスク情報を抽出して作成された教師情報との相関に基づいて、上記状態の危険度を適応的に学習し、上記外界環境に含まれる危険度を認識するリスク認識部とを備えることを特徴とする。 In order to achieve the above object, an online risk recognition system according to the present invention is an online risk learning system that detects an external environment of a mobile object and recognizes a risk included in the external environment in a learning manner. A feature extraction unit that processes detection information to extract multi-dimensional feature values included in the external environment, a recognition model setting unit that sets and sets recognition models for clustering the feature values, and the recognition A state recognition unit that recognizes the multi-dimensional feature quantity as a one-dimensional state using the recognition model set by the model setting unit, and a risk information related to the state recognized by the state recognition unit and the risk level are extracted. A risk recognizing unit that adaptively learns the risk level of the above state based on the correlation with the teacher information created in the above and recognizes the risk level included in the external environment. It is characterized in.

本発明によれば、実際の環境下での経験を自律的に学習して危険度を認識する際、多様な外界環境に対するロバスト性を向上することができる。 According to the present invention, robustness with respect to various external environments can be improved when autonomously learning experiences in an actual environment to recognize a risk level.

リスク認識システムの基本構成図Basic configuration of risk recognition system 特徴量抽出の画像領域を示す説明図Explanatory drawing showing an image area for feature extraction １次元ＳＯＭによる学習の概念図Conceptual diagram of learning by one-dimensional SOM ＳＯＭの非効率部分を示す説明図Explanatory diagram showing inefficient part of SOM ＳＯＭモデルの分割による階層化を示す説明図Explanatory drawing showing hierarchization by division of SOM model ＳＯＭモデルの並列化を示す説明図Explanatory diagram showing parallelization of SOM model ＳＯＭモデル変更によるユニット配置を示す説明図Explanatory drawing showing unit layout by SOM model change 状態認識の概念図Conceptual diagram of state recognition 事前学習とオンライン学習の説明図Illustration of pre-learning and online learning 自己組織化マップの学習後の分布を示す説明図Explanatory diagram showing the distribution of self-organizing maps after learning リスクレベルとリスク確率との関係を示す説明図Explanatory diagram showing the relationship between risk level and risk probability リスク伝播の説明図Explanatory diagram of risk propagation 情報伝播の説明図Illustration of information propagation リスク情報の拡張を示す説明図Explanatory diagram showing expansion of risk information 認識結果の出力例を示す説明図Explanatory drawing showing an output example of recognition results リスク確率の学習結果を示す説明図Explanatory diagram showing learning results of risk probability

以下、図面を参照して本発明の実施の形態を説明する。
本発明のオンラインリスク学習システムは、自動車等の移動体に搭載され、外界環境の検出結果からその環境内に含まれる危険度（リスク）に係る情報を適応的に認識するシステムであり、事前には想定していなかった環境に対しても、適応的にリスク認識が行えるようにオンラインで成長していく。 Embodiments of the present invention will be described below with reference to the drawings.
The online risk learning system of the present invention is a system that is mounted on a moving body such as an automobile and adaptively recognizes information related to the risk (risk) included in the environment from the detection result of the external environment. Will grow online so that it can adaptively recognize risks even in environments that were not envisioned.

外界環境のセンシングは、システムの入力系として備える各種センサデバイス、例えば、外界を単眼視やステレオ視で撮像するカメラ、レーザやミリ波等のレーダ装置からのセンシング情報を用いることが可能である。つまり、本システムは、基本的に外界環境を検出するセンサデバイスに依存するものではなく、広義にはセンサデバイスより得られる外界環境情報とリスク情報との相関関係を学習する学習システムである。 The sensing of the outside environment can use sensing information from various sensor devices provided as an input system of the system, for example, a camera that picks up the outside world with monocular or stereo vision, or a radar device such as a laser or a millimeter wave. That is, this system is not basically dependent on the sensor device that detects the external environment, but in a broad sense, is a learning system that learns the correlation between the external environment information obtained from the sensor device and the risk information.

本形態においては、オンラインリスク学習システムを自動車等の車両に適用し、車載カメラによって外界を撮像した画像情報と、ドライバの運転操作や車両の運転状態等の車両情報とを用いてリスク情報を抽出する例について説明する。すなわち、本形態のオンラインリスク学習システムは、画像から得られた情報から直接リスクとの関連付けを行うことでリスクを認識すると共に、その関連付けを実際の走行で遭遇した環境から学習し、適応的にリスク認識を行う。 In this embodiment, an online risk learning system is applied to a vehicle such as an automobile, and risk information is extracted using image information obtained by imaging the outside world with an in-vehicle camera and vehicle information such as a driver's driving operation and a driving state of the vehicle. An example will be described. In other words, the online risk learning system of this embodiment recognizes the risk by directly associating the risk with the information obtained from the image, and learns the association from the environment encountered in the actual driving, and adaptively Perform risk recognition.

具体的には、ドライバを認識器の学習においての教師とし、ドライバの運転操作からリスク情報を抽出し、その運転操作に基づくリスク情報と、カメラから得られる画像情報との関連を人工知能技術を用いて学習させる。例えば、ドライバが歩行者を回避するような操作行動を行ったとき、本システムは、その状況が危険であると判断し、そのときに得られた画像は危険であるということを教える。 Specifically, the driver is a teacher in learning of the recognizer, risk information is extracted from the driving operation of the driver, and artificial intelligence technology is used to relate the risk information based on the driving operation and the image information obtained from the camera. To learn. For example, when the driver performs an operation action that avoids a pedestrian, the system determines that the situation is dangerous, and teaches that the image obtained at that time is dangerous.

これにより、次の機会に同じような状況（画像）がシステムに入ってきた場合に危険であるという出力を行い、ドライバに警告を与えることができる。また、本システムでは、リスクを確率的に取り扱っている。このことにより、似たような状況でもリスクが異なる場合や、得られている画像情報だけでは判断が出来ない本質的に確率的なリスクも取り扱うことが可能になる。 As a result, when a similar situation (image) enters the system at the next opportunity, it is output that it is dangerous, and a warning can be given to the driver. In this system, risk is handled probabilistically. As a result, it is possible to handle a case where the risk is different even in a similar situation or an inherently probabilistic risk that cannot be determined only by the obtained image information.

以下、図１を参照して本形態のオンラインリスク学習システムについて説明する。本形態におけるオンラインリスク学習システム１は、単一のコンピュータシステム或いはネットワーク等を介して接続された複数のコンピュータシステムで構成されている。車両の外界環境を検出するセンサとしては、ＣＣＤやＣＭＯＳ等の撮像素子を有する車載カメラを用いている。 Hereinafter, the online risk learning system of this embodiment will be described with reference to FIG. The online risk learning system 1 in this embodiment is configured by a single computer system or a plurality of computer systems connected via a network or the like. As a sensor for detecting the external environment of a vehicle, an in-vehicle camera having an image sensor such as a CCD or a CMOS is used.

具体的には、オンラインリスク学習システム１は、画像入力部２、車両データ入力部３、特徴量抽出部４、状態認識部５、リスク情報抽出部６、リスク認識部７、リスク情報出力部８を基本構成として備えている。詳細は後述するが、本システムは、車載カメラで撮像した画像を主として抽出した特徴量を状態という量に変換してクラスタリングを行い、各状態に対するリスクの確率密度分布を学習したリスク分布からリスクを認識するようにしている。 Specifically, the online risk learning system 1 includes an image input unit 2, a vehicle data input unit 3, a feature amount extraction unit 4, a state recognition unit 5, a risk information extraction unit 6, a risk recognition unit 7, and a risk information output unit 8. Is provided as a basic configuration. Although the details will be described later, this system performs clustering by converting the feature value mainly extracted from the image captured by the in-vehicle camera into a quantity called a state, and learns the risk from the risk distribution obtained by learning the probability density distribution of the risk for each state. I try to recognize it.

本実施の形態においては、階層型ニューラルネットワークの一種である自己組織化マップ（SOM;Self-Organization Maps）を用いて、ＳＯＭの各ユニットを状態として認識し、また、認識処理を学習する。このため、状態認識部５は、認識モデルとしてのＳＯＭモデルを設定する認識モデル設定部５ａと、ＳＯＭモデルを用いた認識の学習を行う認識学習部５ｂとを備えている。 In the present embodiment, each unit of the SOM is recognized as a state by using a self-organizing map (SOM) which is a kind of hierarchical neural network, and the recognition process is learned. Therefore, the state recognition unit 5 includes a recognition model setting unit 5a that sets an SOM model as a recognition model, and a recognition learning unit 5b that performs recognition learning using the SOM model.

各部の概略的な機能は、以下の通りである。尚、図１においては、外界環境を検出するセンサとしての車載カメラは、図示を省略している。 The general function of each part is as follows. In FIG. 1, illustration of an in-vehicle camera as a sensor for detecting the external environment is omitted.

画像入力部２は、車載カメラからの撮像画像を入力し、ノイズ除去、ゲイン調整、γ補正等のビデオプロセス処理を行い、このビデオプロセス処理されたアナログ撮像画像を所定の階調のデジタル画像に変換する。画像入力部２で処理された画像は、一旦、メモリにストアされて収集され、特徴量抽出部４に送られる。 The image input unit 2 inputs a captured image from the in-vehicle camera, performs video process processing such as noise removal, gain adjustment, and γ correction, and converts the analog captured image processed by the video process into a digital image with a predetermined gradation. Convert. The image processed by the image input unit 2 is temporarily stored in the memory, collected, and sent to the feature amount extraction unit 4.

車両データ入力部３は、車内ネットワーク等から所定の周期で車両データを入力する。車両データとしては、車速、ハンドル角、アクセル開度、ブレーキ圧力等であり、これらのデータを所定の周期（例えば、３０Ｈｚ）でサンプリングする。 The vehicle data input unit 3 inputs vehicle data at a predetermined cycle from an in-vehicle network or the like. The vehicle data includes vehicle speed, steering wheel angle, accelerator opening, brake pressure, and the like, and these data are sampled at a predetermined cycle (for example, 30 Hz).

尚、以降の認識処理に必要であれば、車内ネットワーク上を流れる他の情報、例えば、前後横加速度やヨーレートといった車両運動情報、ヘッドライト、ウインカーといったシステム操作情報、外気温等も入力する。 If necessary for subsequent recognition processing, other information flowing on the in-vehicle network, for example, vehicle motion information such as longitudinal and lateral acceleration and yaw rate, system operation information such as headlights and turn signals, outside temperature, and the like are also input.

特徴量抽出部４は、画像入力部２からの画像データを受け取り、得られた画像の特徴量を抽出する。すなわち、得られた画像から、エッジ情報、動き情報、色情報等の特徴量を抽出し、それらの情報をＮ次元ベクトルとして保持する。このＮ次元ベクトルには、図１に破線で示すように、画像特徴量以外の車両情報、例えば、車速やヨー角の変化といった情報も含める。 The feature amount extraction unit 4 receives the image data from the image input unit 2 and extracts the feature amount of the obtained image. That is, feature amounts such as edge information, motion information, and color information are extracted from the obtained image, and the information is held as an N-dimensional vector. The N-dimensional vector includes vehicle information other than the image feature amount, for example, information such as a change in vehicle speed and yaw angle, as indicated by a broken line in FIG.

尚、本形態で扱う画像データは、単眼のカラーカメラで撮像した画像とするが、赤外カメラから得られる画像やステレオカメラから得られる距離画像であっても良い。また、前述したように、レーザやミリ波等からの情報を用いることも可能であり、その場合、画像特徴量は、より一般的には、外界環境特徴量とも呼ぶべきものである。 The image data handled in this embodiment is an image captured by a monocular color camera, but may be an image obtained from an infrared camera or a distance image obtained from a stereo camera. In addition, as described above, information from a laser, a millimeter wave, or the like can be used. In this case, the image feature amount should be more generally called an external environment feature amount.

状態認識部５は、得られたＮ次元の特徴量ベクトルを、ＳＯＭモデルを用いて最終的に１次元の状態という量に変換する。状態とは、入力された画像を走行している場所や、天候、走行状態などによりシーン分けしているイメージになる。実際には、オンライン学習時、今はどのシーンであるかを明示的に教師することはできないため、入力データを状態数Ｍのクラスにクラスタリングしている。 The state recognizing unit 5 finally converts the obtained N-dimensional feature quantity vector into an amount called a one-dimensional state using the SOM model. The state is an image in which the input image is divided into scenes according to the location where the image is traveling, the weather, the traveling state, and the like. Actually, at the time of online learning, it is not possible to explicitly teach which scene it is now, so the input data is clustered into a class with M states.

リスク情報抽出部６は、入力された車両情報（ドライバの操作情報）からリスク情報を抽出し、教師情報（リスクの大きさや種類）を作成する。本システムでは、このリスク情報抽出についての学習は行わず、予め設定したルールを用いてリスク情報を認識する。 The risk information extraction unit 6 extracts risk information from the input vehicle information (driver operation information), and creates teacher information (risk magnitude and type). In this system, learning about this risk information extraction is not performed, but risk information is recognized using a preset rule.

リスク認識部７は、状態認識部５で得られた状態と、リスク情報抽出部６で作成された教師との相関関係を求め、状態のリスクを学習・認識する。 The risk recognition unit 7 obtains a correlation between the state obtained by the state recognition unit 5 and the teacher created by the risk information extraction unit 6, and learns and recognizes the risk of the state.

リスク情報出力部８は、認識したリスクをモニタや音声等により出力する。このリスク情報の出力については、認識したリスクそのものを出力する以外にも、操作データから得られたリスク情報との差を出力するようにしても良い。例えば、リスク認識部７がリスクが高いと判断した場合でも、操作データの認識により、ドライバが充分注意している状態であれば、リスクは高くないという出力を行うようにして良い。 The risk information output unit 8 outputs the recognized risk by a monitor or voice. Regarding the output of the risk information, in addition to outputting the recognized risk itself, a difference from the risk information obtained from the operation data may be output. For example, even when the risk recognizing unit 7 determines that the risk is high, an output indicating that the risk is not high may be performed if the driver is sufficiently careful by recognizing the operation data.

以上のオンラインリスク学習システム１においては、現在のリスクの認識・出力と学習とを並行に実行している。すなわち、リスクの認識は、入力画像に対し、それより前までに学習された認識結果によって行い、リスクの学習は、入力された教師情報を用いてリスク認識部７を更新（学習）し、次の時刻は、更新されたリスク認識部７でリスクを認識するという処理を繰り返すことにより、リスク認識と学習とを同時に行う。 In the above online risk learning system 1, the current risk recognition / output and learning are executed in parallel. That is, risk recognition is performed on the input image based on the recognition result learned before that, and the risk learning is performed by updating (learning) the risk recognition unit 7 using the input teacher information. At this time, risk recognition and learning are simultaneously performed by repeating the process of recognizing the risk by the updated risk recognition unit 7.

学習は、以下の（ａ），（ｂ），（ｃ），（ｄ）に示すように、４つの部分でそれぞれに可能であるが、（ａ），（ｂ）は学習に大規模な計算を要して時間がかかるため、本システムでは学習は行わず、予め事前にオフラインにて学習、或いは予め作成したアルゴリズムを用いている。
（ａ）ドライバ操作データからのリスク情報の抽出（リスク情報抽出部６）
（ｂ）画像データからの画像特徴量の決定・選択（特徴量抽出部４）
（ｃ）画像特徴量から状態への変換（状態認識部５）
（ｄ）状態とリスクとの相関認識（リスク認識部７） Learning is possible in each of four parts as shown in (a), (b), (c), and (d) below, but (a) and (b) are large-scale computations for learning. In this system, learning is not performed, and offline learning beforehand or a previously created algorithm is used.
(A) Extraction of risk information from driver operation data (risk information extraction unit 6)
(B) Determination / selection of image feature quantity from image data (feature quantity extraction unit 4)
(C) Conversion from image feature amount to state (state recognition unit 5)
(D) Correlation recognition between state and risk (risk recognition unit 7)

次に、オンラインリスク学習システム１における各処理の詳細について説明する。以下では、車載カメラによる画像から特徴量を抽出する処理、抽出した特徴量から状態を認識する処理、車両操作情報からリスクを抽出する処理（教師情報の作成）、状態と与えられた教師情報とからリスクを認識する処理の順に説明する。 Next, details of each process in the online risk learning system 1 will be described. In the following, a process for extracting feature values from an image by an in-vehicle camera, a process for recognizing a state from the extracted feature values, a process for extracting a risk from vehicle operation information (creating teacher information), a state and given teacher information Will be described in the order of risk recognition processing.

［画像特徴量の抽出処理］
特徴量抽出部４は、以降のリスク認識のためのデータを抽出する。一般に、リスク認識に相関がないデータは認識に悪影響を与える。つまり、この特徴量抽出処理においては、むやみに特徴量を増やすということは得策でなく、逆に、必要な特徴量を用いないことも精度を悪化させる。 [Image feature extraction processing]
The feature amount extraction unit 4 extracts data for subsequent risk recognition. In general, data that is not correlated with risk perception adversely affects recognition. That is, in this feature quantity extraction process, it is not a good idea to increase the feature quantity unnecessarily, and conversely, not using a necessary feature quantity also deteriorates accuracy.

そのため、どの特徴量を用いるべきかという特徴量選択が課題として発生するが、前述したように、特徴量選択については、それを学習的に得る場合は、以下に説明するリスク認識の上位の学習が必要になり、計算量・メモリ容量的にオンラインでの学習には不利である。 For this reason, feature quantity selection as to which feature quantity should be used occurs as an issue, but as described above, when feature quantity selection is obtained in a learning manner, higher-level learning of risk recognition described below is performed. This is disadvantageous for online learning in terms of computational complexity and memory capacity.

従って、本形態では、ここでの特徴量抽出部分は固定として扱う例について説明する。学習する場合には、システムの認識率を基準として評価し、各特徴量の組み合わせを最適化すれば良く、これには、組み合わせの全探索、遺伝的アルゴリズム（GA;Genetic Algorithm）等の発見的な探索法等、既存の最適化手法を用いることができる。 Therefore, in this embodiment, an example will be described in which the feature amount extraction portion is treated as fixed. When learning, it is only necessary to evaluate the recognition rate of the system as a standard and optimize the combination of each feature quantity. This includes heuristics such as full search of combinations and genetic algorithm (GA). Existing optimization methods such as simple search methods can be used.

本形態においては、特徴量抽出部４で予め設定した種類の特徴量を抽出している。ここでは、処理を３つの要素に分け、各要素毎に設定した特徴量を抽出する。３つの要素は、前処理、特徴量計算、領域設定である。具体的には、以下に示すように、前処理で６種類、特徴量計算で１０種類、領域設定で４種類のデータを抽出し、それらの組み合わせで計２４０（６×１０×４）次元のデータを抽出する。 In the present embodiment, a feature amount of a preset type is extracted by the feature amount extraction unit 4. Here, the process is divided into three elements, and feature amounts set for each element are extracted. The three elements are preprocessing, feature amount calculation, and region setting. Specifically, as shown below, 6 types of data are extracted in the pre-processing, 10 types in the feature amount calculation, and 4 types in the region setting, and a total of 240 (6 × 10 × 4) dimensions are obtained by combining them. Extract data.

＜前処理＞
入力画像に対して、ソベル、縦方向ソベル、横方向ソベル、フレーム間差分、輝度、彩度の６種類のフィルタ処理を行い、６次元の特徴量データを抽出する。 <Pretreatment>
Six types of filter processing are performed on the input image, sobel, vertical sobel, horizontal sobel, inter-frame difference, luminance, and saturation to extract six-dimensional feature data.

＜特徴量＞
フィルタ処理された画像の画素値に対して、平均、分散、最大値、最小値、横方向重心、縦方向重心、コントラスト、均一性、エントロピー、フラクタル次元の１０種類の計算処理を行い、１０次元の特徴量データを抽出する。 <Feature amount>
Ten types of calculation processing are performed on the pixel values of the filtered image: average, variance, maximum value, minimum value, horizontal centroid, vertical centroid, contrast, uniformity, entropy, and fractal dimension. Feature quantity data is extracted.

＜領域＞
図２に示すように、画像内に領域Ａ０を設定し、この設定領域Ａ０の全体、設定領域Ａ０内の左側の領域Ａ１、右側の領域Ａ２、中央の領域Ａ３の４種類の領域について、４次元の特徴量データを抽出する。 <Area>
As shown in FIG. 2, an area A0 is set in the image, and four types of areas, that is, the entire setting area A0, the left area A1, the right area A2, and the central area A3 in the setting area A0 are set to 4 types. Extract dimension feature data.

尚、以上の２４０次元の特徴量は、オンラインシステムの演算性能に応じて、使用する次元を絞るようにしても良い。例えば、画像以外にも車両データも用いて、画面全体のソベルの平均、分散、画面全体のフレーム間差分の平均、分散、車速、ハンドル角の６次元の特徴量を抽出するようにしても良い。 Note that the 240-dimensional feature value may be narrowed down according to the computing performance of the online system. For example, in addition to images, vehicle data may also be used to extract 6-dimensional feature quantities such as the average and variance of the Sobel over the entire screen, the average of inter-frame differences over the entire screen, the variance, the vehicle speed, and the steering wheel angle. .

また、以上の特徴量抽出処理においては、各特徴量は正規化しているが、理論上の範囲は非効率であるため、事前に各特徴量の分布を評価しておき、その評価結果を元に最大値及び最小値を設定し、０〜１の数値に正規化している。その場合、最大値・最小値を動的に変化させるようしても良く、例えば、最大値を超える値もしくは最小値を下回る値が入力された場合には、それぞれ範囲を拡大するように最大値・最小値を変更する。逆に、しばらく最小値、最大値付近のデータが入ってこなかった場合は、範囲を狭めるように変更する。 In the above feature quantity extraction process, each feature quantity is normalized, but the theoretical range is inefficient. Therefore, the distribution of each feature quantity is evaluated in advance, and the evaluation result is used as a basis. The maximum value and the minimum value are set to, and normalized to a numerical value of 0 to 1. In that case, the maximum and minimum values may be changed dynamically. For example, when a value exceeding the maximum value or a value below the minimum value is input, the maximum value is expanded so that the range is expanded.・ Change the minimum value. Conversely, if the data near the minimum and maximum values has not been entered for a while, the range is changed to narrow.

また、ここでは基本的な特徴量を用いたが、過去のフレーム情報を用いて動き情報を算出する等、特徴量の時系列的な変動を計算し、その情報を特徴量として用いることもできる。更に、全体としてのリスク認識の精度向上のためには、この特徴量抽出処理に高精度の画像処理を入れることもでき、例えば、歩行者認識結果、道路の白線認識結果、障害物認識結果等を含めて、ここでの抽出データに組み込むようにしても良い。このような意味では、本システムは、個々の外界認識結果を統合し、リスクを認識するシステムと捉えることもできる。 Although basic feature values are used here, it is also possible to calculate time-series fluctuations of feature values, such as calculating motion information using past frame information, and use the information as feature values. . Furthermore, in order to improve the accuracy of risk recognition as a whole, high-accuracy image processing can be added to this feature amount extraction processing, for example, pedestrian recognition results, road white line recognition results, obstacle recognition results, etc. May be incorporated in the extracted data here. In this sense, the present system can also be regarded as a system that recognizes risks by integrating individual external recognition results.

［状態認識処理］
状態認識部５では、Ｎ次元の特徴量データを１次元の状態という量に圧縮変換する。つまり、状態は、入力された特徴量データから状態という量を出力する識別器としての機能によって認識される（但し、この識別器の出力は、１状態を確定せずに確率的に扱うこともできる）。本処理は、この識別器の内部構造を入力データ、教師データを用いて実環境に適応させることになるが、ここでの教師は、この入力データがどの状態であるかを直接教えるのではなく、出力された状態から認識されるリスクを、できるだけ効率的に、且つ精度良く認識できるようにするものである。 [Status recognition processing]
The state recognition unit 5 compresses and converts the N-dimensional feature data into an amount called a one-dimensional state. In other words, the state is recognized by the function as a discriminator that outputs the quantity of the state from the input feature data (however, the output of this discriminator can be handled stochastically without determining one state. it can). In this process, the internal structure of this discriminator is adapted to the actual environment using input data and teacher data. However, the teacher here does not directly teach the state of this input data. The risk recognized from the output state can be recognized as efficiently and accurately as possible.

具体的には、状態認識部５は、先ず、認識モデル設定部５ａでＳＯＭモデルを設定し、このＳＯＭモデルを用いて認識学習部５ｂで認識及び学習を行う。ＳＯＭのアルゴリズムでは、Ｍ次元（通常は２次元）に並べられたユニットが、それぞれベクトル値（通常入力との結線の重みと呼ばれる）を持ち、入力に対して勝者ユニットをベクトルの距離を基準として決定する。そして、勝者ユニット及びその周辺のユニットの参照ベクトル値を、入力ベクトルに近づくように更新する。これを繰り返し、全体が入力データの分布を最適に表現できるように教師無しで学習してゆく。１次元ＳＯＭによる学習のイメージを、図３に示す。 Specifically, the state recognition unit 5 first sets an SOM model by the recognition model setting unit 5a, and performs recognition and learning by the recognition learning unit 5b using the SOM model. In the SOM algorithm, units arranged in the M dimension (usually two dimensions) each have a vector value (referred to as the weight of the connection with the normal input), and the winner unit with respect to the input is based on the vector distance. decide. Then, the reference vector values of the winner unit and the surrounding units are updated so as to approach the input vector. This is repeated, and learning is performed without a teacher so that the entire distribution of input data can be optimally expressed. An image of learning by one-dimensional SOM is shown in FIG.

この場合、ＳＯＭのユニット配置（及び各状態の確率密度分布）は、走行中に順次更新され、各ユーザの走行環境、運転傾向に適合した認識が可能になる。しかしながら、実環境への適応では、以下に示すような状況でロバスト性が低下する虞がある。 In this case, the SOM unit arrangement (and the probability density distribution of each state) is sequentially updated during traveling, enabling recognition suitable for each user's traveling environment and driving tendency. However, when adapting to a real environment, there is a risk that the robustness may deteriorate in the following situation.

（Ａ）幹線道路と路地裏のように、そのシーンの持つリスクが大きく異なると考えられる場所においてもリスク出力が同じような値を出す可能性がある。このため、場所の違いに対するロバスト性が低下する虞がある。 (A) There is a possibility that the risk output has the same value even in a place where the risk of the scene is considered to be greatly different, such as the main road and the back alley. For this reason, there exists a possibility that the robustness with respect to the difference in a place may fall.

（Ｂ）昼夜の違いに認識出力が大きく影響を受ける可能性がある。このため、時間帯の違いに対するロバスト性が低下する虞がある。 (B) The recognition output may be greatly affected by the difference between day and night. For this reason, there exists a possibility that the robustness with respect to the difference in a time slot | zone may fall.

（Ａ）の場所の違いに対するロバスト性の低下は、認識モデルのＳＯＭユニット間での学習頻度が偏り、効率的な学習及び認識ができないことに起因する。特に、１次元上で終端が存在しないようにループ上にＳＯＭユニットを配置し、この１次元ループ上のＳＯＭに、車速・舵角等の車両情報と画像情報とを合わせて入力する場合には、走行中と停止時といったように、異なる入力分布領域を１次元ループ上で表現しようとしていることになる。このため、図４に示すように、入力データの希薄な部分に対してもＳＯＭユニットが配置され、外部環境を近似する上で非効率な状況となる。 The decrease in robustness with respect to the difference in the location of (A) is caused by the fact that the learning frequency is uneven among the SOM units of the recognition model, and efficient learning and recognition cannot be performed. In particular, when an SOM unit is arranged on a loop so that there is no end on one dimension, and vehicle information such as vehicle speed and rudder angle and image information are input to the SOM on this one-dimensional loop. Thus, different input distribution regions are being expressed on a one-dimensional loop, such as when traveling and when stopped. For this reason, as shown in FIG. 4, SOM units are arranged even in a sparse portion of input data, which is inefficient in approximating the external environment.

そこで、本実施の形態においては、図５に示すように、１次元上ループ上の従来モデルのＳＯＭを、複数のループ上のモデルに分割し、この分割した新モデルを用いて処理を階層化する。すなわち、第１段階の処理として、複数のループ上のモデルの中から使用するモデルを選択し、選択したモデルに、分布領域の同じデータを入力することで、入力データの希薄な部分に対してＳＯＭユニットが配置されることを回避する。 Therefore, in the present embodiment, as shown in FIG. 5, the SOM of the conventional model on the one-dimensional upper loop is divided into models on a plurality of loops, and the processing is hierarchized using the divided new models. To do. That is, as a first stage process, a model to be used is selected from among models on a plurality of loops, and the same data in the distribution region is input to the selected model. Avoid placing SOM units.

また、（Ｂ）の時間帯の違いに対するロバスト性の低下は、昼間や夜間等、性質の異なるデータをあわせて学習を行うと、確率密度分布自体が平均化されてしまうことに起因する。性質の異なるデータの一方だけを対象とした学習では、他方の環境に対応できない。そこで、（Ａ）の状況に対処するモデル階層化の更に上位として、例えば、図６に示すような昼用のモデルと夜用のモデルといった複数のモデル群を並列的に用意し、状況に応じてモデル群を切り換えることで、性質の異なるデータ入力に対するロバスト性を向上する。 Further, the decrease in robustness with respect to the difference in time zone (B) is caused by the fact that the probability density distribution itself is averaged when learning is performed with data having different properties such as daytime and nighttime. Learning that targets only one of the data with different properties cannot handle the other environment. Therefore, as a higher level of model hierarchization to deal with the situation of (A), for example, a plurality of model groups such as a day model and a night model as shown in FIG. By switching model groups, the robustness against data input with different properties is improved.

以下、ＳＯＭモデルを設定する処理、設定されたＳＯＭモデルを用いて状態を認識する処理、その認識処理自体を学習する学習処理に分けて説明を行う。 The following description will be divided into a process for setting an SOM model, a process for recognizing a state using the set SOM model, and a learning process for learning the recognition process itself.

＜ＳＯＭモデルの設定＞
認識モデル設定部５ａは、先ず、入力データの性質に応じて並列化されたＳＯＭモデル群の中から、比較的長い時間で使用するモデル群を選択する。例えば、昼用のＳＯＭモデル群と夜用のＳＯＭモデル群とを保有している場合、スモールライトスイッチの点灯状態から判別した昼夜の別に応じてモデルを切り換える。但し、モデル切り換えに際しては、パッシングライト等への対策としてヒステリシスを設ける。 <Setting of SOM model>
The recognition model setting unit 5a first selects a model group to be used in a relatively long time from among the SOM model groups that are parallelized according to the nature of the input data. For example, in the case of having a daytime SOM model group and a nighttime SOM model group, the models are switched according to day and night determined from the lighting state of the small light switch. However, when switching models, hysteresis is provided as a countermeasure against passing lights and the like.

並列化されたモデル群としては、昼用モデルと夜用モデルとの他、例えば、雨天用のモデルと晴天用のモデル、一般道用のモデルと高速道路用のモデル等がある。これらのモデル群を切り換えることにより、傾向が大きく異なり、少なくとも数分以上といった長い時間軸で変動する環境に対して、ロバスト性の向上を期待することができる。 The parallel model group includes a daytime model and a nighttime model, for example, a rainy weather model and a clear weather model, a general road model, and a highway model. By switching between these model groups, it is possible to expect an improvement in robustness in an environment where the tendency is greatly different and fluctuates on a long time axis of at least several minutes or more.

その際、並列モデルの切り換えに用いる情報としては、例えば、以下の（１）〜（６）に示す情報を用いることができる。
（１）雨天時の判別のためのワイパー動作情報
（２）晴天時の判別のための日照センサの情報
（３）高速道路走行の判別のための車速やギア情報
（４）自車の走行位置、時刻、季節の判別のためのＧＰＳ情報
（５）明るさ条件の判別のためのシャッタースピード等のカメラ制御情報
（６）各判別のための画像情報 At this time, as information used for switching the parallel model, for example, the following information (1) to (6) can be used.
(1) Wiper operation information for discrimination in rainy weather (2) Sunlight sensor information for discrimination in fine weather (3) Vehicle speed and gear information for discrimination of highway driving (4) Travel position of own vehicle GPS information for discrimination of time, season, (5) Camera control information such as shutter speed for discrimination of brightness conditions, (6) Image information for each discrimination

次に、認識モデル設定部５ａは、選択したモデル群の中から、比較的短いスパンで使用するＳＯＭモデルを、車両情報から判別される走行状態に応じて選択する。具体的には、各モデル群のそれぞれを、停車時、右折時、左折時、通常走行時の４つのモデルに切り分けておき、第１段階として、４つのモデルの中から車速や舵角等の情報によって使用するモデルを選択する。 Next, the recognition model setting unit 5a selects an SOM model to be used in a relatively short span from the selected model group according to the traveling state determined from the vehicle information. Specifically, each model group is divided into four models at stopping, right turn, left turn, and normal driving. As a first step, vehicle speed, steering angle, etc. are selected from the four models. Select the model to use according to the information.

図７に示すグラフは、ＳＯＭモデルの変更（切り分け）によるユニット配置を示しており、横軸がユニット番号、縦軸が各ユニットの登場頻度（発生頻度）である。図７においては、ユニット番号０−４が右折時、ユニット番号５−９が左折時、ユニット番号１０−１４が停止時、ユニット番号１５−９９が通常走行時であり、便宜上、右折時、左折時、停止時、通常走行時の４つのモデルを１軸で表示している。 The graph shown in FIG. 7 shows the unit arrangement by changing (separating) the SOM model, where the horizontal axis is the unit number and the vertical axis is the appearance frequency (occurrence frequency) of each unit. In FIG. 7, unit numbers 0-4 are when turning right, unit numbers 5-9 are turning left, unit numbers 10-14 are stopped, and unit numbers 15-99 are during normal driving. For convenience, when turning right, turning left Four models are displayed on a single axis for hour, stop, and normal travel.

図７からは、モデル変更（切り分け）前はところどころにほとんど使用されないユニットが存在したが、モデル変更によって改善され、効率的なユニット配置となっていることがわかる。尚、変更後のユニット番号１０付近は停車中に対応するユニットであり、停車中はユニットの登場回数が多いという特殊な状況である。 From FIG. 7, it can be seen that there were units that were rarely used in some places before the model change (separation), but the unit was improved by the model change and an efficient unit arrangement. In addition, the unit number 10 vicinity after a change is a unit corresponding to the time of a stop, and it is a special situation that the number of appearance of a unit is many during a stop.

＜認識処理＞
認識学習部５ｂは、走行状態に応じて選択したＳＯＭモデルに対する第２段階の処理として、選択したＳＯＭモデルに画像特徴量のみの同じ種類のデータを入力し、状態の認識処理及び認識処理の学習を行う。この認識処理は、入力データに対するプロトタイプ型の識別処理として行われる。ここで、状態番号をＳとすると、各状態は代表値を持ち、これをｐｒｏｔ_s(i)とする。状態代表値ｐｒｏｔ_s(i)は、Ｎ次元のベクトルであり、ｉ＝０，１，…，Ｎ−１となる。 <Recognition process>
The recognition learning unit 5b inputs the same kind of data of only the image feature amount to the selected SOM model as the second stage process for the SOM model selected according to the driving state, and learns the state recognition process and the recognition process. I do. This recognition process is performed as a prototype type identification process for input data. Here, if the state number is S, each state has a representative value, which is designated as prot _s (i). The state representative value prot _s (i) is an N-dimensional vector, i = 0, 1,..., N−1.

入力データ（特徴量ベクトル）をＩｎ(i)とすると、入力ベクトルは、以下の（１）式に示すように、状態代表値ｐｒｏｔ_s(i)との距離Ｌ(s)により求められ、どの状態に属するかが認識される。
Ｌ(s)＝(Σ_i(ｐｒｏｔ_s(i)−Ｉｎ(i))²)^1/2 …（１） Assuming that the input data (feature vector) is In (i), the input vector is obtained by the distance L (s) from the state representative value prot _s (i) as shown in the following equation (1). Whether it belongs to a state is recognized.
L (s) = (Σ _i (prot _s (i) −In (i)) ² ) ^1/2 (1)

入力データの属する状態（状態番号）Ｋは、以下の（２）式に示すように、距離Ｌ(s)の最小値で求められ、入力ベクトルが一番近い状態代表値の状態であると認識される。
Ｋ＝ｍｉｎ_s(Ｌ(s)) …（２） The state (state number) K to which the input data belongs is obtained by the minimum value of the distance L (s) as shown in the following equation (2), and is recognized as the state representative state having the closest input vector. Is done.
K = min _s (L (s)) (2)

図８は、Ｎ次元中の３次元に注目した場合を示しており、入力データは、状態Ｓ６より状態Ｓ１に近いため、Ｓ１の状態であると認識される。以上が基本的な状態認識となるが、これは入力データがどの状態であるかを確定させていることになる。 FIG. 8 shows a case where attention is paid to three of the N dimensions. Since the input data is closer to the state S1 than the state S6, the input data is recognized as being in the state of S1. The above is the basic state recognition, which is to determine the state of the input data.

この場合、図８では、状態Ｓ１と状態Ｓ６とでは、距離はそれほど違いはないが、若干、状態Ｓ１との距離が近いことで、入力データは状態Ｓ１であると認識される。つまり、状態Ｓ１と状態Ｓ６との距離がほぼ同じ領域においては、認識が不安定になる可能性がある。 In this case, in FIG. 8, the distance between the state S1 and the state S6 is not so different, but the input data is recognized as the state S1 because the distance to the state S1 is slightly close. That is, in a region where the distance between the state S1 and the state S6 is almost the same, the recognition may become unstable.

従って、ここでは、更に拡張し、状態が確率的であるとして扱うことで、認識の不安定さを解消する。すなわち、入力データが状態ｓである確率をＰ(s)とすると、状態の確率は、距離Ｌ(s)を用いて、以下の（３），（４）式で求める。ここで、σはパラメータであり、小さくするほど状態を確定的にする効果がある。
Ｐ(s)＝(ｅｘｐ(−Ｌ(s)／σ))／ｚ …（３）
ｚ＝Σ_sｅｘｐ(−Ｌ(s)／σ) …（４） Therefore, here, the instability of recognition is resolved by further expanding and treating the state as probabilistic. That is, if the probability that the input data is in the state s is P (s), the probability of the state is obtained by the following equations (3) and (4) using the distance L (s). Here, σ is a parameter, and the smaller the value, the more effective the state becomes deterministic.
P (s) = (exp (−L (s) / σ)) / z (3)
z = Σ _s exp (−L (s) / σ) (4)

このように、状態を入力データとの距離に応じた尺度で確率的に決定した場合、以後の計算で全ての状態について計算する必要がある。従って、計算量を削減するため、一定値以下の確率は０とし、計算として扱わないようにしても良い。 As described above, when states are stochastically determined on a scale according to the distance from the input data, it is necessary to calculate all states in the subsequent calculations. Therefore, in order to reduce the amount of calculation, the probability below a certain value may be set to 0 and may not be treated as a calculation.

尚、Ｐ(s)の定義において、ｓ＝Ｋのときだけ１、それ以外を０とすれば、状態を確定したときと同じになる。 In the definition of P (s), if 1 is set only when s = K and 0 is set otherwise, it is the same as when the state is fixed.

＜学習処理＞
次に、状態認識部５の学習処理では、入力データ及び教師情報から、ＳＯＭをベースとした各状態の代表値の学習（更新）を行う。本システムにおいては、ＳＯＭによる学習は、以下のようになる。但し、本システムにおいては、ユニット（状態）は１次元につながっているものとする。勝者ユニットの状態番号を、前述の状態番号Ｋとすると、代表ベクトルｐｒｏｔ_sは、以下の（５）式に従って更新（学習）される。
ｐｒｏｔ_s(i)→ｐｒｏｔ_s(i)＋α(Ｉｎ(i)−ｐｒｏｔ_s(i) …（５） <Learning process>
Next, in the learning process of the state recognizing unit 5, learning (update) of the representative value of each state based on the SOM is performed from the input data and the teacher information. In this system, learning by SOM is as follows. However, in this system, it is assumed that units (states) are connected in one dimension. When the state number of the winner unit is the above-described state number K, the representative vector prot _s is updated (learned) according to the following equation (5).
prot _s (i) → prot _s (i) + α (In (i) −prot _s (i) (5)

ここで、（５）式におけるαは、更新の重みを示す学習率係数であり、以下の（６）式で表される。
α＝ａ・ｂ(t)・ｃ(Ｄ(s,K),ｔ)・ｅ(t) …（６）
但し、ａ：学習係数
ｂ：時間減衰係数
ｃ：領域減衰係数
Ｄ(s,K)：更新対象のユニットと勝者ベクトル間のつながりにおける距離
ｅ：教師情報係数 Here, α in the equation (5) is a learning rate coefficient indicating the weight of update, and is expressed by the following equation (6).
α = a · b (t) · c (D (s, K), t) · e (t) (6)
Where a: learning coefficient
b: Time decay coefficient
c: Domain attenuation coefficient
D (s, K): Distance in the connection between the unit to be updated and the winner vector
e: Teacher information coefficient

（６）式における各パラメータａ，ｂ，ｃは、通常のＳＯＭでも用いられるパラメータであり、時間減衰係数ｂは、学習経過時間ｔ（通常何回目の更新かを表す）の関数であり、一般には時間ｔの増加につれ減衰する。また、距離Ｄ(s,K)は、特徴量空間上での距離ではなく、例えば、図３においては、勝者ユニットの隣のユニットは距離１、その隣は距離２となる。 The parameters a, b, and c in the equation (6) are parameters that are also used in normal SOM, and the time attenuation coefficient b is a function of the learning elapsed time t (usually indicating how many times it is updated). Decreases with increasing time t. Further, the distance D (s, K) is not a distance in the feature amount space. For example, in FIG. 3, the unit adjacent to the winner unit is the distance 1 and the adjacent unit is the distance 2.

一方、領域減衰係数ｃは、その距離Ｄ(s,K)の関数であり、距離Ｄ(s,K)が大きくなる程、値が小さく、ある一定以上の距離Ｄ(s,k）については更新されないように設定される。また、領域減衰係数ｃは、時間ｔの関数でもあり、時間ｔが大きくなる程、値が小さくなる。更に、本システムでは、教師情報を示す教師情報係数ｅ(t)を導入するが、これについては後述する。 On the other hand, the region attenuation coefficient c is a function of the distance D (s, K), and the value decreases as the distance D (s, K) increases. It is set not to be updated. The region attenuation coefficient c is also a function of the time t, and the value decreases as the time t increases. Furthermore, in this system, a teacher information coefficient e (t) indicating teacher information is introduced, which will be described later.

このように、ＳＯＭの学習アルゴリズムでは、学習初期は、広範囲のユニットが入力データに近づくように更新され、学習が進むにつれ、更新されるユニット数、更新量とも少なくなり、最終的には、学習率係数α（更新の重み）が０になり、学習が終了する。尚、初期状態では、通常、ユニットはベクトル空間上の中心付近にランダムに配置される。 As described above, in the learning algorithm of SOM, in the initial stage of learning, a wide range of units are updated so as to approach the input data. As the learning progresses, both the number of units to be updated and the amount of update are reduced. The rate coefficient α (update weight) becomes 0, and learning ends. In the initial state, the units are usually randomly arranged near the center on the vector space.

ここで、本システムでは、オンライン学習システムであることから、以上の学習アルゴリズムを若干変更し、学習を事前学習フェーズとオンライン学習フェーズとに分け、各学習フェーズで学習のパラメータを変更している。すなわち、学習の終了時刻は設けずに、事前学習フェーズとオンライン学習フェーズとでそれぞれの時刻で一定とし、また、更新範囲の減衰も設けず、事前学習時とオンライン学習時の範囲は異なるものとする。 Here, since this system is an online learning system, the above learning algorithm is slightly changed, the learning is divided into a pre-learning phase and an online learning phase, and learning parameters are changed in each learning phase. In other words, the learning end time is not set, the time is constant in the pre-learning phase and the online learning phase, the update range is not attenuated, and the range for pre-learning and online learning is different. To do.

これは、オンライン学習である本システムにおいては、学習に終了時刻はないこと、また、事前学習を導入しているのは、一義的に一定値とすると、更新量が大きい場合、ＳＯＭの分布が入力データの平均付近の狭い範囲に集中してしまうためであり、逆に小さい場合には、ＳＯＭの分布が特徴量空間上にばらつき過ぎてしまい、入力データの分布をうまく表現できないためである。 This is because, in this system, which is online learning, there is no end time for learning, and pre-learning is introduced because it is uniquely set to a constant value. This is because the data is concentrated in a narrow range near the average of the input data. On the other hand, if the input data is small, the SOM distribution is too varied in the feature amount space, and the input data distribution cannot be expressed well.

そのため、図９に示すように、事前学習として、時間減衰係数ｂ，領域減衰係数ｃの値を大きくとることで、先ず、ＳＯＭを入力データ分布の中心付近に寄せておき、その後、時間減衰係数ｂ，領域減衰係数ｃを小さくすることで、適切な分布を表現できるようにしている。尚、ここでの事前学習は、市場で実走行に使う前のオフラインでの学習を想定している。 Therefore, as shown in FIG. 9, by taking large values of the time attenuation coefficient b and the area attenuation coefficient c as pre-learning, first, the SOM is brought close to the center of the input data distribution, and then the time attenuation coefficient is set. By reducing b and the region attenuation coefficient c, an appropriate distribution can be expressed. The pre-learning here is assumed to be off-line learning before being used for actual driving in the market.

図１０に、学習後のＳＯＭの分布例を示す。実際の特徴量空間は２４０次元であるが、図１０では、そのうちの３次元のみを表しており、グラフの各点が入力データを示している。実際には、各点は色つきの点として表現され、色によってリスクの大きさを表している。黒い点が各状態の代表ベクトルで、それらを結ぶ黒線がＳＯＭのつながりである。 FIG. 10 shows an example of SOM distribution after learning. Although the actual feature amount space is 240 dimensions, FIG. 10 shows only three dimensions, and each point of the graph indicates input data. Actually, each point is expressed as a colored point, and the magnitude of the risk is represented by the color. Black dots are representative vectors for each state, and black lines connecting them are SOM connections.

以上では、入力データの分布を最適に表現できる学習法について述べてきたが、実際に求められるのは、リスクを認識する上で入力データの分布を最適に表現できることである。ＳＯＭは、本来、教師なしの学習法（入ってきたデータを均等に扱い学習していく）であるが、本システムにおいては、リスクを認識する上での効率的な学習として、前述の教師情報係数ｅ(t)によるリスク情報を与えた学習を行う。 In the above, learning methods that can optimally express the distribution of input data have been described, but what is actually required is that the distribution of input data can be optimally expressed in order to recognize risks. SOM is originally an unsupervised learning method (incoming data is treated equally and learned), but in this system, the above-mentioned teacher information is used as an efficient learning method for recognizing risks. Learning is performed with risk information given by a coefficient e (t).

詳細は後述するが、リスクの認識は、認識した状態のリスク確率という形で出力する。これは、その状態が、リスクをどの程度の確率で持つかということを表したものである。具体的な学習法としては、時刻ｔでの入力データがドライバ情報から得られたリスクレベルＲという教師情報を持つ場合、認識された状態が持つリスク確率においてリスクレベルＲの確率が高ければ教師情報係数ｅ(t)を大きくし、小さければ、教師情報係数ｅ(t)を小さくする。また、教師情報が得られない場合には、教師情報係数ｅ(t)を小さくするという処理にする。 Although details will be described later, the risk recognition is output in the form of the risk probability of the recognized state. This represents the probability that the state has a risk. As a specific learning method, when input data at time t has teacher information called risk level R obtained from driver information, if the risk probability of the recognized state has a high probability of risk level R, the teacher information The coefficient e (t) is increased, and if it is smaller, the teacher information coefficient e (t) is decreased. If teacher information cannot be obtained, the teacher information coefficient e (t) is reduced.

これにより、学習を進めるうちに、認識された状態は、そのときのリスクを高確率で持つようになり、つまりはリスクの認識精度が上がっているということになる。具体的な教師情報係数ｅ(t)の設定は、次のリスク認識処理において説明する。 As a result, as the learning progresses, the recognized state has the risk at that time with a high probability, that is, the risk recognition accuracy is improved. The specific setting of the teacher information coefficient e (t) will be described in the next risk recognition process.

また、状態を確率的に求めた場合の学習については、勝者ユニットを確率に応じた重みで表現し、その重みに応じた更新量により更新を行う。但し、計算量が増大するという問題があるので、本システムでは、学習時については、勝者ユニットを入力データに一番近い状態に確定させて学習を行っており、一定値以下の確率の状態については、自身を勝者とする更新は行わない。 For learning when the state is obtained stochastically, the winner unit is expressed by a weight corresponding to the probability, and the update is performed with an update amount corresponding to the weight. However, since there is a problem that the calculation amount increases, in this system, the learning is performed with the winner unit determined to be closest to the input data at the time of learning. Does not update itself as a winner.

［リスク情報の抽出処理］
前述したように、本形態では、車両操作情報からのリスク抽出に際して、リスク情報抽出部６では学習を行わず、予め設定したルールを用いてドライバの操作情報からリスク情報を抽出するようにしている。このルールに従ったリスク情報の抽出処理においては、リスク情報をレベル付きの１次元データとして扱う。 [Risk information extraction process]
As described above, in this embodiment, when extracting risk from vehicle operation information, the risk information extraction unit 6 does not perform learning, but extracts risk information from driver operation information using a preset rule. . In the risk information extraction process according to this rule, the risk information is handled as one-dimensional data with a level.

具体的には、リスクのレベルを０〜１０（整数値）の１１段階とし、値が大きいほどリスクが高いことを表現する。但し、ここでのリスク情報は、３０Ｈｚの各フレーム毎といったように、一定時間毎にリスクを認識しようとするものではない。これは、実際のドライバの操作は、リスクだけにより行われているわけではなく、リスクに伴う操作を行う割合は、全走行中の例えば１０%にも満たない一部であろうと考えられるからである。 Specifically, the risk level is set to 11 levels from 0 to 10 (integer value), and the higher the value, the higher the risk. However, the risk information here is not intended to recognize the risk at regular intervals, such as every 30 Hz frame. This is because the actual driver's operation is not performed only by risk, and it is considered that the proportion of operations that accompanies risk is a part that is less than 10% of the total driving, for example. is there.

すなわち、ドライバ操作データからのリスク情報の認識は、ドライバの操作行動に影響を与えるような、ある程度大きなリスクがあったときにのみ、それがわかることを第一の目標とする。そのため、リスク０に関しては、出力はリスクがないということだけでなく、教師情報がないということも表している。 That is, the first goal is to recognize risk information from driver operation data only when there is a certain degree of risk that affects the driver's operation behavior. Therefore, for risk 0, the output indicates not only that there is no risk, but also that there is no teacher information.

また、リスク認識のルールは、できるだけ現実に合うように任意に設定するという立場を取り、以下の（１）〜（５）に示すルールを並列化して各条件の中で最も大きな値のリスクを教師リスクとする。 In addition, risk recognition rules are arbitrarily set so as to fit the reality as much as possible. The rules shown in (1) to (5) below are parallelized, and the risk with the largest value in each condition is selected. Teacher risk.

（１）急ブレーキを踏んだか
フレーム間のブレーキ圧力の差分に応じてリスクレベルを設定する。例えば、ブレーキ圧力の差分が１×１０²ｋＰａ以上ならリスク有り、１×１０²ｋＰａでリスク５、１×１０³ｋＰａでリスク１０とし、リスク５とリスク１０との間は、ブレーキ圧力の差分に応じて線形に設定する。 (1) Did you brake suddenly? Set the risk level according to the difference in brake pressure between frames. For example, if the difference in brake pressure is 1 × 10 ² kPa or more, there is a risk, 1 × 10 ² kPa is risk 5, 1 × 10 ³ kPa is risk 10, and the difference between the brake pressure is between risk 5 and risk 10. Set linearly according to.

（２）ブレーキを強く踏んだか
所定の車速以上で、ブレーキ圧力に応じてリスクレベルを設定する。例えば、車速１０ｋｍ／ｈ以上で、ブレーキ圧力が２０×１０²ｋＰａ以上の場合はリスク１０、ブレーキ圧力が１０×１０²ｋＰａ以上の場合はリスク６、ブレーキ圧力が５×１０²ｋＰａ以上の場合はリスク２とする。 (2) Did you step on the brake? Set the risk level according to the brake pressure at a specified vehicle speed or higher. For example, when the vehicle speed is 10 km / h or higher and the brake pressure is 20 × 10 ² kPa or higher, the risk is 10, when the brake pressure is 10 × 10 ² kPa or higher, the risk is 6, and when the brake pressure is 5 × 10 ² kPa or higher. Is risk 2.

（３）急ハンドルを切ったか
ウインカーが出ていない状態で、フレーム間のハンドル角の差分の絶対値が設定値（例えば１０ｄｅｇ）以上の場合、リスク５とする。 (3) Has the steering wheel been turned off? If the absolute value of the difference in the steering wheel angle between frames is greater than or equal to a set value (for example, 10 degrees) with no turn signal, risk 5 is assumed.

（４）アクセルを急に離したか
所定の車速以上で、フレーム間のアクセル開度の差分に応じてリスクレベルを設定する。例えば、車速５ｋｍ／ｈ以上でアクセル開度の差分が−１％以下の場合、リスク４とする。 (4) Was the accelerator suddenly released? The risk level is set according to the difference in accelerator opening between frames at a predetermined vehicle speed or higher. For example, if the vehicle speed is 5 km / h or more and the difference in accelerator opening is -1% or less, the risk is 4.

（５）アクセルを踏んでいるか
加速中のアクセル開度に応じてリスクレベルを設定する。加速中であるか否かは、車速の微分値で判断し、車速の微分値０以上（加速中）でアクセル開度１％以下の場合、リスク２とする。 (5) Is the accelerator stepped on? Set the risk level according to the accelerator opening during acceleration. Whether or not the vehicle is accelerating is determined by the differential value of the vehicle speed. If the differential value of the vehicle speed is 0 or more (during acceleration) and the accelerator opening is 1% or less, the risk is 2.

以上のルールは、当然ながら、追加・削除が可能であり、より現実に合うように調整することができる。また、以上のルールを自動生成するアルゴリズム、以上のルールに更にファジィ要素を取り入れる等して、「ドライバデータからのリスク認識の学習的獲得」を行うことも可能である。 Of course, the above rules can be added and deleted, and can be adjusted to be more realistic. It is also possible to perform “learning acquisition of risk recognition from driver data” by incorporating an algorithm for automatically generating the above rules, and further incorporating fuzzy elements into the above rules.

［リスク認識処理］
リスク認識部７では、状態認識部５で求めた状態により、リスクを出力する。前述したように、各状態はそれぞれリスク確率分布を持つため、状態ｓでのリスクの確率分布をｐ(Ｒ│ｓ)と表すことにする。尚、ここでのリスクは、リスク情報抽出部６でのリスクと対応しており、１１段階のレベルに分けているので、リスクレベルＲとリスク確率（分布）ｐ(Ｒ│ｓ)とは、例えば図１１に示すような関係で表される。 [Risk recognition process]
The risk recognition unit 7 outputs a risk according to the state obtained by the state recognition unit 5. As described above, since each state has a risk probability distribution, the risk probability distribution in state s is represented as p (R | s). The risk here corresponds to the risk in the risk information extraction unit 6 and is divided into 11 levels, so the risk level R and the risk probability (distribution) p (R | s) are: For example, it is represented by the relationship shown in FIG.

リスク出力は、基本的にこのリスク確率ｐ(Ｒ│ｓ)を出力することになるが、出力結果を例えば警報や表示などに使う場合には、確率分布のままでは使いにくいため、リスク出力としては、以下の（７）式で示される期待値Ｅを出力する。
Ｅ＝Σ_RＲ・ｐ(Ｒ│ｓ) …（７） The risk output basically outputs this risk probability p (R | s). However, when the output result is used for alarm or display, for example, it is difficult to use the probability distribution as it is. Outputs an expected value E expressed by the following equation (7).
E = Σ _R R · p (R | s) (7)

また、状態を確率的に取り扱った場合、期待値Ｅは、以下の（８）式のようになる。
Ｅ＝Σ_sΣ_RＰ(s)・Ｒ・ｐ(Ｒ│ｓ) …（８） Further, when the state is handled stochastically, the expected value E is expressed by the following equation (8).
_{_{E = Σ s Σ R P (}} s) · R · p (R│s) ... (8)

＜リスク確率の学習処理＞
リスク確率の学習は毎フレームに行われ、リスク確率は逐次更新される。リスク確率は、基本的に、過去に経験したリスクレベルの頻度分布を用いて算出する。しかし、本システムは、オンライン学習なので無限遠過去のデータまで持つことは難しく、また遠い過去の経験に現在と同じ重要度を持たせることは好ましくないと考えられる。従って、ここでは、以下の方法でリスク確率を更新する。 <Risk probability learning process>
The learning of the risk probability is performed every frame, and the risk probability is updated sequentially. The risk probability is basically calculated using a frequency distribution of risk levels experienced in the past. However, because this system is online learning, it is difficult to have past data at infinity, and it is not desirable to have distant past experience as important as the present. Therefore, the risk probability is updated here by the following method.

時刻ｔでの状態ｓ_tのリスク確率をｐ_t(Ｒ│ｓ_t)としたとき、以下の（９）式に従って、リスク確率を更新する。
ｐ_t+1(Ｒ│ｓ_t)＝ｐ_t(Ｒ│ｓ_t)＋β…（９） When the risk probability of state s _t at time t and the p _t (R￨s _t), in accordance with the following equation (9), and updates the risk probability.
p _{t + 1} (R | s _t ) = p _t (R | s _t ) + β (9)

更に、リスク確率ｐ_t+1(Ｒ│ｓ_t)は、以下の（１０）式に従って正規化する。
ｐ_t+1(Ｒ│ｓ_t)←ｐ_t+1(Ｒ│ｓ_t)／Σ_Rｐ_t+1(Ｒ│ｓ_t) …（１０） Further, the risk probability p _{t + 1} (R | s _t ) is normalized according to the following equation (10).
p _{t + 1} (R | s _t ) ← p _{t + 1} ( _R | s _t ) / ΣR p _{t + 1} ( _R | s _t ) (10)

尚、状態の更新は、その時刻の状態のみである。また、状態を確率的に扱う場合は、各状態においてβをｐ(ｓ_t)・βとして計算する。ここで、βは定数であり、この値が大きいほどより現在の情報を重要視することになる。 Note that the status update is only the status at that time. When the states are handled stochastically, β is calculated as p (s _t ) · β in each state. Here, β is a constant, and the larger this value, the more important the current information is.

ここで、与えられる教師リスクについては、リスク情報抽出部６の説明で述べたように、各フレーム毎に得られるとは限らない。リスクレベルが高い場合には、ドライバデータからリスク情報が得られる場合が多いが、リスクレベルが低い場合には、特に教師情報が得られる可能性が小さくなってしまうという問題がある。 Here, as described in the explanation of the risk information extraction unit 6, the given teacher risk is not necessarily obtained for each frame. When the risk level is high, risk information is often obtained from the driver data. However, when the risk level is low, there is a problem that the possibility of obtaining teacher information is particularly small.

この問題に対して、本システムでは、教師リスク情報を時間軸方向で伝播させることで対処するようにしている。これは、ある時刻に教師リスク情報が得られた場合は、その前の時刻もその時刻と同じではないまでも危険であるという因果関係に基づくものであり、この因果関係を用いて教師リスク情報を伝播させる。 This system addresses this problem by propagating teacher risk information in the time axis direction. This is based on the causal relationship that if the teacher risk information is obtained at a certain time, it is dangerous even if the previous time is not the same as that time. To propagate.

この場合、過去に情報を伝播させるには、伝播させる分のすべての過去の状態遷移を記憶している必要があるが、リアルタイムでの学習を前提としたとき、記憶容量と計算量がネックとなる。そこで、本システムでは、強化学習の際に用いられるＴＤ(Temporal Difference)誤差を考慮した伝播により、リスク確率を更新している。 In this case, in order to propagate information in the past, it is necessary to memorize all past state transitions for the amount to be propagated. Become. Therefore, in this system, the risk probability is updated by propagation in consideration of a TD (Temporal Difference) error used in reinforcement learning.

強化学習は、その時々の状態に対しての明示的な行動の指示ではなく、行った行動に対しての報酬によって学習を行い、この先得られるであろう報酬の総和が最大となる行動をその時々で選択する学習法であり、時刻ｔにおける実際の報酬と報酬の予測値の差をＴＤ誤差(TD-ERROR)と呼び、これを０とするように学習が行われる。本システムのリスク情報は、この強化学習の報酬に相当し、図１２に示すように、或るシーンでの状態遷移を考えると、状態Ｓ１に至る状態Ｓ２，Ｓ７，…にもリスクがあるはずであると考えられ、リスク情報の伝播を行う。 Reinforcement learning is not based on an explicit action instruction for the current state, but learning based on the reward for the action performed, and the action that maximizes the sum of the rewards that can be obtained in the future is It is a learning method that is selected from time to time, and the difference between the actual reward and the predicted value of the reward at time t is called a TD error (TD-ERROR), and learning is performed so that this is zero. The risk information of this system corresponds to the reward of this reinforcement learning. As shown in FIG. 12, when considering the state transition in a certain scene, the states S2, S7,... Propagate risk information.

この場合、伝播は、現在の状態から一つ前のフレームへ伝播させるだけで良く（つまり計算も記憶も１フレーム前との関係だけ扱えば良い）、一回の経験では、リスク情報は充分な過去まで伝播しないものの、同じような経験を繰り返すことで、徐々にリスク情報が伝播し、その因果関係を学習することができる。また、リスク情報の伝播は、図１３に示すように、同じリスクレベルの時刻ｔの状態Ｓtから時刻ｔ−１の状態Ｓt-1への伝播のみではなく、異なるリスクレベルの状態間においても伝播させるようにする。但し、リスクレベル０は、リスクがないという他に、リスク情報がないという場合も含むため、伝播はさせない。 In this case, propagation only needs to be propagated from the current state to the previous frame (that is, the calculation and storage need only be handled in relation to the previous frame), and the risk information is sufficient in one experience. Although it does not propagate to the past, by repeating the same experience, risk information gradually propagates and the causal relationship can be learned. Further, as shown in FIG. 13, the propagation of the risk information is not only propagated from the state St at the time t at the same risk level to the state St-1 at the time t-1, but also between the states at different risk levels. I will let you. However, risk level 0 includes no risk information and no risk information, and therefore does not propagate.

伝播によるリスク確率ｐ(ｒ│ｓ_t-1)の更新は、以下の（１１）式によって行われる。
ｐ(ｒ│ｓ_t-1)＝ｐ(ｒ│ｓ_t-1)＋η・(ＲＩ(ｒ)＋γ・ｐ(ｒ│ｓ_t)−ｐ(ｒ│ｓ_t-1))
＋ｈ・η・(γ・ｐ(ｒ−１│ｓ_t)−ｐ(ｒ−１│ｓ_t-1))
＋ｈ・η・(γ・ｐ(ｒ＋１│ｓ_t)−ｐ(ｒ＋１│ｓ_t-1)) …（１１）
但し、ｈ：リスクレベル方向の伝播の大きさを表すパラメータ
γ：時系列の伝播の大きさを表すパラメータ
η：一回の学習での更新の大きさを表すパラメータ The update of the risk probability p (r | s _t-1 ) due to propagation is performed by the following equation (11).
p (r | s _t-1 ) = p (r | s _t-1 ) + η · (RI (r) + γ · p (r | s _t ) −p (r | s _t-1 ))
+ H · η · (γ · p (r−1 | s _t ) −p (r ₋₁ | s _t−1 ))
+ H · η · (γ · p (r + 1 | s _t ) −p (r + ₁ | s _t−1 )) (11)
Where h is a parameter indicating the magnitude of propagation in the risk level direction
γ: Parameter indicating the magnitude of time series propagation
η: A parameter that represents the size of the update in a single learning

ここで、時刻ｔで得たリスク情報を、リスクレベルｒを用いてＲＩ(ｒ)と表している。前述したように、リスク情報抽出部６で扱うリスク情報は、０〜１０の１１段階の中の或る一つのリスクレベルに対して得られるものとしている。すなわち、時刻ｔで得られたリスク情報がリスクレベルＱとすると、（１２），（１３）式のように表される。
ＲＩ(ｒ)＝１（ｒ＝Ｑ） …（１２）
ＲＩ(ｒ)＝０（ｒ≠Ｑ） …（１３） Here, the risk information obtained at time t is expressed as RI (r) using the risk level r. As described above, the risk information handled by the risk information extraction unit 6 is assumed to be obtained for a certain risk level among 11 stages of 0 to 10. That is, if the risk information obtained at time t is the risk level Q, it is expressed as in equations (12) and (13).
RI (r) = 1 (r = Q) (12)
RI (r) = 0 (r ≠ Q) (13)

一方、このリスク学習におけるリスク情報ＲＩ(ｒ)は、図１４に示すように、実際はそのリスクレベル付近のリスクも存在すると考えて拡張を行っている。この拡張は、具体的には、隣のリスクレベルをパラメータｇ（ｇ＜１）を用いてｇ倍、そのまた隣のリスクレベルをｇ＊ｇ倍するという操作を行っており、この操作には、限られた教師データをさらに有効に使えるという効果がある。また、リスクレベル方向の伝播の大きさを表すｈは通常、リスク情報の拡張に用いたｇと同じ値としている。 On the other hand, as shown in FIG. 14, the risk information RI (r) in this risk learning is expanded on the assumption that there is actually a risk near the risk level. More specifically, the expansion is performed by multiplying the adjacent risk level by g using the parameter g (g <1) and multiplying the adjacent risk level by g * g. There is an effect that limited teacher data can be used more effectively. Further, h representing the magnitude of propagation in the risk level direction is normally set to the same value as g used for expanding risk information.

リスク確率の更新後は、状態認識部５における学習処理で用いた教師情報係数係数ｅ(t)を設定する。この教師情報係数ｅ(t)は、以下の（１４），（１５）式に従って設定される。時刻ｔでリスク情報抽出部６から得られるリスク情報をＲを用いて、
Ｒ≠０のとき、
ｅ(t)＝１０・Ｒ・ｐ(Ｒ│ｓ_t) …（１４）
Ｒ＝０のとき、
ｅ(t)＝ｃｏｎｓｔ …（１５） After updating the risk probability, the teacher information coefficient coefficient e (t) used in the learning process in the state recognition unit 5 is set. The teacher information coefficient e (t) is set according to the following equations (14) and (15). The risk information obtained from the risk information extraction unit 6 at time t is used as R,
When R ≠ 0
e (t) = 10 · R · p (R | s _t ) (14)
When R = 0
e (t) = const (15)

Ｒ＝０のときは、教師情報が入らなかったときに相当するが、その場合は、教師情報係数ｅ(t)は、定数ｃｏｎｓｔすなわち固定値のゲインになる。この値は、教師リスクが得られる確率により決定され、教師ありの学習データ数と教師なしの学習データ数との比率に基づいて設定される。本システムにおいては、経験則として、教師ありの学習データ数＝教師なしの学習データ数となるように定数ｃｏｎｓｔを設定し、ｃｏｎｓｔ＝０．０１としている。 When R = 0, this corresponds to the case where teacher information is not entered. In this case, the teacher information coefficient e (t) is a constant const, that is, a fixed value gain. This value is determined by the probability that a teacher risk can be obtained, and is set based on the ratio between the number of learning data with teacher and the number of learning data without teacher. In this system, as a rule of thumb, a constant const is set so that the number of learning data with teacher = the number of learning data without teacher is set, and const = 0.01.

教師情報が入った場合は、その確率が高い程、またリスクレベルが大きい程、強く学習される。これにより、実際に起こった事象に対して、認識する確率が小さい場合は、その状態の認識が間違っている可能性が高いことを示し、学習が弱くされる。その状態の代表ベクトルは、同じ状態を認識し、リスクの確率が高かったデータに近づくような学習が行われる。そして、そのような学習が続くことで、その入力データは他の状態と認識されやすくなり、間違っている可能性の高い状態を認識しにくくなる。このようにして全体としての状態認識、リスク認識が最適化される。 When teacher information is entered, the higher the probability and the higher the risk level, the stronger the learning. Thereby, when the probability of recognizing an event that has actually occurred is low, it indicates that there is a high possibility that the recognition of the state is wrong, and learning is weakened. The representative vector in that state is learned so that it recognizes the same state and approaches data with a high probability of risk. As such learning continues, the input data is easily recognized as another state, and it is difficult to recognize a state that is likely to be wrong. In this way, state recognition and risk recognition as a whole are optimized.

以上の処理による認識結果の出力例を図１５に示す。図１５（ａ）〜（ｄ）は、車載カメラから得られた画像に、認識結果を表示したシステムの出力画像であり、認識したリスクの大きさを、各画面の下部のバーグラフＢ１〜Ｂ４で表している。このバーグラフＢ１〜Ｂ４で表される認識リスクは、前述したリスク確率の期待値を示しており、その上に表示される数字は、認識した状態番号である。 An output example of the recognition result by the above processing is shown in FIG. FIGS. 15A to 15D are output images of a system in which the recognition result is displayed on the image obtained from the in-vehicle camera, and the magnitude of the recognized risk is represented by bar graphs B1 to B4 at the bottom of each screen. It is represented by The recognition risks represented by the bar graphs B1 to B4 indicate the expected value of the risk probability described above, and the numbers displayed thereon are the recognized state numbers.

図１５（ａ），（ｂ）に示す２枚の画像は、歩行者や対向車等が近くにおらず、リスクが低いと思われるシーンであり、また、図１５（ｃ），（ｄ）に示す２枚の画像は、それぞれ、道幅の狭い片側一車線道路で対向車が存在し、道幅が更に小さくなっているシーン、交差点での左折シーンであり、リスクとしては、図１５（ａ），（ｂ）のシーンよりリスクが高いと思われるシーンである。 The two images shown in FIGS. 15 (a) and 15 (b) are scenes in which pedestrians and oncoming vehicles are not nearby and are considered to be low in risk, and FIGS. 15 (c) and 15 (d). The two images shown in Fig. 15 are a scene in which an oncoming vehicle is present on a one-lane road with a narrow road width and the road width is further reduced, and a left-turn scene at an intersection. , (B) is a scene that seems to have a higher risk.

ここで、「リスクが低い（高い）と思われる」と記載したのは、それぞれの画像がいくつのリスク値であるという絶対的な値は存在しないためである。本システムの認識結果を見ると、図１５（ａ），（ｂ）のシーンよりも、図１５（ｃ），（ｄ）のシーンの方がリスクが高いと認識できていることがわかる。図１６は、このときのリスク確率ｐ_t(Ｒ│ｓ_t）の学習結果を示している。 Here, “the risk is considered to be low (high)” is described because there is no absolute value that the number of risk values for each image. From the recognition results of this system, it can be seen that the scenes of FIGS. 15C and 15D are recognized as having a higher risk than the scenes of FIGS. 15A and 15B. FIG. 16 shows the learning result of the risk probability p _t (R | s _t ) at this time.

１オンラインリスク認識システム
４特徴量抽出部
５状態認識部
５ａ認識モデル設定部
５ｂ認識学習部
６リスク情報抽出部
７リスク認識部 DESCRIPTION OF SYMBOLS 1 Online risk recognition system 4 Feature extraction part 5 State recognition part 5a Recognition model setting part 5b Recognition learning part 6 Risk information extraction part 7 Risk recognition part

Claims

An online risk learning system that detects the external environment of a mobile object and recognizes the risks contained in the external environment in a learning manner.
A feature quantity extraction unit that processes detection information of the outside environment and extracts multidimensional feature quantities included in the outside environment;
A recognition model setting unit for setting a recognition model for clustering the feature quantities in a hierarchical manner;
A state recognition unit for recognizing the multidimensional feature quantity as a one-dimensional state using the recognition model set by the recognition model setting unit;
Based on the correlation between the state recognized by the state recognition unit and the teacher information created by extracting risk information related to the risk, the risk of the state is adaptively learned and included in the external environment An online risk learning system comprising a risk recognition unit for recognizing the degree of risk.

The online system according to claim 1, wherein the recognition model is created by a self-organizing map, and the self-organizing map is divided into a plurality of one-dimensional models according to the driving state of the moving body to be hierarchized. Risk learning system.

One one-dimensional model is selected from the plurality of one-dimensional models according to the driving state of the moving body, and only the feature amount extracted from the image information obtained by imaging the external environment is input to the selected one-dimensional model. The online risk learning system according to claim 2, wherein:

4. The online risk learning system according to claim 2, wherein the plurality of one-dimensional models are parallelized according to the nature of input data.

5. The online risk learning system according to claim 4, wherein the plurality of parallel one-dimensional models are switched with hysteresis.