JP2011008613A

JP2011008613A - Online risk learning system

Info

Publication number: JP2011008613A
Application number: JP2009152715A
Authority: JP
Inventors: Taichi Kishida; 太一岸田; Motoya Ogawa; 原也小川
Original assignee: Fuji Heavy Industries Ltd
Current assignee: Subaru Corp
Priority date: 2009-06-26
Filing date: 2009-06-26
Publication date: 2011-01-13
Anticipated expiration: 2029-06-26
Also published as: JP5547913B2

Abstract

PROBLEM TO BE SOLVED: To prevent reduction in performance of online risk recognition due to biased learning.SOLUTION: An integration unit 5 integrates the function of a base unit 3 holding advance knowledge with the function of a learning unit 4 having the recognition performance which is changed according to the progress of learning according to a use environment, and the integration unit 5 finally performs risk recognition. According to this, even when biased learning occurs in the learning unit 4, the recognition performance of the base unit 3 which maintains the initial performance in product shipping can be reflected to the system to eliminate a reduction in risk recognition performance, and basic performances can be maintained with specialization according to the use environment.

Description

本発明は、自動車等の移動体の外部環境に含まれるリスクを適応的に学習して認識するオンラインリスク学習システムに関する。 The present invention relates to an online risk learning system that adaptively learns and recognizes risks contained in the external environment of a moving object such as an automobile.

近年、自動車等の移動体における予防安全技術として、カメラを搭載して外界環境を撮像し、撮像した画像を処理して外界環境に含まれる危険度（リスク）の情報を認識し、ドライバに警告する、又は運転をアシストするといった技術が開発されている。 In recent years, as a preventive safety technology for moving objects such as automobiles, a camera is installed to image the external environment, the captured image is processed to recognize risk information contained in the external environment, and alert the driver Technology has been developed to assist driving or assist driving.

このような危険情報の認識技術は、例えば、特許文献１に開示されている。特許文献１の技術は、車両周囲の環境の対象物に対して、その種類や属性毎に危険度パラメータを設定し、この危険度パラメータに基づいて危険度を演算するものである。 Such a technique for recognizing danger information is disclosed in, for example, Patent Document 1. The technique of Patent Literature 1 sets a risk parameter for each type and attribute of an object in the environment around the vehicle, and calculates the risk based on the risk parameter.

特許文献１に開示されているような従来の技術では、歩行者、対向車、障害物、白線等といった危険に結びつく要因を設定し、それらに基づいてリスクの認識を行っており、実際のシステムとしては、開発者が想定したリスク要因や認識を予めシステム内に組み込んでおくという形で実現されている。 In the prior art disclosed in Patent Document 1, factors that lead to danger such as pedestrians, oncoming vehicles, obstacles, white lines, etc. are set, and the risk is recognized based on these factors. Is realized in the form of incorporating risk factors and recognition assumed by the developer into the system in advance.

しかしながら、自動車の走行環境のような実際の環境は、天候の変化、歩行者、車、路上の構造物等の存在といったように多様であり、更には、運転する人間も多様であるため、従来の予め設定した一つの認識モデルでは限界があり、危険に結びつく要因についての認識が高精度で行われなければ、全体としてのリスクを認識できないばかりでなく、予め想定していた以外の危険な場面では認識できないという問題がある。 However, the actual environment such as the driving environment of automobiles is diverse, such as weather changes, the presence of pedestrians, cars, structures on the road, etc. Furthermore, the number of people driving is also diverse. There is a limit in one of the preset recognition models, and if the factors leading to danger are not recognized with high accuracy, not only the overall risk can be recognized, but also dangerous scenes other than those assumed in advance. There is a problem that cannot be recognized.

このため、本出願人は、特許文献２において、実際の環境下での経験をシステムが自律的に学習し、多様な外界環境に対応して危険度の認識を可能とすることのできるオンライン学習システムを提案している。 For this reason, in this patent application, the applicant has learned online experience that allows the system to autonomously learn the experience in an actual environment and to recognize the degree of danger corresponding to various external environments. A system is proposed.

特開２００３−８１０３９号公報JP 2003-81039 A 特開２００８−２３８８３１号公報JP 2008-238831 A

特許文献２の技術は、実際の環境下での経験をシステムが自律的に学習するため、ユーザの使用環境に合わせてシステムを特化させることが可能である反面、ユーザが特定の環境に偏った運転を行うと、偏学習が生じ、製品出荷段階で設定されていた事前学習結果が失われる（忘却される）虞がある。このため、ユーザの普段の使用環境と異なる環境に遭遇した場合、必ずしも十分な認識性能を得られない可能性がある。 In the technology of Patent Document 2, since the system autonomously learns the experience in the actual environment, it is possible to specialize the system according to the user's usage environment, but the user is biased to a specific environment. If the operation is performed, partial learning occurs, and there is a risk that the pre-learning result set at the product shipment stage may be lost (forgotten). For this reason, when an environment different from the user's normal usage environment is encountered, there is a possibility that sufficient recognition performance may not be obtained.

本発明は上記事情に鑑みてなされたもので、偏学習によるオンラインリスク認識の性能低下を防止することのできるオンラインリスク学習システムを提供することを目的としている。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide an online risk learning system that can prevent the performance degradation of online risk recognition due to partial learning.

上記目的を達成するため、本発明によるオンラインリスク学習システムは、移動体の外界環境を検出し、この外界環境に含まれるリスクを学習的に認識するオンラインリスク学習システムであって、上記リスクの事前学習結果を保持するベースユニットと、上記リスクのオンラインでの学習結果を保持する学習ユニットとを備え、上記ベースユニットによるリスクレベルと上記学習ユニットによるリスクレベルとを所定の融合率で融合し、オンラインでの唯一のリスクレベルとして出力することを特徴とする。 In order to achieve the above object, an online risk learning system according to the present invention is an online risk learning system that detects an external environment of a mobile object and recognizes a risk included in the external environment in a learning manner. A base unit for holding the learning result and a learning unit for holding the online learning result of the risk are provided, and the risk level by the base unit and the risk level by the learning unit are fused at a predetermined fusion rate, and online It is characterized by being output as the only risk level.

本発明によれば、偏学習が生じた場合であってもオンラインリスク認識性能の低下を防止することができ、ユーザの使用環境に合わせて特化しつつ基本性能を維持することが可能となる。 According to the present invention, even if partial learning occurs, it is possible to prevent a decrease in online risk recognition performance, and it is possible to maintain basic performance while specializing in accordance with a user's usage environment.

本発明の実施の第１形態に係り、オンラインリスク学習システムの構成図The block diagram of an online risk learning system according to the first embodiment of the present invention. 同上、特徴量抽出の画像領域を示す説明図As above, an explanatory diagram showing an image area for feature amount extraction 同上、状態認識の概念図Same as above, conceptual diagram of state recognition 同上、１次元自己組織化マップによる学習の概念図Same as above, conceptual diagram of learning with a one-dimensional self-organizing map 同上、自己組織化マップの学習後の分布を示す説明図Same as above, explanatory diagram showing the distribution of self-organizing maps after learning 同上、リスクレベルとリスク確率との関係を示す説明図Same as above, explanatory diagram showing the relationship between risk level and risk probability 同上、リスク伝播の説明図Same as above, explanatory diagram of risk propagation 同上、情報伝播の説明図Same as above, explanatory diagram of information propagation 同上、リスク情報の拡張を示す説明図Same as above, explanatory diagram showing expansion of risk information 同上、融合率及び融合パラメータ計算処理のフローチャートSame as above, flowchart of fusion rate and fusion parameter calculation processing 同上、各ニューロンの学習回数を示すヒストグラムSame as above, histogram showing number of times each neuron learns 同上、融合率テーブルの説明図Same as above, explanatory diagram of fusion rate table 同上、認識結果の出力例を示す説明図Same as above, explanatory diagram showing an example of recognition result output 本発明の実施の第２形態に係り、オンラインリスク学習システムの構成図The block diagram of an online risk learning system according to the second embodiment of the present invention.

以下、図面を参照して本発明の実施の形態を説明する。
本発明のオンラインリスク学習システムは、自動車等の移動体に搭載され、外界環境の検出結果からその環境内に含まれる危険度（リスク）に係る情報を適応的に認識するシステムであり、事前には想定していなかった環境に対しても、適応的にリスク認識が行えるようにオンラインで成長していく。 Embodiments of the present invention will be described below with reference to the drawings.
The online risk learning system of the present invention is a system that is mounted on a moving body such as an automobile and adaptively recognizes information related to the risk (risk) included in the environment from the detection result of the external environment. Will grow online so that it can adaptively recognize risks even in environments that were not envisioned.

外界環境のセンシングは、システムの入力系として備える各種センサデバイス、例えば、外界を単眼視やステレオ視で撮像するカメラ、レーザやミリ波等のレーダ装置からのセンシング情報を用いることが可能である。つまり、本システムは、基本的に外界環境を検出するセンサデバイスに依存するものではなく、広義にはセンサデバイスより得られる外界環境情報とリスク情報との相関関係を学習する学習システムである。 The sensing of the outside environment can use sensing information from various sensor devices provided as an input system of the system, for example, a camera that picks up the outside world with monocular or stereo vision, or a radar device such as a laser or a millimeter wave. That is, this system is not basically dependent on the sensor device that detects the external environment, but in a broad sense, is a learning system that learns the correlation between the external environment information obtained from the sensor device and the risk information.

本形態においては、オンラインリスク学習システムを自動車等の車両に適用し、車載カメラによって外界を撮像した画像情報と、ドライバの運転操作や車両の運転状態等の車両情報とを用いてリスク情報を抽出する例について説明する。すなわち、本形態のオンラインリスク学習システムは、画像から得られた情報から直接リスクとの関連付けを行うことでリスクを認識すると共に、その関連付けを実際の走行で遭遇した環境から学習し、適応的にリスク認識を行う。 In this embodiment, an online risk learning system is applied to a vehicle such as an automobile, and risk information is extracted using image information obtained by imaging the outside world with an in-vehicle camera and vehicle information such as a driver's driving operation and a driving state of the vehicle. An example will be described. In other words, the online risk learning system of this embodiment recognizes the risk by directly associating the risk with the information obtained from the image, and learns the association from the environment encountered in the actual driving, and adaptively Perform risk recognition.

具体的には、ドライバを認識器の学習においての教師とし、ドライバの運転操作からリスク情報を抽出し、その運転操作に基づくリスク情報と、カメラから得られる画像情報との関連を人工知能技術を用いて学習させる。例えば、ドライバが歩行者を回避するような操作行動を行ったとき、本システムは、その状況が危険であると判断し、そのときに得られた画像は危険であるということを教える。 Specifically, the driver is a teacher in learning of the recognizer, risk information is extracted from the driving operation of the driver, and artificial intelligence technology is used to relate the risk information based on the driving operation and the image information obtained from the camera. To learn. For example, when the driver performs an operation action that avoids a pedestrian, the system determines that the situation is dangerous, and teaches that the image obtained at that time is dangerous.

これにより、次の機会に同じような状況（画像）がシステムに入ってきた場合に危険であるという出力を行い、ドライバに警告を与えることができる。また、本システムでは、リスクを確率的に取り扱っている。このことにより、似たような状況でもリスクが異なる場合や、得られている画像情報だけでは判断が出来ない本質的に確率的なリスクも取り扱うことが可能になる。 As a result, when a similar situation (image) enters the system at the next opportunity, it is output that it is dangerous, and a warning can be given to the driver. In this system, risk is handled probabilistically. As a result, it is possible to handle a case where the risk is different even in a similar situation or an inherently probabilistic risk that cannot be determined only by the obtained image information.

以下、図１を参照して本形態のオンラインリスク学習システムについて説明する。本形態におけるオンラインリスク学習システム１は、単一のコンピュータシステム或いはネットワーク等を介して接続された複数のコンピュータシステムで構成されている。車両１００の外界環境を検出するセンサとしては、ＣＣＤやＣＭＯＳ等の撮像素子を有するカメラ２を用いている。 Hereinafter, the online risk learning system of this embodiment will be described with reference to FIG. The online risk learning system 1 in this embodiment is configured by a single computer system or a plurality of computer systems connected via a network or the like. As a sensor for detecting the external environment of the vehicle 100, a camera 2 having an image sensor such as a CCD or a CMOS is used.

このオンラインリスク学習システム１は、オフラインでの事前学習によるリスク認識機能を保持するベースユニット３、オンラインで学習したリスク認識機能を保持する学習ユニット４、ベースユニット３のリスク認識機能と学習ユニット４のリスク認識機能とを融合し、融合した認識機能でリスク認識を行う融合ユニット５を主要構成として備えている。ベースユニット３及び学習ユニット４は、それぞれのユニットの機能としてリスクレベルを算出するリスクレベル算出部３３，４３が備えられている。 The online risk learning system 1 includes a base unit 3 that holds a risk recognition function based on offline pre-learning, a learning unit 4 that holds a risk recognition function learned online, and the risk recognition function and learning unit 4 of the base unit 3. The fusion unit 5 that fuses the risk recognition function and performs risk recognition with the fused recognition function is provided as a main component. The base unit 3 and the learning unit 4 are provided with risk level calculation units 33 and 43 that calculate a risk level as a function of each unit.

また、オンラインリスク学習システム１は、各ユニット３，４，５に対するデータ計算用として、車両１００の操作量からリスク学習における教師情報を作成し、学習ユニット４に送出する教師作成部６、車載カメラ２で撮像した画像の特徴量を抽出する画像特徴量抽出部７、画像特徴量を状態量に変換し、この状態量からリスク学習情報を融合ユニット５の認識結果に基づいて計算し、ベースユニット３及び学習ユニット４に送出する融合ユニット計算部８、画像特徴量からリスク学習の学習量を計算し、学習ユニット４に送出する学習量計算部９、ベースユニット３と学習ユニット４とを融合させるための融合計算を行う融合計算部１０を備えている。 The online risk learning system 1 creates teacher information in risk learning from the amount of operation of the vehicle 100 for data calculation for each unit 3, 4, 5, and sends it to the learning unit 4. The image feature quantity extraction unit 7 that extracts the feature quantity of the image captured in 2, converts the image feature quantity into a state quantity, calculates risk learning information from the state quantity based on the recognition result of the fusion unit 5, and 3 and the fusion unit calculation unit 8 to be sent to the learning unit 4, the learning amount for risk learning is calculated from the image feature amount, and the learning amount calculation unit 9 to be sent to the learning unit 4, the base unit 3 and the learning unit 4 are fused. A fusion calculation unit 10 is provided.

ベースユニット３と学習ユニット４は、基本的に同じ枠組みのリスク認識機能を有している。詳細は後述するが、本システムは、車載カメラ２で撮像した外界画像を主として抽出した特徴量を状態という量に変換してクラスタリングを行い、車両操作情報を教師として各状態に対するリスクの確率密度分布を学習したリスク分布テーブルを用いてリスクを認識する。本実施の形態においては、階層型ニューラルネットワークの一種である自己組織化マップ（SOM;Self-Organization Maps）を用いて、ＳＯＭの各ニューロンを状態として認識する。 The base unit 3 and the learning unit 4 basically have a risk recognition function of the same framework. Although the details will be described later, this system performs clustering by converting the feature quantity mainly extracted from the external image captured by the in-vehicle camera 2 into a quantity called a state, and using the vehicle operation information as a teacher, the probability density distribution of the risk for each state Recognize the risk using the risk distribution table learned. In the present embodiment, each SOM neuron is recognized as a state using a self-organizing map (SOM), which is a kind of hierarchical neural network.

すなわち、ベースユニット３及び学習ユニット４は、製品出荷時の初期状態では、それぞれ、事前学習で形成したＳＯＭとリスク分布テーブルとを保有しており、一般的なユーザを想定した同じ認識性能、或いは個人の運転傾向に合わせる等して意図的に異なる認識性能を有するように設定されている。以下では、適宜、ベースユニット３が保有するＳＯＭ及びリスク分布テーブルをベースＳＯＭ３１及びベースリスク分布テーブル３２と記載し、学習ユニット４が保有するＳＯＭ及びリスク分布テーブルを学習ＳＯＭ４１及び学習リスク分布テーブル４２と記載する。 That is, the base unit 3 and the learning unit 4 each have the SOM and the risk distribution table formed by the prior learning in the initial state at the time of product shipment, or the same recognition performance assuming a general user, or It is set so as to have different recognition performance intentionally depending on personal driving tendency. In the following, the SOM and risk distribution table held by the base unit 3 will be described as a base SOM 31 and a base risk distribution table 32, and the SOM and risk distribution table held by the learning unit 4 will be described as a learning SOM 41 and a learning risk distribution table 42, as appropriate. Describe.

市場におけるオンライン動作では、車両走行に伴うシステム稼働時間の経過と共に、学習ユニット４のみが学習を行い、学習ＳＯＭ４１及び学習リスク分布テーブル４２がユーザの使用環境に合わせて適応的に更新される。一方、ベースユニット３は、一般的に想定される様々な状況に対して、所定の認識性能を確保するように事前学習されており、市場におけるオンライン動作に対して、あくまで事前学習で得た知識（ベースＳＯＭ３１及びベースリスク分布テーブル３２）を保持する。 In the online operation in the market, only the learning unit 4 learns with the passage of the system operation time associated with vehicle travel, and the learning SOM 41 and the learning risk distribution table 42 are adaptively updated according to the user's usage environment. On the other hand, the base unit 3 is pre-learned so as to ensure a predetermined recognition performance in various generally assumed situations, and knowledge obtained by pre-learning for online operations in the market is only used. (Base SOM 31 and base risk distribution table 32) are held.

この場合、ユーザが市街地走行に偏った運転をする、郊外での交通量の少ない環境での走行を主とする運転をするといったように、特定の走行環境に偏った運転を行うと、学習ユニット４で偏学習が生じる虞がある。この偏学習が生じると、ユーザの普段の使用環境と異なる走行環境に遭遇した場合、十分な認識性能を得られない可能性がある。 In this case, when the user performs a driving biased toward a specific driving environment, such as a driving biased toward urban driving or driving mainly in an environment where there is little traffic in the suburbs, the learning unit 4 may cause partial learning. When this partial learning occurs, there is a possibility that sufficient recognition performance cannot be obtained when a running environment different from the user's normal usage environment is encountered.

従って、本システムでは、事前知識を保持するベースユニット３の機能と、ユーザの使用環境に応じた学習の進行によって認識性能が変化する学習ユニット４の機能とを融合ユニット５で融合し、最終的に融合ユニット５でリスク認識を行うようにしている。すなわち、融合ユニット５に、ベースＳＯＭ３１と学習ＳＯＭ４１とを融合した融合ＳＯＭ５１を持たせ、この融合ＳＯＭ５１の各ニューロンに対応するリスクレベルとの関係を保持する融合リスクレベルテーブル５２から唯一のリスクレベルを算出する。 Therefore, in this system, the function of the base unit 3 that holds prior knowledge and the function of the learning unit 4 whose recognition performance changes according to the progress of learning according to the user's usage environment are fused in the fusion unit 5, and finally The risk is recognized by the fusion unit 5. That is, the fusion unit 5 has a fusion SOM 51 obtained by fusing the base SOM 31 and the learning SOM 41, and the unique risk level is obtained from the fusion risk level table 52 that holds the relationship with the risk level corresponding to each neuron of the fusion SOM 51. calculate.

これにより、学習ユニット４で偏学習が生じた場合であっても、製品出荷時の初期の性能を維持するベースユニット３の認識性能をシステムに反映してリスク認識性能の低下を排除することができ、ユーザの使用環境に合わせて特化しつつ基本性能を維持することが可能となる。 As a result, even if partial learning occurs in the learning unit 4, the recognition performance of the base unit 3 that maintains the initial performance at the time of product shipment is reflected in the system to eliminate the deterioration of the risk recognition performance. It is possible to maintain the basic performance while specializing in accordance with the use environment of the user.

以下、オンラインリスク学習システム１における各処理の詳細について、（Ａ）車両操作情報から教師情報を生成する教師情報作成処理、（Ｂ）画像から特徴量を抽出する画像特徴量抽出処理、（Ｃ）画像特徴量を状態量に変換し、ＳＯＭの勝者ニューロンを決定する融合ユニット計算処理、（Ｄ）画像特徴量からＳＯＭの学習量を計算するＳＯＭ学習量計算処理、（Ｅ）ベースユニット３のリスクレベル及び学習ユニット４のリスクレベルを計算するリスクレベル計算処理、（Ｆ）ベースユニット３と学習ユニット４とを融合させてリスクを認識する融合リスク認識処理の項目に分けて説明する。 Hereinafter, with respect to details of each process in the online risk learning system 1, (A) teacher information generation processing for generating teacher information from vehicle operation information, (B) image feature amount extraction processing for extracting feature amounts from images, (C) Fusion unit calculation processing for converting image feature amount into state amount and determining SOM winner neuron, (D) SOM learning amount calculation processing for calculating SOM learning amount from image feature amount, (E) Risk of base unit 3 The risk level calculation process for calculating the level and the risk level of the learning unit 4 and (F) the fusion risk recognition process for recognizing the risk by fusing the base unit 3 and the learning unit 4 will be described separately.

（Ａ）教師情報作成処理
教師作成部６は、学習ユニット４の学習リスク分布テーブルに対する教師情報を、車両操作情報に基づいて作成する。本形態では、車両操作情報からのリスク抽出に際しては学習を行わず、教師作成部６は、予め設定したルールを用いてドライバの操作情報からリスク情報を抽出するようにしている。このルールに従ったリスク情報の抽出処理においては、リスク情報をレベル付きの１次元データとして扱う。 (A) Teacher information creation processing The teacher creation unit 6 creates teacher information for the learning risk distribution table of the learning unit 4 based on the vehicle operation information. In this embodiment, learning is not performed when risk is extracted from the vehicle operation information, and the teacher creating unit 6 extracts risk information from the driver operation information using a preset rule. In the risk information extraction process according to this rule, the risk information is handled as one-dimensional data with a level.

具体的には、リスクのレベルを、例えば０〜１０（整数値）の１１段階とし、値が大きいほどリスクが高いことを表現する。但し、ここでのリスク情報は、３０Ｈｚの各フレーム毎といったように、一定時間毎にリスクを認識しようとするものではない。これは、実際のドライバの操作は、リスクだけにより行われているわけではなく、リスクに伴う操作を行う割合は、全走行中の例えば１０%にも満たない一部であろうと考えられるからである。 Specifically, the risk level is, for example, 11 levels of 0 to 10 (integer value), and the higher the value, the higher the risk. However, the risk information here is not intended to recognize the risk at regular intervals, such as every 30 Hz frame. This is because the actual driver's operation is not performed only by risk, and it is considered that the proportion of operations that accompanies risk is a part that is less than 10% of the total driving, for example. is there.

すなわち、ドライバ操作データからのリスク情報の認識は、ドライバの操作行動に影響を与えるような、ある程度大きなリスクがあったときにのみ、それがわかることを第一の目標とする。そのため、リスク０に関しては、出力はリスクがないということだけでなく、教師情報がないということも表している。 That is, the first goal is to recognize risk information from driver operation data only when there is a certain degree of risk that affects the driver's operation behavior. Therefore, for risk 0, the output indicates not only that there is no risk, but also that there is no teacher information.

また、リスク認識のルールは、できるだけ現実に合うように任意に設定するという立場を取り、以下の（１）〜（５）に示すルールを並列化して各条件の中で最も大きな値のリスクを教師リスクとする。 In addition, risk recognition rules are arbitrarily set so as to fit the reality as much as possible. The rules shown in (1) to (5) below are parallelized, and the risk with the largest value in each condition is selected. Teacher risk.

（１）急ブレーキを踏んだか
フレーム間のブレーキ圧力の差分に応じてリスクレベルを設定する。例えば、ブレーキ圧力の差分が１×１０²ｋＰａ以上ならリスク有り、１×１０²ｋＰａでリスク５、１×１０³ｋＰａでリスク１０とし、リスク５とリスク１０との間は、ブレーキ圧力の差分に応じて線形に設定する。 (1) Did you brake suddenly? Set the risk level according to the difference in brake pressure between frames. For example, if the difference in brake pressure is 1 × 10 ² kPa or more, there is a risk, 1 × 10 ² kPa is risk 5, 1 × 10 ³ kPa is risk 10, and the difference between the brake pressure is between risk 5 and risk 10. Set linearly according to.

（２）ブレーキを強く踏んだか
所定の車速以上で、ブレーキ圧力に応じてリスクレベルを設定する。例えば、車速１０ｋｍ／ｈ以上で、ブレーキ圧力が２０×１０²ｋＰａ以上の場合はリスク１０、ブレーキ圧力が１０×１０²ｋＰａ以上の場合はリスク６、ブレーキ圧力が５×１０²ｋＰａ以上の場合はリスク２とする。 (2) Did you step on the brake? Set the risk level according to the brake pressure at a specified vehicle speed or higher. For example, when the vehicle speed is 10 km / h or higher and the brake pressure is 20 × 10 ² kPa or higher, the risk is 10, when the brake pressure is 10 × 10 ² kPa or higher, the risk is 6, and when the brake pressure is 5 × 10 ² kPa or higher. Is risk 2.

（３）急ハンドルを切ったか
ウインカーが出ていない状態で、フレーム間のハンドル角の差分の絶対値が設定値（例えば１０ｄｅｇ）以上の場合、リスク５とする。 (3) Has the steering wheel been turned off? If the absolute value of the difference in the steering wheel angle between frames is greater than or equal to a set value (for example, 10 degrees) with no turn signal, risk 5 is assumed.

（４）アクセルを急に離したか
所定の車速以上で、フレーム間のアクセル開度の差分に応じてリスクレベルを設定する。例えば、車速５ｋｍ／ｈ以上でアクセル開度の差分が−１％以下の場合、リスク４とする。 (4) Was the accelerator suddenly released? The risk level is set according to the difference in accelerator opening between frames at a predetermined vehicle speed or higher. For example, if the vehicle speed is 5 km / h or more and the difference in accelerator opening is -1% or less, the risk is 4.

（５）アクセルを踏んでいるか
加速中のアクセル開度に応じてリスクレベルを設定する。加速中であるか否かは、車速の微分値で判断し、車速の微分値０以上（加速中）でアクセル開度１％以下の場合、リスク２とする。 (5) Is the accelerator stepped on? Set the risk level according to the accelerator opening during acceleration. Whether or not the vehicle is accelerating is determined by the differential value of the vehicle speed. If the differential value of the vehicle speed is 0 or more (during acceleration) and the accelerator opening is 1% or less, the risk is 2.

以上のルールは、当然ながら、追加・削除が可能であり、より現実に合うように調整することができる。また、以上のルールを自動生成するアルゴリズム、以上のルールに更にファジィ要素を取り入れる等して、「ドライバデータからのリスク認識の学習的獲得」を行うことも可能である。 Of course, the above rules can be added and deleted, and can be adjusted to be more realistic. It is also possible to perform “learning acquisition of risk recognition from driver data” by incorporating an algorithm for automatically generating the above rules, and further incorporating fuzzy elements into the above rules.

（Ｂ）特徴量抽出処理
画像特徴量抽出部７は、車載カメラ２からの撮像画像を入力し、ノイズ除去、ゲイン調整、γ補正等のビデオプロセス処理を経て所定の階調のデジタル画像に変換し、この画像の特徴量を抽出する。すなわち、得られた画像から、エッジ情報、動き情報、色情報等の特徴量を抽出し、それらの情報をＮ次元ベクトルとして保持する。 (B) Feature amount extraction processing The image feature amount extraction unit 7 inputs a captured image from the in-vehicle camera 2 and converts it into a digital image having a predetermined gradation through video process processing such as noise removal, gain adjustment, and γ correction. Then, the feature amount of this image is extracted. That is, feature amounts such as edge information, motion information, and color information are extracted from the obtained image, and the information is held as an N-dimensional vector.

尚、このＮ次元ベクトルには、画像特徴量以外の車両情報、例えば、車速やヨー角の変化といった情報も含めるようにしても良い。また、本形態で扱う画像データは、単眼のカラーカメラで撮像した画像とするが、赤外カメラから得られる画像やステレオカメラから得られる距離画像であっても良い。また、前述したように、レーザやミリ波等からの情報を用いることも可能であり、その場合、画像特徴量は、より一般的には、外界環境特徴量とも呼ぶべきものである。 The N-dimensional vector may include vehicle information other than the image feature amount, for example, information such as a change in vehicle speed and yaw angle. The image data handled in this embodiment is an image captured by a monocular color camera, but may be an image obtained from an infrared camera or a distance image obtained from a stereo camera. In addition, as described above, information from a laser, a millimeter wave, or the like can be used. In this case, the image feature amount should be more generally called an external environment feature amount.

この画像特徴量の抽出は、以降のリスク認識のためのデータ抽出であるが、一般に、リスク認識に相関がないデータは認識に悪影響を与える。つまり、この特徴量抽出処理においては、むやみに特徴量を増やすということは得策でなく、逆に、必要な特徴量を用いないことも精度を悪化させる。 The extraction of the image feature amount is data extraction for subsequent risk recognition. In general, data having no correlation with risk recognition adversely affects the recognition. That is, in this feature quantity extraction process, it is not a good idea to increase the feature quantity unnecessarily, and conversely, not using a necessary feature quantity also deteriorates accuracy.

そのため、どの特徴量を用いるべきかという特徴量選択が課題として発生するが、前述したように、特徴量選択については、それを学習的に得る場合は、以下に説明するリスク認識の上位の学習が必要になり、計算量・メモリ容量的にオンラインでの学習には不利である。 For this reason, feature quantity selection as to which feature quantity should be used occurs as an issue, but as described above, when feature quantity selection is obtained in a learning manner, higher-level learning of risk recognition described below is performed. This is disadvantageous for online learning in terms of computational complexity and memory capacity.

従って、本形態では、ここでの特徴量抽出部分は固定として扱う例について説明する。学習する場合には、システムの認識率を基準として評価し、各特徴量の組み合わせを最適化すれば良く、これには、組み合わせの全探索、遺伝的アルゴリズム（GA;Genetic Algorithm）等の発見的な探索法等、既存の最適化手法を用いることができる。 Therefore, in this embodiment, an example will be described in which the feature amount extraction portion is treated as fixed. When learning, it is only necessary to evaluate the recognition rate of the system as a standard and optimize the combination of each feature quantity. This includes heuristics such as full search of combinations and genetic algorithm (GA). Existing optimization methods such as simple search methods can be used.

本形態においては、画像特徴量抽出部７で予め設定した種類の特徴量を抽出している。ここでは、処理を３つの要素に分け、各要素毎に設定した特徴量を抽出する。３つの要素は、前処理、特徴量計算、領域設定である。具体的には、以下に示すように、前処理で６種類、特徴量計算で１０種類、領域設定で４種類のデータを抽出し、それらの組み合わせで計２４０（６×１０×４）次元のデータを抽出する。 In the present embodiment, the type of feature quantity preset by the image feature quantity extraction unit 7 is extracted. Here, the process is divided into three elements, and feature amounts set for each element are extracted. The three elements are preprocessing, feature amount calculation, and region setting. Specifically, as shown below, 6 types of data are extracted in the pre-processing, 10 types in the feature amount calculation, and 4 types in the region setting, and a total of 240 (6 × 10 × 4) dimensions are obtained by combining them. Extract data.

＜前処理＞
入力画像に対して、ソベル、縦方向ソベル、横方向ソベル、フレーム間差分、輝度、彩度の６種類のフィルタ処理を行い、６次元の特徴量データを抽出する。 <Pretreatment>
Six types of filter processing are performed on the input image, sobel, vertical sobel, horizontal sobel, inter-frame difference, luminance, and saturation to extract six-dimensional feature data.

＜特徴量＞
フィルタ処理された画像の画素値に対して、平均、分散、最大値、最小値、横方向重心、縦方向重心、コントラスト、均一性、エントロピー、フラクタル次元の１０種類の計算処理を行い、１０次元の特徴量データを抽出する。 <Feature amount>
Ten types of calculation processing are performed on the pixel values of the filtered image: average, variance, maximum value, minimum value, horizontal centroid, vertical centroid, contrast, uniformity, entropy, and fractal dimension. Feature quantity data is extracted.

＜領域＞
図２に示すように、画像内に領域Ａ０を設定し、この設定領域Ａ０の全体、設定領域Ａ０内の左側の領域Ａ１、右側の領域Ａ２、中央の領域Ａ３の４種類の領域について、４次元の特徴量データを抽出する。 <Area>
As shown in FIG. 2, an area A0 is set in the image, and four types of areas, that is, the entire setting area A0, the left area A1, the right area A2, and the central area A3 in the setting area A0 are set to 4 types. Extract dimension feature data.

尚、以上の２４０次元の特徴量は、オンラインシステムの演算性能に応じて、使用する次元を絞るようにしても良い。例えば、画像以外にも車両データも用いて、画面全体のソベルの平均、分散、画面全体のフレーム間差分の平均、分散、車速、ハンドル角の６次元の特徴量を抽出するようにしても良い。 Note that the 240-dimensional feature value may be narrowed down according to the computing performance of the online system. For example, in addition to images, vehicle data may also be used to extract 6-dimensional feature quantities such as the average and variance of the Sobel over the entire screen, the average of inter-frame differences over the entire screen, the variance, the vehicle speed, and the steering wheel angle. .

また、以上の特徴量抽出処理においては、各特徴量は正規化しているが、理論上の範囲は非効率であるため、事前に各特徴量の分布を評価しておき、その評価結果を元に最大値及び最小値を設定し、０〜１の数値に正規化している。その場合、最大値・最小値を動的に変化させるようしても良く、例えば、最大値を超える値もしくは最小値を下回る値が入力された場合には、それぞれ範囲を拡大するように最大値・最小値を変更する。逆に、しばらく最小値、最大値付近のデータが入ってこなかった場合は、範囲を狭めるように変更する。 In the above feature quantity extraction process, each feature quantity is normalized, but the theoretical range is inefficient. Therefore, the distribution of each feature quantity is evaluated in advance, and the evaluation result is used as a basis. The maximum value and the minimum value are set to, and normalized to a numerical value of 0 to 1. In that case, the maximum and minimum values may be changed dynamically. For example, when a value exceeding the maximum value or a value below the minimum value is input, the maximum value is expanded so that the range is expanded.・ Change the minimum value. Conversely, if the data near the minimum and maximum values has not been entered for a while, the range is changed to narrow.

また、ここでは基本的な特徴量を用いたが、過去のフレーム情報を用いて動き情報を算出する等、特徴量の時系列的な変動を計算し、その情報を特徴量として用いることもできる。更に、全体としてのリスク認識の精度向上のためには、この特徴量抽出処理に高精度の画像処理を入れることもでき、例えば、歩行者認識結果、道路の白線認識結果、障害物認識結果等を含めて、ここでの抽出データに組み込むようにしても良い。このような意味では、本システムは、個々の外界認識結果を統合し、リスクを認識するシステムと捉えることもできる。 Although basic feature values are used here, it is also possible to calculate time-series fluctuations of feature values, such as calculating motion information using past frame information, and use the information as feature values. . Furthermore, in order to improve the accuracy of risk recognition as a whole, high-accuracy image processing can be added to this feature amount extraction processing, for example, pedestrian recognition results, road white line recognition results, obstacle recognition results, etc. May be incorporated in the extracted data here. In this sense, the present system can also be regarded as a system that recognizes risks by integrating individual external recognition results.

（Ｃ）融合ユニット計算処理
融合ユニット計算部８は、得られたＮ次元の特徴量ベクトルを１次元の状態という量に変換する。状態とは、入力された画像を走行している場所や、天候、走行状態などによりシーン分けしているイメージになる。実際には、オンライン学習時、今はどのシーンであるかを明示的に教師することはできないため、入力データを状態数Ｍのクラスにクラスタリングしている。つまり、状態の認識は、入力された画像特徴量データから状態という量を出力する識別器の機能によって処理される（但し、この識別器の出力は、１状態を確定せずに確率的に扱うこともできる）。 (C) Fusion Unit Calculation Processing The fusion unit calculation unit 8 converts the obtained N-dimensional feature quantity vector into a quantity called a one-dimensional state. The state is an image in which the input image is divided into scenes according to the location where the image is traveling, the weather, the traveling state, and the like. Actually, at the time of online learning, it is not possible to explicitly teach which scene it is now, so the input data is clustered into a class with M states. That is, the recognition of the state is processed by the function of the discriminator that outputs the quantity of the state from the input image feature quantity data (however, the output of this discriminator is handled stochastically without determining one state. Can also).

本処理における学習は、この識別器の内部構造を入力データ、教師データを用いて実環境に適応させることになるが、ここでの学習における教師は、この入力データがどの状態であるかを直接教えるのではなく、出力された状態から認識されるリスクを、できるだけ効率的に、且つ精度良く認識できるようにするものである。 The learning in this process adapts the internal structure of this discriminator to the actual environment using the input data and teacher data, but the teacher in learning here directly determines the state of this input data. Instead of teaching, the risk recognized from the output state can be recognized as efficiently and accurately as possible.

識別器としての認識処理は、入力データに対してプロトタイプ型の識別処理を行う。ここで、状態番号をＳとすると、各状態は代表値を持ち、これをｐｒｏｔ_s(i)とする。状態代表値ｐｒｏｔ_s(i)は、Ｎ次元のベクトルであり、ｉ＝０，１，…，Ｎ−１となる。 The recognition process as a classifier performs a prototype type identification process on input data. Here, if the state number is S, each state has a representative value, which is designated as prot _s (i). The state representative value prot _s (i) is an N-dimensional vector, i = 0, 1,..., N−1.

入力データ（特徴量ベクトル）をＩｎ(i)とすると、入力ベクトルは、以下の（１）式に示すように、状態代表値ｐｒｏｔ_s(i)との距離Ｌ(s)により求められ、どの状態に属するかが認識される。
Ｌ(s)＝(Σ_i(ｐｒｏｔ_s(i)−Ｉｎ(i))²)^1/2 …（１） Assuming that the input data (feature vector) is In (i), the input vector is obtained by the distance L (s) from the state representative value prot _s (i) as shown in the following equation (1). Whether it belongs to a state is recognized.
L (s) = (Σ _i (prot _s (i) −In (i)) ² ) ^1/2 (1)

入力データの属する状態（状態番号）Ｋは、以下の（２）式に示すように、距離Ｌ(s)の最小値で求められ、入力ベクトルが一番近い状態代表値の状態であると認識される。
Ｋ＝ｍｉｎ_s(Ｌ(s)) …（２） The state (state number) K to which the input data belongs is obtained by the minimum value of the distance L (s) as shown in the following equation (2), and is recognized as the state representative state having the closest input vector. Is done.
K = min _s (L (s)) (2)

図３は、Ｎ次元中の３次元に注目した場合を示しており、入力データは、状態Ｓ６より状態Ｓ１に近いため、Ｓ１の状態であると認識される。以上が基本的な状態認識となるが、これは入力データがどの状態であるかを確定させていることになる。 FIG. 3 shows a case where attention is paid to three of the N dimensions. Since the input data is closer to the state S1 than the state S6, the input data is recognized as being in the state of S1. The above is the basic state recognition, which is to determine the state of the input data.

この場合、図３では、状態Ｓ１と状態Ｓ６とでは、距離はそれほど違いはないが、若干、状態Ｓ１との距離が近いことで、入力データは状態Ｓ１であると認識される。つまり、状態Ｓ１と状態Ｓ６との距離がほぼ同じ領域においては、認識が不安定になる可能性がある。 In this case, in FIG. 3, the distance between the state S1 and the state S6 is not so different, but the input data is recognized as the state S1 because the distance to the state S1 is slightly close. That is, in a region where the distance between the state S1 and the state S6 is almost the same, the recognition may become unstable.

従って、更に拡張し、状態が確率的であるとして扱うことで、認識の不安定さを解消することができる。すなわち、入力データが状態ｓである確率をＰ(s)とすると、状態の確率は、距離Ｌ(s)を用いて、以下の（３），（４）式で求める。ここで、σはパラメータであり、小さくするほど状態を確定的にする効果がある。
Ｐ(s)＝(ｅｘｐ(−Ｌ(s)／σ))／ｚ …（３）
ｚ＝Σ_sｅｘｐ(−Ｌ(s)／σ) …（４） Therefore, the instability of recognition can be eliminated by further expanding and treating the state as probabilistic. That is, if the probability that the input data is in the state s is P (s), the probability of the state is obtained by the following equations (3) and (4) using the distance L (s). Here, σ is a parameter, and the smaller the value, the more effective the state becomes deterministic.
P (s) = (exp (−L (s) / σ)) / z (3)
z = Σ _s exp (−L (s) / σ) (4)

このように、状態を入力データとの距離に応じた尺度で確率的に決定した場合、以後の計算で全ての状態について計算する必要がある。従って、計算量を削減するため、一定値以下の確率は０とし、計算として扱わないようにしても良い。 As described above, when states are stochastically determined on a scale according to the distance from the input data, it is necessary to calculate all states in the subsequent calculations. Therefore, in order to reduce the amount of calculation, the probability below a certain value may be set to 0 and may not be treated as a calculation.

尚、Ｐ(s)の定義において、ｓ＝Ｋのときだけ１、それ以外を０とすれば、状態を確定したときと同じになる。 In the definition of P (s), if 1 is set only when s = K and 0 is set otherwise, it is the same as when the state is fixed.

各状態は、ＳＯＭの学習をベースとして代表値が更新され、学習ユニット４の学習ＳＯＭ４１が更新される。ＳＯＭは、Ｍ次元（通常は２次元）に並べられたニューロンが、それぞれベクトル値（通常入力との結線の重みと呼ばれる）を持ち、入力に対して勝者ニューロンがベクトルの距離を基準として決定され、ベースユニット３及び学習ユニット４に出力される。本実施の形態においては、融合ユニット５の融合ＳＯＭ５１を基準として勝者ニューロンが決定される。 In each state, the representative value is updated based on learning of the SOM, and the learning SOM 41 of the learning unit 4 is updated. In the SOM, neurons arranged in the M dimension (usually two dimensions) each have a vector value (referred to as the weight of the connection with the normal input), and the winner neuron for the input is determined based on the vector distance. , And output to the base unit 3 and the learning unit 4. In the present embodiment, the winner neuron is determined based on the fusion SOM 51 of the fusion unit 5.

（Ｄ）ＳＯＭ学習量計算処理
学習量計算部９は、学習ユニット４が保持する学習ＳＯＭ４１に対して、融合ＳＯＭ５１の勝者ニューロンと同じ番号の勝者ニューロン及びその周辺のニューロンの参照ベクトル値が入力ベクトルに近づくよう、更新量を計算する。この計算を繰り返して、学習ＳＯＭ４１が入力データの分布を最適に表現できるように教師無しで学習してゆく。１次元ＳＯＭによる学習のイメージを、図４に示す。 (D) SOM learning amount calculation processing The learning amount calculation unit 9 inputs the reference vector values of the winner neuron having the same number as the winner neuron of the fusion SOM 51 and the surrounding neurons with respect to the learning SOM 41 held by the learning unit 4. The update amount is calculated so as to approach. This calculation is repeated, and learning is performed without a teacher so that the learning SOM 41 can optimally express the distribution of input data. An image of learning by the one-dimensional SOM is shown in FIG.

本システムにおいては、ＳＯＭによる学習は、以下のようになる。但し、本システムにおいては、ニューロン（状態）は１次元につながっているものとする。勝者ニューロンの状態番号をＫとすると、代表ベクトルｐｒｏｔ_sは、以下の（５）式に従って更新（学習）される。
ｐｒｏｔ_s(i)→ｐｒｏｔ_s(i)＋α(Ｉｎ(i)−ｐｒｏｔ_s(i) …（５） In this system, learning by SOM is as follows. However, in this system, it is assumed that neurons (states) are connected in one dimension. If the state number of the winner neuron is K, the representative vector prot _s is updated (learned) according to the following equation (5).
prot _s (i) → prot _s (i) + α (In (i) −prot _s (i) (5)

ここで、（５）式におけるαは、更新の重みを示す学習率係数であり、以下の（６）式で表される。
α＝ａ・ｂ(t)・ｃ(Ｄ(s,K),ｔ)・ｅ(t) …（６）
但し、ａ：学習係数
ｂ：時間減衰係数
ｃ：領域減衰係数
Ｄ(s,K)：更新対象のニューロンと勝者ベクトル間のつながりにおける距離
ｅ：教師情報係数 Here, α in the equation (5) is a learning rate coefficient indicating the weight of update, and is expressed by the following equation (6).
α = a · b (t) · c (D (s, K), t) · e (t) (6)
Where a: learning coefficient
b: Time decay coefficient
c: Domain attenuation coefficient
D (s, K): Distance in the connection between the neuron to be updated and the winner vector
e: Teacher information coefficient

（６）式における各パラメータａ，ｂ，ｃは、通常のＳＯＭでも用いられるパラメータであり、時間減衰係数ｂは、学習経過時間ｔ（通常何回目の更新かを表す）の関数であり、一般には時間ｔの増加につれ減衰する。また、距離Ｄ(s,K)は、特徴量空間上での距離ではなく、例えば、図４においては、勝者ニューロンの隣のニューロンは距離１、その隣は距離２となる。 The parameters a, b, and c in the equation (6) are parameters that are also used in normal SOM, and the time attenuation coefficient b is a function of the learning elapsed time t (usually indicating how many times it is updated). Decreases with increasing time t. Further, the distance D (s, K) is not a distance in the feature amount space. For example, in FIG. 4, the neuron next to the winner neuron is the distance 1 and the neighbor is the distance 2.

一方、領域減衰係数ｃは、その距離Ｄ(s,K)の関数であり、距離Ｄ(s,K)が大きくなる程、値が小さく、ある一定以上の距離Ｄ(s,k）については更新されないように設定される。また、領域減衰係数ｃは、時間ｔの関数でもあり、時間ｔが大きくなる程、値が小さくなる。更に、本システムでは、教師情報を示す教師情報係数ｅ(t)を導入するが、これについては後述する。 On the other hand, the region attenuation coefficient c is a function of the distance D (s, K), and the value decreases as the distance D (s, K) increases. It is set not to be updated. The region attenuation coefficient c is also a function of the time t, and the value decreases as the time t increases. Furthermore, in this system, a teacher information coefficient e (t) indicating teacher information is introduced, which will be described later.

このように、ＳＯＭの学習アルゴリズムでは、学習初期は、広範囲のニューロンが入力データに近づくように更新され、学習が進むにつれ、更新されるニューロン数、更新量とも少なくなり、最終的には、学習率係数α（更新の重み）が０になり、学習が終了する。尚、初期状態では、通常、ニューロンはベクトル空間上の中心付近にランダムに配置される。 As described above, in the learning algorithm of SOM, in the initial stage of learning, a wide range of neurons are updated so as to approach the input data, and as learning progresses, both the number of neurons to be updated and the amount to be updated are reduced. The rate coefficient α (update weight) becomes 0, and learning ends. In the initial state, normally, neurons are randomly arranged near the center on the vector space.

図５に、学習後のＳＯＭの分布例を示す。実際の特徴量空間は２４０次元であるが、図５では、そのうちの３次元のみを表しており、グラフの各点が入力データを示している。実際には、各点は色つきの点として表現され、色によってリスクの大きさを表している。黒い点が各状態の代表ベクトルで、それらを結ぶ黒線がＳＯＭのつながりである。 FIG. 5 shows an example of SOM distribution after learning. Although the actual feature amount space is 240 dimensions, FIG. 5 shows only three dimensions, and each point of the graph indicates input data. Actually, each point is expressed as a colored point, and the magnitude of the risk is represented by the color. Black dots are representative vectors for each state, and black lines connecting them are SOM connections.

以上では、入力データの分布を最適に表現できる学習法について述べてきたが、実際に求められるのは、リスクを認識する上で入力データの分布を最適に表現できることである。ＳＯＭは、本来、教師なしの学習法（入ってきたデータを均等に扱い学習していく）であるが、本システムにおいては、リスクを認識する上での効率的な学習として、前述の教師情報係数ｅ(t)によるリスク情報を与えた学習を行う。 In the above, learning methods that can optimally express the distribution of input data have been described, but what is actually required is that the distribution of input data can be optimally expressed in order to recognize risks. SOM is originally an unsupervised learning method (incoming data is treated equally and learned), but in this system, the above-mentioned teacher information is used as an efficient learning method for recognizing risks. Learning is performed with risk information given by a coefficient e (t).

詳細は後述するが、リスクは、認識した状態のリスク確率という形で認識される。これは、その状態が、リスクをどの程度の確率で持つかということを表したものである。具体的な学習法としては、時刻ｔでの入力データがドライバ情報から得られたリスクレベルＲという教師情報を持つ場合、認識された状態が持つリスク確率においてリスクレベルＲの確率が高ければ教師情報係数ｅ(t)を大きくし、小さければ、教師情報係数ｅ(t)を小さくする。また、教師情報が得られない場合には、教師情報係数ｅ(t)を小さくするという処理にする。 Although details will be described later, the risk is recognized in the form of the risk probability of the recognized state. This represents the probability that the state has a risk. As a specific learning method, when input data at time t has teacher information called risk level R obtained from driver information, if the risk probability of the recognized state has a high probability of risk level R, the teacher information The coefficient e (t) is increased, and if it is smaller, the teacher information coefficient e (t) is decreased. If teacher information cannot be obtained, the teacher information coefficient e (t) is reduced.

これにより、学習を進めるうちに、認識された状態は、そのときのリスクを高確率で持つようになり、つまりはリスクの認識精度が上がっているということになる。具体的な教師情報係数ｅ(t)の設定は、次のリスク認識処理において説明する。 As a result, as the learning progresses, the recognized state has the risk at that time with a high probability, that is, the risk recognition accuracy is improved. The specific setting of the teacher information coefficient e (t) will be described in the next risk recognition process.

また、状態を確率的に求めた場合の学習については、勝者ニューロンを確率に応じた重みで表現し、その重みに応じた更新量により更新を行う。但し、計算量が増大するという問題があるので、本システムでは、学習時については、勝者ニューロンを入力データに一番近い状態に確定させて学習を行っており、一定値以下の確率の状態については、自身を勝者とする更新は行わない。 For learning when the state is obtained probabilistically, the winner neuron is expressed by a weight corresponding to the probability and is updated by an update amount corresponding to the weight. However, since there is a problem that the amount of calculation increases, in this system, at the time of learning, the winner neuron is determined to be closest to the input data, and learning is performed. Does not update itself as a winner.

（Ｅ）リスクレベル計算処理
ベースユニット３及び学習ユニット４は、各リスクレベル算出部３３，４３において、状態量に応じたリスクレベルを、それぞれのリスク分布テーブル３２，４２を参照して計算する。前述したように、各状態はそれぞれリスク確率分布を持つため、状態ｓでのリスクの確率分布をｐ(Ｒ│ｓ)と表すことにする。尚、ここでのリスクは、教師作成部６でのリスクと対応しており、１１段階のレベルに分けているので、リスクレベルＲとリスク確率（分布）ｐ(Ｒ│ｓ)とは、例えば図６に示すような関係で表される。 (E) Risk level calculation process The base unit 3 and the learning unit 4 calculate the risk level corresponding to the state quantity with reference to the risk distribution tables 32 and 42 in the risk level calculation units 33 and 43, respectively. As described above, since each state has a risk probability distribution, the risk probability distribution in state s is represented as p (R | s). The risk here corresponds to the risk in the teacher creation unit 6 and is divided into 11 levels, so the risk level R and the risk probability (distribution) p (R | s) are, for example, This is represented by the relationship shown in FIG.

リスク出力は、基本的にこのリスク確率ｐ(Ｒ│ｓ)を出力することになるが、出力結果を例えば警報や表示などに使う場合には、確率分布のままでは使いにくいため、リスクレベルとしては、以下の（７）式で示される期待値Ｅを計算する。
Ｅ＝Σ_RＲ・ｐ(Ｒ│ｓ) …（７） The risk output basically outputs this risk probability p (R | s). However, when the output result is used for alarm or display, for example, it is difficult to use the probability distribution as it is. Calculates an expected value E expressed by the following equation (7).
E = Σ _R R · p (R | s) (7)

また、状態を確率的に取り扱った場合、期待値Ｅは、以下の（８）式のようになる。
Ｅ＝Σ_sΣ_RＰ(s)・Ｒ・ｐ(Ｒ│ｓ) …（８） Further, when the state is handled stochastically, the expected value E is expressed by the following equation (8).
_{_{E = Σ s Σ R P (}} s) · R · p (R│s) ... (8)

リスク確率は、学習ユニット４にて学習され、逐次更新される。基本的には、リスク確率は過去に経験したリスクレベルの頻度分布を用いて算出するが、本システムは、オンライン学習なので無限遠過去のデータまで持つことは難しく、また遠い過去の経験に現在と同じ重要度を持たせることは好ましくないと考えられる。従って、ここでは、以下の方法でリスク確率を更新する。 The risk probability is learned by the learning unit 4 and updated sequentially. Basically, the risk probability is calculated using the frequency distribution of the risk level experienced in the past, but since this system is online learning, it is difficult to have data of infinity past, It is considered undesirable to have the same importance. Therefore, the risk probability is updated here by the following method.

時刻ｔでの状態ｓ_tのリスク確率をｐ_t(Ｒ│ｓ_t)としたとき、以下の（９）式に従って、リスク確率を更新する。
ｐ_t+1(Ｒ│ｓ_t)＝ｐ_t(Ｒ│ｓ_t)＋β…（９） When the risk probability of state s _t at time t and the p _t (R￨s _t), in accordance with the following equation (9), and updates the risk probability.
p _{t + 1} (R | s _t ) = p _t (R | s _t ) + β (9)

更に、リスク確率ｐ_t+1(Ｒ│ｓ_t)は、以下の（１０）式に従って正規化する。
ｐ_t+1(Ｒ│ｓ_t)←ｐ_t+1(Ｒ│ｓ_t)／Σ_Rｐ_t+1(Ｒ│ｓ_t) …（１０） Further, the risk probability p _{t + 1} (R | s _t ) is normalized according to the following equation (10).
p _{t + 1} (R | s _t ) ← p _{t + 1} ( _R | s _t ) / ΣR p _{t + 1} ( _R | s _t ) (10)

尚、状態の更新は、その時刻の状態のみである。また、状態を確率的に扱う場合は、各状態においてβをｐ(ｓ_t)・βとして計算する。ここで、βは定数であり、この値が大きいほどより現在の情報を重要視することになる。 Note that the status update is only the status at that time. When the states are handled stochastically, β is calculated as p (s _t ) · β in each state. Here, β is a constant, and the larger this value, the more important the current information is.

ここで、与えられる教師リスクについては、教師作成部６の説明で述べたように、各フレーム毎に得られるとは限らない。リスクレベルが高い場合には、ドライバデータからリスク情報が得られる場合が多いが、リスクレベルが低い場合には、特に教師情報が得られる可能性が小さくなってしまうという問題がある。 Here, the teacher risk to be given is not always obtained for each frame, as described in the explanation of the teacher creation unit 6. When the risk level is high, risk information is often obtained from the driver data. However, when the risk level is low, there is a problem that the possibility of obtaining teacher information is particularly small.

この問題に対して、本システムでは、教師リスク情報を時間軸方向で伝播させることで対処するようにしている。これは、ある時刻に教師リスク情報が得られた場合は、その前の時刻もその時刻と同じではないまでも危険であるという因果関係に基づくものであり、この因果関係を用いて教師リスク情報を伝播させる。 This system addresses this problem by propagating teacher risk information in the time axis direction. This is based on the causal relationship that if the teacher risk information is obtained at a certain time, it is dangerous even if the previous time is not the same as that time. To propagate.

この場合、過去に情報を伝播させるには、伝播させる分のすべての過去の状態遷移を記憶している必要があるが、リアルタイムでの学習を前提としたとき、記憶容量と計算量がネックとなる。そこで、本システムでは、強化学習の際に用いられるＴＤ(Temporal Difference)誤差を考慮した伝播により、リスク確率を更新している。 In this case, in order to propagate information in the past, it is necessary to memorize all past state transitions for the amount to be propagated. Become. Therefore, in this system, the risk probability is updated by propagation in consideration of a TD (Temporal Difference) error used in reinforcement learning.

強化学習は、その時々の状態に対しての明示的な行動の指示ではなく、行った行動に対しての報酬によって学習を行い、この先得られるであろう報酬の総和が最大となる行動をその時々で選択する学習法であり、時刻ｔにおける実際の報酬と報酬の予測値の差をＴＤ誤差(TD-ERROR)と呼び、これを０とするように学習が行われる。本システムのリスク情報は、この強化学習の報酬に相当し、図７に示すように、或るシーンでの状態遷移を考えると、状態Ｓ１に至る状態Ｓ２，Ｓ７，…にもリスクがあるはずであると考えられ、リスク情報の伝播を行う。 Reinforcement learning is not based on an explicit action instruction for the current state, but learning based on the reward for the action performed, and the action that maximizes the sum of the rewards that can be obtained in the future is It is a learning method that is selected from time to time, and the difference between the actual reward and the predicted value of the reward at time t is called a TD error (TD-ERROR), and learning is performed so that this is zero. The risk information of this system corresponds to the reward of this reinforcement learning, and as shown in FIG. 7, there should be risks in the states S2, S7,. Propagate risk information.

この場合、伝播は、現在の状態から一つ前のフレームへ伝播させるだけで良く（つまり計算も記憶も１フレーム前との関係だけ扱えば良い）、一回の経験では、リスク情報は充分な過去まで伝播しないものの、同じような経験を繰り返すことで、徐々にリスク情報が伝播し、その因果関係を学習することができる。また、リスク情報の伝播は、図８に示すように、同じリスクレベルの時刻ｔの状態Ｓtから時刻ｔ−１の状態Ｓt-1への伝播のみではなく、異なるリスクレベルの状態間においても伝播させるようにする。但し、リスクレベル０は、リスクがないという他に、リスク情報がないという場合も含むため、伝播はさせない。 In this case, propagation only needs to be propagated from the current state to the previous frame (that is, the calculation and storage need only be handled in relation to the previous frame), and the risk information is sufficient in one experience. Although it does not propagate to the past, by repeating the same experience, risk information gradually propagates and the causal relationship can be learned. Further, as shown in FIG. 8, the propagation of risk information is not only propagated from the state St at the same risk level at the time t to the state St-1 at the time t-1, but also between the states at different risk levels. I will let you. However, risk level 0 includes no risk information and no risk information, and therefore does not propagate.

伝播によるリスク確率ｐ(ｒ│ｓ_t-1)の更新は、以下の（１１）式によって行われる。
ｐ(ｒ│ｓ_t-1)＝ｐ(ｒ│ｓ_t-1)＋η・(ＲＩ(ｒ)＋γ・ｐ(ｒ│ｓ_t)−ｐ(ｒ│ｓ_t-1))
＋ｈ・η・(γ・ｐ(ｒ−１│ｓ_t)−ｐ(ｒ−１│ｓ_t-1))
＋ｈ・η・(γ・ｐ(ｒ＋１│ｓ_t)−ｐ(ｒ＋１│ｓ_t-1)) …（１１）
但し、ｈ：リスクレベル方向の伝播の大きさを表すパラメータ
γ：時系列の伝播の大きさを表すパラメータ
η：一回の学習での更新の大きさを表すパラメータ The update of the risk probability p (r | s _t-1 ) due to propagation is performed by the following equation (11).
p (r | s _t-1 ) = p (r | s _t-1 ) + η · (RI (r) + γ · p (r | s _t ) −p (r | s _t-1 ))
+ H · η · (γ · p (r−1 | s _t ) −p (r ₋₁ | s _t−1 ))
+ H · η · (γ · p (r + 1 | s _t ) −p (r + ₁ | s _t−1 )) (11)
Where h is a parameter indicating the magnitude of propagation in the risk level direction
γ: Parameter indicating the magnitude of time series propagation
η: A parameter that represents the size of the update in a single learning

ここで、時刻ｔで得たリスク情報を、リスクレベルｒを用いてＲＩ(ｒ)と表している。前述したように、教師作成部６で扱うリスク情報は、０〜１０の１１段階の中の或る一つのリスクレベルに対して得られるものとしている。すなわち、時刻ｔで得られたリスク情報がリスクレベルＱとすると、（１２），（１３）式のように表される。
ＲＩ(ｒ)＝１（ｒ＝Ｑ） …（１２）
ＲＩ(ｒ)＝０（ｒ≠Ｑ） …（１３） Here, the risk information obtained at time t is expressed as RI (r) using the risk level r. As described above, the risk information handled by the teacher creating unit 6 is assumed to be obtained with respect to a certain risk level among 11 stages of 0 to 10. That is, if the risk information obtained at time t is the risk level Q, it is expressed as in equations (12) and (13).
RI (r) = 1 (r = Q) (12)
RI (r) = 0 (r ≠ Q) (13)

一方、このリスク学習におけるリスク情報ＲＩ(ｒ)は、図９に示すように、実際はそのリスクレベル付近のリスクも存在すると考えて拡張を行っている。この拡張は、具体的には、隣のリスクレベルをパラメータｇ（ｇ＜１）を用いてｇ倍、そのまた隣のリスクレベルをｇ＊ｇ倍するという操作を行っており、この操作には、限られた教師データをさらに有効に使えるという効果がある。また、リスクレベル方向の伝播の大きさを表すｈは通常、リスク情報の拡張に用いたｇと同じ値としている。 On the other hand, as shown in FIG. 9, the risk information RI (r) in this risk learning is expanded on the assumption that there is actually a risk near the risk level. More specifically, the expansion is performed by multiplying the adjacent risk level by g using the parameter g (g <1) and multiplying the adjacent risk level by g * g. There is an effect that limited teacher data can be used more effectively. Further, h representing the magnitude of propagation in the risk level direction is normally set to the same value as g used for expanding risk information.

リスク確率の更新後は、融合ユニット計算部８における学習処理で用いた教師情報係数係数ｅ(t)を設定する。この教師情報係数ｅ(t)は、以下の（１４），（１５）式に従って設定される。時刻ｔで教師作成部６から得られるリスク情報をＲを用いて、
Ｒ≠０のとき、
ｅ(t)＝１０・Ｒ・ｐ(Ｒ│ｓ_t) …（１４）
Ｒ＝０のとき、
ｅ(t)＝ｃｏｎｓｔ …（１５） After updating the risk probability, the teacher information coefficient coefficient e (t) used in the learning process in the fusion unit calculator 8 is set. The teacher information coefficient e (t) is set according to the following equations (14) and (15). The risk information obtained from the teacher creation unit 6 at time t is used as R,
When R ≠ 0
e (t) = 10 · R · p (R | s _t ) (14)
When R = 0
e (t) = const (15)

Ｒ＝０のときは、教師情報が入らなかったときに相当するが、その場合は、教師情報係数ｅ(t)は、定数ｃｏｎｓｔすなわち固定値のゲインになる。この値は、教師リスクが得られる確率により決定され、教師ありの学習データ数と教師なしの学習データ数との比率に基づいて設定される。本システムにおいては、経験則として、教師ありの学習データ数＝教師なしの学習データ数となるように定数ｃｏｎｓｔを設定し、ｃｏｎｓｔ＝０．０１としている。 When R = 0, this corresponds to the case where teacher information is not entered. In this case, the teacher information coefficient e (t) is a constant const, that is, a fixed value gain. This value is determined by the probability that a teacher risk can be obtained, and is set based on the ratio between the number of learning data with teacher and the number of learning data without teacher. In this system, as a rule of thumb, a constant const is set so that the number of learning data with teacher = the number of learning data without teacher is set, and const = 0.01.

教師情報が入った場合は、その確率が高い程、またリスクレベルが大きい程、強く学習される。これにより、実際に起こった事象に対して、認識する確率が小さい場合は、その状態の認識が間違っている可能性が高いことを示し、学習が弱くされる。その状態の代表ベクトルは、同じ状態を認識し、リスクの確率が高かったデータに近づくような学習が行われる。そして、そのような学習が続くことで、その入力データは他の状態と認識されやすくなり、間違っている可能性の高い状態を認識しにくくなる。このようにして全体としての状態認識、リスク認識が最適化される。 When teacher information is entered, the higher the probability and the higher the risk level, the stronger the learning. Thereby, when the probability of recognizing an event that has actually occurred is low, it indicates that there is a high possibility that the recognition of the state is wrong, and learning is weakened. The representative vector in that state is learned so that it recognizes the same state and approaches data with a high probability of risk. As such learning continues, the input data is easily recognized as another state, and it is difficult to recognize a state that is likely to be wrong. In this way, state recognition and risk recognition as a whole are optimized.

（Ｆ）融合リスク認識処理
ベースユニット３によるリスクレベル（以下、「ベースリスクレベル」と記載）と学習ユニット４によるリスクレベル（以下、「学習リスクレベル」と記載）は、融合計算部１０において所定の比率（融合率）αｆで融合され、融合ユニット５に送出される。具体的には、融合ユニット５へ出力されるリスクレベルは、ベースリスクレベルと学習リスクレベルとを融合率αｆで加重平均する等して算出される。 (F) Fusion Risk Recognition Processing The risk level by the base unit 3 (hereinafter referred to as “base risk level”) and the risk level by the learning unit 4 (hereinafter referred to as “learning risk level”) are predetermined in the fusion calculation unit 10. (Fusing rate) αf, and sent to the fusion unit 5. Specifically, the risk level output to the fusion unit 5 is calculated by weighted averaging the base risk level and the learning risk level with the fusion rate αf.

また、融合計算部１０は、学習ユニット４の学習ＳＯＭ４１が更新される毎に、ベースユニット３のベースＳＯＭ３１と学習ユニット４の学習ＳＯＭ４１との２つのＳＯＭを融合し、融合ユニット５の融合ＳＯＭ５１を更新する。この２つのＳＯＭの融合は、ベースＳＯＭ３１と学習ＳＯＭ４１との対応する勝者ニューロンを融合率αｆで融合させ、この融合された勝者ニューロンに従って各ニューロンを更新することで行われる。 Further, every time the learning SOM 41 of the learning unit 4 is updated, the fusion calculation unit 10 fuses the two SOMs of the base SOM 31 of the base unit 3 and the learning SOM 41 of the learning unit 4, and determines the fusion SOM 51 of the fusion unit 5. Update. The fusion of the two SOMs is performed by fusing the corresponding winner neurons of the base SOM 31 and the learning SOM 41 at the fusion rate αf, and updating each neuron according to the fused winner neurons.

その際の融合率αｆは、学習ＳＯＭ４１の勝者ニューロンの学習回数、ベースＳＯＭ３１のパラメータと学習ＳＯＭ４１のパラメータとの類似度、更には、学習回数と類似度との組み合わせに応じて制御される。以下では、図１０に示す融合率及び融合パラメータ計算処理のフローチャートを用いて、学習回数に応じて融合率を算出し、その融合率に基づいてＳＯＭの融合パラメータを算出する例について説明する。 The fusion rate αf at that time is controlled according to the number of learnings of the winning neuron of the learning SOM 41, the similarity between the parameters of the base SOM 31 and the parameters of the learning SOM 41, and the combination of the number of learnings and the similarity. Hereinafter, an example in which the fusion rate is calculated according to the number of learnings and the SOM fusion parameter is calculated based on the fusion rate, using the flowchart of the fusion rate and fusion parameter calculation process shown in FIG.

この処理では、先ず、ステップＳ１において、学習ＳＯＭ４１の勝者ニューロン番号を入力し、図１１に示すように、ｎ個のニューロンの学習回数のヒストグラムを更新する。次に、ステップＳ２へ進み、更新したヒストグラムを用いて、勝者ニューロンの学習回数に応じて各ニューロン毎の融合率αｆi（ｉ＝１，２，…，ｎ：０≦αｆi≦１）を算出する。 In this process, first, in step S1, the winner neuron number of the learning SOM 41 is input, and the histogram of the number of learning times of n neurons is updated as shown in FIG. Next, the process proceeds to step S2, and using the updated histogram, the fusion rate αfi (i = 1, 2,..., N: 0 ≦ αfi ≦ 1) for each neuron is calculated according to the number of learnings of the winner neuron. .

例えば、図１２に示すように、融合率αｆは、学習回数と融合率αｆとの関係を示すテーブルを予め作成しておき、このテーブルを参照して算出する。図１２のテーブルでは、徐々に融合率を高めることでオンライン学習を安定的に動作させるよう、学習回数が所定の設定回数ＮＬに達するまで融合率αｆを直線的に大きくし、学習回数が設定回数ＮＬに達した後は、融合率αｆを一定値αｆL（例えば、αｆL＝０．５）とする特性に設定されている。 For example, as shown in FIG. 12, the fusion rate αf is calculated by creating a table indicating the relationship between the number of learnings and the fusion rate αf in advance. In the table of FIG. 12, the fusion rate αf is linearly increased until the number of learning reaches the predetermined set number NL so that the online learning is stably operated by gradually increasing the fusion rate, and the number of learning is the set number of times. After reaching NL, the fusion rate αf is set to a constant value αfL (for example, αfL = 0.5).

ステップＳ２で融合率αｆを算出した後はステップＳ３へ進み、融合ＳＯＭ５１のパラメータＣｉを計算する。この融合ＳＯＭ５１のパラメータＣiは、以下の（１６）式に示すように、ベースＳＯＭ３１が持つニューロンのパラメータＢiと学習ＳＯＭ４１が持つニューロンのパラメータＬiとを、融合率αｆiで重み付けして算出する。
Ｃi＝(１−αｆi)・Ｂi＋αｆi・Ｌi …（１６） After calculating the fusion rate αf in step S2, the process proceeds to step S3, and the parameter Ci of the fusion SOM 51 is calculated. The parameter Ci of the fusion SOM 51 is calculated by weighting the neuron parameter Bi of the base SOM 31 and the neuron parameter Li of the learning SOM 41 with the fusion rate αfi, as shown in the following equation (16).
Ci = (1−αfi) · Bi + αfi · Li (16)

これにより、融合ユニット５内では、事前学習結果を維持するベースＳＯＭ３１とオンラインの学習結果を反映した学習ＳＯＭ４１とが融合され、この融合ＳＯＭ５１と融合リスクレベルとの関係に基づいて唯一のリスクレベルが決定され、車両１の表示装置等に出力される。 Thereby, in the fusion unit 5, the base SOM 31 that maintains the pre-learning result and the learning SOM 41 that reflects the online learning result are fused, and the only risk level is based on the relationship between the fusion SOM 51 and the fusion risk level. It is determined and output to the display device of the vehicle 1 or the like.

以上の処理によるリスク認識結果の出力例を図１３に示す。図１３（ａ）〜（ｄ）は、車載カメラから得られた画像に、認識結果を表示したシステムの出力画像であり、認識したリスクの大きさを、各画面の下部のバーグラフＢ１〜Ｂ４で表している。このバーグラフＢ１〜Ｂ４で表される認識リスクは、前述したリスク確率の期待値を示しており、その上に表示される数字は、認識した状態番号である。 An output example of the risk recognition result by the above processing is shown in FIG. FIGS. 13A to 13D are output images of a system in which the recognition result is displayed on the image obtained from the in-vehicle camera. The magnitude of the recognized risk is represented by bar graphs B1 to B4 at the bottom of each screen. It is represented by The recognition risks represented by the bar graphs B1 to B4 indicate the expected value of the risk probability described above, and the numbers displayed thereon are the recognized state numbers.

図１３（ａ），（ｂ）に示す２枚の画像は、歩行者や対向車等が近くにおらず、リスクが低いと思われるシーンであり、また、図１３（ｃ），（ｄ）に示す２枚の画像は、それぞれ、道幅の狭い片側一車線道路で対向車が存在し、道幅が更に小さくなっているシーン、交差点での左折シーンであり、リスクとしては、図１３（ａ），（ｂ）のシーンよりリスクが高いと思われるシーンである。 The two images shown in FIGS. 13A and 13B are scenes in which pedestrians and oncoming vehicles are not nearby and are considered to be low in risk, and FIGS. 13C and 13D. The two images shown in Fig. 13 are a one-lane road with a narrow road and an oncoming vehicle, the road width is further reduced, and a left-turn scene at an intersection. The risks are shown in Fig. 13 (a). , (B) is a scene that seems to have a higher risk.

ここで、「リスクが低い（高い）と思われる」と記載したのは、それぞれの画像がいくつのリスク値であるという絶対的な値は存在しないためである。本システムの認識結果を見ると、図１３（ａ），（ｂ）のシーンよりも、図１３（ｃ），（ｄ）のシーンの方がリスクが高いと認識できていることがわかる。 Here, “the risk is considered to be low (high)” is described because there is no absolute value that the number of risk values for each image. From the recognition results of this system, it can be seen that the scenes of FIGS. 13C and 13D are recognized as having a higher risk than the scenes of FIGS. 13A and 13B.

このように本実施の形態においては、ベースユニット３で保持される事前学習結果と、学習ユニット４で保持されるオンライン学習結果とを融合し、融合結果を融合ユニット５で保持して最終的なリスク認識を行う。これにより、事前知識の忘却を回避して偏学習による認識性能劣化を防止することができ、オンライン学習を安定的に動作させてリスク認識処理の信頼性と性能を向上させることができるばかりでなく、ユーザの使用環境に合わせて特化しつつ基本性能を確保することができる。 As described above, in the present embodiment, the pre-learning result held in the base unit 3 and the online learning result held in the learning unit 4 are fused, and the fusion result is held in the fusion unit 5 to obtain the final result. Perform risk recognition. As a result, it is possible not only to avoid the forgetting of prior knowledge and prevent recognition performance deterioration due to partial learning, but also to improve the reliability and performance of risk recognition processing by operating online learning stably. Basic performance can be ensured while specializing in accordance with the user's usage environment.

しかも、リスク認識における学習を、融合ユニット５にて決定されたＳＯＭのニューロンと同じ番号の学習ユニット４内のニューロンについて行ない、勝者ニューロンの決定を学習ユニット４にて並列に行なわないため、計算コストを削減することができる。また、勝者ニューロンの番号に食い違いが生じることがなく、偏学習による認識性能劣化を回避することができる。 In addition, learning in risk recognition is performed for the neurons in the learning unit 4 having the same number as the SOM neuron determined in the fusion unit 5, and the winner neuron is not determined in parallel in the learning unit 4. Can be reduced. In addition, there is no discrepancy between the numbers of winner neurons, and recognition performance deterioration due to partial learning can be avoided.

次に、本発明の実施の第２形態について説明する。第２形態は、第１形態の融合ユニット５を省略し、ベースユニット３’と学習ユニット４’を並列動作させるものである。 Next, a second embodiment of the present invention will be described. In the second form, the fusion unit 5 of the first form is omitted, and the base unit 3 'and the learning unit 4' are operated in parallel.

すなわち、図１４に示すように、第２形態においては、ベースユニット３’は、第１形態と同様のベースＳＯＭ３１とベースリスク分布テーブル３２とリスクレベル算出部３３とを備え、更に、ベースリスクレベルテーブル３４を備えている。また、学習ユニット４’は、第１形態と同様の学習ＳＯＭ４１と学習リスク分布テーブル４２とリスクレベル算出部４３とを備え、更に、学習リスクレベルテーブル４４を備えている。 That is, as shown in FIG. 14, in the second embodiment, the base unit 3 ′ includes a base SOM 31, a base risk distribution table 32, and a risk level calculation unit 33 similar to those in the first embodiment, and further includes a base risk level. A table 34 is provided. The learning unit 4 ′ includes a learning SOM 41, a learning risk distribution table 42, and a risk level calculation unit 43 that are the same as those in the first embodiment, and further includes a learning risk level table 44.

また、第２形態では、第１形態の融合ユニット計算部８を、ベースユニット３’と学習ユニット４’のそれぞれに対する専用のデータ計算用として、ベースユニット計算部１１と学習ユニット計算部１２とに分離し、更に、ベースユニット３から出力されるベースリスクレベルと学習ユニット４’から出力される学習リスクレベルとを融合する融合計算部１３を備えている。その他、教師作成部６、画像特徴量抽出部７、学習量計算部９は、第１形態と同様である。 In the second mode, the fusion unit calculation unit 8 of the first mode is used as a dedicated data calculation for each of the base unit 3 ′ and the learning unit 4 ′. In addition, a fusion calculation unit 13 is provided that combines the base risk level output from the base unit 3 and the learning risk level output from the learning unit 4 ′. In addition, the teacher creation unit 6, the image feature amount extraction unit 7, and the learning amount calculation unit 9 are the same as in the first embodiment.

第２形態においては、ベースユニット３’と学習ユニット４’とが並列に動作し、それぞれから、ベースリスクレベル、学習リスクレベルが出力される。各ユニットの動作は、第１形態と同様であるが、学習処理は、学習ユニット４’にて決定された勝者ニューロンを用いて行なうこととなる。各ユニット３’，４’から出力されるベースリスクレベルと学習リスクレベルは、第１形態と同様、融合率αｆで融合され、唯一のリスクレベルとして出力される。 In the second mode, the base unit 3 'and the learning unit 4' operate in parallel, and the base risk level and the learning risk level are output from each. The operation of each unit is the same as in the first embodiment, but the learning process is performed using the winner neuron determined by the learning unit 4 '. The base risk level and the learning risk level output from each unit 3 ′, 4 ′ are merged at the fusion rate αf and output as the only risk level, as in the first embodiment.

第２形態においても、第１形態と同様、事前知識の忘却を回避して偏学習による認識性能劣化を防止することができる。更に、第２形態では、ベースユニット３’と学習ユニット４’とが並列に計算を行うため、第１形態に比較して計算が冗長になる傾向があるものの、システム構成を単純化することができる。 Also in the second mode, similarly to the first mode, it is possible to avoid forgetting prior knowledge and prevent recognition performance deterioration due to partial learning. Furthermore, in the second embodiment, the base unit 3 ′ and the learning unit 4 ′ perform calculations in parallel, so that the calculation tends to be redundant compared to the first embodiment, but the system configuration can be simplified. it can.

１オンラインリスク学習システム
３ベースユニット
４学習ユニット
５融合ユニット
８融合ユニット計算部
１０融合計算部
αｆ融合率 1 Online Risk Learning System 3 Base Unit 4 Learning Unit 5 Fusion Unit 8 Fusion Unit Calculation Unit 10 Fusion Calculation Unit αf Fusion Rate

Claims

An online risk learning system that detects the external environment of a mobile object and recognizes the risks contained in the external environment in a learning manner.
A base unit for holding the prior learning result of the risk, and a learning unit for holding the online learning result of the risk,
An online risk learning system characterized in that the risk level of the base unit and the risk level of the learning unit are fused at a predetermined fusion rate and output as the only online risk level.

The online risk learning system according to claim 1, wherein the fusion rate is controlled according to the number of learnings of the learning unit.

The online risk learning system according to claim 1, wherein the fusion rate is controlled according to a similarity between a learning parameter of the base unit and a learning parameter of the learning unit.

2. The online risk learning system according to claim 1, wherein the fusion rate is controlled by combining the number of learnings of the learning unit and the similarity between the learning parameter of the base unit and the learning parameter of the learning unit. .

The fusion unit which fuses and holds the pre-learning result of the base unit and the online learning result of the learning unit is provided, and the unique risk level is output from the fusion unit. The online risk learning system according to any one of the above.

The base unit and the learning unit are operated in parallel, and the risk level output from each unit is merged at the fusion rate and output as the only risk level. The online risk learning system described in 1.