JP6726075B2

JP6726075B2 - Image processing method, image processing apparatus and program

Info

Publication number: JP6726075B2
Application number: JP2016196795A
Authority: JP
Inventors: 和紀小塚; 谷川　徹; 徹谷川; 齊藤　雅彦; 雅彦齊藤
Original assignee: Panasonic Intellectual Property Corp of America
Current assignee: Panasonic Intellectual Property Corp of America
Priority date: 2016-03-11
Filing date: 2016-10-04
Publication date: 2020-07-22
Anticipated expiration: 2036-10-04
Also published as: JP2017162437A

Description

本発明は、画像処理方法、画像処理装置およびプログラムに関する。 The present invention relates to an image processing method, an image processing device and a program.

近年、ニューラルネットワークを用いた機械学習技術による一般物体認識が、高い性能を示し注目されている。 In recent years, general object recognition by machine learning technology using neural networks has been attracting attention because of its high performance.

しかし、ニューラルネットワークによる一般物体認識において、高い認識性能を引き出すためには、認識ターゲットとする物体の名前や種類等がアノテーション（正解情報）として付された画像を大量に用いて学習処理を行う必要がある。 However, in general object recognition by neural network, in order to bring out high recognition performance, it is necessary to perform learning processing using a large amount of images in which the names and types of objects to be recognized are added as annotations (correct answer information). There is.

また、機械学習では、学習用データとして大規模のデータ（ビッグデータ）が提供されれば、精度が向上することが知られている。 Further, in machine learning, it is known that accuracy is improved if large-scale data (big data) is provided as learning data.

ビッグデータを集める方法の一つとして、クラウドソーシングなど第三者へのアウトソーシングを利用する方法がある。クラウドソーシングは、インターネットを通じて不特定多数の者（ワーカ）に簡単な作業（タスク）を安価で依頼する仕組みである。そのため、クラウドソーシングを利用してビッグデータのデータ収集を行えば、ビッグデータを構成する個々のデータに対するタスクを多数のワーカに分配して依頼することができるので、ビッグデータを効率的（比較的安価で短時間）に収集できる。 One method of collecting big data is to use outsourcing to a third party such as crowdsourcing. Crowdsourcing is a mechanism to request an unspecified number of people (workers) to perform simple tasks (tasks) through the Internet at low cost. Therefore, if data collection of big data is performed using crowdsourcing, tasks for individual data making up big data can be distributed and requested to a large number of workers, so that big data can be transmitted efficiently (relatively). It is cheap and can be collected in a short time.

例えば特許文献１には、なるべく少ない人数により高い作業精度でクラウドソーシングを実現する技術が開示されている。 For example, Patent Document 1 discloses a technique for realizing crowdsourcing with high work accuracy by a small number of people.

特開２０１３−１９７７８５号公報JP, 2013-197785, A

しかしながら、特許文献１に開示される技術を用いても、アノテーションを付す作業に高度な認識を必要とする場合、アノテーションを付す作業はクラウドソーシングのワーカの個人差が出やすいという問題がある。アノテーションを付す作業に高度な認識を必要とする場合としては、例えば車両が走行する前方を人が横切ることになり危険となりそうな危険領域を示すアノテーションを付すことを挙げることができる。したがって、アノテーションを付す作業に高度な認識を必要とする場合、クラウドソーシングで得られる学習用データの品質のにはばらつきが生じてしまうという問題がある。そして、品質のばらつきのある学習用データからなるビッグデータを用いて機械学習を行った場合、学習の精度は向上しない。 However, even if the technique disclosed in Patent Document 1 is used, when a high degree of recognition is required for the annotation work, there is a problem in that the annotation work is likely to cause individual differences among crowdsourcing workers. An example of a case in which a high degree of recognition is required for annotating work is, for example, annotating a dangerous area that is likely to be dangerous because a person crosses the front of the vehicle. Therefore, when a high degree of recognition is required for the work of adding annotations, there is a problem that the quality of the learning data obtained by crowdsourcing varies. When machine learning is performed using big data composed of learning data having variations in quality, the learning accuracy does not improve.

本開示は、上述の事情を鑑みてなされたもので、学習用データの品質のばらつきを抑制することができる画像処理方法、画像処理装置およびプログラムを提供することを目的とする。 The present disclosure has been made in view of the above circumstances, and an object of the present disclosure is to provide an image processing method, an image processing device, and a program that can suppress variation in quality of learning data.

上記目的を達成するために、本発明の一形態に係るタスク画像処理方法は、少なくとも１つが人物領域である２以上の第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、前記２以上の第１領域が前記車両の走行経路中に存在し、前記第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像を取得する取得ステップと、前記取得ステップにおいて取得した前記複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、前記２以上の第１領域それぞれの位置を判定する判定ステップと、前記複数の画像のうち前記判定ステップにおいて前記２以上の第１領域それぞれの位置が前記走行経路中にないと判定された最初の第１時刻における第１画像を特定し、特定した前記第１画像における前記２以上の第１領域どうしの間の領域を第２領域として決定する決定ステップと、前記決定ステップにおいて決定された前記第２領域を示す第２アノテーションを付与する付与ステップと、を含む。 In order to achieve the above object, a task image processing method according to an aspect of the present invention is provided with a first annotation indicating two or more first areas, at least one of which is a person area, and is mounted on a vehicle. A plurality of time-sequential images captured by an in-vehicle camera, wherein the two or more first regions are present in the travel route of the vehicle, and the distance between the first regions is less than or equal to a threshold value. In the acquisition step of acquiring a plurality of images including at least the image of the first area, and in the plurality of images acquired in the acquisition step, the two or more first regions are sequentially traced back from the last time image in the time series. A determination step of determining each position, and a first image at the first first time when it is determined that the position of each of the two or more first regions is not in the travel route in the determination step among the plurality of images. And a second annotation indicating the second region determined in the determining step, and a determining step of determining a region between the two or more first regions in the identified first image as a second region. And a grant step of granting.

なお、これらの全般的または具体的な態様は、システム、方法、集積回路、コンピュータプログラムまたはコンピュータで読み取り可能なＣＤ−ＲＯＭなどの記録媒体で実現されてもよく、システム、方法、集積回路、コンピュータプログラムおよび記録媒体の任意な組み合わせで実現されてもよい。 Note that these general or specific aspects may be realized by a recording medium such as a system, a method, an integrated circuit, a computer program or a computer-readable CD-ROM, and the system, the method, the integrated circuit, the computer. It may be realized by any combination of the program and the recording medium.

本発明によれば、学習用データの品質のばらつきを抑制することができる画像処理方法等を実現できる。 According to the present invention, it is possible to realize an image processing method and the like that can suppress variations in quality of learning data.

図１は、実施の形態１における画像処理装置の機能構成の一例を示す図である。FIG. 1 is a diagram showing an example of a functional configuration of the image processing apparatus according to the first embodiment. 図２は、実施の形態１におけるアノテーション部が取得する複数の画像の一例を示す図である。FIG. 2 is a diagram showing an example of a plurality of images acquired by the annotation unit according to the first embodiment. 図３は、図２に示す複数の画像に対して実施の形態１におけるアノテーション部が行う画像処理の説明図である。FIG. 3 is an explanatory diagram of image processing performed by the annotation unit according to the first embodiment on the plurality of images shown in FIG. 図４は、図２に示す複数の画像に対して実施の形態１におけるアノテーション部が行う画像処理の一例の説明図である。FIG. 4 is an explanatory diagram of an example of image processing performed by the annotation unit according to the first embodiment on the plurality of images shown in FIG. 図５は、図１に示す絞り込み部の詳細機能構成の一例を示す図である。FIG. 5 is a diagram showing an example of a detailed functional configuration of the narrowing unit shown in FIG. 図６は、実施の形態１における絞り込み部の第１絞り込み方法の説明図である。FIG. 6 is an explanatory diagram of a first narrowing-down method of the narrowing-down portion according to the first embodiment. 図７は、実施の形態１における絞り込み部の第２絞り込み方法の説明図である。FIG. 7 is an explanatory diagram of a second narrowing-down method of the narrowing portion according to the first embodiment. 図８は、実施の形態１における画像処理装置の絞り込み部の動作を示すフローチャートである。FIG. 8 is a flowchart showing the operation of the narrowing unit of the image processing apparatus according to the first embodiment. 図９は、実施の形態１における画像処理装置のアノテーション部での動作を示すフローチャートである。FIG. 9 is a flowchart showing the operation of the annotation unit of the image processing apparatus according to the first embodiment. 図１０は、実施の形態１の効果の説明図である。FIG. 10 is an explanatory diagram of the effect of the first embodiment. 図１１Ａは、実施の形態１におけるアノテーション部が決定する第２領域の一例を示す図である。FIG. 11A is a diagram showing an example of a second area determined by the annotation unit according to the first embodiment. 図１１Ｂは、変形例１におけるアノテーション部が決定する第２領域の一例を示す図である。FIG. 11B is a diagram showing an example of the second region determined by the annotation unit in the first modification. 図１２は、変形例２におけるアノテーション部が取得する複数の画像の一例を示す図である。FIG. 12 is a diagram illustrating an example of a plurality of images acquired by the annotation unit according to the second modification. 図１３は、変形例２におけるアノテーション部が決定する第２領域の一例を示す図である。FIG. 13 is a diagram illustrating an example of the second area determined by the annotation unit in the second modification. 図１４は、変形例３におけるアノテーション部が取得する複数の画像の一例を示す図である。FIG. 14 is a diagram illustrating an example of a plurality of images acquired by the annotation unit according to the modified example 3. 図１５は、変形例３におけるアノテーション部が決定する第２領域の一例を示す図である。FIG. 15 is a diagram illustrating an example of the second region determined by the annotation unit in the modified example 3. 図１６は、変形例４の第１例におけるアノテーション部が付与する第２アノテーションの一例を示す図である。FIG. 16 is a diagram illustrating an example of the second annotation added by the annotation unit in the first example of the modified example 4. 図１７は、変形例４の第２例におけるアノテーション部が付与する第２アノテーションの一例を示す図である。FIG. 17 is a diagram illustrating an example of the second annotation added by the annotation unit in the second example of the modified example 4. 図１８は、実施の形態２における判定部の詳細構成の一例を示す図である。FIG. 18 is a diagram illustrating an example of a detailed configuration of the determination unit according to the second embodiment. 図１９は、実施の形態２における画像処理装置の判定部の動作を示すフローチャートである。FIG. 19 is a flowchart showing the operation of the determination unit of the image processing apparatus according to the second embodiment. 図２０は、実施の形態２における取得部が取得する複数の画像の一例を示す図である。FIG. 20 is a diagram showing an example of a plurality of images acquired by the acquisition unit according to the second embodiment. 図２１は、図２０に示す複数の画像に対して実施の形態２における判定部が行う画像処理の説明図である。FIG. 21 is an explanatory diagram of image processing performed by the determination unit according to the second embodiment on the plurality of images shown in FIG. 図２２は、実施の形態３における画像処理装置の機能構成の一例を示す図である。FIG. 22 is a diagram showing an example of the functional configuration of the image processing apparatus according to the third embodiment.

本発明の一態様に係る画像処理方法は、少なくとも１つが人物領域である２以上の第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、前記２以上の第１領域が前記車両の走行経路中に存在し、前記第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像を取得する取得ステップと、前記取得ステップにおいて取得した前記複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、前記２以上の第１領域それぞれの位置を判定する判定ステップと、前記複数の画像のうち前記判定ステップにおいて前記２以上の第１領域それぞれの位置が前記走行経路中にないと判定された最初の第１時刻における第１画像を特定し、特定した前記第１画像における前記２以上の第１領域どうしの間の領域を第２領域として決定する決定ステップと、前記決定ステップにおいて決定された前記第２領域を示す第２アノテーションを付与する付与ステップと、を含む。 An image processing method according to an aspect of the present invention is provided with a first annotation that indicates two or more first areas, at least one of which is a person area, and the time series of images taken by an on-vehicle camera mounted on a vehicle. A plurality of consecutive images, the plurality of images including at least one image in which the two or more first regions are present in the traveling route of the vehicle and the distance between the first regions is equal to or less than a threshold value; An acquisition step of acquiring, and a determination step of determining the position of each of the two or more first regions while tracing back the time in sequence from the last time image in the plurality of images acquired in the acquisition step. , The first image at the first first time when it is determined that the position of each of the two or more first regions is not in the travel route in the determination step among the plurality of images A determining step of determining an area between the two or more first areas in the image as a second area; and an assigning step of assigning a second annotation indicating the second area determined in the determining step. ..

このようにして、車載カメラにより撮影された複数の画像に対して、クラウドソーシングのワーカであれば高度な認識を必要とする第２領域を示す第２アノテーションを機械的に付すことができる。それにより、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる。 In this way, the plurality of images captured by the vehicle-mounted camera can be mechanically attached with the second annotation indicating the second region that requires advanced recognition by a crowdsourcing worker. Thereby, it is possible to suppress the variation in quality of the learning data including the plurality of images.

ここで、例えば、前記決定ステップでは、さらに、前記２以上の第１領域が前記車両の走行経路中に存在し、かつ、前記第１領域どうしの距離が前記閾値以下である第２画像を特定し、特定した前記第１画像から前記第２画像までに含まれる時系列に連続する複数の画像における前記２以上の第１領域どうしの間の領域を前記第２領域と決定するとしてもよい。 Here, for example, in the determining step, a second image in which the two or more first regions are present in the traveling route of the vehicle and the distance between the first regions is equal to or less than the threshold value is further specified. However, the area between the two or more first areas in the plurality of images that are consecutive in time series and included in the specified first image to the second image may be determined as the second area.

これにより、１以上の画像に対して第２領域を示す第２アノテーションを機械的に付すことができる。 Thereby, the second annotation indicating the second region can be mechanically attached to one or more images.

また、例えば、前記画像処理方法は、さらに、前記車両に搭載された車載カメラにより撮影された時系列に連続するすべての画像であって前記車両のブレーキ強度または加速度を示す情報と紐付けられているすべての画像のうち、前記車両のブレーキ強度または加速度が閾値より大きい時点から一定期間前の時点までの複数の画像である第１絞り込み画像を選択する第１絞り込みステップと、前記第１絞り込みステップにおいて選択された前記第１絞り込み画像のうちから、前記複数の画像を選択する第２絞り込みステップと、を含むとしてもよい。 Further, for example, the image processing method is further associated with information indicating the brake strength or the acceleration of the vehicle, which is all the images that are taken in time series and are captured by the vehicle-mounted camera mounted on the vehicle. A first narrowing step of selecting a first narrowed image which is a plurality of images from a time point at which the brake strength or acceleration of the vehicle is greater than a threshold value to a time point before a certain period, among all the existing images; and the first narrowing step. The second narrowing down step of selecting the plurality of images from the first narrowed down images selected in (3).

これにより、車載カメラにより撮影された複数の画像のうち、第２領域を示す第２アノテーションを付す可能性のある時系列画像であって第１領域を示す第１アノテーションが付された画像を含む時系列画像に絞り込んだ上で、第２領域を示す第２アノテーションを機械的に付すことができる。 Thereby, among the plurality of images captured by the vehicle-mounted camera, a time-series image that may be annotated with the second annotation indicating the second region and includes an image annotated with the first annotation indicating the first region The second annotation indicating the second region can be mechanically added after narrowing down to the time-series image.

ここで、例えば、前記画像処理方法は、さらに、前記第１絞り込みステップの前に、クラウドソーシングのワーカに、前記すべての画像に対して、画像中に存在する前記第１領域を示す第１アノテーションを付与させるワーカステップを含むとしてもよい。 Here, for example, the image processing method may further include, before the first narrowing step, providing a crowdsourcing worker with a first annotation indicating, for all the images, the first region existing in the images. May be included in the worker step.

また、例えば、前記画像処理方法は、さらに、前記第２絞り込みステップの前に、クラウドソーシングのワーカに、前記第１絞り込みステップにおいて選択された前記第１絞り込み画像に対して、当該第１絞り込み画像中に存在する前記第１領域を示す第１アノテーションを付与させるワーカステップを含むとしてもよい。 Further, for example, in the image processing method, before the second narrowing step, the crowd narrowing worker is further provided with the first narrowed image with respect to the first narrowed image selected in the first narrowing step. A worker step of adding a first annotation indicating the first region existing therein may be included.

これらにより、クラウドソーシングのワーカに、画像中に存在する第１領域を示す第１アノテーションを付与させることができる。 By these, the worker of crowdsourcing can be made to give the 1st annotation which shows the 1st field which exists in an image.

ここで、例えば、前記２以上の第１領域はそれぞれ、人物を示す人物領域であるとしてもよい。 Here, for example, each of the two or more first areas may be a person area indicating a person.

これにより、第２領域を車両が走行する上で人物と衝突する可能性のある危険領域として、第２領域を示す第２アノテーションを機械的に付すことができる。 As a result, the second annotation indicating the second area can be mechanically added as a dangerous area in which the vehicle may collide with a person while traveling in the second area.

また、例えば、前記２以上の第１領域は、人物を示す人物領域と、駐停車中の自動車を示す自動車領域とを含むとしてもよい。 Further, for example, the two or more first areas may include a person area indicating a person and an automobile area indicating a parked vehicle.

これにより、第２領域を車両が走行する上で衝突する可能性のある危険領域として、第２領域を示す第２アノテーションを機械的に付すことができる。 As a result, the second annotation indicating the second area can be mechanically attached as the dangerous area in which the vehicle may collide while traveling in the second area.

また、例えば、前記第２領域は、前記２以上の第１領域が示す物体同士が接近すると前記車両の前方を横切ることになり、前記車両と衝突する可能性のある危険領域であり、前記画像処理方法は、さらに、前記付与ステップにおいて付与された前記第２アノテーションに、さらに、前記第２領域の面積が小さいほど高い値となる危険度を含める危険度付与ステップを含むとしてもよい。 Further, for example, the second area is a dangerous area in which the objects indicated by the two or more first areas cross each other in front of the vehicle when approaching each other, and there is a risk of collision with the vehicle. The processing method may further include a risk degree giving step in which the second annotation given in the giving step further includes a risk degree that becomes higher as the area of the second region becomes smaller.

これにより、車両の走行上の危険領域である第２領域を示す第２アノテーションに、さらに危険度を含めることができる。 As a result, the degree of risk can be further included in the second annotation indicating the second area, which is a dangerous area in which the vehicle travels.

また、例えば、前記第２領域は、前記２以上の第１領域が示す物体同士が接近すると前記車両の前方を横切ることになり、前記車両と衝突する可能性のある危険領域であり、前記画像処理方法は、さらに、前記付与ステップにおいて付与された前記第２アノテーションに、さらに、前記第２領域を構成する一方側領域および他方側領域に異なる危険度であって、前記第２領域を挟む２つの前記第１領域のうち移動の大きさが大きい前記第１領域がある側の前記一方側領域または前記他方側領域の方が高い値となる危険度を含める危険度付与ステップを含むとしてもよい。 Further, for example, the second area is a dangerous area in which the objects indicated by the two or more first areas cross each other in front of the vehicle when approaching each other, and there is a risk of collision with the vehicle. The processing method is such that, with respect to the second annotation added in the adding step, the one side area and the other side area forming the second area have different risk levels, and the second area is sandwiched between the second annotation and the second area. It is also possible to include a risk degree giving step of including a risk degree that the one side area or the other side area on the side where the first area having a large movement size among the two first areas has a higher value. ..

また、例えば、前記判定ステップは、前記取得ステップにおいて取得した複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、前記第１アノテーションが付与されていない最初の画像を判定する第１判定ステップと、前記第１判定ステップにおいて判定された前記最初の画像の第３時刻の時系列上の次の時刻における画像中の前記第１領域を、前記第３時刻の画像から時刻を時系列順に遡りながら、前記車両の移動方向と垂直方向に向かう方向にずらした当該画像それぞれの中の位置に前記第１領域が存在するか否かを画像処理により判定する第２判定ステップとを含むとしてもよい。 Further, for example, in the determination step, in the plurality of images acquired in the acquisition step, the first image to which the first annotation is not added is determined while sequentially tracing the time from the image at the last time on the time series. And a first determination step for performing the first region in the image at the next time on the time series of the third time of the first image determined in the first determination step from the image at the third time. A second determination step of determining whether or not the first region is present at a position in each of the images displaced in the direction toward the moving direction of the vehicle while tracing back in chronological order by image processing. May be included.

これにより、一部の画像に付されているべき第１領域を示す第１アノテーションが付されていない場合でも、画像処理により当該一部の画像に第１領域が有るか否か判定することができる。それにより、高度な認識を必要とする第２領域を示す第２アノテーションをさらに付すことができるので、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる。 As a result, even if the first annotation indicating the first region that should be attached to some images is not attached, it is possible to determine whether or not the image has the first region by image processing. it can. As a result, the second annotation indicating the second region that requires a high degree of recognition can be further added, so that the variation in the quality of the learning data including the plurality of images can be suppressed.

また、本発明の一態様に係る画像処理装置は、少なくとも１つが人物領域である２以上の第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、前記２以上の第１領域が前記車両の走行経路中に存在し、前記第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像を取得する取得部と、前記取得部において取得した前記複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、前記２以上の第１領域それぞれの位置を判定する判定部と、前記複数の画像のうち前記判定部において前記２以上の第１領域それぞれの位置が前記走行経路中にないと判定された最初の第１時刻における第１画像を特定し、特定した前記第１画像における前記２以上の第１領域どうしの間の領域を第２領域として決定する決定部と、前記決定部において決定された前記第２領域を示す第２アノテーションを付与する付与部と、を備える。 Further, the image processing device according to one aspect of the present invention, when at least one of which is a person region, is attached with a first annotation indicating two or more first regions, and is imaged by an in-vehicle camera mounted on the vehicle. A plurality of images that are consecutive in a sequence, and that include at least one image in which the two or more first regions are present in the travel route of the vehicle and the distance between the first regions is less than or equal to a threshold value. Determination of determining the position of each of the two or more first regions in the acquisition unit that acquires an image and the plurality of images acquired by the acquisition unit while sequentially tracing back the time from the last time image in the time series Part and the first image at the first first time when the determination unit determines that the position of each of the two or more first regions is not in the travel route among the plurality of images, and the identified A determining unit that determines a region between the two or more first regions in the first image as a second region, and a assigning unit that assigns a second annotation indicating the second region determined by the determining unit, Equipped with.

なお、これらの全般的または具体的な態様は、システム、方法、集積回路、コンピュータプログラムまたはコンピュータで読み取り可能なＣＤ−ＲＯＭ等の記録媒体で実現されてもよく、システム、方法、集積回路、コンピュータプログラムまたは記録媒体の任意な組み合わせで実現されてもよい。 It should be noted that these general or specific aspects may be realized by a recording medium such as a system, a method, an integrated circuit, a computer program or a computer-readable CD-ROM, and the system, the method, the integrated circuit, the computer. It may be realized by any combination of programs or recording media.

以下、本発明の一態様に係る画像処理方法等について、図面を参照しながら具体的に説明する。なお、以下で説明する実施の形態は、いずれも本発明の一具体例を示すものである。以下の実施の形態で示される数値、形状、材料、構成要素、構成要素の配置位置などは、一例であり、本発明を限定する主旨ではない。また、以下の実施の形態における構成要素のうち、最上位概念を示す独立請求項に記載されていない構成要素については、任意の構成要素として説明される。また全ての実施の形態において、各々の内容を組み合わせることもできる。 Hereinafter, an image processing method and the like according to an aspect of the present invention will be specifically described with reference to the drawings. Each of the embodiments described below shows one specific example of the present invention. Numerical values, shapes, materials, constituent elements, arrangement positions of constituent elements, and the like shown in the following embodiments are examples and are not intended to limit the present invention. Further, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claim indicating the highest concept are described as arbitrary constituent elements. Further, the contents of each of the embodiments can be combined.

（実施の形態１）
［画像処理装置１０の構成］
図１は、実施の形態１における画像処理装置１０の機能構成の一例を示す図である。 (Embodiment 1)
[Configuration of Image Processing Device 10]
FIG. 1 is a diagram illustrating an example of a functional configuration of the image processing apparatus 10 according to the first embodiment.

画像処理装置１０は、記憶部２０に記憶されているアノテーション付与データに対して、ワーカであれば高度な認識を必要とするアノテーションを機械的にさらに付す画像処理を行い、学習用データとして、記憶部３０に出力する。本実施の形態では、アノテーション付与データは、クラウドソーシングにおけるワーカによって、画像中に明示的に存在する人物（人物領域）を示すアノテーション（第１アノテーション）が付与された複数の画像であって車載カメラにより撮影された複数の画像である。なお、人物が画像中に明示的に存在する場合にアノテーションを付することは、ワーカに高度な認識を要求しないので、ワーカの個人差が出にくく、品質にばらつきがない。 The image processing apparatus 10 performs image processing on the annotation data stored in the storage unit 20 by mechanically adding an annotation that requires advanced recognition by a worker, and stores the data as learning data. It is output to the unit 30. In the present embodiment, the annotation-added data is a plurality of images to which an annotation (first annotation) indicating a person (personal region) explicitly present in the image is attached by a worker in crowdsourcing, and the in-vehicle camera It is a plurality of images taken by. It should be noted that adding an annotation when a person is explicitly present in an image does not require the worker to have a high level of recognition, so that it is unlikely that individual differences among workers will occur and there will be no variation in quality.

本実施の形態では、画像処理装置１０は、図１に示すように、アノテーション部１１と、絞り込み部１２と、記憶部１３とを備える。以下、各構成要素について詳細に説明する。 In the present embodiment, the image processing device 10 includes an annotation unit 11, a narrowing unit 12, and a storage unit 13, as illustrated in FIG. 1. Hereinafter, each component will be described in detail.

［アノテーション部１１］
図２は、実施の形態１におけるアノテーション部１１が取得する複数の画像の一例を示す図である。図３および図４は、図２に示す複数の画像に対して実施の形態１におけるアノテーション部１１が行う画像処理の一例の説明図である。 [Annotation part 11]
FIG. 2 is a diagram showing an example of a plurality of images acquired by the annotation unit 11 according to the first embodiment. 3 and 4 are explanatory diagrams of an example of image processing performed by the annotation unit 11 according to the first embodiment on the plurality of images shown in FIG.

アノテーション部１１は、図１に示すように、取得部１１１と、判定部１１２と、決定部１１３と、付与部１１４とを備える。 As shown in FIG. 1, the annotation unit 11 includes an acquisition unit 111, a determination unit 112, a determination unit 113, and a provision unit 114.

（取得部１１１）
取得部１１１は、少なくとも１つが人物領域である２以上の第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、当該２以上の第１領域が車両の走行経路中に存在しかつ第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像を取得する。以下、２以上の第１領域はそれぞれ、人物を示す人物領域であるとして説明する。 (Acquisition unit 111)
The acquisition unit 111 is a plurality of time-sequential images to which first annotations indicating two or more first areas, at least one of which is a person area, are added, and which are taken in time series by an in-vehicle camera mounted on the vehicle. Then, a plurality of images including at least one image in which the two or more first regions are present in the travel route of the vehicle and the distance between the first regions is equal to or less than the threshold value is acquired. Hereinafter, each of the two or more first areas will be described as a person area indicating a person.

本実施の形態では、取得部１１１は、例えば図２に示す時系列に連続する複数の画像のような第１領域を示す第１アノテーションが付されたデータを記憶部１３から取得する。 In the present embodiment, the acquisition unit 111 acquires, from the storage unit 13, data with a first annotation indicating a first region, such as a plurality of images that are continuous in time series shown in FIG.

ここで、図２を用いて時系列に連続する複数の画像について説明する。 Here, a plurality of images that are continuous in time series will be described with reference to FIG.

図２に示す複数の画像は、学習用データを構成する複数の画像の一部であり、例えば車両に搭載された車載カメラにより撮影された映像の一部を構成する時系列に連続する複数の画像である。より具体的には、図２に示す複数の画像は、フレーム１０１ａ、フレーム１０１ｂ、フレーム１０１ｃ、フレーム１０１ｄ等からなる。当該複数の画像それぞれには、道路１０１１並びに人物６０および人物６１が含まれている（写っている）。一般的に、車載カメラにより撮影された映像を構成する画像では、人物６０および人物６１の動きより、車などの車両の動きの方が大きい（速い）ので、当該複数の画像において、人物６０および人物６１は遠ざかっている（または近づいている）。 The plurality of images shown in FIG. 2 are a part of the plurality of images forming the learning data, and are, for example, a plurality of time-series continuous images forming a part of an image captured by an in-vehicle camera mounted on the vehicle. It is an image. More specifically, the plurality of images shown in FIG. 2 includes a frame 101a, a frame 101b, a frame 101c, a frame 101d, and the like. Each of the plurality of images includes (represents) the road 1011 and the person 60 and the person 61. In general, in an image forming a video image captured by a vehicle-mounted camera, the movement of a vehicle such as a car is larger (faster) than the movements of the person 60 and the person 61. The person 61 is moving away (or approaching).

さらに、当該複数の画像（フレーム１０１ａ〜フレーム１０１ｄ）に第１領域（第１アノテーション）が付与されている。ここで、第１領域（第１アノテーション）は、明示的に存在する人物６０および人物６１を示す人物領域である。そして、例えばフレーム１０１ｄやフレーム１０１ｃ（当該複数の画像のうち１以上の画像）では、人物６０および人物６１を示す２つの第１領域が車両の走行経路である道路１０１１中に存在し、当該２つの第１領域どうしの距離が閾値以下である。ここで、閾値は、例えば人物の一人分の幅以下でもよいし、ゼロ距離であってもよい。 Further, a first region (first annotation) is added to the plurality of images (frame 101a to frame 101d). Here, the first area (first annotation) is a person area indicating the person 60 and the person 61 that explicitly exist. Then, for example, in the frame 101d or the frame 101c (one or more images of the plurality of images), two first regions indicating the person 60 and the person 61 exist in the road 1011 which is the traveling route of the vehicle, and The distance between the two first areas is less than or equal to the threshold value. Here, the threshold may be equal to or smaller than the width of one person, or may be zero distance.

（判定部１１２）
判定部１１２は、取得部１１１が取得した複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、２以上の第１領域それぞれの位置を判定する。 (Determination unit 112)
The determination unit 112 determines the position of each of the two or more first areas in the plurality of images acquired by the acquisition unit 111 while sequentially tracing back the time from the image at the last time on the time series.

本実施の形態では、判定部１１２は、例えば図２に示す複数の画像において、フレーム１０１ｄ、フレーム１０１ｃ、フレーム１０１ｂ、フレーム１０１ａをこの順に、それぞれの画像（フレーム）に付された第１アノテーションに基づき、２つの第１領域それぞれの位置を判定する。例えば、判定部１１２は、フレーム１０１ｄには２つの第１アノテーションが付されているので、フレーム１０１ｄには２つの第１領域があると判定し、フレーム１０１ｄ中に存在する２つの第１領域を示す枠の位置と大きさを判定する。判定部１１２は、フレーム１０１ｃ〜フレーム１０１ａについてもこの順で同様の判定を行うが、上述した通りであるので、説明は省略する。 In the present embodiment, for example, in the plurality of images shown in FIG. 2, the determination unit 112 assigns the frame 101d, the frame 101c, the frame 101b, and the frame 101a in this order to the first annotation attached to each image (frame). Based on this, the positions of the two first areas are determined. For example, the determination unit 112 determines that there are two first areas in the frame 101d because the two first annotations are added to the frame 101d, and determines the two first areas existing in the frame 101d. The position and size of the frame to be displayed are determined. The determination unit 112 also makes similar determinations in this order for the frames 101c to 101a, but since it is as described above, description thereof will be omitted.

（決定部１１３）
決定部１１３は、複数の画像のうち判定部１１２で２以上の第１領域それぞれの位置が走行経路中にないと判定された最初の第１時刻における第１画像を特定する。そして、決定部１１３は、特定した第１画像における当該２以上の第１領域どうしの間の領域を第２領域として決定する。なお、決定部１１３は、さらに、当該２以上の第１領域が車両の走行経路中に存在し、かつ、当該第１領域どうしの距離が閾値以下である第２画像を特定してもよい。この場合、決定部１１３は、特定した第１画像から第２画像までに含まれる時系列に連続する複数の画像における当該２以上の第１領域どうしの間の領域を第２領域と決定すればよい。 (Determination unit 113)
The determination unit 113 specifies the first image at the first first time when the determination unit 112 determines that the positions of the two or more first regions are not on the travel route among the plurality of images. Then, the determining unit 113 determines the area between the two or more first areas in the specified first image as the second area. Note that the determination unit 113 may further specify the second image in which the two or more first areas are present in the traveling route of the vehicle and the distance between the first areas is equal to or less than the threshold value. In this case, the determining unit 113 may determine the area between the two or more first areas in the plurality of time-sequential images included in the identified first image to second image as the second area. Good.

本実施の形態では、図３に示すように、決定部１１３は、図２に示す複数の画像において判定部１１２により人物６０を示す第１領域と人物６１を示す第１領域とのそれぞれの位置が道路１０１１中にないと判定された最初の時刻ｔ１におけるフレーム１０１ｂ（第１画像）を特定する。そして、決定部１１３は、特定したフレーム１０１ｂにおける人物６０を示す第１領域と人物６１を示す第１領域との間の領域を第２領域として決定する。ここで、第２領域は、車両が走行する上で第１領域が示す人物等の物体と衝突する可能性のある危険領域であることを意味する。 In the present embodiment, as shown in FIG. 3, determination unit 113 determines the positions of the first region showing person 60 and the first region showing person 61 by the determination unit 112 in the plurality of images shown in FIG. The frame 101b (first image) at the first time t1 when it is determined that is not on the road 1011 is specified. Then, the determining unit 113 determines the area between the first area showing the person 60 and the first area showing the person 61 in the specified frame 101b as the second area. Here, the second area means a dangerous area in which the vehicle may collide with an object such as a person indicated by the first area when the vehicle travels.

なお、決定部１１３は、さらに、人物６０を示す第１領域と人物６１を示す第１領域とが車両の走行経路である道路１０１１中に存在し、かつ、これら第１領域どうしの距離が閾値以下である第２画像としてフレーム１０１ｄ（またはフレーム１０１ｃ）を特定する。この場合、決定部１１３は、第１画像であるフレーム１０１ｂから第２画像であるフレーム１０１ｄまでに含まれる複数の画像であるフレーム１０１ｂ〜フレーム１０１ｄにおける人物６０を示す第１領域と人物６１を示す第１領域との間の領域を第２領域として決定する。 Note that the determining unit 113 further includes a first area indicating the person 60 and a first area indicating the person 61 on the road 1011 that is the traveling route of the vehicle, and the distance between the first areas is the threshold value. The frame 101d (or the frame 101c) is specified as the second image which will be described below. In this case, the determination unit 113 indicates the first area indicating the person 60 and the person 61 in the frames 101b to 101d, which are multiple images included in the frame 101b that is the first image to the frame 101d that is the second image. A region between the first region and the first region is determined as the second region.

このようにして、決定部１１３は、１以上の画像に対して第２領域を機械的に決定することができる。 In this way, the determination unit 113 can mechanically determine the second region for one or more images.

（付与部１１４）
付与部１１４は、決定部１１３で決定された第２領域を示す第２アノテーションを付与する。 (Granting unit 114)
The attaching unit 114 attaches the second annotation indicating the second area determined by the determining unit 113.

本実施の形態では、付与部１１４は、決定部１１３により決定された第２領域を示す第２アノテーションを、例えば図４に示す画像に付与する。また、付与部１１４は、第２アノテーションを付した複数の画像（アノテーション付与データに対してさらに第２アノテーションを付したもの）を、学習用データとして、記憶部３０に出力する。 In the present embodiment, the adding unit 114 adds the second annotation indicating the second area determined by the determining unit 113 to the image shown in FIG. 4, for example. Further, the adding unit 114 outputs a plurality of images with the second annotation (images with the second annotation further added to the annotation data) to the storage unit 30 as learning data.

なお、アノテーション部１１は、複数の画像を出力しなくてもよい。この場合、付与部１１４は、例えば、人物６０、６１を示す第１領域の座標値と第２領域の座標値などアノテーションを付すべき複数の画像に関する情報を出力すればよい。 The annotation unit 11 may not output a plurality of images. In this case, the adding unit 114 may output information about a plurality of images to be annotated, such as the coordinate values of the first area and the coordinate values of the second area indicating the persons 60 and 61, for example.

［絞り込み部１２の構成］
図５は、図１に示す絞り込み部１２の詳細機能構成の一例を示す図である。図６は、実施の形態１における絞り込み部１２の第１絞り込み方法の説明図である。図７は、実施の形態１における絞り込み部１２の第２絞り込み方法の説明図である。 [Structure of narrowing unit 12]
FIG. 5 is a diagram showing an example of a detailed functional configuration of the narrowing unit 12 shown in FIG. FIG. 6 is an explanatory diagram of a first narrowing-down method of the narrowing portion 12 in the first embodiment. FIG. 7 is an explanatory diagram of the second narrowing-down method of the narrowing portion 12 in the first embodiment.

絞り込み部１２は、図５に示すように、第１絞り込み部１２１と、第２絞り込み部１２２とを備える。 As shown in FIG. 5, the narrowing unit 12 includes a first narrowing unit 121 and a second narrowing unit 122.

絞り込み部１２は、記憶部２０から取得したアノテーション付与データを所定の時系列画像に絞り込み、記憶部１３に保存する。ここで、所定の時系列画像とは、車両が走行する上で人物同士が接近すると車両の前方を横切ることになり、車両と衝突する可能性のある危険領域であって、ワーカであれば高度な認識を必要とする危険領域を付す可能性のある時系列画像である。 The narrowing-down unit 12 narrows down the annotation-attached data acquired from the storage unit 20 into a predetermined time-series image, and stores it in the storage unit 13. Here, the predetermined time-series image is a dangerous area in which there is a possibility of collision with the vehicle when people approach each other while the vehicle is traveling, and if the worker is a worker, the altitude is high. It is a time-series image that may have a dangerous area that requires specific recognition.

本実施の形態では、記憶部２０は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成され、クラウドソーシングのワーカによりアノテーションが付されたデータ（アノテーション付与データ）が記憶されている。 In the present embodiment, the storage unit 20 is configured by a HDD (Hard Disk Drive), a memory, or the like, and stores data annotated by a crowdsourcing worker (annotation data).

より具体的には、アノテーション付与データは、車両に搭載された車載カメラにより撮影された時系列に連続するすべての画像であって車両のブレーキ強度または加速度を示す情報と紐付けられているすべての画像である。また、アノテーション付与データは、クラウドソーシングのワーカにより、当該すべての画像において、画像中に存在する人物領域である第１領域を示す第１アノテーションが付されている。 More specifically, the annotation data is all the images that are taken in time series by the vehicle-mounted camera mounted on the vehicle and are linked to the information indicating the brake strength or acceleration of the vehicle. It is an image. In addition, the annotation-added data is given a first annotation indicating a first area, which is a person area existing in the image, in all the images by a crowdsourcing worker.

第１絞り込み部１２１は、記憶部２０に記憶されているアノテーション付与データであるすべての画像を、ブレーキ情報等により例えば図６に示す第１期間に紐づけられる複数の画像（第１絞り込み画像）に絞り込む。より具体的には、第１絞り込み部１２１は、車両に搭載された車載カメラにより撮影された時系列に連続するすべての画像であって車両のブレーキ強度または加速度を示す情報と紐付けられているすべての画像のうち、当該車両のブレーキ強度または加速度が閾値より大きい時点から一定期間前の時点までの複数の画像である第１絞り込み画像を選択する。 The first narrowing unit 121 links a plurality of images, which are the annotation data stored in the storage unit 20, with the brake information or the like, for example, into a plurality of images (first narrowed image) in the first period shown in FIG. Narrow down to. More specifically, the first narrowing unit 121 is all images that are taken in time series and are captured by an in-vehicle camera mounted on the vehicle, and are associated with information indicating the brake strength or acceleration of the vehicle. From all the images, the first narrowed-down image, which is a plurality of images from the time when the brake strength or the acceleration of the vehicle is larger than the threshold value to the time before a certain period, is selected.

そして、第２絞り込み部１２２は、第１絞り込み部１２１で選択された第１絞り込み画像のうちから、上記の複数の画像に絞り込む。本実施の形態では、第２絞り込み部１２２は、第１絞り込み部１２１により絞り込まれた複数の画像（第１絞り込み画像）を、さらに、画像処理等により絞り込む。より具体的には、第２絞り込み部１２２は、例えば図７のフレーム１０１ｄに示すように、人物６０、６１を示す２つの第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、当該２つの第１領域が道路１０１１中に存在し、かつ、第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像に絞り込む。 Then, the second narrowing down unit 122 narrows down the first narrowed down images selected by the first narrowing down unit 121 to the above-described plurality of images. In the present embodiment, the second narrowing unit 122 further narrows down the plurality of images (first narrowed images) narrowed down by the first narrowing unit 121 by image processing or the like. More specifically, the second narrowing unit 122 is provided with a first annotation indicating two first regions indicating the persons 60 and 61, and is mounted on the vehicle, for example, as shown in a frame 101d in FIG. A plurality of images taken in time series by the vehicle-mounted camera, the two first regions being present in the road 1011 and the distance between the first regions being less than or equal to a threshold value. Narrow down to multiple images that include at least.

そして、第２絞り込み部１２２は、絞り込んだ当該複数の画像を記憶部１３に記憶する。 Then, the second narrowing unit 122 stores the plurality of narrowed images in the storage unit 13.

［記憶部１３］
記憶部１３は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成されている。記憶部１３は、絞り込み部１２により絞り込まれた複数の画像を記憶している。 [Storage 13]
The storage unit 13 is composed of an HDD (Hard Disk Drive), a memory, and the like. The storage unit 13 stores a plurality of images narrowed down by the narrowing unit 12.

［画像処理装置１０の動作］
次に、以上のように構成された画像処理装置１０の動作について、図８および図９を用いて説明する。 [Operation of Image Processing Device 10]
Next, the operation of the image processing apparatus 10 configured as above will be described with reference to FIGS. 8 and 9.

図８は、実施の形態１における画像処理装置１０の絞り込み部１２の動作を示すフローチャートである。 FIG. 8 is a flowchart showing the operation of the narrowing unit 12 of the image processing apparatus 10 according to the first embodiment.

図８において、まず、画像処理装置１０の絞り込み部１２は、記憶部２０から、アノテーション付与データを取得する。 In FIG. 8, first, the narrowing-down unit 12 of the image processing apparatus 10 acquires the annotation data from the storage unit 20.

次に、絞り込み部１２は、取得したアノテーション付与データをブレーキ情報等により絞り込む第１絞り込み処理を行う（Ｓ９０）。具体的には、上述したように、絞り込み部１２は、車両に搭載された車載カメラにより撮影された時系列に連続するすべての画像であって車両のブレーキ強度または加速度を示す情報と紐付けられているすべての画像のうち、当該車両のブレーキ強度または加速度が閾値より大きい時点から一定期間前の時点までの複数の画像である第１絞り込み画像を選択する。 Next, the narrowing-down unit 12 performs a first narrowing-down process of narrowing down the acquired annotation-added data with brake information or the like (S90). Specifically, as described above, the narrowing-down unit 12 is all images taken in time series by the vehicle-mounted camera mounted on the vehicle and associated with information indicating the brake strength or acceleration of the vehicle. The first narrowed-down image, which is a plurality of images from a time point when the brake strength or acceleration of the vehicle is larger than the threshold value to a time point before a certain period, is selected from all the displayed images.

次に、絞り込み部１２は、Ｓ９０において絞り込まれた第１絞り込み画像を、さらに画像処理等により絞り込む第２絞り込み処理を行う（Ｓ９１）。具体的には、上述したように、絞り込み部１２は、第１絞り込み処理により絞り込まれた第１絞り込み画像のうち、人物を示す２つの第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、当該２つの第１領域が走行経路中に存在し、かつ、第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像を選択する。そして、第２絞り込み処理により選択された（絞り込まれた）複数の画像を記憶部１３に記憶する。 Next, the narrowing unit 12 performs a second narrowing process that further narrows down the first narrowed image narrowed down in S90 by image processing or the like (S91). Specifically, as described above, the narrowing-down unit 12 is provided with the first annotation indicating the two first regions indicating the person in the first narrowed-down image narrowed down by the first narrowing-down process, and the vehicle A plurality of time-sequential images captured by the vehicle-mounted camera mounted on the vehicle, the two first areas are present in the travel route, and the distance between the first areas is less than or equal to a threshold value. A plurality of images including at least the above images are selected. Then, the plurality of images selected (thinned) by the second narrowing-down process are stored in the storage unit 13.

図９は、実施の形態１における画像処理装置１０のアノテーション部１１での動作を示すフローチャートである。 FIG. 9 is a flowchart showing the operation of the annotation unit 11 of the image processing apparatus 10 according to the first embodiment.

図９において、まず、画像処理装置１０のアノテーション部１１は、記憶部１３から、第２絞り込み処理により絞り込まれた複数の画像を取得する取得処理を行う（Ｓ１０１）。 In FIG. 9, first, the annotation unit 11 of the image processing apparatus 10 performs an acquisition process of acquiring a plurality of images narrowed down by the second narrowing down process from the storage unit 13 (S101).

次に、アノテーション部１１は、Ｓ１０１において取得した複数の画像それぞれにおいて、時系列上の最後の時刻の画像から時刻を順に遡りながら、２以上の第１領域それぞれの位置を判定する判定処理を行う（Ｓ１０２）。 Next, the annotation unit 11 performs a determination process of determining the position of each of the two or more first regions while sequentially tracing back the time from the image at the last time on the time series in each of the plurality of images acquired in S101. (S102).

次に、アノテーション部１１は、Ｓ１０２において、複数の画像において２以上の第１領域それぞれの位置が走行経路中にないと判定された最初の第１時刻における第１画像を特定し、特定した第１画像における当該２以上の第１領域どうしの間の領域を第２領域として決定する決定処理を行う（Ｓ１０３）。 Next, in S102, the annotation unit 11 identifies the first image at the first first time when it is determined that the positions of the two or more first regions in the plurality of images are not on the travel route, and the identified first image is determined. A determination process is performed to determine the area between the two or more first areas in one image as the second area (S103).

次に、アノテーション部１１は、Ｓ１０３で決定した第２領域を示す第２アノテーションを付与する付与処理を行う（Ｓ１０４）。そして、アノテーション部１１は、第２アノテーションを付与した複数の画像を、学習用データとして、記憶部３０に出力する。 Next, the annotation unit 11 performs the adding process of adding the second annotation indicating the second area determined in S103 (S104). Then, the annotation unit 11 outputs the plurality of images with the second annotation to the storage unit 30 as learning data.

このようにして、画像処理装置１０は、記憶部２０に記憶されているアノテーション付与データに対して、ワーカであれば高度な認識を必要とするアノテーションを機械的にさらに付す画像処理を行い、学習用データとして、記憶部３０に出力することができる。 In this way, the image processing apparatus 10 performs image processing on the annotation data stored in the storage unit 20 by mechanically adding an annotation that requires advanced recognition by a worker, and performs learning. It can be output to the storage unit 30 as use data.

なお、上記では、画像処理装置１０は、記憶部２０に記憶されているアノテーション付与データを用いて、第１絞り込み処理（Ｓ９０）および第２絞り込み処理（Ｓ９１）を行うとして説明したが、これに限らない。すなわち、画像処理装置１０は、第１絞り込み処理（Ｓ９０）の前に、クラウドソーシングのワーカに、すべての画像に対して、画像中に存在する人物領域である第１領域を示す第１アノテーションを付与させてアノテーション付与データを生成しているとして説明したが、これに限らない。 In the above description, the image processing apparatus 10 has been described as performing the first narrowing-down process (S90) and the second narrowing-down process (S91) using the annotation-added data stored in the storage unit 20. Not exclusively. That is, before performing the first narrowing-down process (S90), the image processing device 10 causes the worker of the crowdsourcing to provide the first annotation indicating the first region, which is the human region existing in the images, to all the images. Although it has been described that the annotation-attached data is generated by being attached, the present invention is not limited to this.

つまり、画像処理装置１０は、アノテーションが付与されていない車載カメラにより撮影された時系列に連続するすべての画像を取得し、取得したすべての画像に対して第１絞り込み処理（Ｓ９０）を行ってもよい。この場合、第１絞り込み処理がされた複数の画像（第１絞り込み画像）に対して、クラウドソーシングのワーカに、画像中に存在する運動物体であることを示す領域を示すアノテーションを付与させればよい。すなわち、画像処理装置１０は、第２絞り込み処理（Ｓ９１）の前に、クラウドソーシングのワーカに、第１絞り込み処理（Ｓ９０）において選択された第１絞り込み画像に対して、当該第１絞り込み画像中に存在する人物領域である第１領域を示すアノテーションを付与させればよい。 In other words, the image processing apparatus 10 acquires all the images in time series that are captured by the vehicle-mounted camera to which the annotation is not added, and performs the first narrowing process (S90) on all the acquired images. Good. In this case, if the crowd sourcing worker is given an annotation indicating a region that is a moving object existing in the images to the plurality of images (first narrowed-down images) that have been subjected to the first narrowing-down process. Good. That is, before the second narrowing process (S91), the image processing apparatus 10 asks the worker of the crowdsourcing to perform the first narrowed image with respect to the first narrowed image selected in the first narrowing process (S90). The annotation indicating the first area, which is the person area existing in the above, may be added.

［効果等］
以上のように、本実施の形態によれば、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる画像処理方法等を実現できる。 [Effects, etc.]
As described above, according to the present embodiment, it is possible to realize an image processing method and the like that can suppress variations in quality of learning data including the plurality of images.

ここで、図１０を用いて、本実施の形態の画像処理方法等の効果について説明する。図１０は、実施の形態１の効果の説明図である。図１０の（ａ）に示される複数の画像は、車両に搭載された車載カメラにより前方（走行方向）が撮影された時系列に連続する複数の画像の一例である。図１０の（ｂ）には、図１０の（ａ）に示す複数の画像それぞれに紐づけられているブレーキ強度または加速度が示されている。なお、図２等と同様の要素には同一の符号を付しており、詳細な説明は省略する。 Here, the effects of the image processing method according to the present embodiment will be described with reference to FIG. FIG. 10 is an explanatory diagram of the effect of the first embodiment. The plurality of images shown in (a) of FIG. 10 is an example of a plurality of images that are continuous in time series in which the front (traveling direction) is captured by the vehicle-mounted camera mounted on the vehicle. FIG. 10B shows the brake strength or the acceleration associated with each of the plurality of images shown in FIG. Note that elements similar to those in FIG. 2 and the like are designated by the same reference numerals, and detailed description thereof will be omitted.

図１０の（ａ）および（ｂ）から、当該車載カメラを搭載した車両の運転手が、時刻ｔｐのフレーム１０１ｃで人物６０および人物６１が接近し、車両の前方を横切ることが見え始めて、当該車両を人物６０および人物６１にぶつけないように急ブレーキを行い、加速度が変化している様子がわかる。 From (a) and (b) of FIG. 10, the driver of the vehicle equipped with the in-vehicle camera begins to see that the person 60 and the person 61 approach each other at the frame 101c at time tp and cross the front of the vehicle. It can be seen that the vehicle is suddenly braked so as not to hit the person 60 and the person 61 and the acceleration changes.

そこで、人物６０および人物６１が接近すると車両の前方を横切ることになり車両が走行する上で危険領域となる人物６０および人物６１の間の領域を車両の運転手に通知するためには、当該危険領域をアノテーション（正解情報）として付された画像を用いて学習処理を行わせる必要がある。 Therefore, in order to notify the driver of the vehicle of the area between the person 60 and the person 61, which is a dangerous area for the vehicle to travel when the person 60 and the person 61 approach each other, the vehicle crosses in front of the vehicle. It is necessary to perform the learning process using the image in which the dangerous area is attached as the annotation (correct answer information).

しかし、人物同士が接近すると車両の前方を横切ることになり車両が走行する上で危険となるので、これらの人物の間の領域は車両は通ってはいけない危険領域であると、クラウドソーシングのワーカが認識することは、ワーカの個人差が出やすい。そのため、このような危険領域に対してアノテーションを付す作業は、ワーカの個人差が出やすい。例えば図１０の（ａ）に示される時刻ｔ１のフレーム１０１ｂを見て、人物６０および人物６１の間の領域は、人物６０および人物６１が接近すると車両の前方を横切ることになり車両が走行する上で危険領域であると認識することは、経験や次の時刻の画像との比較をする等の高度な認識を必要とするからである。 However, when people approach each other, they will cross the front of the vehicle and it will be dangerous for the vehicle to travel.Therefore, the area between these people is a dangerous area where vehicles cannot pass, and the crowdsourcing worker It is easy for workers to recognize the differences among workers. Therefore, the work of annotating such a dangerous area is likely to cause individual differences among workers. For example, looking at the frame 101b at time t1 shown in (a) of FIG. 10, the area between the person 60 and the person 61 crosses the front of the vehicle when the person 60 and the person 61 approach each other, and the vehicle runs. This is because recognizing the above as a dangerous area requires advanced recognition such as experience and comparison with an image at the next time.

一方、図１０の（ａ）に示されるすべての画像（フレーム１０１ａ〜フレーム１０１ｄ）において、人物６０および人物６１を示す第１領域を付すアノテーション作業にはクラウドソーシングのワーカの個人差は出にくい。ワーカは、画像中で見える通りに人物６０および人物６１を示すアノテーション（第１領域）を付すことができるので、高度な認識を必要としないからである。 On the other hand, in all the images (frames 101a to 101d) shown in FIG. 10A, the annotation work with the first region showing the person 60 and the person 61 is unlikely to cause individual differences in crowdsourcing workers. This is because the worker can add annotations (first area) indicating the person 60 and the person 61 as they can be seen in the image, and therefore does not require high-level recognition.

以上から、本実施の形態の画像処理方法では、車載カメラにより撮影された時系列画像において画像内に見える人物などの物体を示すアノテーションをクラウドソーシングのワーカに行わせればよい。 As described above, in the image processing method according to the present embodiment, the crowdsourcing worker may be made to perform the annotation indicating the object such as a person who is visible in the image in the time-series image captured by the vehicle-mounted camera.

そして、人物同士が接近すると車両の前方を横切ることになり車両が走行する上で危険領域となる人物同士の間の領域（第２領域）を、画像処理装置１０若しくは画像処理方法を実行するコンピュータ等の機械に行わせればよい。具体的には、まず、人物を示す２つの第１領域を示す第１アノテーションが付与され、かつ、車両に搭載された車載カメラにより撮影された時系列に連続する複数の画像であって、当該２つの第１領域が当該車両の走行経路中に存在し、第１領域どうしの距離が閾値以下である１以上の画像を少なくとも含む複数の画像に絞り込ませる。そして、当該複数の画像において、時系列上の時刻を遡りながら、当該２つの第１領域それぞれの位置が走行経路中にないと判定された最初の時刻における第１画像を特定し、特定した第１画像における当該２以上の第１領域どうしの間の領域を第２領域として決定して、危険領域（第２領域）を示す第２アノテーションを付与すればよい。 Then, when the people approach each other, they cross the front of the vehicle, and the area (second area) between the people, which is a dangerous area when the vehicle travels, is set to the image processing apparatus 10 or a computer that executes the image processing method. It may be performed by a machine such as. Specifically, first, a plurality of consecutive images in a time series, to which a first annotation indicating two first areas indicating a person is added, and which are captured by an in-vehicle camera mounted on the vehicle, Two first areas are present in the travel route of the vehicle, and the plurality of images including at least one image in which the distance between the first areas is equal to or less than the threshold value is narrowed down. Then, in the plurality of images, the first image at the first time when it is determined that the position of each of the two first regions is not in the travel route is specified while tracing back the time on the time series, and the specified first image is specified. The area between the two or more first areas in one image may be determined as the second area, and the second annotation indicating the dangerous area (second area) may be added.

以上のようにして、本実施の形態の画像処理方法等は、クラウドソーシングのワーカに、画像中に存在する人物領域である第１領域を示す第１アノテーションを付与させることができる。また、本実施の形態の画像処理方法等は、車載カメラにより撮影された複数の画像に対して、クラウドソーシングのワーカであれば高度な認識を必要とする第２領域を示す第２アノテーションを機械的に付すことができる。それにより、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる。 As described above, according to the image processing method and the like of the present embodiment, the crowd sourcing worker can be made to add the first annotation indicating the first region which is the human region existing in the image. In addition, the image processing method and the like of the present embodiment uses a second annotation that indicates a second region that requires advanced recognition for a plurality of images captured by an in-vehicle camera if a crowdsourcing worker uses a machine. Can be attached to the target. Thereby, it is possible to suppress the variation in quality of the learning data including the plurality of images.

なお、本実施の形態の画像処理方法等は、当該２つの第１領域それぞれの位置が走行経路中にないと判定された最初の時刻における第１画像を特定し、特定した第１画像における当該２以上の第１領域どうしの間の領域を第２領域として決定するとしたが、それに限らない。それぞれ人物を示す２つの第１領域とが車両の走行経路中に存在し、かつ、２つの第１領域どうしの距離が閾値以下である第２画像を特定してもよい。この場合、第２画像を含み、第２画像から所定時間前における画像（例えば第１画像）までにおいて当該２以上の第１領域どうしの間の領域を第２領域として決定してもよい。 Note that the image processing method and the like of the present embodiment specifies the first image at the first time when it is determined that the positions of the two first regions are not on the travel route, Although the area between the two or more first areas is determined as the second area, it is not limited thereto. You may specify the 2nd image in which the two 1st area|regions which each show a person exist in the driving|running route of a vehicle, and the distance between 2 1st area|regions is below a threshold value. In this case, a region including the second image and between the two or more first regions from the second image to the image (for example, the first image) a predetermined time before may be determined as the second region.

（変形例１）
図１１Ａは、実施の形態１におけるアノテーション部１１が決定する第２領域の一例を示す図である。図１１Ｂは、変形例１におけるアノテーション部１１が決定する第２領域の一例を示す図である。 (Modification 1)
FIG. 11A is a diagram showing an example of a second area determined by the annotation unit 11 according to the first embodiment. FIG. 11B is a diagram showing an example of the second region determined by the annotation unit 11 in the first modification.

実施の形態１では、図１１Ａに示すように、第２領域は２つの第１領域の間の２次元領域として説明したがこれに限らない。アノテーション部１１が取得する複数の画像に含まれる２つの第１領域が示す人物それぞれに距離情報が存在する場合には、アノテーション部１１は、図１１Ｂに示すように、２つの人物（人物領域）の間を結ぶ空間を第２領域として決定してもよい。 In the first embodiment, as shown in FIG. 11A, the second area has been described as a two-dimensional area between two first areas, but the present invention is not limited to this. When distance information exists for each of the persons indicated by the two first areas included in the plurality of images acquired by the annotation unit 11, the annotation unit 11 determines that the two persons (person areas) have the distance information as illustrated in FIG. 11B. The space connecting the two may be determined as the second region.

（変形例２）
実施の形態１では、２以上の第１領域が示す物体として、２つの第１領域が示す２つの人物領域を例に挙げて説明したが、これに限らない。２以上の第１領域は、３以上の人物領域を示すとしてもよい。本変形例では、４つの第１領域が、４人の人物領域を示す場合について説明する。 (Modification 2)
In the first embodiment, as the object indicated by the two or more first areas, the two person areas indicated by the two first areas are described as an example, but the present invention is not limited to this. The two or more first areas may indicate three or more person areas. In this modification, a case where the four first areas indicate four person areas will be described.

図１２は、変形例２におけるアノテーション部１１が取得する複数の画像の一例を示す図である。図１３は、変形例２におけるアノテーション部１１が決定する第２領域の一例を示す図である。 FIG. 12 is a diagram showing an example of a plurality of images acquired by the annotation unit 11 in the second modification. FIG. 13 is a diagram showing an example of the second area determined by the annotation unit 11 in the second modification.

変形例２におけるアノテーション部１１は、図１２に示すような、フレーム１０３ｉおよびフレーム１０３ｎを含む複数の画像を取得する。図１２に示す複数の画像のそれぞれには、道路１０３１並びに人物６２、人物６３、人物６４および人物６５が含まれている。さらに、図１２に示す複数の画像には、人物６２、人物６３、人物６４および人物６５を示す４つの第１領域（第１アノテーション）が付与されている。 The annotation unit 11 in the second modification acquires a plurality of images including the frame 103i and the frame 103n as shown in FIG. Each of the plurality of images shown in FIG. 12 includes a road 1031 and a person 62, a person 63, a person 64, and a person 65. Furthermore, four first areas (first annotations) indicating the person 62, the person 63, the person 64, and the person 65 are added to the plurality of images shown in FIG.

ここで、変形例２におけるアノテーション部１１は、図１２に示す複数の画像において、人物６２〜６５を示す４つの第１領域の位置が道路１０３１中にないと判定された最初の時刻における第１画像としてフレーム１０１ａ（不図示）を特定する。また、変形例２におけるアノテーション部１１は、人物６２〜６５を示す４つの第１領域の位置が車両の走行経路である道路１０３１中に存在し、かつ、これら第１領域どうしの距離が閾値以下である第２画像として時刻ｔ２のフレーム１０３ｎを特定する。 Here, the annotation unit 11 in the modified example 2 makes the first at the first time when it is determined that the positions of the four first areas indicating the persons 62 to 65 are not on the road 1031 in the plurality of images illustrated in FIG. 12. A frame 101a (not shown) is specified as an image. In addition, in the annotation unit 11 in the second modification, the positions of the four first areas indicating the persons 62 to 65 are present in the road 1031 that is the traveling route of the vehicle, and the distance between these first areas is equal to or less than the threshold value. The frame 103n at time t2 is specified as the second image.

そして、変形例２におけるアノテーション部１１は、図１３に示すように、例えば、第１画像であるフレーム１０３ａから第２画像であるフレーム１０３ｎまでに含まれる複数の画像であるフレーム１０３ｉにおける人物６２〜６５を示す４つの第１領域の間の領域を第２領域として決定すればよい。 Then, as shown in FIG. 13, the annotation unit 11 in the modified example 2 includes, for example, the person 62 to 62 in the frame 103i which is a plurality of images included in the frame 103a which is the first image to the frame 103n which is the second image. The area between the four first areas indicating 65 may be determined as the second area.

このようにして、本変形例の画像処理方法等は、３以上の第１領域が３以上の人物領域を示す場合であっても、同様に、車両が走行する上での危険領域である第２領域を機械的に決定することができ、当該第２領域を示す第２アノテーションを機械的に付すことができる。 In this way, the image processing method and the like of the present modified example similarly, even when the three or more first areas indicate three or more person areas, the first is a dangerous area when the vehicle travels. Two regions can be mechanically determined, and a second annotation indicating the second region can be mechanically added.

（変形例３）
実施の形態１および変形例１、２では、第１領域は人物を示すとして説明したがこれに限らない。第１領域が示す物体が、駐停車中の自動車であってもよい。本変形例では、２つの第１領域の一方が人物領域であり、他方が駐停車中の自動車を示す自動車領域であるとして、図１４および図１５を用いて説明する。 (Modification 3)
In the first embodiment and the first and second modifications, the first area is described as a person, but the present invention is not limited to this. The object indicated by the first area may be an automobile parked or stopped. In the present modification, one of the two first areas is a person area, and the other is an automobile area indicating a parked vehicle, which will be described with reference to FIGS. 14 and 15.

図１４は、変形例３におけるアノテーション部１１が取得する複数の画像の一例を示す図である。図１５は、変形例３におけるアノテーション部１１が決定する第２領域の一例を示す図である。 FIG. 14 is a diagram showing an example of a plurality of images acquired by the annotation unit 11 in the modified example 3. FIG. 15 is a diagram showing an example of the second region determined by the annotation unit 11 in Modification 3.

変形例３におけるアノテーション部１１は、図１４に示すフレーム１０４ａ、…、フレーム１０４ｉ、…、フレーム１０４ｎを含む複数の画像を取得する。図１４に示す複数の画像のそれぞれには、道路１０４１並びに自動車６６および人物６７が含まれている。さらに、図１４に示す複数の画像には、自動車６６および人物６７を示す２つの第１領域（第１アノテーション）が付与されている。 The annotation unit 11 in Modification 3 acquires a plurality of images including the frames 104a,..., The frame 104i,. Each of the plurality of images shown in FIG. 14 includes the road 1041, the automobile 66, and the person 67. Furthermore, two first regions (first annotation) showing the automobile 66 and the person 67 are added to the plurality of images shown in FIG.

変形例３におけるアノテーション部１１は、図１４に示す複数の画像において、人物６７を示す第１領域の位置が道路１０４１中にないと判定された最初の時刻における第１画像として時刻ｔ１のフレーム１０４ａを特定する。また、変形例４におけるアノテーション部１１は、自動車６６および人物６７を示す２つの第１領域の位置が車両の走行経路である道路１０４１中に存在し、かつ、当該２つの第１領域どうしの距離が閾値以下である第２画像として時刻ｔ２のフレーム１０４ｎを特定する。 The annotation unit 11 in the modified example 3 uses the frame 104a at time t1 as the first image at the first time when it is determined that the position of the first region indicating the person 67 is not on the road 1041 in the plurality of images shown in FIG. Specify. In addition, in the annotation unit 11 in the modified example 4, the positions of the two first regions indicating the automobile 66 and the person 67 are present in the road 1041 that is the traveling route of the vehicle, and the distance between the two first regions is the same. The frame 104n at the time t2 is specified as the second image in which is less than or equal to the threshold.

そして、変形例３におけるアノテーション部１１は、図１５のフレーム１０４ｉ示すように、例えば、第１画像であるフレーム１０４ａから第２画像であるフレーム１０４ｎまでに含まれる複数の画像（フレーム１０４ａ〜１０４ｎ）における自動車６６および人物６７を示す２つの第１領域の間の領域を第２領域として決定すればよい。 Then, the annotation unit 11 in the modified example 3, as shown in the frame 104i in FIG. 15, includes, for example, a plurality of images (frames 104a to 104n) included in the frame 104a that is the first image to the frame 104n that is the second image. The area between the two first areas indicating the automobile 66 and the person 67 in 1 may be determined as the second area.

このようにして、本変形例の画像処理方法等は、２つの第１領域が示す物体のうち一方が駐停車中の自動車であっても、上記と同様に、車両が走行する前方を人が横切ることになり、それらの間を車両が通ると衝突することになる危険領域である第２領域を機械的に決定することができ、当該第２領域を示す第２アノテーションを機械的に付すことができる。 Thus, even if one of the objects indicated by the two first areas is a parked vehicle, the image processing method according to the present modified example is similar to the above in that a person is traveling in front of the vehicle. It is possible to mechanically determine a second area that is a dangerous area that will be crossed and will collide when a vehicle passes between them, and to mechanically attach a second annotation indicating the second area. You can

（変形例４）
上記の実施の形態１および変形例１〜変形例３では、アノテーション部１１が第２領域を決定し、決定した第２領域を示す第２アノテーションを付すことについて説明したが、これに限らない。アノテーション部１１は、走行中の車両にとっての危険領域である第２領域を決定することに加えて、第２領域の危険度をさらに決定してもよい。この場合、アノテーション部１１は、第２領域を示すことに加えてその危険度を示す第２アノテーションを付与すればよい。以下、第２領域の危険度の決定方法等について具体的に説明する。 (Modification 4)
Although the annotation unit 11 determines the second region and attaches the second annotation indicating the determined second region in the first embodiment and the first to third variations described above, the present invention is not limited to this. The annotation unit 11 may further determine the degree of risk of the second area in addition to determining the second area that is a dangerous area for the running vehicle. In this case, the annotation unit 11 may add the second annotation indicating the degree of risk in addition to indicating the second area. Hereinafter, a method of determining the degree of danger of the second area and the like will be specifically described.

＜第１例：危険度の決定方法＞
図１６は、変形例４の第１例におけるアノテーション部１１が付与する第２アノテーションの一例を示す図である。 <First example: risk determination method>
FIG. 16 is a diagram illustrating an example of the second annotation added by the annotation unit 11 in the first example of the modified example 4.

変形例４の第１例におけるアノテーション部１１は、図１４に示す複数の画像を取得し、図１５に示すように、第１領域が示す物体同士が接近すると車両の前方を横切ることになり、当該車両と衝突する可能性があるので車両が走行する上での危険領域である第２領域を決定したとする。なお、アノテーション部１１が第２領域を決定する動作については、変形例３で説明したのでここでの説明は説明する。 The annotation unit 11 in the first example of the modified example 4 acquires a plurality of images shown in FIG. 14, and as shown in FIG. 15, when the objects indicated by the first region approach each other, the annotation unit 11 crosses the front of the vehicle. It is assumed that the second area, which is a dangerous area in which the vehicle travels, is determined because the vehicle may collide with the vehicle. Note that the operation of the annotation unit 11 to determine the second area has been described in Modification 3, and thus the description here will be described.

本変形例では、さらに、アノテーション部１１は、決定した第２領域の面積に応じて危険度を決定する。より具体的には、アノテーション部１１は、第２領域の大きさが小さいほど高い値となる危険度を決定する。第２領域の面積が小さいほど、第１領域に示される自動車６６および人物６７の間を車両が走行すると自動車６６および人物６７と衝突する可能性が高いため車両は通ってはいけないからである。なお、第２領域の面積が所定の面積以下である場合には、危険度１．０（危険度１００％）と決定してもよい。 In this modification, the annotation unit 11 further determines the risk level according to the determined area of the second region. More specifically, the annotation unit 11 determines the degree of risk that becomes higher as the size of the second area is smaller. This is because the smaller the area of the second region, the higher the possibility of collision with the vehicle 66 and the person 67 when the vehicle travels between the vehicle 66 and the person 67 shown in the first region, and therefore the vehicle cannot pass through. If the area of the second region is equal to or smaller than the predetermined area, the risk level may be determined to be 1.0 (100% risk level).

そして、アノテーション部１１は、決定された第２領域と当該第２領域の危険度とを示す第２アノテーションを付与する。より具体的には、変形例４の第１例では、アノテーション部１１は、第２領域を示す第２アノテーションに、さらに、第２領域の面積が小さいほど高い値を示す危険度を含める。例えば図１６に示す例では、アノテーション部１１は、時刻ｔ１におけるフレーム１０４ａの第２領域には、危険度０．７を示す第２アノテーションを付与し、時刻ｔｉにおけるフレーム１０４ｉの第２領域には、危険度１．０を示す第２アノテーションを付与している。なお、フレーム１０４ｉの第２領域を車両が通ると、確実に人物６７と衝突することから、危険度１．０を示す第２アノテーションを付与している。 Then, the annotation unit 11 adds a second annotation indicating the determined second area and the degree of risk of the second area. More specifically, in the first example of the modified example 4, the annotation unit 11 further includes, in the second annotation indicating the second region, a risk degree indicating a higher value as the area of the second region is smaller. For example, in the example illustrated in FIG. 16, the annotation unit 11 adds the second annotation indicating the degree of danger of 0.7 to the second area of the frame 104a at time t1, and sets the second area of the frame 104i at time ti to the second area. , A second annotation indicating a risk level of 1.0 is added. When the vehicle passes through the second area of the frame 104i, the vehicle 67 will definitely collide with the person 67, and thus the second annotation indicating the risk of 1.0 is added.

＜第２例：危険度の決定方法＞
図１７は、変形例４の第２例におけるアノテーション部１１が付与する第２アノテーションの一例を示す図である。 <Second example: risk determination method>
FIG. 17 is a diagram illustrating an example of the second annotation added by the annotation unit 11 in the second example of the modified example 4.

変形例４の第２例におけるアノテーション部１１も、図１４に示す複数の画像を取得し、図１５に示すように、第１領域が示す物体同士が接近すると車両の前方を横切ることになり、当該車両と衝突する可能性があるので、車両が走行する上での危険領域である第２領域を決定したとする。なお、アノテーション部１１が第２領域を決定する動作については、変形例３で説明したのでここでの説明も説明する。 The annotation unit 11 in the second example of the modified example 4 also acquires the plurality of images shown in FIG. 14, and as shown in FIG. 15, when the objects indicated by the first region approach each other, the object crosses the front of the vehicle, Since there is a possibility of collision with the vehicle, it is assumed that the second area, which is a dangerous area when the vehicle travels, is determined. Note that the operation of the annotation unit 11 to determine the second region has been described in Modification 3, and thus the description here will also be described.

本変形例では、さらに、アノテーション部１１は、決定した第２領域の危険度を当該第２領域内で重み付けて決定する。より具体的には、アノテーション部１１は、第２領域内を２つに分け、より大きく移動する人物等を示す一方の第１領域側の領域を、他方の第１領域側の領域よりも高い値となるように重み付けた危険度を決定する。車両は、走行経路を通る走行する上で、大きく移動する人物に衝突する可能性が高いと言えるからである。 In the present modification, the annotation unit 11 further determines the degree of risk of the determined second area by weighting it within the second area. More specifically, the annotation unit 11 divides the second area into two areas, and the area on the side of one first area indicating a person or the like who is moving more is higher than the area on the side of the other first area. The risk is weighted so that the value becomes a value. This is because it can be said that the vehicle has a high possibility of colliding with a person who moves significantly when traveling along the traveling route.

そして、アノテーション部１１は、決定された第２領域と当該第２領域の危険度とを示す第２アノテーションを付与する。より具体的には、変形例４の第２例では、アノテーション部１１は、第２領域を示す第２アノテーションに、さらに、第２領域を構成する一方側領域および他方側領域に異なる危険度であって、第２領域を挟む２つの第１領域のうち移動の大きさが大きい第１領域がある側の一方側領域または他方側領域の方が高い値となる危険度を含める。例えば図１７に示す例では、アノテーション部１１は、時刻ｔ１におけるフレーム１０４ａの第２領域のうちの人物６７近傍の領域には、危険度１．０を示す第２アノテーションを付与し、当該第２領域のうちの自動車６６近傍の領域には、危険度０．７を示す第２アノテーションを付与している。 Then, the annotation unit 11 adds a second annotation indicating the determined second area and the degree of risk of the second area. More specifically, in the second example of the modified example 4, the annotation unit 11 uses the second annotation indicating the second region with different risk levels for the one side region and the other side region forming the second region. Therefore, of the two first areas sandwiching the second area, the degree of risk that the one side area or the other side area where the first area having a large movement is located has a higher value is included. For example, in the example illustrated in FIG. 17, the annotation unit 11 adds the second annotation indicating the risk level of 1.0 to the area near the person 67 in the second area of the frame 104a at time t1, and the second area A second annotation indicating a risk level of 0.7 is added to the area near the automobile 66 in the area.

ここで、アノテーション部１１は、時刻ｔｉにおけるフレーム１０４ｉの第２領域全体に対して危険度１．０を示す第２アノテーションを付与している。これは上記の人物６７近傍の領域の面積が第２領域の面積以下になったためである。なお、第２領域の面積が所定の面積以下である場合には、上記の重み付けた危険度を付与せず、均一の危険度を付与しているとしてもよい。 Here, the annotation unit 11 adds the second annotation indicating the risk level of 1.0 to the entire second area of the frame 104i at the time ti. This is because the area of the area in the vicinity of the person 67 has become equal to or smaller than the area of the second area. When the area of the second region is equal to or smaller than the predetermined area, the weighted risk may not be given, and a uniform risk may be given.

以上のように、本変形例の画像処理方法等によれば、車両が走行する上で危険な危険領域となる第２領域を示す第２アノテーションに、さらに車両が走行する上での当該第２領域の危険度を含めることができる。 As described above, according to the image processing method and the like of the present modification, the second annotation indicating the second area, which is a dangerous area when the vehicle travels, is added to the second annotation when the vehicle travels further. Area risk can be included.

（実施の形態２）
実施の形態１では、車載カメラにより撮影された時系列画像において画像内に見える人物などの物体を示すアノテーションをクラウドソーシングのワーカに行わせるとして説明した。しかし、ワーカの作業品質は一定ではないので、車載カメラにより撮影された時系列画像のうち一部の画像において、人物などの物体が画像内に見えていても当該物体があることを示す第１領域を示すアノテーションが付されていない場合も考えられる。 (Embodiment 2)
In the first embodiment, it has been described that the crowdsourcing worker is made to perform the annotation indicating the object such as a person who is visible in the image in the time-series image captured by the vehicle-mounted camera. However, since the work quality of the worker is not constant, in some images among the time-series images taken by the vehicle-mounted camera, even if an object such as a person is visible in the image, the first It is also possible that an annotation indicating the area is not added.

以下、この場合について実施の形態２として実施の形態１と異なるところを中心に説明する。 Hereinafter, this case will be described as a second embodiment focusing on the points different from the first embodiment.

［画像処理装置１０Ａの構成］
実施の形態２に係る画像処理装置１０Ａは、実施の形態１に係る画像処理装置１０と比較して、アノテーション部１１Ａの判定部１１２Ａの構成が異なる。それ以外の構成は、実施の形態１に係る画像処理装置１０と同様のため説明は省略する。 [Configuration of Image Processing Device 10A]
The image processing apparatus 10A according to the second embodiment is different from the image processing apparatus 10 according to the first embodiment in the configuration of the determination unit 112A of the annotation unit 11A. The rest of the configuration is the same as that of the image processing apparatus 10 according to the first embodiment, and therefore its description is omitted.

［判定部１１２Ａ］
図１８は、実施の形態２における判定部１１２の詳細構成の一例を示す図である。 [Determination unit 112A]
FIG. 18 is a diagram illustrating an example of a detailed configuration of the determination unit 112 according to the second embodiment.

判定部１１２Ａは、取得部１１１が取得した複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、第１アノテーションが付与されていない最初の画像を判定する。 The determination unit 112A determines the first image, to which the first annotation is not added, in the plurality of images acquired by the acquisition unit 111 by sequentially tracing back the time from the image at the last time in the time series.

本実施の形態では、判定部１１２Ａは、取得部１１１が取得した複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、第１アノテーションが付与されていない最初の画像を判定する。判定部１１２Ａは、判定した最初の画像の第３時刻の時系列上の次の時刻における画像中の第１領域を、第３時刻の画像から時刻を時系列順に遡りながら、前記車両の移動方向と垂直方向にずらした当該画像それぞれの中の位置に第１領域が存在するか否かを画像処理により判定する。 In the present embodiment, the determination unit 112A, in the plurality of images acquired by the acquisition unit 111, sequentially traces the time from the image at the last time on the time series, and determines the first image to which the first annotation is not added. judge. The determination unit 112A traces the first area in the image at the next time on the time series of the third time of the determined first image, tracing the time from the image at the third time in chronological order while moving in the moving direction of the vehicle. It is determined by image processing whether or not the first region exists at the position in each of the images shifted in the vertical direction.

［画像処理装置１０Ａの動作］
次に、以上のように構成された画像処理装置１０Ａの動作について、図１９〜図２１を用いて説明する。 [Operation of Image Processing Device 10A]
Next, the operation of the image processing apparatus 10A configured as described above will be described with reference to FIGS.

図１９は、実施の形態２における画像処理装置１０Ａの判定部１１２Ａの動作を示すフローチャートである。図２０は、実施の形態２における取得部１１１が取得する複数の画像の一例を示す図である。図２１は、図２０に示す複数の画像に対して実施の形態２における判定部１１２Ａが行う画像処理の説明図である。なお、図２〜図４と同様の要素には同一の符号を付しており、詳細な説明は省略する。 FIG. 19 is a flowchart showing the operation of the determination unit 112A of the image processing apparatus 10A according to the second embodiment. FIG. 20 is a diagram showing an example of a plurality of images acquired by the acquisition unit 111 according to the second embodiment. FIG. 21 is an explanatory diagram of image processing performed by the determination unit 112A in the second embodiment on the plurality of images shown in FIG. The same elements as those in FIGS. 2 to 4 are designated by the same reference numerals, and detailed description thereof will be omitted.

まず、画像処理装置１０Ａの取得部１１１は、記憶部２０から、アノテーション付与データである複数の画像を取得する。本実施の形態では、取得部１１１が取得する複数の画像の一部の画像において、人物６０または人物６１が画像内に見えていても人物６０または人物６１があることを示す第１領域（第１アノテーション）が付されていない。図２０に示す例では、一部の画像（フレーム１０１ａ、フレーム１０１ｂ）において人物６０または人物６１が画像（フレーム）内に見えていても第１領域が付されていない。 First, the acquisition unit 111 of the image processing apparatus 10A acquires a plurality of images, which are annotation data, from the storage unit 20. In the present embodiment, in a part of the plurality of images acquired by the acquisition unit 111, the first area (first area) indicating that the person 60 or the person 61 is present even if the person 60 or the person 61 is visible in the image (first 1 Annotation) is not added. In the example shown in FIG. 20, the first region is not added even if the person 60 or the person 61 is visible in the image (frame) in some images (frame 101a, frame 101b).

次に、判定部１１２Ａは、取得部１１１が取得した複数の画像において、時系列上の最後の時刻の画像から時刻を順に遡りながら、第１アノテーションが付与されていない最初の画像を判定する第１判定処理を行う（Ｓ１０２１）。例えば、判定部１１２Ａは、図２０に示す複数の画像（フレーム１０１ａ〜フレーム１０１ｄ）において、時系列上の最後の時刻の画像であるフレーム１０１ｄから時刻を順に遡りながら、第１アノテーションすなわち第１領域が付与されていない最初の画像であるフレーム１０１ｂを判定する。 Next, the determination unit 112A determines the first image to which the first annotation is not added while sequentially tracing back the time from the image at the last time on the time series in the plurality of images acquired by the acquisition unit 111. 1 determination processing is performed (S1021). For example, in the plurality of images (frames 101a to 101d) illustrated in FIG. 20, the determination unit 112A traces the time sequentially from the frame 101d, which is the last time image in the time series, and the first annotation, that is, the first region. The frame 101b which is the first image to which is not added is determined.

次に、判定部１１２Ａは、判定した最初の画像の第３時刻の時系列上の次の時刻における画像中の第１領域を、第３時刻の画像から時刻を時系列順に遡りながら、車両の移動方向と垂直方向にずらした当該画像それぞれの中の位置に第１領域が存在するかを画像処理により判定する第２判定処理を行う（Ｓ１０２２）。例えば、図２１に示すように、判定部１１２Ａは、フレーム１０１ｂの時刻ｔ３（第３時刻）の時系列上の次の時刻ｔ４におけるフレーム１０１ｃ中の第１領域を、時刻ｔ３のフレーム１０１ｂから時刻を時系列順に遡りながら、車両の移動方向と垂直方向にずらした当該画像（フレーム１０１ｂ〜フレーム１０１ａ）それぞれの中の位置に第１領域が存在するかを画像処理により判定する。図２１に示す例では、判定部１１２Ａは、フレーム１０１ａ〜フレーム１０１ｂにおいて画像処理により第１領域が存在すると判定している。 Next, the determination unit 112A traces the first area in the image at the next time on the time series of the third time of the determined first image, tracing back the time from the image at the third time in chronological order, and A second determination process is performed by image processing to determine whether or not the first region is present at a position in each of the images shifted in the direction perpendicular to the moving direction (S1022). For example, as illustrated in FIG. 21, the determination unit 112A sets the first area in the frame 101c at the next time t4 on the time series of the time t3 (third time) of the frame 101b from the frame 101b at the time t3 to the time. While going back in chronological order, it is determined by image processing whether or not the first region exists at the position in each of the images (frames 101b to 101a) shifted in the direction perpendicular to the moving direction of the vehicle. In the example illustrated in FIG. 21, the determination unit 112A determines that the first area exists in the frames 101a to 101b by image processing.

このようにして、判定部１１２Ａは、取得部１１１が取得した複数の画像のうち、第１アノテーションの無い画像に対して、さらに、画像処理により人物等を示す第１領域の有無を判定する。 In this way, the determination unit 112A further determines the presence/absence of the first region indicating a person or the like by image processing with respect to the image without the first annotation among the plurality of images acquired by the acquisition unit 111.

［効果等］
以上のように、本実施の形態によれば、車載カメラにより撮影された複数の画像の一部の画像において、高度な認識を必要としない第１領域を示す第１アノテーションが付されていない場合でも、複数の画像（映像）を巻き戻しながら第１領域を追跡することで、当該第１領域の有無を画像認識で機械的に判定することができる。つまり、一部の画像に付されているべき第１領域を示す第１アノテーションが付されていない場合でも、画像処理により当該一部の画像に第１領域が有るか否か判定することができる。これにより、車載カメラにより撮影された複数の画像に対して、高度な認識を必要とする第２領域を示す第２アノテーションを機械的に付すことができるので、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる画像処理方法等を実現できる。 [Effects, etc.]
As described above, according to the present embodiment, when a part of the plurality of images captured by the vehicle-mounted camera is not annotated with the first annotation indicating the first region that does not require advanced recognition However, by tracking the first area while rewinding a plurality of images (videos), the presence or absence of the first area can be mechanically determined by image recognition. That is, even if the first annotation indicating the first region that should be attached to some images is not attached, it is possible to determine whether or not the first region is present in some images by image processing. .. As a result, it is possible to mechanically attach a second annotation indicating a second region that requires a high degree of recognition to a plurality of images captured by the vehicle-mounted camera, and thus learning data including the plurality of images. It is possible to realize an image processing method and the like that can suppress the variation in the quality of the image.

（実施の形態３）
実施の形態１では、車載カメラにより撮影された時系列画像において画像内に見える人物などの物体を示すアノテーションをクラウドソーシングのワーカに行わせるとして説明したが、これに限らない。当該時系列画像に対して人物等を示す第１領域およびその第１領域を示す第１アノテーションをワーカではなく、画像処理装置が付すとしてもよい。 (Embodiment 3)
In the first embodiment, it has been described that the crowdsourcing worker is made to perform the annotation indicating the object such as a person who can be seen in the image in the time-series image captured by the vehicle-mounted camera, but the invention is not limited to this. The image processing apparatus may attach the first area indicating a person or the like and the first annotation indicating the first area to the time-series image, instead of the worker.

以下、この場合について実施の形態３として実施の形態１と異なるところを中心に説明する。 Hereinafter, in this case, the third embodiment will be described focusing on the differences from the first embodiment.

［画像処理装置１０Ｂの構成］
図２２は、実施の形態３における画像処理装置１０Ｂの機能構成の一例を示す図である。なお、図１等と同様の要素には同一の符号を付しており、詳細な説明は省略する。 [Configuration of Image Processing Device 10B]
FIG. 22 is a diagram showing an example of the functional configuration of the image processing apparatus 10B according to the third embodiment. The same elements as those in FIG. 1 and the like are designated by the same reference numerals, and detailed description thereof will be omitted.

図２２に示す画像処理装置１０Ｂは、実施の形態１に係る画像処理装置１０と比較して、アノテーション付与部１４Ｂおよび記憶部２０Ｂとが追加されている点で構成が異なる。それ以外の構成は、実施の形態１に係る画像処理装置１０と同様のため説明は省略する。 The image processing apparatus 10B shown in FIG. 22 is different in configuration from the image processing apparatus 10 according to the first embodiment in that an annotation adding unit 14B and a storage unit 20B are added. The rest of the configuration is the same as that of the image processing apparatus 10 according to the first embodiment, and therefore its description is omitted.

記憶部４０は、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成されている。記憶部４０は、車載カメラにより撮影された映像データ（時系列画像）を記憶ている。 The storage unit 40 includes an HDD (Hard Disk Drive), a memory, and the like. The storage unit 40 stores video data (time-series images) taken by the vehicle-mounted camera.

アノテーション付与部１４Ｂは、記憶部４０に記憶されている車載カメラにより撮影された映像データ（時系列画像）を取得する。アノテーション付与部１４Ｂは、取得した映像データ（時系列画像）に対して、画像処理を行うことにより画像内に見えている人物などの物体を示す第１領域およびその第１領域を示すアノテーションを付す。アノテーション付与部１４Ｂは、第１アノテーションを付した映像データ（時系列画像）をアノテーション付与データとして記憶部２０Ｂに出力する。 The annotation adding unit 14B acquires the video data (time-series images) captured by the vehicle-mounted camera stored in the storage unit 40. The annotation giving unit 14B attaches an annotation indicating the first region indicating an object such as a person who is seen in the image and an annotation indicating the first region to the acquired video data (time-series image) by performing image processing. .. The annotation giving unit 14B outputs the video data (time-series image) with the first annotation to the storage unit 20B as annotation giving data.

記憶部２０Ｂは、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）やメモリ等で構成されている。記憶部４０は、アノテーション付与部１４Ｂにより第１アノテーションが付されたデータ（アノテーション付与データ）を記憶する。 The storage unit 20B includes an HDD (Hard Disk Drive), a memory, and the like. The storage unit 40 stores the data (annotation data) to which the first annotation is added by the annotation addition unit 14B.

［効果等］
以上のように、本実施の形態によれば、車載カメラにより撮影された映像データ（時系列画像）において、高度な認識を必要としない人物等を示す第１領域およびその第１領域を示すアノテーションを、クラウドソーシングのワーカではなく、機械的に（画像処理装置１０Ｂが）付すことができる。そして、車載カメラにより撮影された複数の画像に対して、さらに、高度な認識を必要とする第２領域を示す第２アノテーションを機械的に付すことができる。 [Effects, etc.]
As described above, according to the present embodiment, in the video data (time-series image) captured by the vehicle-mounted camera, the first area indicating a person or the like that does not require advanced recognition and the annotation indicating the first area. Can be attached mechanically (by the image processing apparatus 10B) instead of a crowdsourcing worker. Then, the plurality of images captured by the vehicle-mounted camera can be further mechanically attached with the second annotation indicating the second region that requires a high degree of recognition.

このようにして、本実施の形態の画像処理方法等によれば、当該複数の画像を含む学習用データの品質のばらつきを抑制することができる画像処理方法等を実現できる。 In this way, according to the image processing method and the like of the present embodiment, it is possible to realize an image processing method and the like that can suppress variation in quality of learning data including the plurality of images.

以上、本発明の一つまたは複数の態様に係る画像処理方法等について、実施の形態に基づいて説明したが、本発明は、この実施の形態に限定されるものではない。本発明の趣旨を逸脱しない限り、当業者が思いつく各種変形を本実施の形態に施したものや、異なる実施の形態における構成要素を組み合わせて構築される形態も、本発明の一つまたは複数の態様の範囲内に含まれてもよい。例えば、以下のような場合も本発明に含まれる。 Although the image processing method and the like according to one or more aspects of the present invention have been described above based on the embodiment, the present invention is not limited to this embodiment. As long as it does not depart from the gist of the present invention, various modifications made by those skilled in the art may be applied to the present embodiment, or a configuration constructed by combining components in different embodiments may be one or more of the present invention. It may be included in the range of the aspect. For example, the following cases are also included in the present invention.

（１）上記の各装置は、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭ、ハードディスクユニット、ディスプレイユニット、キーボード、マウスなどから構成されるコンピュータシステムである。前記ＲＡＭまたはハードディスクユニットには、コンピュータプログラムが記憶されている。前記マイクロプロセッサが、前記コンピュータプログラムにしたがって動作することにより、各装置は、その機能を達成する。ここでコンピュータプログラムは、所定の機能を達成するために、コンピュータに対する指令を示す命令コードが複数個組み合わされて構成されたものである。 (1) Each of the above devices is specifically a computer system including a microprocessor, a ROM, a RAM, a hard disk unit, a display unit, a keyboard, a mouse, and the like. A computer program is stored in the RAM or the hard disk unit. Each device achieves its function by the microprocessor operating according to the computer program. Here, the computer program is configured by combining a plurality of instruction codes indicating instructions to the computer in order to achieve a predetermined function.

（２）上記の各装置を構成する構成要素の一部または全部は、１個のシステムＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ：大規模集積回路）から構成されているとしてもよい。システムＬＳＩは、複数の構成部を１個のチップ上に集積して製造された超多機能ＬＳＩであり、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどを含んで構成されるコンピュータシステムである。前記ＲＡＭには、コンピュータプログラムが記憶されている。前記マイクロプロセッサが、前記コンピュータプログラムにしたがって動作することにより、システムＬＳＩは、その機能を達成する。 (2) Some or all of the constituent elements of each of the above devices may be configured by one system LSI (Large Scale Integration). The system LSI is a super-multifunctional LSI manufactured by integrating a plurality of constituent parts on one chip, and specifically, is a computer system including a microprocessor, ROM, RAM and the like. .. A computer program is stored in the RAM. The system LSI achieves its function by the microprocessor operating according to the computer program.

（３）上記の各装置を構成する構成要素の一部または全部は、各装置に脱着可能なＩＣカードまたは単体のモジュールから構成されているとしてもよい。前記ＩＣカードまたは前記モジュールは、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどから構成されるコンピュータシステムである。前記ＩＣカードまたは前記モジュールは、上記の超多機能ＬＳＩを含むとしてもよい。マイクロプロセッサが、コンピュータプログラムにしたがって動作することにより、前記ＩＣカードまたは前記モジュールは、その機能を達成する。このＩＣカードまたはこのモジュールは、耐タンパ性を有するとしてもよい。 (3) Some or all of the constituent elements of each of the above devices may be configured with an IC card that can be attached to and detached from each device or a single module. The IC card or the module is a computer system including a microprocessor, ROM, RAM and the like. The IC card or the module may include the above super-multifunctional LSI. The IC card or the module achieves its function by the microprocessor operating according to the computer program. This IC card or this module may be tamper resistant.

（４）本開示は、上記に示す方法であるとしてもよい。また、これらの方法をコンピュータにより実現するコンピュータプログラムであるとしてもよいし、前記コンピュータプログラムからなるデジタル信号であるとしてもよい。 (4) The present disclosure may be the methods described above. Further, it may be a computer program that realizes these methods by a computer, or may be a digital signal including the computer program.

（５）また、本開示は、前記コンピュータプログラムまたは前記デジタル信号をコンピュータで読み取り可能な記録媒体、例えば、フレキシブルディスク、ハードディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＢＤ（Ｂｌｕ−ｒａｙ（登録商標）Ｄｉｓｃ）、半導体メモリなどに記録したものとしてもよい。また、これらの記録媒体に記録されている前記デジタル信号であるとしてもよい。 (5) Further, according to the present disclosure, a computer-readable recording medium for reading the computer program or the digital signal, for example, a flexible disk, a hard disk, a CD-ROM, a MO, a DVD, a DVD-ROM, a DVD-RAM, a BD ( It may be recorded on a Blu-ray (registered trademark) Disc), a semiconductor memory, or the like. Further, the digital signal recorded on these recording media may be used.

（６）また、本開示は、前記コンピュータプログラムまたは前記デジタル信号を、電気通信回線、無線または有線通信回線、インターネットを代表とするネットワーク、データ放送等を経由して伝送するものとしてもよい。 (6) Further, the present disclosure may transmit the computer program or the digital signal via an electric communication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast, or the like.

（７）また、本開示は、マイクロプロセッサとメモリを備えたコンピュータシステムであって、前記メモリは、上記コンピュータプログラムを記憶しており、前記マイクロプロセッサは、前記コンピュータプログラムにしたがって動作するとしてもよい。 (7) Further, the present disclosure may be a computer system including a microprocessor and a memory, wherein the memory stores the computer program, and the microprocessor operates according to the computer program. ..

（８）また、前記プログラムまたは前記デジタル信号を前記記録媒体に記録して移送することにより、または前記プログラムまたは前記デジタル信号を、前記ネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしてもよい。 (8) Another computer independent by recording the program or the digital signal in the recording medium and transferring the program or by transferring the program or the digital signal via the network or the like. It may be implemented by a system.

本発明は、画像処理方法、画像処理装置およびそのプログラムに利用できる。特に、車両が走行する上で人物同士が接近すると車両の前方を横切ることになり、車両と衝突する可能性のある危険領域を機械学習させる際に用いられる学習用データを、品質にばらつきがなく作成するための画像処理方法、画像処理装置およびそのプログラムに利用可能である。 The present invention can be used for an image processing method, an image processing device, and a program thereof. In particular, the quality of the learning data used when performing machine learning on the dangerous area where a vehicle may collide with each other will cross the front of the vehicle when the vehicles approach each other, and there is no variation in quality. The image processing method, the image processing apparatus, and the program therefor can be used.

１０、１０Ａ、１０Ｂ画像処理装置
１１、１１Ａアノテーション部
１２絞り込み部
１３、２０、２０Ｂ、３０、４０記憶部
１４Ｂアノテーション付与部
６０、６１、６２、６３、６４、６５、６７人物
６６自動車
１０１ａ、１０１ｂ、１０１ｃ、１０１ｄ、１０２、１０２Ａ、１０３ｉ、１０３ｎ、１０４ａ、１０４ｉ、１０４ｎフレーム
１１１取得部
１１２、１１２Ａ判定部
１１３決定部
１１４付与部
１２１第１絞り込み部
１２２第２絞り込み部
１０１１、１０２１、１０３１、１０４１道路 10, 10A, 10B Image processing device 11, 11A Annotation part 12 Narrowing down part 13, 20, 20B, 30, 40 Storage part 14B Annotation giving part 60, 61, 62, 63, 64, 65, 67 Person 66 Car 101a, 101b , 101c, 101d, 102, 102A, 103i, 103n, 104a, 104i, 104n Frame 111 Acquisition unit 112, 112A Judgment unit 113 Determining unit 114 Assigning unit 121 First narrowing unit 122 Second narrowing unit 1011, 1021, 1031, 1041 road

Claims

A plurality of consecutive images taken in-series by a vehicle-mounted camera provided with a first annotation indicating at least one first area, at least one of which is a person area. An acquisition step of acquiring a plurality of images including at least one image in which the first region exists in the travel route of the vehicle and the distance between the first regions is equal to or less than a threshold value;
In the plurality of images acquired in the acquisition step, a determination step of determining the position of each of the two or more first regions while sequentially going back from the image of the last time on the time series,
Of the plurality of images, the first image at the first first time when the position of each of the two or more first regions is determined not to be on the travel route in the determination step is identified and specified. A determination step of determining a region between the two or more first regions in the above as a second region,
An assigning step of assigning a second annotation indicating the second area determined in the determining step,
Image processing method.

In the determining step,
A second image in which the two or more first regions are present in the travel route of the vehicle and the distance between the first regions is equal to or less than the threshold value is specified.
An area between the two or more first areas in a plurality of time-series consecutive images included in the specified first image to the second image is determined as the second area,
The image processing method according to claim 1.

The image processing method further comprises
Of all the images taken in chronological order taken by the vehicle-mounted camera mounted on the vehicle and associated with the information indicating the brake strength or acceleration of the vehicle, the brake strength of the vehicle Or a first narrowing step of selecting a first narrowed image which is a plurality of images from a time point when the acceleration is larger than a threshold value to a time point before a certain period,
A second narrowing step of selecting the plurality of images from the first narrowed images selected in the first narrowing step.
The image processing method according to claim 1.

The image processing method further comprises
Before the first narrowing step, a worker step of causing a worker of crowdsourcing to attach a first annotation indicating the first region existing in the image to all the images,
The image processing method according to claim 3.

The image processing method further comprises
Before the second narrowing step, a crowdsourcing worker first indicates the first region existing in the first narrowed image with respect to the first narrowed image selected in the first narrowing step. Including a worker step to add annotations,
The image processing method according to claim 4.

Each of the two or more first areas is a person area indicating a person,
The image processing method according to claim 1.

The two or more first areas include a person area indicating a person and an automobile area indicating a parked vehicle.
The image processing method according to claim 1.

The second area is a dangerous area that may cross the front of the vehicle when objects indicated by the two or more first areas approach each other, and may collide with the vehicle.
The image processing method further comprises
The second annotation added in the assigning step further includes a risk assigning step that includes a risk that becomes higher as the area of the second region becomes smaller.
The image processing method according to claim 1.

The second area is a dangerous area that may cross the front of the vehicle when objects indicated by the two or more first areas approach each other, and may collide with the vehicle.
The image processing method further comprises
In addition to the second annotation added in the adding step, there is a different degree of risk in the one side area and the other side area forming the second area, and the two first areas sandwiching the second area. A step of adding a risk degree including a risk degree that the one side area or the other side area on the side where the first area where the movement amount is large is higher.
The image processing method according to claim 1.

The determination step is
In a plurality of images acquired in the acquisition step, a first determination step of determining the first image to which the first annotation is not added while sequentially tracing the time from the image at the last time on the time series,
While tracing back the first region in the image at the next time on the time series of the third time of the first image determined in the first determination step from the image of the third time in time sequence order, A second determination step of determining, by image processing, whether or not the first area is present at a position in each of the images that are displaced in a direction toward the direction perpendicular to the moving direction of the vehicle,
The image processing method according to claim 1.

A plurality of consecutive images taken in-series by a vehicle-mounted camera provided with a first annotation indicating at least one first area, at least one of which is a person area. An acquisition unit that acquires a plurality of images including at least one image in which the first region exists in the travel route of the vehicle and the distance between the first regions is equal to or less than a threshold value;
In the plurality of images acquired by the acquisition unit, a determination unit that determines the position of each of the two or more first regions while sequentially going back from the image of the last time on the time series,
Of the plurality of images, the determination unit determines the first image at the first first time when the positions of the two or more first regions are determined not to be in the travel route, and identifies the first image. A determining unit that determines a region between the two or more first regions in the above as a second region,
An assigning unit that assigns a second annotation indicating the second region determined by the determining unit,
Image processing device.

A plurality of images, which are consecutively time-sequentially taken by an in-vehicle camera mounted on a vehicle, to which at least one first region is attached, the first annotation indicating at least one person region, An acquisition step of acquiring a plurality of images including at least one image in which the first region exists in the travel route of the vehicle and the distance between the first regions is equal to or less than a threshold value;
In the plurality of images acquired in the acquisition step, a determination step of determining the position of each of the two or more first regions while sequentially going back from the image of the last time on the time series,
Among the plurality of images, the first image at the first first time when it is determined that the positions of the two or more first regions are not in the traveling route in the determination step is identified and specified. A determination step of determining a region between the two or more first regions in the above as a second region,
An attaching step of attaching a second annotation indicating the second area determined in the determining step,
A program that causes a computer to execute.