JP2020201661A

JP2020201661A - Data structure for learning and image data for learning generation device

Info

Publication number: JP2020201661A
Application number: JP2019107298A
Authority: JP
Inventors: 淳一竹田; Junichi Takeda
Original assignee: Pioneer Electronic Corp; Increment P Corp
Current assignee: Pioneer Corp; Geotechnologies Inc
Priority date: 2019-06-07
Filing date: 2019-06-07
Publication date: 2020-12-17
Anticipated expiration: 2039-06-07
Also published as: JP7295710B2

Abstract

To facilitate preparation of data for learning to recognize traffic signs.SOLUTION: An image data for learning generation device 10 comprises an acquisition unit 110 and a generation unit 120. The acquisition unit 110 acquires sign processing data from a photographic image including a first traffic sign. The sign processing data is data affecting the appearance of the first traffic sign, in another words, a parameter for changing the appearance of the traffic sign. The generation unit 120 generates image data for learning to learn a second traffic sign different from the first traffic sign by using an image of the second traffic sign and the sign processing data. The generated image data for learning is stored in a data for learning storage unit 160 together with attribute data of the image as a data structure.SELECTED DRAWING: Figure 1

Description

本発明は、学習用データ構造及び学習用画像データ生成装置に関する。 The present invention relates to a learning data structure and a learning image data generator.

車両の運転者をサポートする際、及び、車両を自動運転する際には、画像処理によって交通標識を高精度で認識できるようにすることが好ましい。 When supporting the driver of the vehicle and when driving the vehicle automatically, it is preferable that the traffic sign can be recognized with high accuracy by image processing.

例えば特許文献１には、以下の技術が記載されている。まず、車両に搭載された装置が、画像解析によって道路標識を認識し、認識結果を情報センタに送信する。情報センタは、この認識結果を含む情報を、他の車両に配信する。 For example, Patent Document 1 describes the following techniques. First, the device mounted on the vehicle recognizes the road sign by image analysis and transmits the recognition result to the information center. The information center distributes information including this recognition result to other vehicles.

また、特許文献２には、以下の技術が記載されている。まず、車載カメラによって撮影された画像に対して明暗に関する補正を行うことにより、明暗補正画像を生成する。また、同じ画像に対して色に関する補正を行うことにより、色補正画像を生成する。次いで、明暗補正画像を用いて物体を検出するとともに、色補正画像を用いて物体検出を行う。そして、２つの物体検出処理結果を用いて、標識及び表示を認識する。 Further, Patent Document 2 describes the following techniques. First, a light / dark corrected image is generated by performing correction regarding light / darkness on an image taken by an in-vehicle camera. In addition, a color-corrected image is generated by performing color correction on the same image. Next, the object is detected using the brightness-corrected image, and the object is detected using the color-corrected image. Then, the sign and the display are recognized by using the two object detection processing results.

一方、近年は機械学習に関する開発が行われている。機械学習の精度は、教師データの質及び量に大きく影響される。これに対して特許文献３には、背景の特徴量を別の値に置換することにより、シルエット画像から複数の学習用の画像を生成することが記載されている。 On the other hand, in recent years, development related to machine learning has been carried out. The accuracy of machine learning is greatly influenced by the quality and quantity of teacher data. On the other hand, Patent Document 3 describes that a plurality of learning images are generated from a silhouette image by replacing the background feature amount with another value.

特開２００４−１７１１５９号公報Japanese Unexamined Patent Publication No. 2004-171159 特開２０１８−７２８９３号公報JP-A-2018-72893 特開２００８−５９１１０号公報Japanese Unexamined Patent Publication No. 2008-59110

上記したように、機械学習の精度は、学習データの質及び量に大きく影響される。一方、交通標識は様々な場所に配置されているが、その見え方はその時の環境によって大きく変わる。このため、機械学習の結果を用いて交通標識を精度よく認識できるようにするためには、様々な環境の下で撮影された画像を学習用のデータとして準備する必要がある。 As mentioned above, the accuracy of machine learning is greatly affected by the quality and quantity of learning data. On the other hand, traffic signs are placed in various places, but their appearance changes greatly depending on the environment at that time. Therefore, in order to be able to accurately recognize traffic signs using the results of machine learning, it is necessary to prepare images taken under various environments as learning data.

本発明が解決しようとする課題としては、交通標識を認識するための学習用のデータを準備しやすくすることが一例として挙げられる。 As an example of the problem to be solved by the present invention, it is easy to prepare learning data for recognizing a traffic sign.

請求項１に記載の発明は、第１交通標識を含む撮影画像から取得される前記第１交通標識の外観に影響を与える標識処理データと、前記第１交通標識とは異なる第２交通標識の画像を用いて生成される、前記第２交通標識を学習するための学習用画像データと、
前記学習用画像データが前記第２交通標識を含むことを示す属性データと、
を有する学習用データ構造である。 The invention according to claim 1 is a sign processing data that affects the appearance of the first traffic sign acquired from a photographed image including the first traffic sign, and a second traffic sign different from the first traffic sign. Learning image data for learning the second traffic sign, which is generated using an image, and
Attribute data indicating that the learning image data includes the second traffic sign, and
It is a learning data structure having.

請求項４に記載の発明は、第１交通標識を含む撮影画像から、前記第１交通標識の外観に影響を与える標識処理データを取得する取得手段と、
前記第１交通標識とは異なる第２交通標識の画像と、前記標識処理データとを用いて前記第２交通標識を学習するための学習用画像データを生成する生成手段と、
を有する学習用画像データ生成装置である。 The invention according to claim 4 is an acquisition means for acquiring sign processing data that affects the appearance of the first traffic sign from a photographed image including the first traffic sign.
A generation means for generating learning image data for learning the second traffic sign by using an image of a second traffic sign different from the first traffic sign and the sign processing data.
It is an image data generation device for learning having.

請求項１１に記載の発明は、第１交通標識を含む複数の撮影画像から、ＧＡＮ（Generative Adversarial Networks）を用いて生成される第１の特徴量空間を用いて、前記第１の特徴量空間において指定された点に対応する画像の特徴量である指定点特徴量と前記第１の特徴量空間における前記第１交通標識を示す所定の標準画像の特徴量である標準特徴量との差分により標識処理データを取得する取得手段と、
前記第１交通標識とは異なる第２交通標識の画像の標識特徴量と、前記標識処理データとを用いて前記第２交通標識の学習用画像データを生成する生成手段と、
を有する学習用画像データ生成装置である。 The invention according to claim 11 uses the first feature space created by using GAN (Generative Adversarial Networks) from a plurality of captured images including the first traffic sign, and uses the first feature space. By the difference between the designated point feature amount, which is the feature amount of the image corresponding to the point specified in, and the standard feature amount, which is the feature amount of the predetermined standard image indicating the first traffic sign in the first feature amount space. Acquisition method for acquiring label processing data,
A generation means for generating learning image data of the second traffic sign using the sign feature amount of the image of the second traffic sign different from the first traffic sign and the sign processing data.
It is an image data generation device for learning having.

請求項１４に記載の発明は、標識が設置される道路の道路種別が高速道路である第１交通標識を含む撮影画像から、前記第１交通標識の外観に影響を与える標識処理データを取得する取得手段と、
前記第１交通標識とは異なり標識が設置される道路の道路種別が高速道路である第２交通標識の画像と、前記標識処理データと、高速道路の風景画像とを用いて前記第２交通標識を学習するための学習用画像データを生成する生成手段と、
を有する学習用画像データ生成装置である。 The invention according to claim 14 acquires sign processing data that affects the appearance of the first traffic sign from a photographed image including a first traffic sign whose road type on which the sign is installed is an expressway. Acquisition method and
Unlike the first traffic sign, the second traffic sign uses an image of a second traffic sign in which the road type on which the sign is installed is an expressway, the sign processing data, and a landscape image of the expressway. A generation means for generating learning image data for learning
It is an image data generation device for learning having.

請求項１５に記載の発明は、第１交通標識を含む撮影画像から、前記第１交通標識の外観に影響を与える標識処理データを取得する取得手段と、
前記第１交通標識とは異なる第２交通標識の画像と、前記標識処理データと、風景を撮影した風景画像を用いて前記第２交通標識を学習するための学習用画像データを生成する生成手段と、を有し、
前記撮影画像および前記風景画像の撮影時間帯が同一である学習用画像データ生成装置である。 The invention according to claim 15 is an acquisition means for acquiring sign processing data that affects the appearance of the first traffic sign from a photographed image including the first traffic sign.
A generation means for generating learning image data for learning the second traffic sign by using an image of a second traffic sign different from the first traffic sign, the sign processing data, and a landscape image obtained by photographing the landscape. And have
It is a learning image data generation device in which the shooting time zone of the shot image and the landscape image is the same.

第１実施形態に係る学習用画像データ生成装置の機能構成を示す図である。It is a figure which shows the functional structure of the image data generation apparatus for learning which concerns on 1st Embodiment. 生成部の処理を模式的に説明するための図である。It is a figure for exemplifying the process of a generation part. 取得部及び生成部が行う処理の例を説明するための図である。It is a figure for demonstrating the example of the processing performed by the acquisition part and the generation part. 撮影画像記憶部が記憶している撮影画像の一例を示す図である。It is a figure which shows an example of the photographed image stored in the photographed image storage part. 撮影画像記憶部が記憶している撮影画像の一例を示す図である。It is a figure which shows an example of the photographed image stored in the photographed image storage part. 撮影画像記憶部が記憶しているデータのデータ構造の第１例を示す図である。It is a figure which shows the 1st example of the data structure of the data stored in the captured image storage part. 撮影画像記憶部が記憶しているデータのデータ構造の第２例を示す図である。It is a figure which shows the 2nd example of the data structure of the data stored in the captured image storage part. 標識処理データ記憶部が記憶しているデータのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of the data which the label processing data storage part stores. 学習用画像データ生成装置のハードウェア構成例を示す図である。It is a figure which shows the hardware configuration example of the image data generation apparatus for learning. 取得部による標識処理データの生成処理の一例を示す図である。It is a figure which shows an example of the generation processing of a sign processing data by an acquisition part. 生成部による学習用画像データの生成処理の第１例を示す図である。It is a figure which shows the 1st example of the generation process of the image data for learning by a generation part. 生成部による学習用画像データの生成処理の第２例を示す図である。It is a figure which shows the 2nd example of the generation process of the image data for learning by a generation part. 生成部による学習用画像データの生成処理の第３例を示す図である。It is a figure which shows the 3rd example of the generation process of the image data for learning by a generation part. 生成部による学習用画像データの生成処理の第４例を示す図である。It is a figure which shows the 4th example of the generation process of the image data for learning by a generation part. 第２実施形態に係る学習用画像データ生成装置の機能構成を示す図である。It is a figure which shows the functional structure of the image data generation apparatus for learning which concerns on 2nd Embodiment. 第２実施形態に係る生成部が行う処理の第１例を示す図である。It is a figure which shows the 1st example of the process performed by the generation part which concerns on 2nd Embodiment. 第２実施形態に係る生成部が行う処理の第２例を示す図である。It is a figure which shows the 2nd example of the process performed by the generation part which concerns on 2nd Embodiment. 第３実施形態に係る学習用画像データ生成装置の機能構成を示す図である。It is a figure which shows the functional structure of the image data generation apparatus for learning which concerns on 3rd Embodiment.

以下、本発明の実施の形態について、図面を用いて説明する。尚、すべての図面において、同様な構成要素には同様の符号を付し、適宜説明を省略する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In all drawings, similar components are designated by the same reference numerals, and description thereof will be omitted as appropriate.

［第１実施形態］
図１は、本実施形態に係る学習用画像データ生成装置１０の機能構成を示す図である。この学習用画像データ生成装置１０は、取得部１１０及び生成部１２０を備えている。取得部１１０は、第１交通標識を含む撮影画像から、標識処理データを取得する。標識処理データは、第１交通標識の外観に影響を与えるデータ、言い換えると交通標識の外観を変更するためのパラメータである。生成部１２０は、第１交通標識とは異なる第２交通標識の画像と、標識処理データとを用いて第２交通標識を学習するための学習用画像データを生成する。学習用画像データは、機械学習によって処理される。以下、学習用画像データ生成装置１０について詳細に説明する。 [First Embodiment]
FIG. 1 is a diagram showing a functional configuration of the learning image data generation device 10 according to the present embodiment. The learning image data generation device 10 includes an acquisition unit 110 and a generation unit 120. The acquisition unit 110 acquires the sign processing data from the captured image including the first traffic sign. The sign processing data is data that affects the appearance of the first traffic sign, in other words, a parameter for changing the appearance of the traffic sign. The generation unit 120 generates learning image data for learning the second traffic sign by using the image of the second traffic sign different from the first traffic sign and the sign processing data. The image data for learning is processed by machine learning. Hereinafter, the learning image data generation device 10 will be described in detail.

本実施形態において、取得部１１０は撮影画像を撮影画像記憶部１３０から取得する、撮影画像記憶部１３０は、複数の撮影画像を記憶している。撮影画像は、実際に設置されている交通標識を撮影した画像であり、例えば車載カメラによって生成されている。ただし、撮影画像は車載カメラ以外のカメラ、例えば定点カメラや人が保持しているカメラによって生成されていてもよい。 In the present embodiment, the acquisition unit 110 acquires a captured image from the captured image storage unit 130, and the captured image storage unit 130 stores a plurality of captured images. The photographed image is an image obtained by photographing a traffic sign actually installed, and is generated by, for example, an in-vehicle camera. However, the captured image may be generated by a camera other than the in-vehicle camera, for example, a fixed-point camera or a camera held by a person.

また、取得部１１０は標識処理データを生成する際に、第１交通標識を示す標準画像を用いる。この標準画像は、例えば、背景や陰影を含まない第１交通標識の画像（以下、テンプレート画像と記載）である。ただし標準画像は、撮影画像から第１交通標識をトリミングすることにより生成されていてもよい。本実施形態において、取得部１１０は標識標準画像記憶部１４０から標準画像を取得する。そして取得部１１０は、この標準画像と、撮影画像中の第１交通標識との差分により、標識処理データを取得する。標識処理データは、例えば図６を用いて後述するように、特徴量空間における座標である。 In addition, the acquisition unit 110 uses a standard image showing the first traffic sign when generating the sign processing data. This standard image is, for example, an image of a first traffic sign (hereinafter, referred to as a template image) that does not include a background or a shadow. However, the standard image may be generated by trimming the first traffic sign from the captured image. In the present embodiment, the acquisition unit 110 acquires a standard image from the marker standard image storage unit 140. Then, the acquisition unit 110 acquires the sign processing data by the difference between the standard image and the first traffic sign in the captured image. The label processing data are coordinates in the feature space, for example, as will be described later with reference to FIG.

そして取得部１１０は、一つの撮影画像につき一つの標識処理データを生成する。撮影画像記憶部１３０は複数の撮影画像を記憶しているため、取得部１１０は、複数の標識処理データを生成できる。 Then, the acquisition unit 110 generates one marker processing data for one captured image. Since the captured image storage unit 130 stores a plurality of captured images, the acquisition unit 110 can generate a plurality of marking processing data.

図１に示す例において、取得部１１０は標識処理データを標識処理データ記憶部１５０に記憶させる。そして生成部１２０は、標識処理データ記憶部１５０から標識処理データを読み出す。標識処理データ記憶部１５０は不揮発性の記憶部である。ただし、標識処理データ記憶部１５０はメモリなど揮発性の記憶部であってもよい。この場合、標識処理データ記憶部１５０は、取得部１１０が生成した標識処理データを一時的に記憶する。 In the example shown in FIG. 1, the acquisition unit 110 stores the labeling processing data in the labeling processing data storage unit 150. Then, the generation unit 120 reads the labeling processing data from the labeling processing data storage unit 150. The label processing data storage unit 150 is a non-volatile storage unit. However, the labeling processing data storage unit 150 may be a volatile storage unit such as a memory. In this case, the labeling processing data storage unit 150 temporarily stores the labeling processing data generated by the acquisition unit 110.

取得部１１０は、例えば機械学習を用いて標識処理データを生成する。この機械学習の一例は、ＧＡＮ（Generative Adversarial Networks）であるが、他の手法、例えばＶＡＥ（Variational Auto Encoder）やＣＶＡＥ（Conditional Variational Auto Encoder）を用いてもよい。 The acquisition unit 110 generates labeling processing data using, for example, machine learning. An example of this machine learning is GAN (Generative Adversarial Networks), but other methods such as VAE (Variational Auto Encoder) and CVAE (Conditional Variational Auto Encoder) may be used.

生成部１２０は、上記したように、第２交通標識の画像と、標識処理データとを用いて、第２交通標識を学習するための学習用画像データを生成する。図１に示す例において、生成部１２０は、第２交通標識の標準画像を標識標準画像記憶部１４０から読み出して使用する。そして生成部１２０は、生成した学習用画像データをその画像の属性データとともに学習用データ記憶部１６０に記憶させる。すなわち学習用画像データと属性データによりデータ構造が構成されている。属性データは、少なくとも、この学習用画像データが第２交通標識を含むことを示している。 As described above, the generation unit 120 generates learning image data for learning the second traffic sign by using the image of the second traffic sign and the sign processing data. In the example shown in FIG. 1, the generation unit 120 reads out the standard image of the second traffic sign from the sign standard image storage unit 140 and uses it. Then, the generation unit 120 stores the generated learning image data in the learning data storage unit 160 together with the attribute data of the image. That is, the data structure is composed of the learning image data and the attribute data. The attribute data indicates that at least this learning image data includes a second traffic sign.

上記したように、生成部１２０は学習用画像データを生成する際に、標識処理データを用いている。標識処理データは、上記したように、ＧＡＮなどの機械学習を用いて生成されている。このように、学習用画像データは、一般的な画像合成のみを用いて生成された画像データではなく、機械学習を用いて生成された画像データである。 As described above, the generation unit 120 uses the labeling processing data when generating the learning image data. As described above, the labeling processing data is generated by using machine learning such as GAN. As described above, the learning image data is not the image data generated only by using general image composition, but the image data generated by using machine learning.

なお、標識標準画像記憶部１４０は、交通標識の標準画像を、その交通標識が設置される道路の種別（例えば一般道、高速道路、及び有料道路）を特定する情報に紐づけて記憶していてもよい。このようにすると、道路種が指定されると、その道路種に対応した第２交通標識を選択し、選択した第２交通標識に関する学習用画像データを生成することができる。すなわち、道路種別に、その道路に設置される可能性が高い交通標識を含む学習用画像データを生成できる。 The sign standard image storage unit 140 stores a standard image of a traffic sign in association with information that identifies the type of road on which the traffic sign is installed (for example, a general road, an expressway, and a toll road). You may. In this way, when a road type is specified, a second traffic sign corresponding to the road type can be selected, and learning image data regarding the selected second traffic sign can be generated. That is, it is possible to generate learning image data including traffic signs that are likely to be installed on the road for each road type.

特に、後述する他の実施例において図１５を用いて説明するように、標識の背景画像についても道路種別が紐づけられている場合、その道路の種別に応じた背景画像を選択できるため、より良い学習用画像データを生成できる。 In particular, as described with reference to FIG. 15 in another embodiment described later, when the road type is also associated with the background image of the sign, the background image corresponding to the road type can be selected. Can generate good learning image data.

また、山間部に警戒標識が多い、人口の多いところに駐車禁止が多い等、特定の交通標識が存在しやすい道路種別や地域属性がある。このため、交通標識に、道路種別や地域属性を紐づけており、第２交通標識が選択されると、その第２交通標識に紐付いた道路種別や地域属性を選択し、さらにこの道路種別や地域属性に紐付いた背景画像を選択してもよい。このようにすると、現実に近い教師用画像データを生成できる。 In addition, there are road types and regional attributes where specific traffic signs are likely to exist, such as many warning signs in mountainous areas and many parking bans in populated areas. Therefore, the road type and the area attribute are linked to the traffic sign, and when the second traffic sign is selected, the road type and the area attribute linked to the second traffic sign are selected, and further, this road type and the area attribute are selected. You may select the background image associated with the area attribute. By doing so, it is possible to generate image data for teachers that is close to reality.

一方、背景画像を選択する際に、道路種別や地域属性を考慮しなくてもよい。このようにすると、例外的な組み合わせに従った教師用画像データが生成されるため、教師用画像データのバリエーションは増える。このように、学習用画像データ生成装置１０を用いると、その用途に合わせて教師用画像データを生成できる。 On the other hand, when selecting the background image, it is not necessary to consider the road type and the area attribute. In this way, the teacher image data is generated according to the exceptional combination, so that the variation of the teacher image data increases. In this way, when the learning image data generation device 10 is used, teacher image data can be generated according to the intended use.

なお、地域属性の一例は、後述するように、都市、郊外、トンネル、及び山間部などの地域の特徴、並びに人口密度の少なくとも一方である。そしてこの地域属性は、例えば、道路種別や一般的な地理情報などから判断できる。 An example of regional attributes is at least one of regional characteristics such as cities, suburbs, tunnels, and mountainous areas, and population density, as will be described later. And this area attribute can be judged from, for example, a road type and general geographic information.

生成部１２０の機能により、学習用データ記憶部１６０は、第２交通標識に関する複数の学習用画像データを記憶することができる。これら複数の学習用画像データは機械学習の教師データとして使用される。そしてこの機械学習の結果（例えば分類器）は、車載カメラが撮像した画像が第２交通標識を含むか否かを判別する際に用いられる。学習用画像データ生成装置１０は、さらに、この判別処理を行ってもよい。 By the function of the generation unit 120, the learning data storage unit 160 can store a plurality of learning image data related to the second traffic sign. These plurality of learning image data are used as teacher data for machine learning. The result of this machine learning (for example, a classifier) is used when determining whether or not the image captured by the in-vehicle camera includes the second traffic sign. The learning image data generation device 10 may further perform this discrimination process.

なお、学習用データ記憶部１６０は、学習用画像データの代わりに、後述する特徴量空間の座標を学習用データとして記憶していてもよい。 The learning data storage unit 160 may store the coordinates of the feature amount space described later as learning data instead of the learning image data.

図１に示した例において、撮影画像記憶部１３０、標識標準画像記憶部１４０、標識処理データ記憶部１５０、及び学習用データ記憶部１６０は、いずれも学習用画像データ生成装置１０の一部である。ただし、これら記憶部の少なくとも一つは、学習用画像データ生成装置１０の外部のストレージであってもよい。この場合、学習用画像データ生成装置１０とこのストレージは、通信回線を介して互いに接続する。この通信回線の少なくとも一部は無線であってもよい。 In the example shown in FIG. 1, the captured image storage unit 130, the labeling standard image storage unit 140, the labeling processing data storage unit 150, and the learning data storage unit 160 are all part of the learning image data generation device 10. is there. However, at least one of these storage units may be an external storage of the learning image data generation device 10. In this case, the learning image data generation device 10 and the storage are connected to each other via a communication line. At least a part of this communication line may be wireless.

図２は、生成部１２０の処理を模式的に説明するための図である。撮影画像は、実際に設置されている第１交通標識の画像である。このため、この画像には、陰影や白とびなど、環境起因の様々な影響が加わっている。生成部１２０は、撮影画像と第１交通標識の標準画像の差分を、この影響を他の画像に再現するためのデータ（すなわち標識処理データ）として生成する。 FIG. 2 is a diagram for schematically explaining the processing of the generation unit 120. The photographed image is an image of the first traffic sign actually installed. For this reason, various environmental effects such as shading and overexposure are added to this image. The generation unit 120 generates the difference between the captured image and the standard image of the first traffic sign as data (that is, sign processing data) for reproducing this effect on another image.

そして、この標識処理データを用いて第２交通標識の画像を処理することにより、学習用画像データを生成する。この学習用画像データは、第２交通標識に対して、撮影画像と同様の影響が加わった画像となっている。 Then, the image data for learning is generated by processing the image of the second traffic sign using this sign processing data. The learning image data is an image in which the same influence as the photographed image is applied to the second traffic sign.

図３は、取得部１１０及び生成部１２０が行う処理の例を説明するための図である。画像を機械学習するアルゴリズムの大部分は、画像から複数種類の特徴量を抽出し、これら複数種類の特徴量のそれぞれを座標軸とした多次元の特徴量空間における座標を、学習対象にする。 FIG. 3 is a diagram for explaining an example of processing performed by the acquisition unit 110 and the generation unit 120. Most of the algorithms for machine learning an image extract a plurality of types of features from an image, and use the coordinates in a multidimensional feature space with each of the plurality of types as coordinate axes as a learning target.

取得部１１０は、画像を処理することにより、上述した特徴量を生成する。そして、この特徴量空間における、撮影画像の座標（指定点特徴量の一例）と標準画像の座標（標準特徴量の一例）の差分、すなわち各特徴量の差分を、標識処理データとして生成する。この場合、標識処理データは、例えば標準画像の座標を原点とした、撮影画像の座標のベクトルとして示される。 The acquisition unit 110 generates the above-mentioned feature amount by processing the image. Then, the difference between the coordinates of the captured image (an example of the designated point feature amount) and the coordinates of the standard image (an example of the standard feature amount) in this feature amount space, that is, the difference of each feature amount is generated as labeling processing data. In this case, the marking processing data is shown as a vector of the coordinates of the captured image, for example, with the coordinates of the standard image as the origin.

そして生成部１２０は、第２標識画像に関する特徴量空間において、第２標識画像の座標（標識特徴量の一例）を上記した差分ほど移動させ（例えばベクトルを加える）、この移動後の座標（特徴量）を用いることにより、学習用画像データを生成する。 Then, the generation unit 120 moves the coordinates of the second labeled image (an example of the labeled feature amount) by the above difference (for example, adds a vector) in the feature quantity space related to the second labeled image, and the coordinates (features) after the movement. By using the quantity), image data for learning is generated.

なお、特徴量空間で用いられる特徴量としては、例えば、標識の姿勢(yaw、pitch、及びrollの少なくとも一つ)、明度、汚れ具合、左右反転（Ｈ−Ｆｌｉｐ）、並びに上下反転（Ｖ−Ｆｌｉｐ）の少なくとも２つ以上であるが、これらをすべて用いるのが好ましい。また、撮影画像が背景を含む場合、特徴量空間で用いられる特徴量には、地域属性(都市、郊外、トンネル、及び山間部などの区別、及び人口密度の少なくとも一方)の確からしさ、緑の多さ、建物の多さ、及び車両の多さの少なくとも一つを含めることができる。また、撮影画像に、その撮影画像を撮影した時の天候情報が紐付いている場合、特徴量空間で用いられる特徴量にこの天候情報を含めてもよい。ここで用いられる天候情報は、例えば、日照量、雨の有無及びその降水量、霧の有無、並びに雪の有無その降水量の少なくとも一つである。 The feature amount used in the feature amount space includes, for example, the posture of the sign (at least one of yaw, pitch, and roll), brightness, degree of dirt, left-right inversion (H-Flip), and up-down inversion (V-). At least two or more of Flip), but it is preferable to use all of them. In addition, when the captured image includes a background, the feature quantity used in the feature quantity space includes the certainty of regional attributes (distinguishing between cities, suburbs, tunnels, and mountainous areas, and at least one of the population densities) and green. It can include at least one of abundance, abundance of buildings, and abundance of vehicles. Further, when the captured image is associated with the weather information at the time when the captured image is captured, this weather information may be included in the feature quantity used in the feature quantity space. The weather information used here is, for example, at least one of the amount of sunshine, the presence or absence of rain and its precipitation, the presence or absence of fog, and the presence or absence of snow and its precipitation.

図４及び図５のそれぞれは、撮影画像記憶部１３０が記憶している撮影画像の一例を示している。これらの図に示すように、撮影画像記憶部１３０は、同一の交通標識を含む撮影画像を複数有している。図４に示す例において、撮影画像記憶部１３０は、交通標識をトリミングした画像を撮影画像として記憶している。一方、図５に示すように、撮影画像記憶部１３０は、背景を含む画像を撮影画像として記憶していてもよい。 Each of FIGS. 4 and 5 shows an example of a photographed image stored in the photographed image storage unit 130. As shown in these figures, the captured image storage unit 130 has a plurality of captured images including the same traffic sign. In the example shown in FIG. 4, the captured image storage unit 130 stores an image obtained by trimming the traffic sign as a captured image. On the other hand, as shown in FIG. 5, the captured image storage unit 130 may store an image including a background as a captured image.

図６は、撮影画像記憶部１３０が記憶しているデータのデータ構造の第１例を示す図である。上記したように、撮影画像記憶部１３０は、同一の交通標識（第１交通標識）を含む画像を複数記憶している。また、本図に示す例において、撮影画像記憶部１３０は、撮影画像を、その画像の属性に紐づけて記憶している。画像の属性としては、例えば画像が撮影された地域を特定する情報（例えば緯度経度情報や住所情報）、その第１交通標識が設けられていた道路の種別（例えば一般道、高速道路、及び有料道路）を特定する情報、標識の形状を特定する情報、並びに撮影日時（又は時間帯のみ）の少なくとも一つである。このようにすると、取得部１１０は、指定された属性を有する撮影画像のみを選択して処理することにより、属性別に標識処理データを生成することができる。 FIG. 6 is a diagram showing a first example of a data structure of data stored in the captured image storage unit 130. As described above, the captured image storage unit 130 stores a plurality of images including the same traffic sign (first traffic sign). Further, in the example shown in this figure, the captured image storage unit 130 stores the captured image in association with the attribute of the image. The attributes of the image include, for example, information that identifies the area where the image was taken (for example, latitude / longitude information and address information), and the type of road on which the first traffic sign was provided (for example, general road, expressway, and toll road). Information that identifies the road), information that identifies the shape of the sign, and at least one of the shooting date and time (or time zone only). In this way, the acquisition unit 110 can generate the marker processing data for each attribute by selecting and processing only the captured image having the specified attribute.

なお、撮影画像記憶部１３０は、複数の交通標識それぞれについて、複数の撮影画像を記憶していてもよいし、一つの交通標識について複数の撮影画像を記憶していてもよい。前者の場合、撮影画像記憶部１３０は、図６に示すように、交通標識を特定する情報（以下、標識特定情報と記載）を、撮影画像に紐づけて記憶している。 The captured image storage unit 130 may store a plurality of captured images for each of the plurality of traffic signs, or may store a plurality of captured images for one traffic sign. In the former case, as shown in FIG. 6, the captured image storage unit 130 stores information for identifying the traffic sign (hereinafter, referred to as sign identification information) in association with the captured image.

図７は、撮影画像記憶部１３０が記憶しているデータのデータ構造の第２例を示す図である。本図に示す例において、撮影画像記憶部１３０は、標識の種類を区別せずに撮影画像を記憶している。その代わりに、撮影画像のそれぞれを、その撮影画像の属性情報に紐づけて記憶するとともに、その撮影画像が含む交通標識の標識特定情報に紐付づけて記憶している。 FIG. 7 is a diagram showing a second example of a data structure of data stored in the captured image storage unit 130. In the example shown in this figure, the captured image storage unit 130 stores the captured image without distinguishing the type of the sign. Instead, each of the captured images is associated with the attribute information of the captured image and stored, and is associated with the sign specific information of the traffic sign included in the captured image and stored.

図８は、標識処理データ記憶部１５０が記憶しているデータのデータ構造の一例を示す図である。本図に示す例において、標識処理データ記憶部１５０は、属性毎に複数の標識処理データ（図中データ群と記載）を記憶している。この属性の具体例は、撮影画像に紐づけられた属性と同様であり、少なくとも、交通標識が存在する地域を示す地域属性、設置される道路の道路種別を示す道路種別属性、及び交通標識の形状を示す形状属性の少なくとも一つを含む。本図に示す例において、標識処理データ記憶部１５０は、道路の種別に標識処理データを記憶している。ただし標識処理データ記憶部１５０は、地域別又は標識の形状別に標識処理データを記憶していてもよい。 FIG. 8 is a diagram showing an example of a data structure of data stored in the labeling processing data storage unit 150. In the example shown in this figure, the labeling processing data storage unit 150 stores a plurality of labeling processing data (described as a data group in the figure) for each attribute. Specific examples of this attribute are the same as the attributes associated with the photographed image, and at least the area attribute indicating the area where the traffic sign exists, the road type attribute indicating the road type of the road to be installed, and the traffic sign Includes at least one shape attribute that indicates the shape. In the example shown in this figure, the sign processing data storage unit 150 stores the sign processing data for each type of road. However, the sign processing data storage unit 150 may store the sign processing data according to the area or the shape of the sign.

なお、取得部１１０が、属性に関係ない状態で標識処理データを生成していることも有る。この場合、取得部１１０は、複数の標識処理データを属性に紐付けない状態で記憶する。 In addition, the acquisition unit 110 may generate the sign processing data in a state irrelevant to the attribute. In this case, the acquisition unit 110 stores the plurality of indicator processing data in a state of not being associated with the attribute.

図８は、学習用画像データ生成装置１０のハードウェア構成例を示す図である。学習用画像データ生成装置１０は、バス１０１０、プロセッサ１０２０、メモリ１０３０、ストレージデバイス１０４０、入出力インタフェース１０５０、及びネットワークインタフェース１０６０を有する。 FIG. 8 is a diagram showing a hardware configuration example of the learning image data generation device 10. The learning image data generation device 10 includes a bus 1010, a processor 1020, a memory 1030, a storage device 1040, an input / output interface 1050, and a network interface 1060.

バス１０１０は、プロセッサ１０２０、メモリ１０３０、ストレージデバイス１０４０、入出力インタフェース１０５０、及びネットワークインタフェース１０６０が、相互にデータを送受信するためのデータ伝送路である。ただし、プロセッサ１０２０などを互いに接続する方法は、バス接続に限定されない。 The bus 1010 is a data transmission line for the processor 1020, the memory 1030, the storage device 1040, the input / output interface 1050, and the network interface 1060 to transmit and receive data to and from each other. However, the method of connecting the processors 1020 and the like to each other is not limited to the bus connection.

プロセッサ１０２０は、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）などで実現されるプロセッサである。 The processor 1020 is a processor realized by a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), or the like.

メモリ１０３０は、ＲＡＭ（Random Access Memory）などで実現される主記憶装置である。 The memory 1030 is a main storage device realized by a RAM (Random Access Memory) or the like.

ストレージデバイス１０４０は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、メモリカード、又はＲＯＭ（Read Only Memory）などで実現される補助記憶装置である。ストレージデバイス１０４０は学習用画像データ生成装置１０の各機能（例えば取得部１１０及び生成部１２０）を実現するプログラムモジュールを記憶している。プロセッサ１０２０がこれら各プログラムモジュールをメモリ１０３０上に読み込んで実行することで、そのプログラムモジュールに対応する各機能が実現される。 The storage device 1040 is an auxiliary storage device realized by an HDD (Hard Disk Drive), an SSD (Solid State Drive), a memory card, a ROM (Read Only Memory), or the like. The storage device 1040 stores a program module that realizes each function (for example, the acquisition unit 110 and the generation unit 120) of the learning image data generation device 10. When the processor 1020 reads each of these program modules into the memory 1030 and executes them, each function corresponding to the program module is realized.

入出力インタフェース１０５０は、学習用画像データ生成装置１０と各種入出力機器とを接続するためのインタフェースである。 The input / output interface 1050 is an interface for connecting the learning image data generation device 10 and various input / output devices.

ネットワークインタフェース１０６０は、学習用画像データ生成装置１０をネットワークに接続するためのインタフェースである。このネットワークは、例えばＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）である。ネットワークインタフェース１０６０がネットワークに接続する方法は、無線接続であってもよいし、有線接続であってもよい。 The network interface 1060 is an interface for connecting the learning image data generation device 10 to the network. This network is, for example, a LAN (Local Area Network) or a WAN (Wide Area Network). The method of connecting the network interface 1060 to the network may be a wireless connection or a wired connection.

図１０は、取得部１１０による標識処理データの生成処理の一例を示す図である。取得部１１０は、属性の指定を取得する（ステップＳ１０２）。この指定は、例えば、キーボード、マウス、音声認識デバイス等の入力デバイスを介して、学習用画像データ生成装置１０のユーザから入力される。また、ここで特定される属性は、図６を用いて説明した通りであり、例えば、交通標識が存在する地域を示す地域属性、及び、設置される道路の道路種別を示す道路種別属性、及び交通標識の形状を示す形状属性の少なくとも一つである。 FIG. 10 is a diagram showing an example of a sign processing data generation process by the acquisition unit 110. The acquisition unit 110 acquires the attribute designation (step S102). This designation is input by the user of the learning image data generation device 10 via an input device such as a keyboard, a mouse, or a voice recognition device. Further, the attributes specified here are as described with reference to FIG. 6, for example, the area attribute indicating the area where the traffic sign exists, the road type attribute indicating the road type of the road to be installed, and the road type attribute. It is at least one of the shape attributes indicating the shape of a traffic sign.

すると取得部１１０は、撮影画像記憶部１３０から、指定された属性に対応する撮影画像を複数読み出す。また取得部１１０は、読み出した各撮影画像に紐付いた標識特定情報を認識し、この標識特定情報に対応する標準画像を標識標準画像記憶部１４０から読み出す（ステップＳ１０４）。次いで取得部１１０は、読み出した撮影画像及び標準画像を用いて、標識処理データを生成する。取得部１１０は、標識処理データの生成処理を、複数の撮影画像それぞれに対して行う。そして取得部１１０は、生成した複数の標識処理データを、ステップＳ１０２で取得した属性に紐づけて標識処理データ記憶部１５０に記憶させる（ステップＳ１０６）。 Then, the acquisition unit 110 reads out a plurality of captured images corresponding to the designated attributes from the captured image storage unit 130. Further, the acquisition unit 110 recognizes the sign identification information associated with each of the read captured images, and reads out the standard image corresponding to the sign identification information from the sign standard image storage unit 140 (step S104). Next, the acquisition unit 110 generates labeling processing data using the read captured image and the standard image. The acquisition unit 110 performs a sign processing data generation process for each of the plurality of captured images. Then, the acquisition unit 110 associates the generated plurality of label processing data with the attributes acquired in step S102 and stores them in the label processing data storage unit 150 (step S106).

図１１は、生成部１２０による学習用画像データの生成処理の第１例を示す図である。生成部１２０は、第２交通標識の指定を取得する。この指定は、例えば、キーボード、マウス、音声認識デバイス等の入力デバイスを介して、学習用画像データ生成装置１０のユーザから入力される（ステップＳ２０２）。 FIG. 11 is a diagram showing a first example of a learning image data generation process by the generation unit 120. The generation unit 120 acquires the designation of the second traffic sign. This designation is input by the user of the learning image data generation device 10 via an input device such as a keyboard, a mouse, or a voice recognition device (step S202).

次いで生成部１２０は、指定された第２交通標識の標準画像を、標識標準画像記憶部１４０から読み出す。また生成部１２０は、複数の標識処理データを標識処理データ記憶部１５０から読み出す（ステップＳ２０４）。次いで生成部１２０は、読み出した標準画像を、標識処理データを用いて処理することにより学習用画像データを生成する。生成部１２０は、学習用画像データの生成処理を、複数の標識処理データ毎に行う。このため、生成部１２０は、一つの標準画像から複数の学習用画像データを生成する。 The generation unit 120 then reads the designated standard image of the second traffic sign from the sign standard image storage unit 140. Further, the generation unit 120 reads out a plurality of labeling processing data from the labeling processing data storage unit 150 (step S204). Next, the generation unit 120 generates learning image data by processing the read standard image using the labeling processing data. The generation unit 120 performs a learning image data generation process for each of the plurality of label processing data. Therefore, the generation unit 120 generates a plurality of learning image data from one standard image.

次いで生成部１２０は、生成した複数の学習用画像データを、その学習用画像データの属性情報とともに学習用データ記憶部１６０に記憶させる（ステップＳ２０６）。学習用画像データの属性情報は、少なくとも、その学習用画像データが第２交通標識を含むことを示している。 Next, the generation unit 120 stores the generated plurality of learning image data in the learning data storage unit 160 together with the attribute information of the learning image data (step S206). The attribute information of the learning image data indicates that at least the learning image data includes the second traffic sign.

図１２は、生成部１２０による学習用画像データの生成処理の第２例を示す図である。本図に示す例において、生成部１２０は、第２交通標識の指定を取得するとともに、生成される学習用画像データの属性を取得する（ステップＳ２１２）。この属性は、例えば、交通標識が存在する地域を示す地域属性、及び、設置される道路の道路種別を示す道路種別属性の少なくとも一つである。これらの取得は図１１のステップＳ２０２と同様である。 FIG. 12 is a diagram showing a second example of the learning image data generation process by the generation unit 120. In the example shown in this figure, the generation unit 120 acquires the designation of the second traffic sign and the attributes of the generated learning image data (step S212). This attribute is, for example, at least one of the area attribute indicating the area where the traffic sign exists and the road type attribute indicating the road type of the road to be installed. These acquisitions are the same as in step S202 of FIG.

すると生成部１２０は、第２交通標識の標準画像を標識標準画像記憶部１４０から読み出すとともに、取得した属性に対応する標識処理データを、標識処理データ記憶部１５０から読み出す（ステップＳ２１４）。そして生成部１２０は、読み出した第２交通標識の標準画像及び標識処理データを用いて学習用画像データを生成し、生成した学習用画像データを、属性情報に紐づけて学習用データ記憶部１６０に記憶させる（ステップＳ２１６）。ステップＳ２１６の処理は、図１１のステップＳ２０６と同様である。ただし、学習用画像データに紐付いた属性情報には、その学習用画像データが第２交通標識を含むことの他に、ステップＳ２１２で取得した属性が含まれる。 Then, the generation unit 120 reads the standard image of the second traffic sign from the sign standard image storage unit 140, and reads the sign processing data corresponding to the acquired attribute from the sign processing data storage unit 150 (step S214). Then, the generation unit 120 generates learning image data using the read standard image of the second traffic sign and the sign processing data, and associates the generated learning image data with the attribute information to learn data storage unit 160. Is stored in (step S216). The process of step S216 is the same as that of step S206 of FIG. However, the attribute information associated with the learning image data includes the attribute acquired in step S212 in addition to the learning image data including the second traffic sign.

なお、標識標準画像記憶部１４０において第２交通標識の標準画像に属性が紐付いている場合、生成部１２０は、ステップＳ２１２において学習用画像データの属性を取得する代わりに、ステップＳ２１４において、標識標準画像記憶部１４０から、第２交通標識の標準画像に紐付いた属性を読み出してもよい。 When an attribute is associated with the standard image of the second traffic sign in the sign standard image storage unit 140, the generation unit 120 does not acquire the attribute of the learning image data in step S212, but in step S214, the sign standard. The attribute associated with the standard image of the second traffic sign may be read from the image storage unit 140.

図１３は、生成部１２０による学習用画像データの生成処理の第３例を示す図である。本図に示す例において、標識処理データ記憶部１５０には予め標識処理データが記憶されていない。 FIG. 13 is a diagram showing a third example of a learning image data generation process by the generation unit 120. In the example shown in this figure, the labeling processing data is not stored in advance in the labeling processing data storage unit 150.

本図において、ステップＳ２２２に示した処理は、図１２のステップＳ２１２に示した処理と同様である。また、ステップＳ２２８に示した処理は、図１２のステップＳ２１６に示した処理と同様である。ただし、ステップＳ２１６に示した処理の代わりに、ステップＳ２２４，Ｓ２２６を有している。 In this figure, the process shown in step S222 is the same as the process shown in step S212 of FIG. Further, the process shown in step S228 is the same as the process shown in step S216 of FIG. However, instead of the process shown in step S216, steps S224 and S226 are included.

具体的には、生成部１２０は、ステップＳ２２２で取得した属性に対応する複数の撮影画像を撮影画像記憶部１３０から読み出す（ステップＳ２２４）。次いで生成部１２０は、ステップＳ２２４で読み出した複数の撮影画像を用いて、複数の標識処理データを生成する（ステップＳ２２６）。そしてステップＳ２２８において、生成部１２０は、ステップＳ２２６で生成した標識処理データを用いる。 Specifically, the generation unit 120 reads out a plurality of captured images corresponding to the attributes acquired in step S222 from the captured image storage unit 130 (step S224). Next, the generation unit 120 generates a plurality of labeling processing data using the plurality of captured images read in step S224 (step S226). Then, in step S228, the generation unit 120 uses the labeling processing data generated in step S226.

図１４は、生成部１２０による学習用画像データの生成処理の第４例を示す図である。まず生成部１２０は、標識処理データ記憶部１５０から標識処理データを読み出す。この際、生成部１２０は、標識処理データ記憶部１５０から、読み出した標識処理データの属性も取得する（ステップＳ２３２）。この属性は、上記したように、交通標識が存在する地域を示す地域属性、設置される道路の道路種別を示す道路種別属性、及び交通標識の形状を示す形状属性の少なくとも一つである。 FIG. 14 is a diagram showing a fourth example of a learning image data generation process by the generation unit 120. First, the generation unit 120 reads the labeling processing data from the labeling processing data storage unit 150. At this time, the generation unit 120 also acquires the attribute of the label processing data read from the label processing data storage unit 150 (step S232). As described above, this attribute is at least one of the area attribute indicating the area where the traffic sign exists, the road type attribute indicating the road type of the road to be installed, and the shape attribute indicating the shape of the traffic sign.

ここで生成部１２０は、まず、属性の指定を入力デバイスを介して取得してもよい。この場合、ステップＳ２３２において、生成部１２０は、標識処理データ記憶部１５０から、指定された属性を有する標識処理データを複数読み出す。 Here, the generation unit 120 may first acquire the designation of the attribute via the input device. In this case, in step S232, the generation unit 120 reads out a plurality of label processing data having the designated attributes from the label processing data storage unit 150.

次いで生成部１２０は、ステップＳ２３２で読み出した属性と同一の属性を有する第２交通標識の標準画像を標識標準画像記憶部１４０から読み出す（ステップＳ２３４）。次いで生成部１２０は、ステップＳ２３４で読み出した第２交通標識の標準画像を、ステップＳ２３２で読み出した複数の標識処理データを用いて処理することにより、複数の学習用画像データを生成して学習用データ記憶部１６０に記憶させる（ステップＳ２３６）。ステップＳ２３４において複数の第２交通標識が読み出されていた場合、生成部１２０は、これら複数の第２交通標識のそれぞれに対してステップＳ２３６に示した処理を行う。 Next, the generation unit 120 reads a standard image of the second traffic sign having the same attributes as the attributes read in step S232 from the sign standard image storage unit 140 (step S234). Next, the generation unit 120 generates a plurality of learning image data for learning by processing the standard image of the second traffic sign read in step S234 using the plurality of sign processing data read in step S232. It is stored in the data storage unit 160 (step S236). When a plurality of second traffic signs have been read in step S234, the generation unit 120 performs the process shown in step S236 for each of the plurality of second traffic signs.

以上、本実施形態によれば、生成部１２０は、第２交通標識の標準画像を、第１交通標識を含む撮影画像から取得された標識処理データを用いて処理することにより、第２交通標識を含む学習用画像データを生成する。従って、学習用画像データ生成装置１０のユーザは、学習用画像データを容易に準備することができる。特に、学習用画像データ生成装置１０を用いると、実際に設置されている場所が少ない交通標識についても、多くの学習用画像データを容易に準備することができる。 As described above, according to the present embodiment, the generation unit 120 processes the standard image of the second traffic sign by using the sign processing data acquired from the photographed image including the first traffic sign, thereby performing the second traffic sign. Generate learning image data including. Therefore, the user of the learning image data generation device 10 can easily prepare the learning image data. In particular, when the learning image data generation device 10 is used, it is possible to easily prepare a large amount of learning image data even for a traffic sign in which there are few places where it is actually installed.

［第２実施形態］
図１５は、第２実施形態に係る学習用画像データ生成装置１０の機能構成を示す図である。本実施形態に係る学習用画像データ生成装置１０は、以下の点を除いて第１実施形態に係る学習用画像データ生成装置１０と同様の構成である。 [Second Embodiment]
FIG. 15 is a diagram showing a functional configuration of the learning image data generation device 10 according to the second embodiment. The learning image data generation device 10 according to the present embodiment has the same configuration as the learning image data generation device 10 according to the first embodiment except for the following points.

まず、学習用画像データ生成装置１０は風景画像憶部１７０を有している。風景画像憶部１７０は、風景画像すなわち標識の背景となるべき画像を記憶している。風景画像は風景を撮影した画像である。例えば風景画像憶部１７０は、風景の属性毎に、複数の標準画像を記憶している。ここで用いられる風景の属性は、図６で説明した属性と同様であり、例えば、都市部、郊外部、トンネル、及び山岳部などの地域特性や、一般道、高速道路、及び有料道路などの道路の種別である。 First, the learning image data generation device 10 has a landscape image storage unit 170. The landscape image storage unit 170 stores a landscape image, that is, an image that should be a background of a sign. A landscape image is an image of a landscape. For example, the landscape image storage unit 170 stores a plurality of standard images for each attribute of the landscape. The attributes of the landscape used here are the same as those described in FIG. 6, for example, regional characteristics such as urban areas, suburbs, tunnels, and mountainous areas, general roads, highways, and toll roads. The type of road.

そして生成部１２０は、標識処理データを用いて処理した後の第２標識画像と、風景画像憶部１７０が記憶している風景画像とを合成することにより、学習用画像データを生成する。この学習用画像データは、例えば風景画像の中に第２標識画像が嵌めこまれた状態を示している。 Then, the generation unit 120 generates learning image data by synthesizing the second marker image after processing using the sign processing data and the landscape image stored in the landscape image storage unit 170. This learning image data shows, for example, a state in which the second marker image is fitted in the landscape image.

生成部１２０は、標識処理データの属性及び第２標識画像の属性の少なくとも一方が指定されている場合（例えば第１実施形態の図１２〜図１４を用いて説明した例）、同一の属性を有する風景画像を風景画像憶部１７０から読み出し、読み出した風景画像にはめ込むことにより、学習用画像データを生成する。ここで読み出された風景画が複数ある場合、生成部１２０は、読み出された複数の風景画のそれぞれを用いて、学習用画像データを生成する。この場合、生成部１２０が用いた標識処理データ、第２標識画像、及び風景画像がそれぞれｌ個、ｍ個、及びｎ個ある場合、学習用画像データは、ｌ×ｍ×ｎ個生成される。 When at least one of the attribute of the labeling processing data and the attribute of the second labeling image is specified (for example, the example described with reference to FIGS. 12 to 14 of the first embodiment), the generation unit 120 sets the same attribute. The learning image data is generated by reading out the landscape image to be held from the landscape image storage unit 170 and fitting it into the read landscape image. When there are a plurality of landscape paintings read out here, the generation unit 120 generates learning image data using each of the plurality of landscape paintings read out. In this case, when the labeling processing data, the second labeling image, and the landscape image used by the generation unit 120 are l, m, and n, respectively, l × m × n image data for learning are generated. ..

なお、ここで用いられる属性は、地域（すなわち撮影場所）、及び撮影時間帯の少なくとも一方が含まれるのが好ましい。 It is preferable that the attribute used here includes at least one of the area (that is, the shooting location) and the shooting time zone.

図１６は、本実施形態に係る生成部１２０が行う処理の第１例を示す図である。本図に示す処理は、第１実施形態において図１２を用いて説明した処理に近い。 FIG. 16 is a diagram showing a first example of processing performed by the generation unit 120 according to the present embodiment. The process shown in this figure is similar to the process described with reference to FIG. 12 in the first embodiment.

まず生成部１２０は、第２交通標識の指定を取得するとともに、生成される学習用画像データの属性を取得する（ステップＳ２４２）。これらの取得は図１２のステップＳ２１２で説明した通りである。 First, the generation unit 120 acquires the designation of the second traffic sign and the attributes of the generated learning image data (step S242). These acquisitions are as described in step S212 of FIG.

すると生成部１２０は、第２交通標識の標準画像を標識標準画像記憶部１４０から読み出すとともに、取得した属性に対応する標識処理データを、標識処理データ記憶部１５０から読み出す（ステップＳ２４４）。また生成部１２０は、取得した属性に対応する背景画像を風景画像憶部１７０から読み出す（ステップＳ２４６）。そして生成部１２０は、読み出した第２交通標識の標準画像、標識処理データ、及び背景画像を用いて学習用画像データを生成し、生成した学習用画像データを、属性情報に紐づけて学習用データ記憶部１６０に記憶させる（ステップＳ２４８）。例えば生成部１２０は、図１２のステップＳ２１６と同様の処理を行うことにより、第２交通標識の標準画像を処理し、処理後の第２交通標識を風景画像にはめ込むことにより、学習用画像データを生成する。 Then, the generation unit 120 reads the standard image of the second traffic sign from the sign standard image storage unit 140, and reads the sign processing data corresponding to the acquired attribute from the sign processing data storage unit 150 (step S244). Further, the generation unit 120 reads the background image corresponding to the acquired attribute from the landscape image storage unit 170 (step S246). Then, the generation unit 120 generates learning image data using the read standard image, sign processing data, and background image of the second traffic sign, and associates the generated learning image data with the attribute information for learning. It is stored in the data storage unit 160 (step S248). For example, the generation unit 120 processes the standard image of the second traffic sign by performing the same processing as in step S216 of FIG. 12, and fits the processed second traffic sign into the landscape image to obtain the learning image data. To generate.

図１７は、本実施形態に係る生成部１２０が行う処理の第２例を示す図である。本図に示す処理は、第１実施形態において図１４を用いて説明した処理に近い。まず生成部１２０は、標識処理データ記憶部１５０から標識処理データを読み出す。この際、生成部１２０は、標識処理データ記憶部１５０から、読み出した標識処理データの属性も取得する（ステップＳ２５２）。この属性の一例は、交通標識が存在する地域を示す地域属性、設置される道路の道路種別を示す道路種別属性、または交通標識の形状を示す形状属性の少なくとも一つである。 FIG. 17 is a diagram showing a second example of processing performed by the generation unit 120 according to the present embodiment. The process shown in this figure is similar to the process described with reference to FIG. 14 in the first embodiment. First, the generation unit 120 reads the labeling processing data from the labeling processing data storage unit 150. At this time, the generation unit 120 also acquires the attribute of the label processing data read from the label processing data storage unit 150 (step S252). An example of this attribute is at least one of a regional attribute indicating an area where a traffic sign exists, a road type attribute indicating the road type of the road to be installed, or a shape attribute indicating the shape of the traffic sign.

ここで生成部１２０は、まず、属性の指定を入力デバイスを介して取得してもよい。この場合、ステップＳ２５２において、生成部１２０は、標識処理データ記憶部１５０から、指定された属性を有する標識処理データを複数読み出す。 Here, the generation unit 120 may first acquire the designation of the attribute via the input device. In this case, in step S252, the generation unit 120 reads out a plurality of label processing data having the designated attributes from the label processing data storage unit 150.

次いで生成部１２０は、ステップＳ２５２で取得した属性と同一の属性を有する第２交通標識の標準画像を、標識標準画像記憶部１４０から読み出す（ステップＳ２５４）。また生成部１２０は、この属性と同一の属性を有する背景画像を風景画像憶部１７０から読み出す（ステップＳ２５６）。次いで生成部１２０は、読み出した第２交通標識の標準画像、標識処理データ、及び背景画像を用いて学習用画像データを生成し、生成した学習用画像データを、属性情報に紐づけて学習用データ記憶部１６０に記憶させる（ステップＳ２５８）。ここで行われる学習用データの生成処理は、図１７のステップＳ２４８で説明した処理と同様である。また、ステップＳ２５４において複数の第２交通標識が読み出されていた場合、生成部１２０は、これら複数の第２交通標識のそれぞれに対してステップＳ２５８に示した処理を行う。 Next, the generation unit 120 reads out the standard image of the second traffic sign having the same attributes as the attributes acquired in step S252 from the sign standard image storage unit 140 (step S254). Further, the generation unit 120 reads a background image having the same attribute as this attribute from the landscape image storage unit 170 (step S256). Next, the generation unit 120 generates learning image data using the read standard image of the second traffic sign, the sign processing data, and the background image, and associates the generated learning image data with the attribute information for learning. It is stored in the data storage unit 160 (step S258). The process of generating the learning data performed here is the same as the process described in step S248 of FIG. When a plurality of second traffic signs have been read in step S254, the generation unit 120 performs the process shown in step S258 for each of the plurality of second traffic signs.

本実施形態によっても、第１実施形態と同様に、学習用画像データ生成装置１０のユーザは、学習用画像データを容易に準備することができる。また、学習用画像データ生成装置１０のユーザは、所望の背景を有する学習用画像データを容易に準備することができる。 Also in this embodiment, as in the first embodiment, the user of the learning image data generation device 10 can easily prepare the learning image data. Further, the user of the learning image data generation device 10 can easily prepare the learning image data having a desired background.

［第３実施形態］
図１８は、第３実施形態に係る学習用画像データ生成装置１０の機能構成を示す図である。本実施形態に係る学習用画像データ生成装置１０は、更新部１３２を有している点を除いて、第２実施形態に係る学習用画像データ生成装置１０と同様の構成である。 [Third Embodiment]
FIG. 18 is a diagram showing a functional configuration of the learning image data generation device 10 according to the third embodiment. The learning image data generation device 10 according to the present embodiment has the same configuration as the learning image data generation device 10 according to the second embodiment, except that it has an update unit 132.

更新部１３２は、車載カメラに接続している車載装置又は車載カメラと通信を行い、車載カメラが撮影した画像（例えば動画を構成するフレーム画像）であって交通標識を含む画像を取得する。そして更新部１３２は、この画像、又はこの画像から交通標識をトリミングした画像を、撮影画像として撮影画像記憶部１３０に記憶させる。この際、更新部１３２は、撮影画像の属性情報も取得し、撮影画像記憶部１３０に記憶させる。 The update unit 132 communicates with an in-vehicle device or an in-vehicle camera connected to the in-vehicle camera, and acquires an image (for example, a frame image constituting a moving image) taken by the in-vehicle camera and including a traffic sign. Then, the updating unit 132 stores this image or an image obtained by trimming a traffic sign from this image in the captured image storage unit 130 as a captured image. At this time, the updating unit 132 also acquires the attribute information of the captured image and stores it in the captured image storage unit 130.

また更新部１３２は、交通標識を含む画像から交通標識をトリミングした画像を標準画像として標識標準画像記憶部１４０に記憶させてもよいし、交通標識を含まない画像を風景画像として風景画像憶部１７０に記憶させてもよい。いずれの場合においても、更新部１３２は、必要な属性情報を画像とともに取得し、標識標準画像記憶部１４０や風景画像憶部１７０に記憶させる。 Further, the update unit 132 may store an image obtained by trimming the traffic sign from the image including the traffic sign in the sign standard image storage unit 140 as a standard image, or store an image not including the traffic sign as a landscape image in the landscape image storage unit. It may be stored in 170. In either case, the update unit 132 acquires necessary attribute information together with the image and stores it in the sign standard image storage unit 140 and the landscape image storage unit 170.

なお、第１実施形態に係る学習用画像データ生成装置１０に、本実施形態で示した更新部１３２を設けてもよい。 The learning image data generation device 10 according to the first embodiment may be provided with the update unit 132 shown in the present embodiment.

本実施形態によっても、第２実施形態と同様に、所望の背景を有する学習用画像データを容易に準備することができる。また、学習用画像データは標識処理データを用いて生成されるが、本実施形態では、この標識処理データの基となる撮影画像を容易に増やすことができる。このため、容易に学習用画像データの数を増やすことができる。 Also in this embodiment, as in the second embodiment, learning image data having a desired background can be easily prepared. Further, the learning image data is generated by using the sign processing data, but in the present embodiment, the captured image which is the basis of the sign processing data can be easily increased. Therefore, the number of learning image data can be easily increased.

以上、図面を参照して実施形態及び実施例について述べたが、これらは本発明の例示であり、上記以外の様々な構成を採用することもできる。 Although the embodiments and examples have been described above with reference to the drawings, these are examples of the present invention, and various configurations other than the above can be adopted.

１０学習用画像データ生成装置
１１０取得部
１２０生成部
１３０撮影画像記憶部
１３２更新部
１４０標識標準画像記憶部
１５０標識処理データ記憶部
１６０学習用データ記憶部
１７０風景画像憶部 10 Learning image data generation device 110 Acquisition unit 120 Generation unit 130 Captured image storage unit 132 Update unit 140 Label standard image storage unit 150 Label processing data storage unit 160 Learning data storage unit 170 Landscape image memory unit

Claims

The said, which is generated by using the sign processing data which affects the appearance of the first traffic sign acquired from the photographed image including the first traffic sign and the image of the second traffic sign different from the first traffic sign. Learning image data for learning the second traffic sign,
Attribute data indicating that the learning image data includes the second traffic sign, and
A data structure for learning that has.

In the learning data structure according to claim 1,
The sign processing data is a learning data structure obtained by the difference between a predetermined standard image showing the first traffic sign and the first traffic sign in the captured image.

In the learning data structure according to any one of claims 1 or 2.
The image of the second traffic sign is a learning data structure which is a predetermined standard image showing the second traffic sign.

An acquisition means for acquiring sign processing data that affects the appearance of the first traffic sign from a photographed image including the first traffic sign, and
A generation means for generating learning image data for learning the second traffic sign using an image of a second traffic sign different from the first traffic sign and the sign processing data.
Image data generation device for learning having.

In the learning image data generator according to claim 4.
The acquisition means is a learning image data generation device that acquires the sign processing data by the difference between a predetermined standard image showing the first traffic sign and the first traffic sign in the captured image.

In the learning image data generator according to any one of claims 4 or 5.
The generation means is a learning image data generation device that uses a predetermined standard image showing the second traffic sign as the image of the second traffic sign.

In the learning image data generation device according to any one of claims 4 to 6.
In the generation means, at least one of the area attribute indicating the area where the sign exists, the road type attribute indicating the road type of the road to be installed, and the shape attribute indicating the shape of the traffic sign is the first traffic sign. A learning image data generator that selects an image of a second traffic sign that is the same as.

In the learning image data generation device according to any one of claims 4 to 6.
The generation means has at least one of the same regional attributes indicating the area where the sign exists, the road type attribute indicating the road type of the road to be installed, and the shape attribute indicating the shape of the traffic sign. 1 A learning image data generation device that selects sign processing data acquired from captured images including traffic signs.

In the learning image data generator according to any one of claims 4 to 8.
The generation means is a learning image data generation device that further generates the learning image data by using a landscape image obtained by photographing the landscape.

In the learning image data generator according to claim 9.
The generation means is a learning image data generation device that generates a landscape image having the same characteristics of an image shooting time zone or shooting location and the captured image.

An image corresponding to a point specified in the first feature space using a first feature space generated by using GAN (Generative Adversarial Networks) from a plurality of captured images including a first traffic sign. An acquisition means for acquiring label processing data by the difference between the designated point feature amount, which is the feature amount of the above, and the standard feature amount, which is the feature amount of a predetermined standard image indicating the first traffic sign in the first feature amount space. ,
A generation means for generating learning image data of the second traffic sign using the sign feature amount of the image of the second traffic sign different from the first traffic sign and the sign processing data.
Image data generation device for learning having.

In the learning image data generation device according to claim 11.
In the generation means, at least one of the area attribute indicating the area where the sign exists, the road type attribute indicating the road type of the road to be installed, and the shape attribute indicating the shape of the traffic sign is the first traffic sign. A learning image data generator that selects an image of a second traffic sign that is the same as.

In the learning image data generation device according to claim 11.
The generation means has at least one of the same regional attributes indicating the area where the sign exists, the road type attribute indicating the road type of the road to be installed, and the shape attribute indicating the shape of the traffic sign. 1 A learning image data generation device that selects sign processing data acquired from captured images including traffic signs.

An acquisition means for acquiring sign processing data that affects the appearance of the first traffic sign from a photographed image including a first traffic sign whose road type on which the sign is installed is an expressway.
Unlike the first traffic sign, the second traffic sign uses an image of a second traffic sign in which the road type on which the sign is installed is an expressway, the sign processing data, and a landscape image of the expressway. A generation means for generating learning image data for learning
Image data generation device for learning having.

An acquisition means for acquiring sign processing data that affects the appearance of the first traffic sign from a photographed image including the first traffic sign, and
A generation means for generating learning image data for learning the second traffic sign by using an image of a second traffic sign different from the first traffic sign, the sign processing data, and a landscape image obtained by photographing the landscape. And have
A learning image data generation device in which the shooting time zone of the shot image and the landscape image is the same.