JP2018072938A

JP2018072938A - Number-of-targets estimation device, number-of-targets estimation method, and program

Info

Publication number: JP2018072938A
Application number: JP2016208818A
Authority: JP
Inventors: 修平彦坂; Shuhei Hikosaka; 友之今泉; tomoyuki Imaizumi; 藍斗藤田; Aito Fujita; 佳介根本; Keisuke Nemoto
Original assignee: Pasco Corp
Current assignee: Pasco Corp
Priority date: 2016-10-25
Filing date: 2016-10-25
Publication date: 2018-05-10
Anticipated expiration: 2036-10-25
Also published as: JP6798854B2

Abstract

PROBLEM TO BE SOLVED: To automatically and more accurately estimate the number of targets from an analysis object image such as a satellite image.SOLUTION: In one embodiment of this invention, it is determined whether targets exist in each of a plurality of sub-images constituting an analysis object image or not by using a classification model which has learnt characteristics of a learning object image and a correct answer value of presence/absence of targets included in the learning object image. Next, the number of targets included in sub-images where presence of targets is determined is estimated by a regression model which has learnt the characteristics of the learning object image and a correct answer value of the number of targets included in the learning object image.SELECTED DRAWING: Figure 14

Description

本発明は、解析対象画像に含まれる目的物の個数を計算する目的物個数推定装置、目的物個数推定方法及びプログラムに関する。 The present invention relates to an object number estimation device, an object number estimation method, and a program for calculating the number of objects included in an analysis target image.

近年、地球観測衛星により撮影される衛星画像は、高分解能化が進んでおり、画素当たりの空間解像度が数十ｃｍスケールの高分解能画像を取得できるようになってきている。さらに、地球観測衛星の観測機会向上により時間分解能も向上している。そのため、衛星画像はこれまで以上に幅広い用途へ活用できる可能性を有している。衛星画像を用いて日々の地球（地表等）の状況や変化を解析することにより、地球上の経済活動を可視化することができると考えられる。 In recent years, satellite images taken by an earth observation satellite have been improved in resolution, and it has become possible to acquire a high-resolution image having a spatial resolution per pixel of several tens of centimeters. In addition, the time resolution has been improved by improving the observation opportunities of Earth observation satellites. For this reason, satellite images have the potential to be used for a wider range of applications than ever before. It is thought that economic activities on the earth can be visualized by analyzing the daily situation and changes of the earth (surface, etc.) using satellite images.

現状では、衛星画像からの情報抽出を人手で行っており、衛星画像の情報量増加に対して、解析能力が追い付いておらず、有用な情報を迅速に抽出することができていない。また、人手に頼った情報抽出は、作業時間、作業コスト、精度のばらつきが大きい。 At present, the information extraction from the satellite image is performed manually, the analysis ability has not caught up with the increase in the information amount of the satellite image, and the useful information cannot be extracted quickly. In addition, information extraction that relies on manpower has large variations in work time, work cost, and accuracy.

衛星画像を解析する技術として以下のようなものが知られている。例えば特許文献１には、入力画像全体から抽出した線分群から、車両に相当する線分群を一括抽出し、その線分群の道路領域内での密度を計算することで、当該道路領域内での車両台数密度を求めて可視的に提示する車両台数密度観測装置が開示されている。 The following techniques are known as techniques for analyzing satellite images. For example, in Patent Document 1, a line segment group corresponding to a vehicle is collectively extracted from a line segment group extracted from the entire input image, and the density in the road region of the line segment group is calculated. A vehicle number density observation apparatus that obtains a vehicle number density and presents it visually is disclosed.

また非特許文献１には、ディープラーニングを用いて衛星画像から地物を抽出する技術が開示されている。また非特許文献２には、畳み込みニューラルネットワーク（ＣＮＮ）を用いて高空間解像度衛星画像から地物を抽出する技術が開示されている。さらに非特許文献３には、深層の畳み込みニューラルネットワークを用いたイメージネットの分類について開示されている。 Non-Patent Document 1 discloses a technique for extracting features from a satellite image using deep learning. Non-Patent Document 2 discloses a technique for extracting features from a high spatial resolution satellite image using a convolutional neural network (CNN). Further, Non-Patent Document 3 discloses classification of image nets using a deep convolutional neural network.

特開２０１０−１２８７３２号公報JP 2010-128732 A

藤田藍斗、今泉友之、彦坂修平、「Deep Learningを用いた衛星画像からの地物抽出」、日本リモートセンシング学会、第５９回（２０１５年１１月）学術講演会Aito Fujita, Tomoyuki Imaizumi, Shuhei Hikosaka, “Extracting features from satellite images using Deep Learning”, The Remote Sensing Society of Japan, 59th (November 2015) Academic Lecture 藤田藍斗、今泉友之、彦坂修平、「ＣＮＮを用いた高空間解像度衛星画像からの地物抽出」、人工知能学会、第３０回（２０１６年６月）人工知能学会全国大会Aito Fujita, Tomoyuki Imaizumi, Shuhei Hikosaka, “Extracting features from high spatial resolution satellite images using CNN”, Japanese Society for Artificial Intelligence, 30th (June 2016) National Conference of Japanese Society for Artificial Intelligence A. Krizhevsky, I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks, Advances in NIPS, pp. 1097-1105A. Krizhevsky, I. Sutskever, and G.E.Hinton, Imagenet classification with deep convolutional neural networks, Advances in NIPS, pp. 1097-1105

しかしながら、特許文献１及び非特許文献１〜３に記載の技術を含め従来の技術では、解析対象画像から目的物の数を、自動的に精度よく推定することができなかった。また、目的物の数を１つ１つ確認するのに必要な画像の解像度が得られない場合、自動的に精度よく推定することができなかった。 However, in the conventional techniques including the techniques described in Patent Document 1 and Non-Patent Documents 1 to 3, the number of objects cannot be automatically and accurately estimated from the analysis target image. In addition, when the resolution of the image necessary for confirming the number of objects one by one cannot be obtained, it has not been possible to estimate automatically with high accuracy.

本発明は、上記の状況を考慮してなされたものであり、衛星画像等の解析対象画像から目的物の数を自動的に精度よく推定するものである。また、目的物の数を１つ１つ確認するのに必要な画像の解像度が得られない場合においても、おおよその数を把握できる解像度であれば目的物の数を自動的に精度よく推定するものである。 The present invention has been made in consideration of the above situation, and automatically and accurately estimates the number of objects from an analysis target image such as a satellite image. In addition, even when the resolution of an image necessary for confirming the number of objects one by one cannot be obtained, the number of objects can be estimated automatically and accurately if the resolution is such that an approximate number can be grasped.

本発明の一態様の目的物個数推定装置は、目的物判定部と、個数推定部を備える。
目的物判定部は、学習対象画像の特徴と当該学習対象画像に含まれる目的物の有無の正解値を学習した分類モデルを用いて、解析対象画像を構成する複数の小画像の各々に対し各小画像に目的物が存在するかどうかを判定する。
個数推定部は、学習対象画像の特徴と当該学習対象画像に含まれる目的物の個数の正解値を学習した回帰モデルを用いて、上記目的物判定部により目的物が存在すると判定された小画像に含まれる目的物の個数を推定する。 An object number estimation apparatus according to an aspect of the present invention includes an object determination unit and a number estimation unit.
The target object determination unit uses a classification model that learns the characteristics of the learning target image and the correct value of the presence or absence of the target object included in the learning target image, for each of a plurality of small images constituting the analysis target image. It is determined whether the object exists in the small image.
The number estimator uses a regression model that learns the characteristics of the learning target image and the correct value of the number of objects included in the learning target image, and the target image determination unit determines that the target exists. The number of objects included in is estimated.

本発明の少なくとも一態様によれば、解析対象画像から目的物の数を、自動的に精度よく推定することができる。
上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 According to at least one aspect of the present invention, the number of objects can be automatically and accurately estimated from an analysis target image.
Problems, configurations, and effects other than those described above will be clarified by the following description of embodiments.

検討（１）に係る車両台数推定方法の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of the vehicle number estimation method which concerns on examination (1). 検討（１）に係るクラス設定を示す説明図である。It is explanatory drawing which shows the class setting which concerns on examination (1). 検討（１）に係る３種類のクラス設定を比較した説明図である。It is explanatory drawing which compared three types of class setting which concerns on examination (1). 検討（１）に係る車両台数推定方法を示す説明図である。It is explanatory drawing which shows the vehicle number estimation method which concerns on examination (1). 検討（１）に係る１グリッド当たりの確率値区分ごとの車両台数を示すテーブルである。It is a table which shows the number of vehicles for every probability value division per grid concerning examination (1). 検討（１）に係る１グリッド当たりのクラスごとの台数を示すテーブルである。It is a table which shows the number for every class per grid concerning examination (1). 検討（１）に係る台数推定結果を示す説明図である。It is explanatory drawing which shows the number estimation result which concerns on examination (1). 検討（２）に係る車両台数推定方法の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of the vehicle number estimation method which concerns on examination (2). 検討（２）に係る画像全体の正解台数（目視判読結果）と画像全体の台数推定結果を示す説明図である。It is explanatory drawing which shows the correct number (visual interpretation result) of the whole image which concerns on examination (2), and the number estimation result of the whole image. 検討（２）の台数推定結果と検討（１）の台数推定結果を示す説明図である。It is explanatory drawing which shows the number estimation result of examination (2), and the number estimation result of examination (1). 検討（２）に係る車両の存在しないグリッドにおける台数推定結果を示す説明図である。図１１Ａは正解台数（目視判読結果）の例であり、図１１Ｂは検討（２）の台数推定結果の例である。It is explanatory drawing which shows the number estimation result in the grid where the vehicle which concerns on examination (2) does not exist. FIG. 11A is an example of the number of correct answers (visual interpretation result), and FIG. 11B is an example of the number estimation result of examination (2). 検討（２）に係る車両密度の低いグリッドにおける台数推定結果を示す説明図である。図１２Ａは正解台数（目視判読結果）の例であり、図１２Ｂは台数推定結果の例である。It is explanatory drawing which shows the number estimation result in the grid with low vehicle density which concerns on examination (2). FIG. 12A is an example of the number of correct answers (visual interpretation result), and FIG. 12B is an example of the number estimation result. 検討（２）に係る車両密度の高いグリッドにおける台数推定結果を示す説明図である。図１３Ａは正解台数（目視判読結果）の例であり、図１３Ｂは台数推定結果の例である。It is explanatory drawing which shows the number estimation result in the grid with a high vehicle density which concerns on examination (2). FIG. 13A is an example of the number of correct answers (visual interpretation result), and FIG. 13B is an example of the number estimation result. 本発明の一実施形態に係る車両台数推定方法の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of the vehicle number estimation method which concerns on one Embodiment of this invention. 本発明の一実施形態に係る分類結果及び回帰による台数推定結果を表現した衛星画像を示す説明図である。It is explanatory drawing which shows the satellite image showing the classification result and the number estimation result by regression which concern on one Embodiment of this invention. 本発明の一実施形態に係る車両台数推定装置の内部構成例を示すブロック図である。It is a block diagram which shows the example of an internal structure of the vehicle number estimation apparatus which concerns on one Embodiment of this invention. 本発明の一実施形態に係る学習・モデル生成フェーズにおける処理例を示すフローチャートである。It is a flowchart which shows the process example in the learning and model production | generation phase which concerns on one Embodiment of this invention. 本発明の一実施形態に係る解析フェーズにおける処理例を示すフローチャートである。It is a flowchart which shows the process example in the analysis phase which concerns on one Embodiment of this invention. 本発明の一実施形態に係るある１シーンの駐車場領域における台数推定結果を示す説明図である。図１９Ａは正解台数の例であり、図１９Ｂは台数推定結果の例である。It is explanatory drawing which shows the number estimation result in the parking lot area | region of a certain scene which concerns on one Embodiment of this invention. FIG. 19A is an example of the number of correct answers, and FIG. 19B is an example of the number estimation result. 本発明の一実施形態に係る台数推定結果、並びに、検討（１）及び検討（２）の台数推定結果を示す説明図である。It is explanatory drawing which shows the number estimation result which concerns on one Embodiment of this invention, and the number estimation result of examination (1) and examination (2). 本発明の一実施形態に係る推定値と実測値との関係例を示すグラフである。It is a graph which shows the example of a relationship between the estimated value and measured value which concern on one Embodiment of this invention. 検討（２）に係る推定値と実測値との関係例を示すグラフである。It is a graph which shows the example of a relationship between the estimated value which concerns on examination (2), and a measured value. 検討（１）に係る推定値と実測値との関係例を示すグラフである。It is a graph which shows the example of a relationship between the estimated value which concerns on examination (1), and a measured value. 車両台数推定装置が備えるコンピューターのハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the computer with which a vehicle number estimation apparatus is provided.

以下、本発明を実施するための形態の例について、添付図面を参照しながら説明する。各図において実質的に同一の機能又は構成を有する構成要素については、同一の符号を付して重複する説明を省略する。 Hereinafter, an example of an embodiment for carrying out the present invention will be described with reference to the accompanying drawings. In the drawings, components having substantially the same function or configuration are denoted by the same reference numerals and redundant description is omitted.

発明者らは、ディープラーニングを用いた高分解能衛星画像からの情報抽出手法を検討してきた。ディープラーニングは、人間の持つ学習機能をコンピューターで実現することを目的とした人工知能分野における技術の一つである。近年、自然画像の分類や物体検出においてディープラーニングにより既存の手法を大きく上回る性能が報告されている。このディープラーニングの大きな特徴は、解析対象の性質（地物ならば物体の形状や位置、大きさ）に関わらず、認識に有用な特徴をデータから「自動」で学習できる点である。即ち目的に応じた情報解析を人間による設計を介さずに、大量の複雑なデータからコンピューターが自ら学習し、解析モデルを生成することができる。 The inventors have studied a method for extracting information from a high-resolution satellite image using deep learning. Deep learning is one of the technologies in the field of artificial intelligence aimed at realizing human learning functions on computers. In recent years, it has been reported that deep learning greatly exceeds the existing methods in natural image classification and object detection. A major feature of this deep learning is that features that are useful for recognition can be learned “automatically” from data, regardless of the nature of the analysis target (the shape, position, and size of the object if it is a feature). That is, the computer can learn from a large amount of complex data and generate an analysis model without performing human-oriented information analysis according to the purpose.

以下に述べる検討（１）及び検討（２）において、ディープラーニング技術の一つである畳み込みニューラルネットワーク（CNN：Convolutional Neural Network）を用いて、高分解能衛星画像中の駐車車両（以下、単に「車両」ともいう）の台数を推定する手法の検討を行った。 In the studies (1) and (2) described below, a convolutional neural network (CNN), which is one of the deep learning technologies, is used to park a vehicle (hereinafter simply referred to as “vehicle” in a high-resolution satellite image). We also studied a method to estimate the number.

＜１．検討（１）＞
まず検討（１）について説明する。図１は、検討（１）に係る車両台数推定方法の概要を示す説明図である。 <1. Review (1)>
First, Study (1) will be described. FIG. 1 is an explanatory diagram showing an outline of the vehicle number estimation method according to the study (1).

図１に示す検討（１）の手法は、衛星画像等の判読画像１中の駐車場領域Ａｐ（指定領域）を小領域にグリッド分割し、分割画像（チップ画像２）を、畳み込みニューラルネットワークからなる学習済み分類モデル３に入力して車両の有無や車両占有率を表すラベルの種別（クラス）毎に分類する。ラベル種別（クラス）ごとに１グリッド当たりの台数が決定されている。検討（１）の手法は、分類結果４のラベル種別（クラス）ごとにチップ画像２の数と１グリッド当たりの台数をかけ算し、駐車場領域Ａｐにある駐車車両の台数を集計する。１グリッドは分割の単位であり、本明細書において「１グリッド」と「一つの分割画像」を同義で用いることがある。 In the method of study (1) shown in FIG. 1, the parking lot area Ap (designated area) in the interpretation image 1 such as a satellite image is divided into grids, and the divided image (chip image 2) is obtained from a convolutional neural network. Is input to the learned classification model 3 to be classified according to the type (class) of the label indicating the presence or absence of the vehicle and the vehicle occupancy rate. The number of units per grid is determined for each label type (class). The method of examination (1) multiplies the number of chip images 2 and the number of units per grid for each label type (class) of the classification result 4 and totals the number of parked vehicles in the parking lot area Ap. One grid is a unit of division. In this specification, “one grid” and “one divided image” may be used synonymously.

図２は、検討（１）に係るクラス設定を示す説明図である。図３は、検討（１）に係るラベル内容の異なる３種類のクラス設定を比較した説明図である。 FIG. 2 is an explanatory diagram illustrating class setting according to the study (1). FIG. 3 is an explanatory diagram comparing three types of class settings with different label contents according to Study (1).

例えば図２のクラス設定表５において、クラス設定１では「車両あり」と「車両なし」の２つのクラスが設定されている。クラス設定２では「車両占有率５０％以上」と「車両なし」の２つのクラスが設定されている。クラス設定３では「車両占有率５０％以上」、「車両占有率２５−５０％」、「車両占有率２５％未満」及び「車両なし」の４つのクラスが設定されている。ここで車両占有率とは、チップ画像２の面積に対する、車両と考えられる画像オブジェクトの面積の割合である。 For example, in the class setting table 5 of FIG. 2, in class setting 1, two classes of “with vehicle” and “without vehicle” are set. In class setting 2, two classes of “vehicle occupation rate of 50% or more” and “no vehicle” are set. In the class setting 3, four classes of “vehicle occupancy 50% or more”, “vehicle occupancy 25-50%”, “vehicle occupancy less than 25%”, and “no vehicle” are set. Here, the vehicle occupancy is the ratio of the area of the image object that is considered to be the vehicle to the area of the chip image 2.

検討（１）の手法では、入力された学習対象画像に対して車両の有無や量を表すラベルを学習する。図３は、判読画像１からウィンドウＷにより切り出したチップ画像１１〜１３に付されるラベルの例を表しており、図３を参照してチップ画像に車両が存在する場合におけるクラス設定ごとのクラスの分け方を説明する。チップ画像１１には全体に車両オブジェクトが存在し、クラス設定１を適用した場合には「車両あり」、クラス設定２の場合には「車両占有率５０％以上」、クラス設定３の場合にも「車両占有率５０％以上」のラベルが付される（分類される）。また、チップ画像１２には約半分に車両オブジェクトが存在し、クラス設定１の場合には「車両あり」、クラス設定２の場合には「学習しない」、クラス設定３の場合には「車両占有率２５−５０％」のラベルが付される。さらに、チップ画像１３には一部に車両オブジェクトが存在し、クラス設定１の場合には「車両あり」、クラス設定２の場合には「学習しない」、クラス設定３の場合には「車両占有率２５％未満」のラベルが付される。「学習しない」は、学習が行われないことを意味する。 In the method of study (1), a label representing the presence or absence or amount of a vehicle is learned for the input learning target image. FIG. 3 shows examples of labels attached to the chip images 11 to 13 cut out from the interpretation image 1 by the window W. With reference to FIG. 3, classes for each class setting when a vehicle is present in the chip image. Explain how to divide. There are vehicle objects in the chip image 11 as a whole, and “Class with vehicle” is applied when class setting 1 is applied, “vehicle occupation rate is 50% or more” when class setting 2 is applied, and also when class setting 3 is applied. The label “Vehicle occupancy 50% or more” is attached (classified). In addition, the vehicle object is present in about half of the chip image 12, “with vehicle” in the case of class setting 1, “not learning” in the case of class setting 2, “vehicle occupancy in the case of class setting 3 Labeled "rate 25-50%". Further, a part of the vehicle object is present in the chip image 13, “with vehicle” in the case of class setting 1, “not learning” in the case of class setting 2, “vehicle occupancy in the case of class setting 3 The label “less than 25%” is attached. “Do not learn” means that learning is not performed.

次に、図４を参照して検討（１）に係る車両台数推定方法を説明する。図４は、検討（１）に係る車両台数推定方法を示す説明図である。 Next, the vehicle number estimation method according to the study (1) will be described with reference to FIG. FIG. 4 is an explanatory diagram showing a vehicle number estimation method according to Study (1).

図４は、学習対象画像１４の各グリッド（チップ画像）を、クラス設定３（図３）に基づいて３つのラベル（クラス１４ａ〜１４ｃ）に分類した例である。クラス１４ａは「車両占有率５０％以上」、クラス１４ｂは「車両占有率２５−５０％」、クラス１４ｃは「車両占有率２５％未満」である。クラス１４ａ〜１４ｃに分類されたグリッド数はそれぞれ、３０個、１５個、５個である。クラス１４ａ〜１４ｃの各々のグリッド内の車両台数の合計が目視判読結果から７００台、２００台、及び３０台であるとき、クラス１４ａの１グリッドの平均値は２３．３台、クラス１４ｂの１グリッドの平均値は１３．３台、クラス１４ｃの１グリッドの平均値は６台となる。 FIG. 4 is an example in which each grid (chip image) of the learning target image 14 is classified into three labels (classes 14a to 14c) based on the class setting 3 (FIG. 3). Class 14a is “vehicle occupancy 50% or more”, class 14b is “vehicle occupancy 25-50%”, and class 14c is “vehicle occupancy 25% or less”. The numbers of grids classified into the classes 14a to 14c are 30, 15, and 5, respectively. When the total number of vehicles in each of the grids of the classes 14a to 14c is 700, 200, and 30 from the visual interpretation result, the average value of one grid of the class 14a is 23.3, and that of the class 14b The average value of the grid is 13.3 units, and the average value of one grid of class 14c is 6.

判読時には、各クラス（又はクラスの確率値）の１グリッド当たりの車両台数とグリッド数を積算することにより、学習対象画像１４の駐車場領域Ａｐにおける車両台数Ｐが推定される。車両台数Ｐは、式（１）で表される。Ｎ_１〜Ｎ_３は、クラス（又は確率値区分）（１，…，ｎ）に該当するグリッド（チップ画像）の数である。 At the time of interpretation, the number of vehicles P in the parking area Ap of the learning target image 14 is estimated by integrating the number of vehicles per grid of each class (or the probability value of the class) and the number of grids. The number of vehicles P is expressed by equation (1). N _{1 to} N ₃ are the number of grids (chip images) corresponding to the class (or probability value classification) (1,..., N).

Ｐ＝Ｎ_１×（クラス１４ａの１グリッド当たりの台数（２３．３））
＋Ｎ_２×（クラス１４ｂの１グリッド当たりの台数（１３．３））
＋Ｎ_３×（クラス１４ｃの１グリッド当たりの台数（６））・・・・（１） P = N ₁ × (number of units per grid of class 14a (23.3))
+ N ₂ × (number of units per class 14b (13.3))
+ N ₃ × (number of class 14c per grid (6)) (1)

図５は、検討（１）に係る１グリッド当たりの確率値区分ごとの車両台数を示すテーブルである。図５の１グリッド当たりの車両台数が定義された設定テーブル１５は、「確率値区分［％］」と「１グリッド当たりの台数［台］」のフィールドを有する。「１グリッド当たりの台数［台］」のフィールドには、クラス設定１（図２）の「車両あり」のクラスと、クラス設定２の「車両占有率５０％以上」のクラスの車両台数が格納されている。 FIG. 5 is a table showing the number of vehicles for each probability value category per grid according to Study (1). The setting table 15 in which the number of vehicles per grid in FIG. 5 is defined has fields of “probability value classification [%]” and “number of vehicles per grid [unit]”. In the “Number of vehicles per grid [units]” field, the number of vehicles of the class “Class with vehicle” in the class setting 1 (FIG. 2) and the class of the class “2 vehicle occupation rate of 50% or more” is stored. Has been.

「確率値区分」における確率値とは、分類モデルが出力する結果の確信度である。例えば、グリッド分割されたあるチップ画像の分類結果が「車両あり」だったとする。その際、分類結果の出力上では「車両あり：９５％」のような形で出力される。この９５％という値が確率値であり、「分類モデル３が９５％の確信度で車両ありと判断した」ということを示している。即ち図５の設定テーブル１５は、確率値区分毎にどれくらい車両が含まれていたかを評価した結果を表す。 The probability value in the “probability value category” is the certainty of the result output from the classification model. For example, it is assumed that the classification result of a chip image divided into grids is “with vehicle”. At that time, on the output of the classification result, it is output in the form of “with vehicle: 95%”. The value of 95% is a probability value, which indicates that “the classification model 3 has determined that there is a vehicle with a certainty factor of 95%”. That is, the setting table 15 in FIG. 5 represents the result of evaluating how many vehicles are included for each probability value category.

図６は、検討（１）に係る１グリッド当たりのクラスごとの台数を示すテーブルである。図６の１グリッド当たりの車両台数テーブル１６は、「クラス」と「１グリッド当たりの台数［台］」のフィールドを有する。即ち、クラス設定３（図２）における各クラスの１グリッド当たりの台数が格納されている。 FIG. 6 is a table showing the number of units for each class per grid according to Study (1). The number-of-vehicles table 16 per grid in FIG. 6 has fields of “class” and “number of vehicles per grid [unit]”. That is, the number of classes per grid in class setting 3 (FIG. 2) is stored.

次に、検討（１）の手法について検証する。解析対象の衛星画像には、ＡｉｒｂｕｓＤｅｆｅｎｃｅａｎｄＳｐａｃｅ社が提供するＰｌｅａｄｅｓ衛星により撮像された、あるアミューズメントパークの駐車場における２０１２年から２０１４年の衛星画像を使用した。学習シーン数は１４、評価シーン数は５０（学習シーンを含む）である。ここで、学習シーン数とは、学習に用いる画像の数を指し、評価シーン数とは、評価に用いる画像の数を指す。これらの衛星画像は、マルチスペクトル画像とパンクロマチック画像を用いてパンシャープン処理（合成処理）を行い作成された高解像度（５０ｃｍ／ｐｉｘｅｌ）のＲＧＢ画像である。衛星画像の画像サイズは、一例として１３４８×２３９８［ｐｉｘｅｌ］である。 Next, the method of Study (1) will be verified. As satellite images to be analyzed, satellite images from 2012 to 2014 in a parking lot of an amusement park captured by the Pleides satellite provided by Airbus Defense and Space, Inc. were used. The number of learning scenes is 14, and the number of evaluation scenes is 50 (including learning scenes). Here, the number of learning scenes indicates the number of images used for learning, and the number of evaluation scenes indicates the number of images used for evaluation. These satellite images are high-resolution (50 cm / pixel) RGB images created by performing pan-sharpening processing (compositing processing) using multispectral images and panchromatic images. As an example, the image size of the satellite image is 1348 × 2398 [pixel].

図７は、検討（１）に係る台数推定結果を示す説明図である。図７において、クラス設定１（図３）の「２クラス分類」、クラス設定２の「２クラス分類」、クラス設定３の「４クラス分類」ごとに、衛星画像の「全５０シーンの台数推定精度」が示されている。３つのクラス設定のうち、クラス設定２の「２クラス分類（車両占有率５０％以上、車両なし）」の場合が最も推定精度がよく、指定領域（駐車場領域）の正解値（Ground Truth）に対する相対誤差は２５％（推定精度７５％）である。 FIG. 7 is an explanatory diagram showing the number estimation result according to Study (1). In FIG. 7, for each “2 class classification” of class setting 1 (FIG. 3), “2 class classification” of class setting 2 and “4 class classification” of class setting 3, “estimate the number of all 50 scenes” of the satellite image. "Accuracy" is shown. Of the three class settings, the class setting 2 “2 class classification (vehicle occupancy 50% or more, no vehicle)” provides the best estimation accuracy, and the correct value (Ground Truth) of the specified area (parking area) The relative error with respect to is 25% (estimated accuracy 75%).

しかし、推定精度７５％では実用レベルに達していない。これは、分類クラス（又は確率値区分）ごとに一律で台数を割り当てていることにより、車種の違いによる台数の違いを反映できていないことが原因と考えられる。例えば普通車とバスでは全長が異なるため、仮に複数の普通車及びバスが同一面積の領域を占有していても、それぞれの台数は異なる。検討（１）では、種々のクラス設定を検討し、また１グリッド当たりの台数をクラス又は確率値区分ごとに設定したが、１グリッド当たりの台数の確度に限界がある。そのため、台数の推定精度を上げることが難しかった。 However, the estimated accuracy of 75% has not reached the practical level. This is considered to be because the difference in the number of vehicles due to the difference in the vehicle type cannot be reflected because the number of vehicles is uniformly assigned for each classification class (or probability value category). For example, since ordinary vehicles and buses have different total lengths, the number of vehicles differs even if a plurality of ordinary vehicles and buses occupy the same area. In study (1), various class settings were examined, and the number of units per grid was set for each class or probability value category, but the accuracy of the number of units per grid is limited. For this reason, it has been difficult to increase the number estimation accuracy.

＜２．検討（２）＞
次に、検討（２）について説明する。検討（２）の手法は、チップ画像２の分類を行わずに、各チップ画像の車両台数を回帰モデルで直接推定する手法である。解析に使用した衛星画像は、検討（１）と同様である。 <2. Review (2)>
Next, Study (2) will be described. The method of study (2) is a method of directly estimating the number of vehicles in each chip image using a regression model without classifying the chip image 2. The satellite image used for the analysis is the same as in Study (1).

図８は、検討（２）に係る車両台数推定方法の概要を示す説明図である。図８に示す検討（２）の手法は、判読画像１中の指定領域Ａｐを小領域にグリッド分割し、分割画像（チップ画像２）を、畳み込みニューラルネットワークからなる学習済み回帰モデル２３に入力する。そして、回帰モデル２３がチップ画像２ごとに台数推定値（台数推定結果２４）を出力し、チップ画像２ごとの台数推定値を集計することにより、駐車場領域にある駐車車両の台数を推定する。 FIG. 8 is an explanatory diagram showing an outline of the vehicle number estimation method according to the study (2). In the method of study (2) shown in FIG. 8, the designated area Ap in the interpretation image 1 is grid-divided into small areas, and the divided image (chip image 2) is input to the learned regression model 23 formed of a convolutional neural network. . Then, the regression model 23 outputs the estimated number of units (number estimation result 24) for each chip image 2, and estimates the number of parked vehicles in the parking lot area by counting the estimated number of units for each chip image 2. .

回帰モデル２３は、学習対象画像に対して車両台数の正解データ（目視判断結果）を学習済みのモデルであり、入力データ（チップ画像２）に対して駐車車両の台数を推定する。検討（２）の手法は、画像に対する駐車車両の台数を直接学習するため、台数推定値に車種の違いを表現（反映）することができる。 The regression model 23 is a model in which the correct answer data (visual determination result) of the number of vehicles is already learned with respect to the learning target image, and the number of parked vehicles is estimated with respect to the input data (chip image 2). Since the method of examination (2) directly learns the number of parked vehicles with respect to the image, the difference in the vehicle type can be expressed (reflected) in the estimated number of vehicles.

図９は、検討（２）に係る画像全体の正解台数（目視判読結果）と画像全体の台数推定結果を示す説明図である。目視判読結果３１の要部画像３１ａ、及び回帰モデル２３による台数推定結果３２の要部画像３２ａにおいて、上側の部分が普通車の駐車領域、右下部分がバスの駐車領域である。要部画像３１ａと要部画像３２ａを比較すると、要部画像３２ａは、台数推定値に車種別の台数の傾向を概ね表現（反映）できている。なお、要部画像３２ａの左端のグリッドにおける台数推定値‘−１’は、確率計算上の表現であり、０台とみなす。 FIG. 9 is an explanatory diagram showing the number of correct answers (visual interpretation result) of the entire image and the number of images estimated for the entire image according to Study (2). In the main part image 31a of the visual interpretation result 31 and the main part image 32a of the number estimation result 32 by the regression model 23, the upper part is a parking area for a normal vehicle and the lower right part is a parking area for a bus. Comparing the main part image 31a and the main part image 32a, the main part image 32a can generally express (reflect) the tendency of the number of vehicle types in the estimated number of vehicles. Note that the estimated number of units “−1” in the leftmost grid of the main image 32 a is an expression in probability calculation, and is regarded as 0 units.

図１０は、検討（２）の台数推定結果と検討（１）の台数推定結果を示す説明図である。
図１０に示すように、検討（２）の「回帰」の相対誤差は２３％（推定精度７７％）であり、検討（１）の「２クラス分類」の相対誤差２５％を僅かに上回った。 FIG. 10 is an explanatory diagram showing the number estimation result of the study (2) and the number estimation result of the study (1).
As shown in FIG. 10, the relative error of “Regression” in Study (2) is 23% (estimated accuracy 77%), which is slightly higher than the relative error of 25% in “Class 2” of Study (1). .

図１１は、検討（２）に係る車両の存在しないグリッドにおける台数推定結果を示す説明図である。図１１Ａは正解台数（目視判読結果）の例であり、図１１Ｂは検討（２）の台数推定結果の例である。図１１Ｂの台数推定値‘−０’は、計算上の確率値が０未満であることを示し、実質的に０台である。図１１Ｂに示すように、図１１Ａの車両の存在しないグリッドに対する台数推定値が０台ではない。さらに、検討（２）の手法では、駐車されている車両の密度が高い場合の推定誤差と、車両の密度が低い場合の推定誤差（推定精度）との差が大きい。 FIG. 11 is an explanatory diagram showing the result of estimation of the number of vehicles in a grid where there is no vehicle according to Study (2). FIG. 11A is an example of the number of correct answers (visual interpretation result), and FIG. 11B is an example of the number estimation result of examination (2). The estimated number of units “−0” in FIG. 11B indicates that the calculated probability value is less than 0, which is substantially zero. As shown in FIG. 11B, the estimated number of units for the grid in FIG. 11A where no vehicle is present is not zero. Furthermore, in the method of Study (2), there is a large difference between an estimation error when the density of parked vehicles is high and an estimation error (estimation accuracy) when the density of vehicles is low.

図１２は、検討（２）に係る車両密度の低いグリッドにおける台数推定結果を示す説明図である。図１２Ａは正解台数（目視判読結果）の例であり、図１２Ｂは台数推定結果の例である。図１２Ａには、車両が０台であるグリッド（８個）が多く見られる。しかし、図１２Ｂでは、車両の存在しないグリッドに対する台数推定値が０ではないグリッド（５個）がある。これは、一つの理由として、駐車線（例えば白線）と車両（太陽光の反射光）の区別がしっかりできていないことが考えられる。 FIG. 12 is an explanatory diagram illustrating the number estimation result in a grid with low vehicle density according to Study (2). FIG. 12A is an example of the number of correct answers (visual interpretation result), and FIG. 12B is an example of the number estimation result. In FIG. 12A, many grids (eight) with zero vehicles can be seen. However, in FIG. 12B, there are grids (5) in which the estimated number of units for a grid in which no vehicle exists is not zero. One possible reason for this is that the parking line (for example, white line) and the vehicle (reflected sunlight) are not clearly distinguished.

図１３は、検討（２）に係る車両密度の高いグリッドにおける台数推定結果を示す説明図である。図１３Ａは正解台数の例であり、図１３Ｂは台数推定結果の例である。図１３Ａは、図１２Ａと比較して車両が広範囲のグリッドに写っているとともに、グリッド内の台数も多い。図１３Ｂの台数推定結果は、実際に駐車している車両を概ね反映した結果となっており、図１２Ｂと比較して推定誤差が小さい。このため、車両の密度が高い場合の推定誤差と、車両の密度が低い場合の推定誤差との差が大きくなる。なお、図１３Ｂにおいても、車両の存在しないグリッド（２個）に対する台数推定値が０ではないグリッド（２個）が散見されるが、この数字‘２個’も駐車場領域全体（指定領域全体）では無視できない数となる。よって、車両の存在しないグリッドに対する台数推定値が０ではないグリッドが存在する問題を、改善することが望ましい。 FIG. 13 is an explanatory diagram showing the number estimation result in a grid with high vehicle density according to Study (2). FIG. 13A is an example of the number of correct answers, and FIG. 13B is an example of the number estimation result. In FIG. 13A, the vehicle is shown in a wide grid compared to FIG. 12A, and the number of vehicles in the grid is also large. The number estimation result of FIG. 13B is a result that largely reflects the actually parked vehicle, and the estimation error is smaller than that of FIG. 12B. For this reason, the difference between the estimation error when the vehicle density is high and the estimation error when the vehicle density is low increases. In FIG. 13B, there are some grids (2) whose number estimates are not 0 with respect to the grids (2) where there are no vehicles. This number '2' is also the entire parking lot area (the entire designated area). ) Cannot be ignored. Therefore, it is desirable to improve the problem that there is a grid whose number of estimated values is not 0 with respect to a grid where no vehicle exists.

このように、検討（２）の手法では、画像に対する車両台数を直接学習できるため、検討（１）の手法では難しかった台数推定値に車種の違いを表現（反映）することができる。しかし、検討（２）の手法は、誤って車両の存在しないグリッドに車両が数台存在すると推定してしまう問題がある。さらに検討（２）の手法は、台数推定値に駐車車両の密度を表現（反映）できない、即ち、駐車車両の密度が低いグリッドと駐車車両の密度が高いグリッドとの間で推定誤差の違いが大きい。 As described above, since the number of vehicles for the image can be directly learned in the method of Study (2), the difference in vehicle type can be expressed (reflected) in the estimated number of vehicles, which was difficult with the method of Study (1). However, the method of Study (2) has a problem that it is estimated that there are several vehicles in a grid where no vehicle exists. Further, the method of Study (2) cannot express (reflect) the density of parked vehicles in the estimated number of units, that is, there is a difference in estimation error between a grid with a low density of parked vehicles and a grid with a high density of parked vehicles. large.

そこで、本発明者らは、検討（２）の手法（回帰モデルを使用した台数推定方法）に検討（１）の手法（分類モデル）を組み合わせ、検討（２）の手法の問題点を軽減する手法を発明した。以下、本発明について図面を参照しながら説明する。 Therefore, the present inventors combine the technique (classification model) of examination (1) with the technique of examination (2) (number estimation method using a regression model) to reduce the problems of the technique of examination (2). Invented a technique. The present invention will be described below with reference to the drawings.

＜３．一実施形態＞
［車両台数推定方法の概要］
図１４は、本発明の一実施形態に係る車両台数推定方法の概要を示す説明図である。本実施形態では、目的物として衛星画像の車両を例にとり説明する。車両台数推定方法は、目的物個数推定方法の一実施形態である。 <3. One Embodiment>
[Outline of vehicle number estimation method]
FIG. 14 is an explanatory diagram showing an overview of a vehicle number estimation method according to an embodiment of the present invention. In this embodiment, a vehicle with a satellite image will be described as an example of the target object. The vehicle number estimation method is an embodiment of the object number estimation method.

まず、衛星画像５１（解析対象画像）中の駐車場領域Ａｐ（指定領域の例）を小領域にグリッド分割してチップ画像５２を作成し、チップ画像５２中に駐車車両が存在するか否かを、例えば畳み込みニューラルネットワーク（ＣＮＮ）からなる分類モデル１３５（図１６参照）を用いて分類する。次に、分類結果５３の駐車車両ありと判断されたチップ画像５２に対して、チップ画像５２中の駐車車両の数を、ＣＮＮを用いた回帰モデル１３６（図１６参照）により推定を行う。最後に、各チップ画像５２の推定台数（台数推定結果５４）を合計し、衛星画像５１中の駐車場領域Ａｐの駐車車両の台数を得る。 First, a chip image 52 is created by dividing a parking lot area Ap (an example of a specified area) in the satellite image 51 (analysis target image) into small areas, and whether or not a parked vehicle exists in the chip image 52. Are classified using, for example, a classification model 135 (see FIG. 16) composed of a convolutional neural network (CNN). Next, the number of parked vehicles in the chip image 52 is estimated based on the regression model 136 (see FIG. 16) using the CNN for the chip image 52 determined as having a parked vehicle in the classification result 53. Finally, the estimated number of chip images 52 (number estimation result 54) is totaled to obtain the number of parked vehicles in the parking lot area Ap in the satellite image 51.

このように、本実施形態に係る車両台数推定方法は、初めにチップ画像（グリッド）の分類を行い、分類結果を元に「車両あり」のグリッドのみ回帰による台数推定を行う。 As described above, in the vehicle number estimation method according to the present embodiment, the chip images (grids) are classified first, and the number of vehicles is estimated by regression only for the “vehicle present” grid based on the classification result.

図１５は、一実施形態に係る分類結果及び回帰による台数推定結果を表現した衛星画像を示す説明図である。分類モデル１３５は、衛星画像５１からグリッド分割された各チップ画像５２を取り込み、各チップ画像５２について「車両あり」又は「車両なし」を分類する。図１５左側の分類結果５３は、検討（１）のクラス設定２（図２）に基づく２クラス分類結果を表しており、白いグリッドは「車両あり」、黒いグリッドは「車両なし」である。図１５右側の台数推定結果５４に示すように、分類結果５３の黒いグリッド（車両なし）に対して台数推定が行われないため、台数推定値は‘０’と表記されている。 FIG. 15 is an explanatory diagram illustrating a satellite image representing the classification result and the number estimation result by regression according to an embodiment. The classification model 135 takes in each chip image 52 divided into grids from the satellite image 51 and classifies “with a vehicle” or “without a vehicle” with respect to each chip image 52. The classification result 53 on the left side of FIG. 15 represents the two-class classification result based on the class setting 2 (FIG. 2) of the examination (1). The white grid is “with vehicle” and the black grid is “without vehicle”. As shown in the number estimation result 54 on the right side of FIG. 15, since the number estimation is not performed for the black grid (no vehicle) of the classification result 53, the number estimation value is written as “0”.

［車両台数推定装置の内部構成］
図１６は、一実施形態に係る車両台数推定装置の内部構成例を示すブロック図である。図１６に示すように車両台数推定装置１００は、学習用データベース１１０（図中「学習用ＤＢ」と表記）、前処理部１２０、学習部１３０、解析処理部１４０、及び後処理部１５０を備える。 [Internal configuration of vehicle number estimation device]
FIG. 16 is a block diagram illustrating an internal configuration example of the vehicle number estimation device according to the embodiment. As illustrated in FIG. 16, the vehicle number estimation device 100 includes a learning database 110 (indicated as “learning DB” in the figure), a preprocessing unit 120, a learning unit 130, an analysis processing unit 140, and a postprocessing unit 150. .

学習用データベース１１０は、衛星画像等の解析対象画像を学習対象画像として保存するデータベースである。また学習用データベース１１０には、解析対象画像の分割単位であるグリッド内の駐車車両の台数（正解値）が、グリッド位置と対応づけて保存される。学習用データベース１１０は、大容量の不揮発性ストレージ２０７（後述する図２４）に構築される。 The learning database 110 is a database that stores analysis target images such as satellite images as learning target images. Further, the learning database 110 stores the number of parked vehicles (correct value) in the grid, which is a division unit of the analysis target image, in association with the grid position. The learning database 110 is constructed in a large-capacity nonvolatile storage 207 (FIG. 24 described later).

前処理部１２０は、色調補正部１２１、画像分割部１２２、及び学習チップ画像セット生成部１２３を備える。 The preprocessing unit 120 includes a color tone correction unit 121, an image division unit 122, and a learning chip image set generation unit 123.

色調補正部１２１は、学習対象画像及び解析対象画像の色調を補正する処理を行う。衛星画像からの情報抽出には、撮像場所や季節、時間の違いに起因する衛星画像の色の変化に左右されない安定した性能が求められる。即ち、車両台数推定装置１００において、推定精度を上げるため、色合いを正規化した衛星画像での学習及び判読を行うことが重要である。正規化されたデータを利用して学習及び判読することにより、衛星画像のシーン間の色の違い等のノイズを小さくすることができる。それにより、後述するネットワークモデル（識別器）が様々な時期の画像に対して安定した判読性能を持つことが期待できる。 The color tone correction unit 121 performs a process of correcting the color tone of the learning target image and the analysis target image. Information extraction from a satellite image requires stable performance that is not affected by changes in the color of the satellite image due to differences in imaging location, season, and time. That is, in the vehicle number estimation apparatus 100, it is important to perform learning and interpretation with a satellite image with normalized hues in order to increase estimation accuracy. By learning and interpreting using the normalized data, noise such as a color difference between scenes of the satellite image can be reduced. Accordingly, it can be expected that a network model (discriminator) described later has stable interpretation performance for images at various periods.

そこで、色調補正部１２１において、複数の衛星画像（学習対象画像、解析対象画像）の統計量を用いて正規化を行う。この正規化では、補正対象画像の統計量（平均値、標準偏差）が基本色（Ｒ，Ｇ，Ｂ）のバンド（周波数帯域）毎に設定した目標平均値、目標標準偏差と同じになるように変換を行う。後述する学習フェーズと解析フェーズでは、同じ統計量を用いる。具体的な正規化の計算方法の一例を、下記に示す。 Therefore, the color tone correction unit 121 performs normalization using statistics of a plurality of satellite images (learning target images, analysis target images). In this normalization, the statistic (average value, standard deviation) of the correction target image is the same as the target average value and target standard deviation set for each band (frequency band) of the basic colors (R, G, B). Convert to The same statistics are used in the learning phase and the analysis phase described later. An example of a specific normalization calculation method is shown below.

まず衛星画像のＲ，Ｇ，Ｂのバンド毎に全画素の平均値Ａ、標準偏差Ｓを算出する。次に、式（２）により、各バンドについて、画像座標（ｘ，ｙ）の画素における輝度値Ｉ_ｘｙから正規化後の輝度値Ｉ’_ｘｙを求める。Ａ_ａｉｍは目標平均値、Ｓ_ａｉｍは目標標準偏差である。一例として画素の階調が２５６であるとき、各バンド（Ｒ，Ｇ，Ｂ）の目標平均値Ａ_ａｉｍは１２８、目標標準偏差Ｓ_ａｉｍは８０に設定する。 First, the average value A and standard deviation S of all pixels are calculated for each of the R, G, and B bands of the satellite image. Next, for each band, a normalized luminance value I ′ _xy is obtained from the luminance value I _{xy at} the pixel at the image coordinates (x, y) for each band. A _aim is a target average value, and S _aim is a target standard deviation. As an example, when the gradation of the pixel is 256, the target average value A _aim of each band (R, G, B) is set to 128, and the target standard deviation S _aim is set to 80.

画像分割部１２２は、色調補正部１２１により色調補正済みの解析対象画像をグリッド分割して複数のチップ画像（小画像）を生成し、各チップ画像を学習チップ画像セット生成部１２３、解析処理部１４０又は後処理部１５０へ順次出力する。画像分割部１２２は、正解値のグリッド位置を基準に、解析対象画像の分割を行う。なお、（ユーザーの指示などにより）小チップ画像を更に分割する場合など正解値の分割が必要であれば、正解値の分割も実施する。 The image dividing unit 122 grid-divides the analysis target image whose color tone has been corrected by the color tone correcting unit 121 to generate a plurality of chip images (small images), and each chip image is a learning chip image set generating unit 123 and an analysis processing unit. 140 or sequentially output to the post-processing unit 150. The image dividing unit 122 divides the analysis target image based on the grid position of the correct value. If correct value division is necessary, such as when further dividing a small chip image (by user instruction or the like), correct value division is also performed.

学習チップ画像セット生成部１２３は、画像分割部１２２で分割されたチップ画像と、入力された正解値（教師データ）を組み合わせた学習用データセットである学習チップ画像セットを生成し、学習部１３０へ順次出力する。本実施形態では、分類用と回帰用で２種類の学習チップ画像セットが生成される。学習チップ画像セット生成部１２３は、分類用学習チップ画像セット１２５を、学習部１３０の分類モデル生成部１３１へ出力し、回帰用学習チップ画像セット１２６を、学習部１３０の回帰モデル生成部１３２へ出力する。 The learning chip image set generation unit 123 generates a learning chip image set that is a learning data set in which the chip image divided by the image division unit 122 and the input correct value (teacher data) are combined. Output sequentially. In this embodiment, two types of learning chip image sets are generated for classification and for regression. The learning chip image set generation unit 123 outputs the classification learning chip image set 125 to the classification model generation unit 131 of the learning unit 130, and the regression learning chip image set 126 to the regression model generation unit 132 of the learning unit 130. Output.

図１６では、学習チップ画像セット生成部１２３を設けて学習チップ画像セットを生成する構成としているが、この例に限定されない。基本的に正解値は、地理情報システム（ＧＩＳ（Geographical Information System））データとして広域にわたって整備される。即ち位置情報（グリッド情報）と対応づけられているので、解析対象画像とともに分割することができる。画像分割部１２２は、解析対象画像とともに正解値を分割し、分割後の画像（チップ画像）と分割後の正解値の組を、学習チップ画像セットとしてもよい。なお、仮に正解値が既に分割されていたとしても、画像分割部１２２で解析対象画像を分割する際には正解値と対応づけられた位置情報が必要となる。 In FIG. 16, the learning chip image set generation unit 123 is provided to generate the learning chip image set, but the present invention is not limited to this example. Basically, correct values are maintained over a wide area as GIS (Geographical Information System) data. That is, since it is associated with position information (grid information), it can be divided together with the analysis target image. The image dividing unit 122 may divide the correct value together with the analysis target image, and a set of the divided image (chip image) and the divided correct value may be a learning chip image set. Note that even if the correct answer value has already been divided, position information associated with the correct answer value is required when the image dividing unit 122 divides the analysis target image.

学習部１３０は、入力データに対する演算結果を出力する複数のノードを多層に接続した構成を有し、教師あり学習により、抽象化されたチップ画像の特徴を学習して分類モデル１３５及び回帰モデル１３６を生成する。図１６に示すように、学習部１３０は、分類モデル生成部１３１と、回帰モデル生成部１３２を備える。 The learning unit 130 has a configuration in which a plurality of nodes that output calculation results for input data are connected in multiple layers, and learns the characteristics of the abstracted chip image by supervised learning to classify the classification model 135 and the regression model 136. Is generated. As illustrated in FIG. 16, the learning unit 130 includes a classification model generation unit 131 and a regression model generation unit 132.

分類モデル生成部１３１は、分類用学習チップ画像セット１２５に含まれる学習対象画像と、その学習対象画像に対する駐車車両の有無の正解値とを学習し、学習内容が反映された分類モデル１３５を生成する。即ち、分類モデル１３５の種々のパラメーターを決定する。 The classification model generation unit 131 learns the learning target image included in the classification learning chip image set 125 and the correct value of the presence or absence of a parked vehicle with respect to the learning target image, and generates the classification model 135 that reflects the learning content. To do. That is, various parameters of the classification model 135 are determined.

回帰モデル生成部１３２は、回帰用学習チップ画像セット１２６に含まれる学習対象画像と、その学習対象画像に含まれる駐車車両の台数の正解値とを学習し、学習内容を反映した回帰モデル１３６を生成する。即ち、回帰モデル１３６の種々のパラメーターを決定する。 The regression model generation unit 132 learns a learning target image included in the learning chip image set 126 for regression and a correct value of the number of parked vehicles included in the learning target image, and generates a regression model 136 reflecting the learning content. Generate. That is, various parameters of the regression model 136 are determined.

上述の分類モデル１３５及び回帰モデル１３６は、一例として畳み込みニューラルネットワーク（識別器）により構成される。分類モデル１３５及び回帰モデル１３６のネットワーク構成の主要な構成は、同一である。本実施形態のネットワーク構成は、入力層−Ｃ−Ｐ−Ｃ−Ｐ−Ｃ−Ｃ−Ｃ−ＦＣ−Ｄ−出力層からなる層構成を持つ。ここで、Ｃは、同じ重みフィルタを入力データ（チップ画像）全体に適用して畳み込み処理し、特徴マップ（特徴量）を抽出する畳み込み層である。Ｐは、畳み込み層（Ｃ）から出力された特徴マップを縮小するプーリング層である。ＦＣは、重み付き結合を計算し、活性化関数によりユニットの値を求める全結合層である。そして、Ｄは、過学習を防止するため中間層のユニットの値を一定の割合で０にし、結合を欠落させるドロップアウト層である。 The classification model 135 and the regression model 136 described above are configured by a convolutional neural network (discriminator) as an example. The main configuration of the network configuration of the classification model 135 and the regression model 136 is the same. The network configuration of the present embodiment has a layer configuration including an input layer-C-P-C-P-C-C-C-FC-D-output layer. Here, C is a convolution layer in which the same weight filter is applied to the entire input data (chip image) to perform convolution processing and extract a feature map (feature amount). P is a pooling layer that reduces the feature map output from the convolution layer (C). FC is a total coupling layer that calculates a weighted coupling and obtains a unit value by an activation function. D is a dropout layer in which the unit value of the intermediate layer is set to 0 at a constant rate to prevent overlearning and the coupling is lost.

本実施形態の分類モデル１３５では、誤差関数に交差エントロピーを用い、出力層の活性化関数にはソフトマックス関数を用いている。また回帰モデル１３６では、誤差関数に最小二乗誤差を用い、出力層の活性化関数に線形関数を用いている。この誤差関数と出力層の活性化関数については一例であり、この例に限定されない。 In the classification model 135 of the present embodiment, cross entropy is used for the error function, and softmax function is used for the activation function of the output layer. In the regression model 136, a least square error is used as the error function, and a linear function is used as the output layer activation function. The error function and the activation function of the output layer are examples, and are not limited to this example.

上記ネットワーク構成は一例であって、この例に限定されるものではなく、他の文献等でよく用いられているものでもよい。また、分類モデル１３５及び回帰モデル１３６は、他の深層学習の手法、あるいは他の機械学習の手法を利用して構築してもよい。 The above network configuration is an example, and is not limited to this example, and may be one often used in other documents. The classification model 135 and the regression model 136 may be constructed using other deep learning methods or other machine learning methods.

解析処理部１４０は、車両判定部１４１と、台数推定部１４２を備える。車両判定部１４１（目的物判定部の一例）は、分類モデル１３５を用いて、各チップ画像に駐車車両が存在するかどうかを判定し、判定結果（車両あり）を台数推定部１４２へ出力する。また、判定結果（車両なし）として各チップ画像の台数推定値（０台）を後処理部１５０へ出力する。 The analysis processing unit 140 includes a vehicle determination unit 141 and a number estimation unit 142. The vehicle determination unit 141 (an example of an object determination unit) uses the classification model 135 to determine whether there is a parked vehicle in each chip image, and outputs a determination result (with a vehicle) to the number estimation unit 142. . Further, the estimated number (0 units) of each chip image is output to the post-processing unit 150 as a determination result (no vehicle).

台数推定部１４２（個数推定部の一例）は、回帰モデル１３６を用いて、車両判定部１４１により駐車車両が存在すると判定されたチップ画像に含まれる駐車車両の台数を推定し、その台数推定値を後処理部１５０へ出力する。また、台数推定部１４２は、台数推定値を、チップ画像を識別するための情報（例えば位置情報）とともに学習用データベース１１０へ記憶する。これにより、台数推定値が、今後の学習部１３０における学習に利用される。 The number estimating unit 142 (an example of the number estimating unit) uses the regression model 136 to estimate the number of parked vehicles included in the chip image determined to be present by the vehicle determining unit 141, and the number estimated value thereof. Is output to the post-processing unit 150. In addition, the number estimation unit 142 stores the number estimation value in the learning database 110 together with information (for example, position information) for identifying the chip image. Thus, the estimated number of units is used for future learning in the learning unit 130.

後処理部１５０は、車両判定部１４１で判定された、車両が存在しないと判定された各チップ画像の台数推定値（０台）と、台数推定部１４２で推定された、車両が存在すると判定された各チップ画像に対する台数推定値を集計して出力する処理を行う。この後処理部１５０は、前処理部１２０から解析対象画像及びチップ画像を取得し、ユーザーニーズに合わせて出力するレポートの形態（表示形態や項目等）をカスタマイズする。 The post-processing unit 150 determines that the number of estimated chip images (0) of each chip image determined by the vehicle determination unit 141 and that the vehicle does not exist, and that the vehicle estimated by the number estimation unit 142 exists. A process of counting and outputting the estimated number of units for each chip image is performed. The post-processing unit 150 acquires the analysis target image and the chip image from the pre-processing unit 120, and customizes the report format (display mode, items, and the like) to be output according to user needs.

［学習・モデル生成フェーズの処理］
次に、車両台数推定装置１００の学習・モデル生成フェーズにおける処理を説明する。図１７は、一実施形態に係る学習・モデル生成フェーズにおける処理例を示すフローチャートである。 [Processing of learning / model generation phase]
Next, processing in the learning / model generation phase of the vehicle number estimation device 100 will be described. FIG. 17 is a flowchart illustrating a processing example in a learning / model generation phase according to an embodiment.

まず、学習対象画像（衛星画像）が学習用データベース１１０から前処理部１２０に取り込まれると、色調補正部１２１は、学習対象画像に対して色調補正処理を行い（Ｓ１）、色調補正済み学習対象画像を画像分割部１２２に出力する。次に、画像分割部１２２は、色調補正済み学習対象画像の駐車場領域Ａｐに対してグリッド分割処理を行い（Ｓ２）、学習チップ画像セット生成部１２３にチップ画像を出力する。例えば１グリッド（１チップ画像）が６０×６０ピクセル（例えば約３０ｍ四方）となるように分割が行われる。 First, when a learning target image (satellite image) is captured from the learning database 110 to the preprocessing unit 120, the color correction unit 121 performs a color correction process on the learning target image (S1), and the color correction corrected learning target The image is output to the image dividing unit 122. Next, the image division unit 122 performs grid division processing on the parking lot area Ap of the learning target image after color tone correction (S2), and outputs the chip image to the learning chip image set generation unit 123. For example, the division is performed so that one grid (one chip image) becomes 60 × 60 pixels (for example, about 30 m square).

次に、学習チップ画像セット生成部１２３は、入力されたチップ画像と組となる分類用の正解値から分類用学習チップ画像セット１２５を生成する。また、学習チップ画像セット生成部１２３は、入力されたチップ画像と組となる回帰用の正解値から回帰用学習チップ画像セット１２６を生成する（Ｓ３）。分類学習では、各チップ画像に対して、人間の目視による駐車車両あり／なし（正解値）のラベル付けが行われる。また、回帰学習では、各チップ画像に対して、人間が目視により駐車車両の台数をカウントし、そのカウント数が正解値として登録される。なお、チップ画像と正解値の組を、画像分割部１２２による画像分割処理時に作成してもよい。 Next, the learning chip image set generation unit 123 generates a classification learning chip image set 125 from the classification correct value paired with the input chip image. In addition, the learning chip image set generation unit 123 generates a regression learning chip image set 126 from the correct values for regression that are paired with the input chip image (S3). In classification learning, each chip image is labeled with / without a parked vehicle (correct value) by human eyes. In regression learning, a person visually counts the number of parked vehicles for each chip image, and the counted number is registered as a correct value. Note that a set of a chip image and a correct answer value may be created at the time of image division processing by the image dividing unit 122.

次に、学習部１３０の分類モデル生成部１３１は、分類用学習チップ画像セット１２５を用いて学習を行い、分類モデル１３５を生成する（Ｓ４）。また、回帰モデル生成部１３２は、回帰用学習チップ画像セット１２６を用いて学習を行い、回帰モデル１３６を生成する（Ｓ５）。 Next, the classification model generation unit 131 of the learning unit 130 performs learning using the classification learning chip image set 125 to generate a classification model 135 (S4). Moreover, the regression model production | generation part 132 learns using the learning chip image set 126 for regression, and produces | generates the regression model 136 (S5).

このように、分類モデル１３５は、学習対象画像に対して「離散値（ラベル）」を正解データとして学習する。離散値は、「車両あり」及び「車両なし」の２クラスである。また、回帰モデル１３６は、学習対象画像に対して「連続値（車両台数）」を正解データとして学習する。回帰モデル１３６の学習対象画像は、駐車車両が１台以上存在する画像のみである。学習対象画像のパターンを「駐車車両あり」に限定することにより、車両密度に関する特徴を学習しやすくする。 In this way, the classification model 135 learns “discrete values (labels)” as correct data for the learning target image. There are two discrete values, “with vehicle” and “without vehicle”. The regression model 136 learns “continuous value (number of vehicles)” as correct answer data for the learning target image. The learning target image of the regression model 136 is only an image having one or more parked vehicles. By limiting the pattern of the learning target image to “with parked vehicle”, it becomes easier to learn the characteristics related to the vehicle density.

［解析フェーズの処理］
次に、車両台数推定装置１００の解析フェーズにおける処理を説明する。図１８は、一実施形態に係る解析フェーズにおける処理例を示すフローチャートである。 [Processing of analysis phase]
Next, processing in the analysis phase of the vehicle number estimation device 100 will be described. FIG. 18 is a flowchart illustrating a processing example in the analysis phase according to an embodiment.

まず、解析対象画像（衛星画像）が前処理部１２０に取り込まれると、色調補正部１２１は、解析対象画像に対して色調補正処理を行い（Ｓ１１）、色調補正済み解析対象画像を画像分割部１２２に出力する。次に、画像分割部１２２は、色調補正済み解析対象画像の指定領域（駐車場領域）に対するグリッド分割処理を行い（Ｓ１２）、順次チップ画像を生成する。 First, when an analysis target image (satellite image) is captured by the preprocessing unit 120, the color tone correction unit 121 performs a color tone correction process on the analysis target image (S11), and the color tone corrected analysis target image is an image dividing unit. It outputs to 122. Next, the image dividing unit 122 performs grid division processing on the designated area (parking area) of the color-corrected analysis target image (S12), and sequentially generates chip images.

次に、解析処理部１４０の車両判定部１４１は、分類モデル１３５を利用して各チップ画像に対して車両の有無を分類し（Ｓ１３）、その判定結果（車両あり）を台数推定部１４２に順次出力する。 Next, the vehicle determination unit 141 of the analysis processing unit 140 uses the classification model 135 to classify the presence / absence of a vehicle for each chip image (S13), and the determination result (there is a vehicle) to the number estimation unit 142. Output sequentially.

次に、台数推定部１４２は、回帰モデル１３６を用いて、車両ありと判定されたチップ画像に対して台数推定を行い、チップ画像ごとに台数推定値を後処理部１５０へ順次出力する（Ｓ１４）。また、車両判定部１４１は、車両なしと判定したチップ画像に対する台数推定値を０として、後処理部１５０へ出力する（Ｓ１５）。 Next, the number estimating unit 142 estimates the number of chip images determined to have a vehicle using the regression model 136, and sequentially outputs the number estimated value for each chip image to the post-processing unit 150 (S14). ). In addition, the vehicle determination unit 141 outputs 0 to the post-processing unit 150 with the estimated number of chips for the chip image determined to have no vehicle as 0 (S15).

最後に、後処理部１５０は、ステップＳ１５及びＳ１６の処理が終了後、後処理として、例えば車両ありと判定されたチップ画像に対する台数推定値を集計する。そして、後処理部１５０は、解析対象画像の指定領域の駐車車両についてのレポートを出力する処理を行う（Ｓ１６）。 Finally, after the processing in steps S15 and S16 is completed, the post-processing unit 150 aggregates the estimated number of units for the chip image determined to have a vehicle, for example, as post-processing. And the post-processing part 150 performs the process which outputs the report about the parked vehicle of the designated area | region of an analysis object image (S16).

［台数推定結果の検証］
以下、本発明の一実施形態に係る台数推定結果について図１９〜図２３を参照しながら検証する。解析対象画像には、検討（１）、検討（２）で使用したものと同じ衛星画像を使用した。学習シーン数及び評価シーン数も同じである。 [Verification of number estimation results]
Hereinafter, the number estimation result according to the embodiment of the present invention will be verified with reference to FIGS. The same satellite image used in Study (1) and Study (2) was used as the analysis target image. The number of learning scenes and the number of evaluation scenes are the same.

図１９は、一実施形態に係るある１シーンの駐車場領域における台数推定結果を示す説明図である。図１９Ａは正解台数の例であり、図１９Ｂは台数推定結果の例である。図１９Ａの白塗のグリッドは‘駐車車両あり’、黒塗のグリッドは‘駐車車両なし’と分類されたチップ画像であり、図１９Ｂに示した台数推定結果は、各グリッドがチップ画像に相当する。 FIG. 19 is an explanatory diagram illustrating the number estimation result in a parking lot area of one scene according to an embodiment. FIG. 19A is an example of the number of correct answers, and FIG. 19B is an example of the number estimation result. The white grid in FIG. 19A is a chip image classified as “with parked vehicle” and the black grid with “no parked vehicle”, and the number estimation result shown in FIG. 19B is equivalent to the chip image in each grid. To do.

図１９Ｂの駐車車両あり／なしの分類結果を見ると、背景色を黒色とした５個のグリッド‘０’のうち４個のグリッドは実際に駐車車両が０台であり、駐車車両の有無が精度よく分類されていることがわかる。また各グリッドに記載した推定台数を図１９Ａの正解データと比較すると、バスや駐車場境界付近といった性質の異なるチップ画像に対して台数の傾向を表現できていることがわかる。 Looking at the classification result of presence / absence of parked vehicles in FIG. 19B, four grids out of the five grids '0' with black background color are actually 0 parked vehicles, and whether there are parked vehicles or not. It turns out that it is classified with high accuracy. Further, comparing the estimated number described in each grid with the correct data in FIG. 19A, it can be seen that the tendency of the number can be expressed for chip images having different properties such as buses and parking lot boundaries.

図２０は、一実施形態に係る台数推定結果、並びに、検討（１）及び検討（２）の台数推定結果を示す説明図である。本実施形態（２クラス分類＋回帰）に係る正解データに対する相対誤差（全５０シーンを平均した誤差平均）は、１６％（推定精度８４％）であった。これに対し、検討（１）の分類（２クラス分類）のみの場合では同２５％、検討（２）の回帰のみの場合では同２３％であった。 FIG. 20 is an explanatory diagram showing the number estimation result and the number estimation result of examination (1) and examination (2) according to an embodiment. The relative error (average error obtained by averaging all 50 scenes) with respect to the correct answer data according to the present embodiment (2-class classification + regression) was 16% (estimated accuracy 84%). On the other hand, in the case of study (1) classification (2-class classification) alone, it was 25%, and in the case of study (2) only regression, it was 23%.

今回使用した５０ｃｍ分解能の衛星画像では、駐車車両のような小サイズの地物（オブジェクト）は輪郭が潰れて写っているケースが多く、人間が目視で行ったとしても正確に台数を数えることは難しい。したがって、本実施形態における推定精度約８４％の台数推定結果は、妥当な数値であると言える。 In the satellite image of 50cm resolution used this time, there are many cases where small-size features (objects) such as parked vehicles are shown with their outlines collapsed. difficult. Therefore, it can be said that the number estimation result with an estimation accuracy of about 84% in this embodiment is a reasonable numerical value.

図２１は、一実施形態に係る推定値と実測値（目視判読結果）との関係例を示すグラフである。図２１は、各シーンの推定台数と正解データの関係を示し、横軸が正解データである実測値［台］、縦軸が推定値［台］を表す。各プロット点が１対１の回帰直線上にあるほど高い精度で推定できていることを示している。図２１より、推定値は正解データと比較して全体的に低く見積もられているものの、決定係数Ｒ^２は０．９３３と、高い相関を示している。 FIG. 21 is a graph illustrating a relationship example between an estimated value and an actual measurement value (visual interpretation result) according to an embodiment. FIG. 21 shows the relationship between the estimated number of each scene and the correct answer data, where the horizontal axis represents the actually measured value [stand] that is correct answer data, and the vertical axis represents the estimated value [stand]. It shows that the higher the accuracy is, the more the plotted points are on the one-to-one regression line. From FIG. 21, although the estimates are generally underestimated in comparison with the correct data, the coefficient of determination R ² is 0.933, indicating a high correlation.

図２２は、検討（２）に係る推定値と実測値との関係例を示すグラフであり、横軸が正解データである実測値［台］、縦軸が推定値［台］を表す。推定値は正解データと比較して低く見積もられており、決定係数Ｒ^２は０．９１７と、検討（１）の場合よりは高いものの、低い相関を示している。 FIG. 22 is a graph showing an example of the relationship between the estimated value and the actually measured value related to Study (2), where the horizontal axis represents the actually measured value [base] that is correct data, and the vertical axis represents the estimated value [base]. Estimates are underestimated in comparison with correct answer data, the coefficient of determination R ² and 0.917, although higher than that of Study (1), shows a poor correlation.

図２３は、検討（１）に係る推定値と実測値との関係例を示すグラフであり、横軸が正解データである実測値［台］、縦軸が推定値［台］を表す。推定値は正解データと比較して低く見積もられており、決定係数Ｒ^２は０．８９０と、低い相関である。 FIG. 23 is a graph showing an example of the relationship between the estimated value and the actually measured value according to the study (1), where the horizontal axis represents the actually measured value [base] that is correct data, and the vertical axis represents the estimated value [base]. Estimates are underestimated in comparison with correct answer data, the coefficient of determination R ² and 0.890 is low correlation.

上述したように、本実施形態に係る車両台数推定装置１００によれば、衛星画像等の解析対象画像から目的物である駐車車両の台数を自動的に精度よく推定することができる。解析対象画像１枚の駐車場領域（およそ０．３ｋｍ^２）当たりの作業時間は、人間の目視判読では約３時間である。一方、ＣＮＮ（機械学習）を用いた本実施形態の手法では、判読に要する時間は約１分であった。コンピューターの処理能力にもよるが、本実施形態は、目視による判読と比べて大幅な作業時間の短縮（約１８０倍の高速化）が実現可能となる。 As described above, according to the vehicle number estimation apparatus 100 according to the present embodiment, the number of parked vehicles that are target objects can be automatically and accurately estimated from the analysis target image such as a satellite image. The work time per parking area (approximately 0.3 km ² ) of one analysis target image is about 3 hours in human visual interpretation. On the other hand, in the method of the present embodiment using CNN (machine learning), the time required for interpretation is about 1 minute. Although depending on the processing capability of the computer, this embodiment can significantly reduce the work time (about 180 times faster) than visual interpretation.

また、本実施形態では、車両の存在しないグリッドは車両判定部１４１による分類で除かれるとともに、「車両なし」のグリッドは推定台数を０台と処理される。これにより、検討（２）で説明した回帰の問題（車両の存在しないグリッドにも数台車両があると推定される）が軽減されたと考えられる。 Further, in the present embodiment, the grid in which no vehicle is present is excluded by the classification by the vehicle determination unit 141, and the “no vehicle” grid is processed with the estimated number being zero. This is considered to have alleviated the regression problem described in Study (2) (it is estimated that there are several vehicles in a grid where no vehicle exists).

また、学習部１３０（回帰モデル生成部１３２）は、回帰用学習チップ画像セット１２６を用いて「車両あり」のチップ画像のみ台数の学習を行うため、チップ画像の車両密度に関する特徴を学習しやすい。これにより、チップ画像の車両密度に関する特徴に対する学習が強化され、それ故、検討（２）で説明した回帰の他の問題（駐車車両の密度の高いグリッドと低いグリッドとの間で推定精度の差が大きい）が軽減されたと考えられる。 Further, since the learning unit 130 (regression model generation unit 132) learns only the number of chip images of “with vehicle” using the learning chip image set 126 for regression, it is easy to learn the characteristics of the chip image regarding the vehicle density. . This enhances the learning of the features related to the vehicle density in the chip image, and hence other problems of regression described in Study (2) (difference in estimation accuracy between dense and low grids of parked vehicles). Is large).

［ハードウェア構成例］
図２４は、車両台数推定装置１００が備えるコンピューターのハードウェア構成を示すブロック図である。車両台数推定装置１００の機能、使用目的に合わせてコンピューターの各部は取捨選択されてもよい。 [Hardware configuration example]
FIG. 24 is a block diagram illustrating a hardware configuration of a computer included in the vehicle number estimation device 100. Each part of the computer may be selected according to the function and purpose of use of the vehicle number estimation device 100.

コンピューター２００は、バス２０４にそれぞれ接続されたＣＰＵ（Central Processing Unit）２０１、ＲＯＭ（Read Only Memory）２０２、ＲＡＭ（Random Access Memory）２０３を備える。さらに、コンピューター２００は、表示部２０５、操作部２０６、不揮発性ストレージ２０７、ネットワークインターフェース２０８を備える。 The computer 200 includes a CPU (Central Processing Unit) 201, a ROM (Read Only Memory) 202, and a RAM (Random Access Memory) 203, which are connected to a bus 204, respectively. Further, the computer 200 includes a display unit 205, an operation unit 206, a nonvolatile storage 207, and a network interface 208.

ＣＰＵ２０１は、本実施形態に係る各機能を実現するソフトウェアのプログラムコードをＲＯＭ２０２から読み出して実行する。なお、コンピューター２００は、ＣＰＵ２０１の代わりに、ＭＰＵ（Micro-Processing Unit）等の処理装置を備えるようにしてもよい。ＲＡＭ２０３には、演算処理の途中に発生した変数やパラメーター等が一時的に書き込まれる。ＣＰＵ２０１が、ＲＯＭ２０２からプログラムを読み出して実行することにより、図１７及び図１８に示す車両台数推定装置１００の動作が実現される。 The CPU 201 reads out a program code of software that realizes each function according to the present embodiment from the ROM 202 and executes it. Note that the computer 200 may include a processing device such as an MPU (Micro-Processing Unit) instead of the CPU 201. In the RAM 203, variables, parameters, and the like generated during the arithmetic processing are temporarily written. The CPU 201 reads out the program from the ROM 202 and executes it, whereby the operation of the vehicle number estimation device 100 shown in FIGS. 17 and 18 is realized.

なお、ＣＰＵに代えて、ＭＰＵ（Micro-processing unit）や、画像処理を高速に実行するＧＰＵ（Graphics Processing Unit）等を用いてもよい。例えばＧＰＵの機能を画像処理以外の用途に転用する技術であるＧＰＧＰＵ（General-Purpose computing on Graphics Processing Units）を利用して、本実施形態に係る各機能を実現してもよい。 Instead of the CPU, an MPU (Micro-processing unit), a GPU (Graphics Processing Unit) that executes image processing at high speed, or the like may be used. For example, each function according to the present embodiment may be realized by using GPGPU (General-Purpose computing on Graphics Processing Units), which is a technique for diverting the GPU function to applications other than image processing.

表示部２０５は、例えば、液晶ディスプレイモニタであり、コンピューター２００で行われる処理の結果等を表示する。操作部２０６には、例えば、キーボード、マウス又はタッチパネル等が用いられ、ユーザーが所定の操作入力、指示を行うことが可能である。例えばユーザーは操作部２０６を操作し、学習対象画像及び解析対象画像に対して指定領域を指定することができる。 The display unit 205 is a liquid crystal display monitor, for example, and displays a result of processing performed by the computer 200 and the like. For example, a keyboard, a mouse, a touch panel, or the like is used as the operation unit 206, and a user can perform predetermined operation inputs and instructions. For example, the user can operate the operation unit 206 to designate a designated area for the learning target image and the analysis target image.

不揮発性ストレージ２０７としては、例えば、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、フレキシブルディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード等が用いられる。この不揮発性ストレージ２０７には、ＯＳ（Operating System）、各種のパラメーターの他に、コンピューター２００を機能させるためのプログラムが記録されている。例えば不揮発性ストレージ２０７には、学習対象画像及び正解値、解析対象画像等が記憶されている。分類モデル１３５及び回帰モデル１３６のネットワーク構成に関する各種パラメーターが記憶されていてもよい。 Examples of the non-volatile storage 207 include a hard disk drive (HDD), a solid state drive (SSD), a flexible disk, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a magnetic tape, and a non-volatile memory card. Used. The nonvolatile storage 207 stores an OS (Operating System) and various parameters, as well as programs for causing the computer 200 to function. For example, the non-volatile storage 207 stores a learning target image, a correct answer value, an analysis target image, and the like. Various parameters relating to the network configuration of the classification model 135 and the regression model 136 may be stored.

ネットワークインターフェース２０８には、例えば、ＮＩＣ（Network Interface Card）等が用いられ、ＬＡＮ等のネットワークを介して各装置間で各種のデータを送受信することが可能である。 For example, a network interface card (NIC) or the like is used as the network interface 208, and various types of data can be transmitted and received between devices via a network such as a LAN.

＜４．その他＞
上述した一実施形態では、目的物として駐車車両の台数を推定する例を示したが、この例に限定されない。例えば目的物として、ある海域に出航している船舶、道路で渋滞中の車両、設置されたキャンプテント、ある地域の野生生物、キャベツ等の農作物（収穫量）など、画像中の地物が挙げられる。 <4. Other>
In one Embodiment mentioned above, although the example which estimates the number of parked vehicles as a target object was shown, it is not limited to this example. For example, the features in the image include ships that are sailing in a certain sea area, vehicles that are congested on the road, installed camping tents, wildlife in a certain area, and crops (crop yield) such as cabbage. It is done.

また、上述した一実施形態では、解析対象画像として衛星画像を例示したが、本発明の解析対象は衛星画像に限定されず、航空写真や一般的なカメラで撮影された画像など、種々の画像を解析対象とすることができる。 In the above-described embodiment, the satellite image is exemplified as the analysis target image. However, the analysis target of the present invention is not limited to the satellite image, and various images such as an aerial photograph and an image taken with a general camera can be used. Can be analyzed.

また、上述した一実施形態にかかる車両台数推定装置１００の動作が、ソフトウェアによって行われる例を示したが、その一部がハードウェアによって行われてもよい。例えば、前処理部１２０の一部又は全部がハードウェアによって実現されてもよい。 Moreover, although the operation | movement of the vehicle number estimation apparatus 100 concerning one Embodiment mentioned above showed the example performed by software, the one part may be performed by hardware. For example, part or all of the preprocessing unit 120 may be realized by hardware.

さらに、本発明は上述した実施形態例に限られるものではなく、特許請求の範囲に記載した本発明の要旨を逸脱しない限りにおいて、その他種々の応用例、変形例を取り得ることは勿論である。 Furthermore, the present invention is not limited to the above-described embodiments, and various other application examples and modifications can be taken without departing from the gist of the present invention described in the claims. .

例えば、上述した実施形態例は本発明を分かりやすく説明するために装置及びシステムの構成を詳細且つ具体的に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施形態例の構成の一部を他の実施形態例の構成に置き換えることは可能である。また、ある実施形態例の構成に他の実施形態例の構成を加えることも可能である。また、各実施形態例の構成の一部について、他の構成の追加、削除、置換をすることも可能である。 For example, the above-described exemplary embodiments are detailed and specific descriptions of the configuration of the apparatus and the system in order to explain the present invention in an easy-to-understand manner, and are not necessarily limited to those having all the configurations described above. . Further, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment. In addition, the configuration of another embodiment can be added to the configuration of a certain embodiment. Moreover, it is also possible to add, delete, and replace other configurations for a part of the configuration of each exemplary embodiment.

１００…車両台数推定装置、１１０…学習用データベース、１２０…前処理部、１２１…色調補正部、１２２…画像分割部、１２３…学習チップ画像セット生成部、１２５…分類用学習チップ画像セット、１２６…回帰用学習チップ画像セット、１３０…学習部、１３１…分類モデル生成部、１３２…回帰モデル生成部、１３５…分類モデル、１３６…回帰モデル、１４０…解析処理部、１４１…車両判定部、１４２…台数推定部、１５０…後処理部 DESCRIPTION OF SYMBOLS 100 ... Vehicle number estimation apparatus, 110 ... Database for learning, 120 ... Pre-processing part, 121 ... Color tone correction part, 122 ... Image division part, 123 ... Learning chip image set production | generation part, 125 ... Learning chip image set for classification | category, 126 ... learning chip image set for regression, 130 ... learning unit, 131 ... classification model generation unit, 132 ... regression model generation unit, 135 ... classification model, 136 ... regression model, 140 ... analysis processing unit, 141 ... vehicle determination unit, 142 ... number estimation unit, 150 ... post-processing unit

Claims

Using the classification model in which the features of the learning target image and the correct value of the presence or absence of the target object included in the learning target image are learned, the target object is added to each small image for each of the plurality of small images constituting the analysis target image. An object determination unit for determining whether or not there exists,
Included in the small image determined by the target object determination unit that the target object is present, using a regression model that has learned the features of the target image and the correct value of the number of target objects included in the target image. A target number estimation unit comprising: a number estimation unit configured to estimate the number of the target objects.

The target number estimation device according to claim 1, further comprising: a post-processing unit that counts the number of the target objects included in each small image that is determined by the number estimation unit to determine that the target object exists. .

The target object number estimation apparatus according to claim 1, further comprising: an image dividing unit that divides a specified region of the analysis target image to generate a plurality of the small images.

The target object number estimation apparatus according to any one of claims 1 to 3, further comprising a color tone correction unit that corrects a color tone of the analysis target image.

Learning that has a configuration in which a plurality of nodes that output operation results for input data are connected in multiple layers, and learns the features of the abstracted small image by supervised learning to generate the classification model and the regression model The target object number estimation apparatus according to claim 1, further comprising: a unit.

The object number estimation apparatus according to claim 1, wherein the analysis target image is a satellite image, and the object is a feature reflected in the satellite image.

Using the classification model in which the features of the learning target image and the correct value of the presence or absence of the target object included in the learning target image are learned, the target object is added to each small image for each of the plurality of small images constituting the analysis target image. Determining whether or not exists,
Using the regression model that has learned the features of the learning target image and the correct value of the number of the target objects included in the learning target image, the number of the target objects included in the small image determined that the target object exists is calculated. Estimating the number of objects.

Using the classification model in which the features of the learning target image and the correct value of the presence or absence of the target object included in the learning target image are learned, the target object is added to each small image for each of the plurality of small images constituting the analysis target image. A procedure for determining whether or not exists,
Using the regression model that has learned the features of the learning target image and the correct value of the number of the target objects included in the learning target image, the number of the target objects included in the small image determined that the target object exists is calculated. A program that causes a computer to execute the estimated procedure.