CN110503020A

CN110503020A - A method of based on deep learning air remote sensing target detection

Info

Publication number: CN110503020A
Application number: CN201910757510.6A
Authority: CN
Inventors: 余阳; 浦剑
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2019-08-16
Filing date: 2019-08-16
Publication date: 2019-11-26

Abstract

The invention discloses a kind of methods based on deep learning air remote sensing target detection, including data augmentation, and to promote detection accuracy, wherein data augmentation uses affine transformation and field interpolation.For the test problems of Issues On Multi-scales especially Small object object, from the beginning training detection network, in a network using batch normalization, while promoting learning rate, can preferably adapt to Detection task in this way.Meanwhile design feature extracts network structure, in order to reduce the loss of local message, maximum pond is removed, carries out down-sampling with convolution replacement, while updating the convolution kernel in network structure, so that local message is retained, to improve detector to the detection performance of air remote sensing target.

Description

A method of based on deep learning air remote sensing target detection

Technical field

The present invention relates to deep learning, image procossing, remote sensing fields, in particular for the Small object in aerial remote sensing images The method for carrying out target detection.

Background technique

Currently, the quality and quantity of remote sensing images has very big promotion with the rapid development of remote sensing satellite technology, this The data of a little substantial amounts become a kind of highly important resource, have a wide range of applications in civil and military field With important researching value.Deep learning has become the important method of air remote sensing area research, receives domestic and international Art circle and industry are more and more paid close attention to, and the target detection of remote sensing fields has huge application and researching value.

Remote sensing images have certain particularity, are scale diversity first, shooting height variation range from several hundred rice to Nearly myriametre has, also, the size disparity of same category object is also very big, such as the ship on the water surface, argosy can only achieve Several hundred rice, small boat often only have tens meters.Secondly, the surface water that the visual angle of remote sensing images is different from normal image is looked squarely Angle is to take a crane shot from high-altitude mostly, so performance detects network well on routine data collection, detects air remote sensing picture Effect not necessarily can be fine.There exists a further problem in that Small object problem, many objects only have tens very remote sensing images on the image To several pixels, in addition, the operation of pond layer can also allow Small object to further reduce.The shooting angle of remote sensing images causes The direction of target object is uncertain.On the other hand, the background complexity for including in the visual field of remote sensing images is high, therefore can be right The detection of object causes bigger interference.These problems increase the difficulty of Remote Sensing Target detection.

Summary of the invention

The object of the present invention is to provide a kind of method based on deep learning air remote sensing target detection, the method reduce Parameter amount, and detection speed is accelerated, so that time and computing redundancy degree are smaller, while Detection accuracy is also obtained and is mentioned It rises.

Realizing the specific technical solution of the object of the invention is:

A method of based on deep learning air remote sensing target detection, this method comprising the following specific steps

Step 1: building Airborne Data Classification collection

Aerial remote sensing images are classified and marked, forms image and the one-to-one set of mark, it is distant to constitute aviation Feel data set；Data augmentation is carried out to Airborne Data Classification collection using affine transformation and neighbor interpolation；Then by the number after augmentation Test set and training set are divided into according to collection；

Step 2: building deep neural network structure

For the characteristic of remote sensing images, design construction deep neural network, and it is trained；

Step 3: according to constructed deep learning neural network, being tested using the Airborne Data Classification collection.

Data increasing is carried out to Airborne Data Classification collection using affine transformation and neighbor interpolation method described in step 1 of the present invention Extensively, specifically:

Using affine transformation method, including cuts, rotates and turn over；

Using neighbor interpolation method, (x_i, y_i) and (x_j,y_j) it is the sample pair that initial data is concentrated, it is adopted at random from data Sample obtains, and x is image pattern, and y is its corresponding label, and λ belongs to [0,1], It is newly-generated Sample pair.

Deep neural network is constructed described in step 2 of the present invention, specifically:

Network structure based on ResNet, first first down-sampling operation of cancellation ResNet, eliminates maximum pond, uses Convolution operation carries out down-sampling, replaces the convolution kernel in ResNet, so that wisp information is retained, to improve detector To the detection performance of the target of small and dense collection distribution；Secondly, not using pre-training model, batch normalization is added in a network, makes It is 0 that data, which become mean value, the distribution that standard deviation is 1；Also, due to joined batch normalization, learning rate, such energy can be promoted It is enough to promote detection effect higher；Deep neural network model is trained.

Test process described in step 3 of the present invention includes:

Using the deep neural network constructed in the Airborne Data Classification collection and step 2 built in step 1, carries out end and arrive The test at end:

1) picture to be detected is inputted, picture is input in sorter network to obtain various sizes of feature and reflect It penetrates；

2) wherein six layers of characteristic pattern is extracted, each point respectively on these characteristic patterns constructs 6 different rulers later The encirclement frame for spending size, is then detected and is classified respectively, generate multiple encirclement frames；

3) different characteristic figure encirclement frame obtained removed with non-maxima suppression method a part overlapping or Incorrect encirclement frame generates final encirclement frame set；

4) result is evaluated and tested, FN: is judged as negative sample, but be in fact positive sample；FP: it is determined the sample that is positive This, but be in fact negative sample；TN: being judged as negative sample, in fact and negative sample；TP: being judged as positive sample, true Upper is also card sample；It calculates accuracy rate=TP/ (TP+FP), calculates recall rate=TP/ (TP+FN).When P is very high, R can be obvious It reduces, on the contrary, P can be substantially reduced when R is very high, so using F1score come overall merit, F1score=2*P*R/ (P+ R)。

Compared to existing method, the beneficial effects of the present invention are:

1) the data augmentation method for the field interpolation that this method proposes, greatly improves generalization ability, reduces The expense of unnecessary error label.

1) deep neural network model that this method proposes can extract original image loss partial information, especially wisp Information defect, improve classification capacity by a relatively large margin.

2) deep neural network model that this method proposes, reduces parameter amount, and accelerate detection speed, so that when Between and computing redundancy degree it is smaller, while Detection accuracy also obtains promotion.

Detailed description of the invention

Fig. 1 is flow chart of the invention.

Specific embodiment

The present invention will be further described with reference to the accompanying drawing.

Refering to fig. 1, the present invention the following steps are included:

1, Airborne Data Classification collection is constructed

Aerial remote sensing images are classified and marked, image and the one-to-one set of mark are generated.Become using radiation Change method, the method that superposition Gaussian noise method and neighbor interpolation etc. newly propose carries out augmentation to the Airborne Data Classification collection. Then data set is divided into training set and test set in proportion, is trained and tests in step below.

For the small problem of mark sample set quantity, existing data set is increased using affine transformation and neighbor interpolation Extensively.

Wherein, neighbor interpolation is a kind of effective ways of the data set augmentation proposed, and detailed process is as follows:

In supervised learning, a function f can be gone under normal conditions to describe the feature vector, X needed and mesh The relationship between vector Y is marked, that this relationship follows is Joint Distribution P (X, Y).It is therefore desirable to the loss defined using one Function l (f (x), y) is come the difference punished between predicted value f (x) and real goal value y.Next, minimizing in this joint The average loss being distributed on P, here it is expected risk R (f):

R (f)=∫ l (f (x), y) dP (x, y)

Although this distribution P be all in most cases it is ignorant, by using training set, can pass through through Distributed acquisition is tested to approximate distribution P_δ(x, y):

It is distributed by using experience, the approximate expected risk R of empiric risk can be obtained_δ(f):

The function f that approximate expected risk by minimizing formula above learns is exactly traditional empirical risk minimization Change theoretical.

For approximate true distribution there are many kinds of method, simplicity estimation is one of.And in vicinal risk minimization original In then, distribution P is defined as follows:

Wherein v is a neighborhood distribution, for indicating imaginary to searching out on (xi, yi) neighborhood in training characteristics-target Feature-target pair.

It joined a kind of new neighborhood distribution, it may be assumed that

In this case, resulting virtual feature-object vector can be expressed as follows:

In above-mentioned formula, (xi, yi) and (xj, yj) is the sample pair that initial data is concentrated, from from training data Stochastical sampling, λ belong to [0,1].The effect of the hyper parameter λ of neighbor interpolation is degree of the controlling feature-target to interpolation.

2, training neural network target detection model:

Projected depth neural network model is trained using the Airborne Data Classification collection, obtains deep learning mind Through network model.

The network structure based on ResNet is designed, due to the convolution kernel and maximum in ResNet structure, used at the beginning Pondization operation carries out down-sampling, and original image can be reduced four times by such operation, so that the local message damage needed in the detection It has lost very much.In order to reduce the loss of local message, the stride of first convolutional layer of ResNet is changed to 1 from 2 first, Exactly cancel first down-sampling operation, eliminates maximum pond, carry out down-sampling with convolution operation, while replacing ResNet In 7*7 convolution be 3 3x3 convolutional layers so that local message is retained, to improve detector to the mesh of small and dense collection distribution Target detection performance.

During using batch normalization from the beginning training detection network, since the requirement of Detection task and classification task is not With, firstly, abandoning the pre-training model of classification, detection model is trained from the beginning.Secondly, batch normalization is added, it is placed on sharp Before function Relu living, one layer of output data x is sought₁To x_mMean μ_β, seek the standard deviation of one layer of output dataM is the scale of this collection of training sample batch, is normalized, finds out normalization resultWherein ε be a very little close to 0 number, in order to prevent denominator be equal to zero.Also, learning rate is promoted, It solves the above problems, can preferably adapt to Detection task in this way.

3, target detection is carried out using the neural network model that training obtains:

According to the deep learning neural network model, target detection is carried out to aerial remote sensing images.

A picture to be detected is inputted, picture is input in sorter network to obtain various sizes of Feature Mapping.

Wherein six layers of characteristic pattern is extracted, each point respectively on these characteristic patterns constructs six different scales later The encirclement frame of size, is then detected and is classified respectively, and multiple encirclement frames are generated.

Different characteristic figure encirclement frame obtained is combined, a part weight is removed with non-maxima suppression method Folded or incorrect encirclement frame, generates final encirclement frame set.

Result is evaluated and tested, FN: being judged as negative sample, but is in fact positive sample.FP: being judged as positive sample, It but is in fact negative sample.TN: being judged as negative sample, in fact and negative sample.TP: it is judged as positive sample, in fact It is also card sample.It calculates accuracy rate=TP/ (TP+FP), calculates recall rate=TP/ (TP+FN).When P is very high, R can obviously drop It is low, on the contrary, P can be substantially reduced when R is very high, so using F1score come overall merit, F1score=2*P*R/ (P+R).

Embodiment

The present embodiment uses the 1200 picture samples from Google Earth, has expanded 6000 using cutting out, has made 7200 picture samples have been expanded with rotating and turn over, have expanded 7200 picture samples using neighbor interpolation, these picture samples This constitutes the data set comprising 21600 picture samples altogether, and data set is divided into training set and test set, training set ratio Example be 75%, i.e., 16200, test set ratio be 25%, i.e., 5400.

Followed by training set training deep neural network, obtain deep neural network model, then using test set into Row test, is calculated multiple groups accuracy rate and recall rate, calculates F1score and reaches 0.89 ± 0.08.

Claims

1. a kind of method based on deep learning air remote sensing target detection, which is characterized in that this method includes walking in detail below It is rapid:

Step 1: building Airborne Data Classification collection

Aerial remote sensing images are classified and marked, forms image and the one-to-one set of mark, constitutes air remote sensing number According to collection；Data augmentation is carried out to Airborne Data Classification collection using affine transformation and neighbor interpolation method；Then by the number after augmentation Test set and training set are divided into according to collection；

Step 2: building deep neural network structure

2. the method for deep learning air remote sensing target detection according to claim 1, which is characterized in that described in step 1 Data augmentation is carried out to Airborne Data Classification collection using affine transformation and neighbor interpolation method, specifically:

Using affine transformation method, including cuts, rotates and turn over；

Using neighbor interpolation method, (x_i, y_i) and (x_j,y_j) it is the sample pair that initial data is concentrated, stochastical sampling obtains from data It arriving, x is image pattern, and y is its corresponding label, and λ belongs to [0,1], It is newly-generated sample It is right.

3. a kind of method based on deep learning air remote sensing target detection according to claim 1, which is characterized in that step The rapid 2 building deep neural network, specifically:

Network structure based on ResNet, first first down-sampling operation of cancellation ResNet, eliminates maximum pond, uses convolution It operates to carry out down-sampling, the convolution kernel in ResNet is replaced, so that wisp information is retained, to improve detector to small And the detection performance of the target of dense distribution；Secondly, batch normalization is added in a network, data is made to become mean value 0, standard deviation For 1 distribution；Also, promote learning rate；Deep neural network model is trained.

4. the method for deep learning air remote sensing target detection according to claim 1, which is characterized in that described in step 3 Test process includes:

Using the deep neural network constructed in the Airborne Data Classification collection and step 2 built in step 1, carry out end to end Test:

1) picture to be detected is inputted, picture is input in sorter network to obtain various sizes of Feature Mapping；

2) wherein six layers of characteristic pattern is extracted, each point 6 different scale of construction respectively on these characteristic patterns are big later Small encirclement frame, is then detected and is classified respectively, and multiple encirclement frames are generated；

3) different characteristic figure encirclement frame obtained is removed a part overlapping or not just with non-maxima suppression method True encirclement frame generates final encirclement frame set；

4) result is evaluated and tested, FN: is judged as negative sample, but be in fact positive sample；FP: being judged as positive sample, but It in fact is negative sample；TN: being judged as negative sample, in fact and negative sample；TP: it is judged as positive sample, in fact It is card sample；It calculates accuracy rate P=TP/ (TP+FP), calculates recall rate R=TP/ (TP+FN).