CN106909902B

CN106909902B - Remote sensing target detection method based on improved hierarchical significant model

Info

Publication number: CN106909902B
Application number: CN201710115840.6A
Authority: CN
Inventors: 赵丹培; 马媛媛; 姜志国; 谢凤英; 史振威; 张浩鹏
Original assignee: Beihang University
Current assignee: Beihang University
Priority date: 2017-03-01
Filing date: 2017-03-01
Publication date: 2020-06-05
Anticipated expiration: 2037-03-01
Also published as: CN106909902A

Abstract

The invention relates to a remote sensing target detection method based on an improved hierarchical significant model, which comprises the steps of ① performing superpixel segmentation on an input image, ② extracting bottom layer features of each superpixel in the input image to construct a global information set and a background information set, ③ learning feature similarity between each superpixel and the background information set to extract a potential target feature map, ④ generating target feature maps of an airport and an oil tank target, ② 1 fusing the potential target feature maps and the target feature maps to generate a hierarchical significant map, ⑥ defining an end criterion of self-adaptive learning by LDA, if a condition is met, ② 0 is executed, otherwise ⑦ is executed, ⑦ utilizes a feedback mechanism between layers to enhance the hierarchical significant map as an enhancement factor to the input image of a current layer, the enhanced image is used as the input image of next layer learning, ① is executed to start learning of a new layer, ⑧ uses the hierarchical significant map of a step ⑤ as a final significant map of the current layer to determine the position and a class label of the target, and finish remote sensing target detection.

Description

Remote sensing target detection method based on improved hierarchical significant model

Technical Field

The invention belongs to the application field of remote sensing image processing and scene understanding, and relates to a method for simultaneously detecting a plurality of interested remote sensing targets in a low-resolution remote sensing image by using an improved hierarchical reinforcement learning model.

Background

With the rapid development of high-resolution remote sensing satellite application technology, the typical remote sensing target detection technology has a crucial meaning in both military and civil fields. Particularly in the military field, the target position information can be accurately and quickly obtained from a large-format remote sensing image with a complex background, and technical support is provided for accurate guidance. Under the promotion of rapid development of image processing and pattern recognition theories, deep mining and intelligent processing of rich information of remote sensing images become a research hotspot and difficulty. The remote sensing image has rich information and large data volume, and not only comprises complicated and changeable ground environments and artificial targets with different shapes, but also influences image quality by blurring caused by uneven illumination and cloud layer shielding and target deformation and distortion caused by atmospheric refraction and atmospheric turbulence. The remote sensing image acquisition process is also influenced by imaging equipment and weather, when illumination changes and is interfered by rain, snow, smoke and dense fog, the definition of the acquired remote sensing image is reduced, the refraction caused by accumulated water can change the information such as appearance and texture of a target, and further change the characteristics such as shape and gray scale of the target, meanwhile, when the illumination changes, important information of the target is lost due to excessive or insufficient exposure of the camera equipment, the detection of the target can be greatly interfered, and even the detection fails. In addition, the remote sensing images have various target shapes, different colors and variable structures, and also interfere detection, targets are distributed in two dimensions on the space except bridges, airports and the like, most of the other targets are in a three-dimensional state and are represented as shadows of the targets, and the difficulty of target detection is increased. The shape of the target in the remote sensing image is incomplete due to the fact that the target is shielded by vegetation or other obstacles, and the position of the target is uncertain due to the fact that the aerial image is acquired randomly. Since the complex external environment brings difficulty to the target detection in the remote sensing image, the adoption of a proper method to detect the remote sensing target from the remote sensing image with complex background, various environments and rich information content becomes the key point of research of many scholars.

The current detection and identification algorithm of a typical target mainly comprises the following steps: clustering-based methods, feature matching-based methods, classifier-based methods, and the like. The clustering-based method is an unsupervised algorithm, although the method does not need to train samples, the time for manual labeling and sample training is reduced, but due to the fact that target distortion and deformation exist in the remote sensing image, false detection and missed detection can be caused by the unsupervised method, the method has great limitation, and the method is difficult to be widely applied to target detection of the remote sensing image. The feature matching based method usually uses low-level features such as texture features, local feature descriptors and the like to match with template features to realize detection, and has large computation amount and poor self-adaptive capability. The classifier adopted by the classifier-based method mainly comprises a Support Vector Machine (SVM), a feature bag model, an Adaboost classifier, a neural network classifier and the like. These methods firstly require a sufficient number of training samples to train the classifier, secondly require selecting suitable and effective features to classify, and require manual calibration of a large number of sample labels, which is a significant time cost.

The patent CN103729848A provides a hyperspectral remote sensing image small target detection method based on spectral saliency, the method selects spectral information and spatial information of an image to construct a feature vector, an improved Itti model and an improved evolution planning method are applied to a local saliency map and a global saliency map, and a total visual saliency map is finally generated to serve as a final target detection result. The method can only detect the approximate position of the target area, but cannot obtain the accurate boundary information of the target, and has poor adaptability to the target detection problem under the complex background.

The patent CN102214298A proposes a method for rapidly detecting and identifying an airport target based on a remote sensing image of a selective visual attention mechanism, which utilizes an improved attention selection model (GBVS) to obtain a salient region of the remote sensing image, and then combines an HDR tree according to SIFT characteristics on the region to achieve the purpose of identifying the airport target. Patent CN104156722A proposes an airport target detection method based on high-resolution remote sensing images, which detects parallel straight lines in images as airport runways, and has poor robustness to uncertain factors such as distortion and occlusion in the image shooting process.

Patent 201610247053.2 proposes a saliency detection model based on hierarchical reinforcement learning for detecting airport objects in low resolution remote sensing images. However, the method can only complete the detection task of a single target aiming at the airport target, and has no effect on the detection problem of other remote sensing targets, so on the basis of the patent, the invention provides an improved remote sensing target detection method of a hierarchical significant model, and the accurate detection of the airport and oil tank targets can be realized in the remote sensing image with a complex background.

Disclosure of Invention

The invention solves the problems: aiming at the problem of detecting the remote sensing target in the low-resolution remote sensing image, the improved method for detecting the remote sensing target of the hierarchical significant model is provided, and the problem of detecting the target in the remote sensing image under the condition of large breadth and low resolution can be accurately and efficiently solved.

The technical scheme of the invention is as follows: a remote sensing target detection method based on an improved hierarchical significant model mainly aims at two types of targets, namely an airport target and an oil tank target, and comprises the following steps:

step 1: performing superpixel segmentation on an input image to be detected by using a Simple linear iterative Clustering algorithm (SLIC), Clustering pixels with color similarity in a neighboring region in the image, and representing the regions clustered into a class by using superpixels so as to obtain an image subjected to superpixel segmentation;

step 2: taking the color feature of each superpixel in the segmented image as a bottom layer feature, constructing a global information set, and simultaneously extracting the bottom layer features of all superpixels positioned at the boundary position of the segmented image to construct a background information set;

and step 3: learning the feature similarity between each super pixel and a background information set by adopting a minimum Distance-based similarity measurement operator LDSM (LDSM), wherein the feature similarity is represented by a learning coefficient of the LDSM, the value of the learning coefficient is in positive correlation with the feature similarity between each super pixel and the background information set, and a potential target feature map is constructed by utilizing the learning coefficient and can reflect the feature difference between each super pixel region and the background information set and comprises all candidate significant target regions;

and 4, step 4: extracting prior characteristics of an airport target and an oil tank target from top to bottom from an input image to be detected, and respectively generating target characteristic graphs; for an airport target, performing linear detection on an input image to be detected by using a Linear Segment Detector (LSD) to obtain a linear detection result; then, counting the number of pixel points on the detected straight line in each super pixel region by using the image obtained by super pixel segmentation in the step 1, and generating a straight line density graph as a target characteristic graph of the airport target; for an oil tank target, carrying out circle detection on an input image to be detected by using a Hough transformation method to obtain a circle detection result, distributing different weights to the detected circle internal points and non-internal points by using a voting mechanism to generate a circle feature map, and taking the circle feature map as a target feature map of the oil tank target;

and 5: fusing the potential target feature map in the step 3 with the airport target feature map and the oil tank target feature map in the step 4, namely three images, and generating a level saliency map; the salient regions in the hierarchical salient map comprise candidate salient target regions, airport target regions and oil tank target regions in the potential target feature map obtained by each layer in the learning process;

step 6: defining an adaptive learning ending criterion by using a Latent topic semantic model LDA (LDA), and judging the feature similarity between the salient region in the current-level salient map in the step 5 and the airport and the oil tank target by using the criterion so as to judge whether the current learning process should be ended, and if the learning process is not ended, executing a step 7; if the learning process has ended, go to step 8;

and 7: feeding the level saliency map obtained from each layer back to the input image of the current layer by adopting a multilayer learning frame and utilizing a feedback mechanism between adjacent layers to realize background suppression layer by layer, namely taking the level saliency map obtained through fusion in the step 5 as an enhancement factor to act on the input image of the current layer, enhancing the image of the current layer after superpixel segmentation, taking a new image obtained after enhancement as the input image of the next layer for learning, executing the step 1 and starting the learning process of a new layer; repeating the steps, and gradually highlighting the target area through multi-layer learning;

and 8: and after the learning is stopped, taking the level saliency map obtained in the step 5 in the current layer as a final saliency map, extracting a saliency region in the final saliency map as a target region, and marking the target region and a category label of the target, namely an airport target or an oil tank target, in the image to be detected, so as to obtain a final result of remote sensing target detection.

In the step 1, the specific steps of performing super-pixel segmentation by using the SLIC algorithm are as follows:

the method comprises the steps of taking color features of a color image and position information of each pixel point as constraints for an input image I to be detected, clustering by adopting a K-means clustering algorithm, extracting color features of an image CIELab (Commission International Del' Eclairage), wherein the CIELab is a color system of the CIE, and coordinates of the color space are respectively L, a and b, so that the CIELab color space is called), representing local pixel points with similar color features by superpixels, thereby completing a superpixel segmentation process of the image, selecting the total number of superpixels segmented in the whole image as K, and enabling the segmented image I to comprise K superpixel regions.

In step 2, the specific steps of constructing the global information set and the background information set are as follows:

(1) knowing that the total number of superpixels contained in the image after superpixel segmentation is k, selecting the color feature of each superpixel to construct the bottom layer feature, namely converting the image into CIELab color space, respectively solving the values of all pixels in each superpixel in three color channels of the CIELab color space, and taking the mean value of all pixel points in each superpixel in each color channel as the bottom layer feature of the superpixel, namely the bottom layer feature p of the ith superpixel_iExpressed as:

p_i＝(ll_i,la_i,lb_i) (1)

wherein, ll_i，la_i，lb_iRespectively representing the average value of all pixel points in the ith super pixel in the CIELab color space in the values of the three color channels, and then the feature set of all super pixels is P ═ { P ═₁,p₂,…,p_kDefining P as a global information set; wherein p is₁，p₂，…,p_kRespectively representing the bottom layer characteristics of 1 st, 2 nd, … th k super pixel areas in the segmented image; i is more than or equal to 1 and less than or equal to k, i represents any super pixel in the global information set, and k represents the total number of the super pixels;

(2) for a global information set P ═ P₁,p₂,…,p_kN super-pixels are arranged on the image boundary, the n super-pixels on the boundary are extracted to form a background information set, n represents the total number of the background super-pixels, and a background information set B represents:

B＝{b₁,b₂,…,b_n}，0<n<k (2)

b_j＝(ll_j,la_j,lb_j)，1≤j≤n(3)

where j represents any super-pixel in the background information set, b_jUnderlying features, ll, representing super-pixel regions in a background information set_j，la_j，lb_jRespectively representing the average values of all pixel points in the jth super pixel of the CIELab color space in the values of the three color channels.

In the step 3, a potential target feature map is constructed by using the learning coefficient, which specifically includes the following steps:

defining a superpixel dataset (p)_i,b_j) Wherein p is_iBottom layer features representing super-pixel regions in the global feature set, b_jRepresenting an underlying feature of a super-pixel region in a background information set, i.e. p_i∈P,b_jBelongs to B, and applies LDSM operator to solve the bottom layer characteristic of super pixel region in each global information set, namely p_iWith the underlying features b of the super-pixel region in each background information set_jCoefficient of similarity between α_ijAs shown in the following formula:

by solving equation (4), the learning coefficient α of the similarity measure is obtained_ijWhen p is_iAnd b_jWhen they are close, α_ijIs approximately 1; when p is_iAnd b_jWhen they are completely equal α_ijEqual to 1; when p is_iAnd b_jWhen the difference is large, α_ijAway from 1; due to p_iAnd b_jIs expressed as α_ijDegree of deviation from 1, the similarity learning coefficient α_ijNormalized to β according to equation (5)_ijThen, then

Wherein the content of the first and second substances,

represents will | a_ij-1| is normalized to the interval [0,1 |)]Normalized learning coefficient β_ijThe closer to 0, the lower level feature p representing the global superpixel region_iBottom layer feature b closer to background super pixel area_j；

By solving the above optimization problem, a set of normalized learning coefficients is obtained, as shown in the following equation:

i represents any row of the matrix in equation (6), the ith row represents the learning coefficient obtained by learning all background superpixels by the ith superpixel in the feature set, the potential features β are defined by taking the mean value of each element in each row_iThe calculation formula of (2) is as follows:

wherein n is totalBackground superpixel number, β_ijRepresenting any one element in the matrix of the formula (6), and obtaining a potential target feature map F (β) from the formula (7)₁,β₂,…,β_k) And k represents the total number of superpixels.

In the step 4, the target feature maps of the airport and oil tank targets are extracted, and the following steps are realized:

(1) extracting airport target characteristic diagram

For an input image I to be detected, initially acquiring straight line information of the whole image by applying an LSD (least squares) straight line detection operator, then calculating straight line density characteristics corresponding to a super-pixel segmentation result of a current layer, wherein a numerator is the number of pixel points on a straight line in each super-pixel region, and a denominator is the area of a corresponding region. The corresponding linear density characteristic d of any super pixel region i is expressed by the formula (8)_iComprises the following steps:

wherein Num_LRepresenting the number of pixel points on a straight line, Num is the total number of pixels, region (i) represents the ith super pixel region, and therefore, an object feature map based on the airport object is obtained, namely a straight line density map D ═ D₁,d₂,…,d_k) Wherein d is₁，d₂，…,d_kRespectively representing the linear density characteristics of 1, 2, … th superpixel areas, wherein k represents the total number of superpixels;

(2) extracting oil tank target characteristic diagram

The circle feature is a typical regional feature of an oil tank target, circle detection is carried out on an input image to be detected by utilizing Hough transformation, then a circle detection result is converted into a feature diagram, and the feature diagram is used as an oil tank target feature diagram;

when circle detection is carried out by utilizing Hough transformation, voting is carried out on (a, c, r) consisting of circle center positions (a, c) and circle radius r, wherein a is a circle center abscissa and c is a circle center ordinate; the local peak value in the voting result is the existing center position coordinate and radius of the circle, the voting result is used as the weight value of the circle feature, and all points meeting the formula (9) should share the same weight value as the corresponding (a, c, r), that is, all points should share the same weight value as the corresponding (a, c, r), that is, the local peak value in the voting result is the existing center position coordinate and radius of the circle

(x-a)²+(y-c)²≤r²(9)

Wherein (x, y) is the coordinate of any position in the target characteristic diagram, and the original weight values of other points in the non-circle are retained to obtain a target characteristic diagram C of the oil tank target.

The step 5 of fusing and generating the level saliency map specifically comprises the following steps:

firstly, carrying out additive fusion on a target characteristic diagram D of an airport target and a target characteristic diagram C of an oil tank target, namely performing union operation to obtain an image subjected to additive fusion; and then fusing the potential target feature graph F and the image after the addition fusion again, wherein the fusion process adopts a mode of multiplication among corresponding pixel points, namely intersection operation is taken, and the fusion mode is expressed as follows:

S＝(D+C)·F (10)

wherein, S represents a saliency map generated through fusion in each layer learning process, i.e. a hierarchical saliency map.

In step 6, defining an adaptive learning termination criterion, and determining whether to terminate the current learning process, specifically including:

(1) firstly, obtaining sample images of an airport target and an oil tank target from a database, respectively training an LDA model by using the sample of the airport target and the sample of the oil tank target to obtain a theme model of the airport target and the oil tank target, and simultaneously training a theme model of a background to obtain the theme model of the airport target, the oil tank target and the background. In the LDA model training process, the characteristics of all training samples are selected as the color characteristics of a CIELab color space; after LDA model training, a background topic model p (z | bg) and an airport topic model p (z | fg) are obtained₁) And tank topic model p (z | fg)₂)；

(2) For the level saliency map of each layer, a threshold of 0.6 is set. When the value of the super pixel area is more than 0.6, the super pixel area is considered as a significant area, otherwise, the super pixel area is considered as not significant; computing a topic model p (zs) for the ith salient region_i) Wherein s is_iIndicating the ith superpixel saliencyRegion, calculate topic model p (z | s)_i) P (z | bg), p (z | fg)₁)，p(z|fg₂) The distance between the three types of the three is taken as the type label corresponding to the topic model with the minimum distance, wherein the type label is one of an airport target, an oil tank target or a background, and the three are given to a significant area s_iLabeling the same labels, finishing learning when the type labels of all the salient regions are the type labels of the targets, and outputting the position of each salient region and the type label value of the salient region, otherwise, continuing the reinforcement learning process of the next layer; the cosine distance between the two vectors is used when calculating the distance between the topic models.

In the step 7, a multi-layer learning framework is adopted, and a feedback mechanism is utilized between adjacent layers to feed back the level saliency map of each layer to the input image of the current layer, so as to realize background suppression layer by layer, specifically comprising:

when the learned hierarchical saliency map S of the current layer does not satisfy the learning termination condition, the hierarchical saliency map S is first stretched by a stretching function, which is as follows:

R＝f(S) (11)

wherein f represents a stretching function applied to the hierarchical saliency map S, said stretching function selecting a quadratic function; and defining the stretched matrix R as an enhancement matrix, and applying the enhancement matrix to enhance the input image. The image enhancement formula for the first layer is as follows:

I₂＝I·R₁(12)

R₁is an enhanced matrix of a first layer in the learning process, and an enhanced image I of the first layer is obtained by the feedback between layers of an input image I to be detected₂Is shown by₂As input image for the second layer reinforcement learning process, pair I₂And (3) performing the characteristic extraction step to obtain a level saliency map of the second-level learning process, and by analogy, the feedback expression among levels is as follows:

I_t+1＝I_t·R_t(13)

wherein, I_tAnd I_t+1Representing input images of the t-th and t + 1-th layers, respectively, R_tAs an enhancement matrix for the t-th layerInput image I of t +1 th layer through the inter-level feedback process of equation (13)_t+1The saliency of the target area is preserved while the background area is suppressed;

in the t-th learning process, the input image I of the current layer_tPerforming superpixel segmentation on the image by using SLIC algorithm, and segmenting the image into k_tA super pixel region; the adopted super-pixel segmentation number meets the following conditions:

k₁≥k₂≥k₃≥…≥k_t≥…，t＝1,2,3,… (14)

namely, a segmentation mode from fine to coarse is adopted, the edge characteristics of the image can be accurately reserved by the initial fine segmentation, and the calculation amount can be properly reduced by the subsequent coarse segmentation;

updated image I_t+1The learning of a new layer is started with the input image of the next layer, and step 1 is executed.

In step 8, obtaining the target area and the category label of the target in the final saliency map specifically includes:

in the learning process, if the level saliency map of the current layer meets the self-adaptive learning ending criterion, ending the learning process; the number T of layers where the current layer is located is represented as the total number of learning layers and is also the iteration number of the learning process; taking the hierarchical saliency map of the current layer as a final saliency map, namely a final saliency map S_finalExpressed as:

S_final＝S_T(15)

S_Ta hierarchical saliency map S obtained in the T-th layer learning process is shown.

Let S be the final saliency map S_finalIn any significant area, the position of the airport target or the oil tank target is marked in the input image to be detected according to the label of the area, namely the airport target or the oil tank target, determined by the self-adaptive learning ending criterion, so that the task of detecting the remote sensing target is completed.

The invention has the following advantages and beneficial effects:

(1) the invention adopts the similarity measurement operator with the minimum distance to carry out the similarity measurement of the feature vector, and a similarity measurement coefficient can be obtained corresponding to each superpixel and is used as the potential target feature of the superpixel, thereby learning the feature difference between each superpixel and the background information set.

(2) The invention adopts a hierarchical enhanced structural framework to adaptively learn the potential target characteristics. During each layer of learning and updating process of the image, the feature representation of the target area is made more prominent, and simultaneously the feature representation of the background area is suppressed, so that the prominent target is gradually highlighted.

(3) The invention adaptively determines the number of learning layers. When the target area in the hierarchical saliency map is sufficiently salient, the model can automatically control the learning process to end, so that the number of learned layers is adaptively determined, and manual intervention is reduced, so that the algorithm has good adaptability to different input images.

(4) The invention adopts a super-pixel segmentation method from fine to coarse. When superpixel segmentation is carried out in hierarchical learning, fine segmentation is carried out first, and then coarse segmentation is carried out. The method comprises the steps of firstly carrying out fine segmentation, improving the accuracy of segmentation, facilitating the acquisition of accurate boundary information of a target object in an image, and then carrying out coarse segmentation, and improving the rapidity of an algorithm on the basis of ensuring the accuracy.

(5) When a hierarchical reinforcement learning model is improved, a mode of fusing a top-down target feature map and a bottom-up underlying potential target feature map is provided, so that candidate significant regions with relatively significant colors and textures can be extracted from a large-amplitude remote sensing image, significant regions meeting the requirement of a detection target task can be screened out from the candidate significant regions according to task driving of target detection, and the constructed significance detection model has pertinence and flexibility.

(6) The invention applies the LDA topic model as the judgment condition of learning ending, and the LDA topic model can extract the topic characteristics of any target, so the method can process the detection problem of multiple targets, and the method can determine the target category while determining the significant area.

The improved remote sensing target detection method of the hierarchical significant model can accurately detect the airport and oil tank targets in the low-resolution remote sensing images under different sizes and illumination conditions, and has better robustness.

Drawings

FIG. 1 is a detailed flowchart of a remote sensing target detection method for an improved hierarchical saliency model according to the present invention;

FIG. 2 is a diagram of airport detection effects of the remote sensing target detection algorithm under various scales and illumination conditions in the present invention; wherein a is a detection effect diagram of a small-size airport target, b is a detection effect diagram of a large-size airport target, and c is a detection effect diagram of an airport target under the condition of insufficient illumination;

fig. 3 is a schematic diagram of the detection result of the remote sensing target detection algorithm for the airport and oil tank mixed target, wherein a is the detection result of the airport target, and b is the detection result of the oil tank target.

Detailed Description

Referring to fig. 1, a remote sensing target detection method based on an improved hierarchical significant model according to the present invention is described in an embodiment by taking an airport and oil tank mixed target as an example, and includes the following specific implementation steps:

step 1: performing superpixel segmentation on an input image to be detected by using a simple linear iterative clustering algorithm SLIC (linear iterative clustering algorithm), clustering pixels with color similarity in a neighboring region in the image, and representing the regions clustered into a class by adopting superpixels to obtain an image subjected to superpixel segmentation;

the method comprises the steps of taking color features of a color image and position information of each pixel point as constraints for an input image I to be detected, clustering by adopting a K-means clustering algorithm, extracting color features of a CIELab color space of the image, representing local pixel points with similar color features by using superpixels, completing a superpixel segmentation process of the image, selecting the total superpixel segmentation number of the whole image as K, and enabling the segmented image I to comprise K superpixel regions.

Step 2: taking the color feature of each superpixel in the segmented image as a bottom layer feature, constructing a global information set, and simultaneously extracting the bottom layer features of all superpixels at the boundary position of the segmented image to construct a background information set;

p_i＝(ll_i,la_i,lb_i) (1)

B＝{b₁,b₂,…,b_n}，0<n<k (2)

b_j＝(ll_j,la_j,lb_j)，1≤j≤n (3)

where j represents any super-pixel in the background information set, b_jUnderlying features, ll, representing super-pixel regions in a background information set_j，la_j，lb_jRespectively representing three pixel points in the jth super pixel in the CIELab color spaceThe color channel value is the mean value.

And step 3: learning the feature similarity between each super-pixel and a background information set by adopting a similarity measurement operator LDSM based on a minimum distance, wherein the feature similarity is represented by a learning coefficient of the LDSM, the value of the learning coefficient is in positive correlation with the feature similarity between each super-pixel and the background information set, and a potential target feature map is constructed by utilizing the learning coefficient, can reflect the feature difference between each super-pixel region and the background information set and comprises all candidate significant target regions;

Wherein the content of the first and second substances,

where n is the total background superpixel number, β_ijRepresenting any one element in the matrix of the formula (6), and obtaining a potential target feature map F (β) from the formula (7)₁,β₂,…,β_k) And k represents the total number of superpixels.

And 4, step 4: extracting prior characteristics of airport and oil tank targets from top to bottom from an input image to be detected, and respectively generating target characteristic graphs. For the airport target, performing linear detection on an input image to be detected by using an LSD (Line Segment Detector) to obtain a linear detection result, then counting the number of pixel points on the detected linear in each super-pixel region by using the image obtained by super-pixel segmentation in the step 1, and generating a linear density graph as a target feature graph of the airport target; for an oil tank target, carrying out circle detection on an input image to be detected by using a Hough transformation method to obtain a circle detection result, distributing different weights to the detected circle internal points and non-internal points by using a voting mechanism to generate a circle feature map, and taking the circle feature map as a target feature map of the oil tank target;

(1) extracting airport target characteristic diagram

(2) extracting oil tank target characteristic diagram

(x-a)²+(y-c)²≤r²(9)

And 5: fusing the potential target characteristic diagram F in the step 3 with the airport target characteristic diagram D and the oil tank target characteristic diagram C in the step 4 to generate a hierarchical saliency map; the salient regions in the hierarchical salient map include candidate salient target regions, airport target regions and tank target regions in the potential target feature map obtained at each layer in the learning process.

S＝(D+C)·F (10)

Step 6: and defining an adaptive learning ending criterion by using the LDA topic model, and judging the feature similarity between the salient region in the current-level saliency map and the airport and tank targets in the step 5 by using the criterion so as to judge whether the current learning process should be ended. If the learning process is not finished, executing step 7; if the learning process has ended, go to step 8;

(2) For the level saliency map of each layer, a threshold of 0.6 is set. When in the super pixel areaWhen the value is more than 0.6, the super pixel area is considered as a significant area, otherwise, the super pixel area is considered as insignificant; computing a topic model p (zs) for the ith salient region_i) Wherein s is_iRepresenting the ith super-pixel salient region, a topic model p (z | s) is calculated_i) P (z | bg), p (z | fg)₁)，p(z|fg₂) The distance between the three types of the three is taken as the type label corresponding to the topic model with the minimum distance, wherein the type label is one of an airport target, an oil tank target or a background, and the three are given to a significant area s_iLabeling the same labels, finishing learning when the type labels of all the salient regions are the type labels of the targets, and outputting the position of each salient region and the type label value of the salient region, otherwise, continuing the reinforcement learning process of the next layer; the cosine distance between the two vectors is used when calculating the distance between the topic models.

And 7: and a multi-layer learning framework is adopted, and a feedback mechanism is utilized between adjacent layers to feed the level saliency map obtained by each layer back to the input image of the current layer, so that the background suppression layer by layer is realized. Namely, the hierarchical saliency map obtained through fusion in the step 5 is used as an enhancement factor to act on the input image of the current layer, the image of the current layer after superpixel segmentation is enhanced, a new image obtained through enhancement is used as the input image for next-layer learning, the step 1 is executed, and the learning process of a new layer is started; repeating the steps, and gradually highlighting the target area through multi-layer learning;

R＝f(S) (11)

I₂＝I·R₁(12)

R₁is the enhancement matrix of the first layer in the learning process,obtaining a first-layer enhanced image I from an input image I to be detected through inter-hierarchy feedback₂Is shown by₂As input image for the second layer reinforcement learning process, pair I₂And (3) performing the characteristic extraction step to obtain a level saliency map of the second-level learning process, and by analogy, the feedback expression among levels is as follows:

I_t+1＝I_t·R_t(13)

wherein, I_tAnd I_t+1Representing input images of the t-th and t + 1-th layers, respectively, R_tFor the enhancement matrix of the t-th layer, the input image I of the t + 1-th layer is processed by the inter-level feedback process of formula (13)_t+1The saliency of the target area is preserved while the background area is suppressed;

k₁≥k₂≥k₃≥…≥k_t≥…，t＝1,2,3,… (14)

And 8: and after the learning is stopped, taking the level saliency map obtained in the step 5 in the current layer as a final saliency map, extracting a saliency region in the final saliency map as a target region, and marking a target region and a category label of the target, namely an airport target or an oil tank target, in the image to be detected, so as to obtain a final result of remote sensing target detection.

In the learning process, if the level saliency map of the current layer meets the self-adaptive learning ending criterion, ending the learning process; the number T of layers where the current layer is located is represented as the total number of learning layers and is also the iteration number of the learning process; the hierarchy of the current layerThe saliency map is used as the final saliency map, namely the final saliency map S_finalExpressed as:

S_final＝S_T(15)

FIG. 2 is a diagram of the airport detection effect of the remote sensing target detection algorithm under various scales and illumination conditions, and (a) is a diagram of the detection effect of small-size airports. The detection result shows that the remote sensing target detection method provided by the invention can better detect the whole airport area, can well suppress the interference of the surrounding background area, and can detect the obvious airport area completely. Wherein (b) is a detection effect graph for large-size airport targets. As can be seen from the detection results, under the condition of large scale, the detection results of the airport targets are complete. (c) The method is a detection effect diagram for the airport target under the condition of insufficient illumination. As can be seen from the detection result, when the illumination is insufficient, the target detection model can still accurately extract the position of the airport target. As can be seen from FIG. 2, the target detection model has good self-adaptive capacity for the earth surface environment of the remote sensing image, the target dimension and the illumination change, namely the algorithm has the characteristics of wide application range and good robustness.

FIG. 3 is a schematic diagram of the detection result of the remote sensing target detection algorithm for the airport and oil tank mixed target. (a) The method can accurately detect the target area, has good edge retentivity to the remote sensing target, and can highlight the whole target area at the same time. (b) Is the detection result of the oil tank object, and the clutter degree of the background is slightly increased from the detection result, because a plurality of oil tank objects usually exist in one image, and the pressing of the background is completely performed when a single object is not pressed. But despite the slightly cluttered background, the most salient areas in the map can still accurately hit the target.

Claims

1. A remote sensing target detection method based on an improved hierarchical significant model is characterized by comprising the following steps: the remote sensing target aims at two types of targets, namely an airport target and an oil tank target, and comprises the following steps:

step 1: performing superpixel segmentation on an input image to be detected by using a simple linear iterative clustering algorithm SLIC (linear iterative clustering algorithm), clustering pixels with color similarity in a neighboring region in the image, and then expressing the regions clustered into a class by using superpixels so as to obtain an image subjected to superpixel segmentation;

the method comprises the steps that an input image I to be detected is clustered by adopting a K-means clustering algorithm by taking color features of a color image and position information of each pixel point as constraints, color features of a CIELab color space of the image are extracted, local pixel points with similar color features are represented by superpixels, so that a superpixel segmentation process of the image is completed, the total number of superpixels segmented in the whole image is selected to be K, and the segmented image I' comprises K superpixel regions;

(1) knowing that the total number of the superpixels contained in the image after the superpixel segmentation is k, selecting the color feature of each superpixel to construct the bottom layer feature, namely converting the image into a CIELab color space, respectively solving the values of all pixels in each superpixel in three color channels of the CIELab color space, taking the average value of all pixel points in each superpixel in each color channel as the bottom layer feature of the superpixel,i.e. the bottom layer feature p of the ith super pixel_iExpressed as:

p_i＝(ll_i,la_i,lb_i) (1)

wherein, ll_i，la_i，lb_iRespectively representing the average value of all pixel points in the ith super pixel in the CIELab color space in the values of the three color channels, and then the feature set of all super pixels is P ═ { P ═₁,p₂,···,p_kDefining P as a global information set; wherein p is₁，p₂，···,p_kRespectively representing the bottom layer characteristics of 1, 2, ·, k superpixel regions in the segmented image; i is more than or equal to 1 and less than or equal to k, i represents any super pixel in the global information set, and k represents the total number of the super pixels;

(2) for a global information set P ═ P₁,p₂,···,p_kN super-pixels are arranged on the image boundary, the n super-pixels on the boundary are extracted to form a background information set, n represents the total number of the background super-pixels, and a background information set B represents:

B＝{b₁,b₂,···,b_n}，0＜n＜k (2)

b_j＝(ll_j,la_j,lb_j)，1≤j≤n (3)

where j represents any super-pixel in the background information set, b_jUnderlying features, ll, representing super-pixel regions in a background information set_j，la_j，lb_jRespectively representing the average values of all pixel points in the jth super pixel in the CIELab color space in the values of the three color channels;

and step 3: learning the feature similarity between each super pixel and a background information set by adopting a similarity measurement operator LDSM based on a minimum distance, wherein the feature similarity is characterized by a learning coefficient of the LDSM, the value of the learning coefficient is in positive correlation with the feature similarity between each super pixel and the background information set, and a potential target feature map is constructed by utilizing the learning coefficient, reflects the feature difference between each super pixel region and the background information set and comprises all candidate significant target regions;

Wherein the content of the first and second substances,

Through the solution, a group of normalized learning coefficients is obtainedβ_ijAs shown in the following formula:

where n is the total background superpixel number, β_ijRepresenting any one element in the matrix of the formula (6), and obtaining a potential target feature map F (β) from the formula (7)₁,β₂,···,β_k) K represents the total number of superpixels;

and 4, step 4: extracting prior characteristics of an airport target and an oil tank target from top to bottom from an input image to be detected, and respectively generating target characteristic graphs; for an airport target, performing linear detection on an input image to be detected by using a linear detection operator LSD to obtain a linear detection result; then, counting the number of pixel points on the detected straight line in each super pixel region by using the image obtained by super pixel segmentation in the step 1, and generating a straight line density graph as a target characteristic graph of the airport target; for an oil tank target, carrying out circle detection on an input image to be detected by using a Hough transformation method to obtain a circle detection result, distributing different weights to the detected circle internal points and non-internal points by using a voting mechanism to generate a circle feature map, and taking the circle feature map as a target feature map of the oil tank target;

(1) extracting airport target characteristic diagram

For an input image I to be detected, initially acquiring the linear information of the whole image by applying an LSD linear detection operator, and then corresponding to the superpixel segmentation of the current layerAnd calculating a linear density characteristic, wherein a numerator is the number of pixel points positioned on a straight line in each super pixel region, a denominator is the area of the corresponding region, and a formula (8) is used for expressing the corresponding linear density characteristic d for any super pixel region i_iComprises the following steps:

wherein Num_LRepresenting the number of pixel points on a straight line, Num is the total number of pixels, region (i) represents the ith super pixel region, and therefore, an object feature map based on the airport object is obtained, namely a straight line density map D ═ D₁,d₂,…,d_k) Wherein d is₁，d₂，···,d_kRespectively representing the linear density characteristics of the 1 st, 2 nd, k th superpixel areas, wherein k represents the total number of superpixels;

(2) extracting oil tank target characteristic diagram

when circle detection is carried out by utilizing Hough transformation, voting is carried out on (a, c, r) consisting of circle center positions (a, c) and circle radius r, wherein a is a circle center abscissa and c is a circle center ordinate; the local peak in the voting result is the coordinate of the center position of the existing circle and the radius of the circle, the voting result is used as the weight value of the circle feature, all the values (x, y) satisfying the formula (9) are circles with (a, c) as the center of the circle and r as the radius, that is, all the points satisfying the formula (9) share the same weight value as the corresponding value (a, c, r), that is:

(x-a)²+(y-c)²≤r²(9)

wherein (x, y) is the coordinate of any position in the target characteristic diagram, and the original weight values of other points in the non-circle are retained to obtain a target characteristic diagram C of the oil tank target;

and 5: fusing the potential target feature map in the step 3 with the target feature map of the airport target and the target feature map of the oil tank target in the step 4, namely three images, and generating a level saliency map; the salient regions in the hierarchical salient map comprise candidate salient target regions, airport target regions and oil tank target regions in the potential target feature map obtained by each layer in the learning process;

in the step 5, the fusion generation of the hierarchical saliency map is specifically as follows:

S＝(D+C)·F (10)

wherein, S represents a saliency map generated through fusion in each layer learning process, namely a hierarchy saliency map;

step 6: defining an adaptive learning ending criterion by using a latent topic semantic model LDA, judging the feature similarity between the salient region in the hierarchical salient map and the airport and oil tank targets in the step 5 by using the criterion, judging whether the current learning process is to be ended or not, and executing a step 7 if the learning process is not ended; if the learning process has ended, go to step 8;

(1) firstly, obtaining sample images of an airport target and an oil tank target from a database, respectively training an LDA model by using the sample of the airport target and the sample of the oil tank target to obtain a theme model of the airport target and the oil tank target, simultaneously training the theme model of a background to obtain the theme model of the airport target, the oil tank target and the background, and selecting the characteristics of all training samples as the color characteristics of a CIELab color space in the training process of the LDA model; after LDA model training, a background topic model p (z | bg) and an airport topic model p (z | fg) are obtained₁) And tank topic model p (z | fg)₂) Wherein z is a model variable and bg is a background regionDomain, fg₁For airport area, fg₂Is an oil tank area;

(2) setting a threshold value of 0.6 for the level saliency map of each layer, and when the value of the super pixel area is greater than 0.6, considering the super pixel area as a saliency area, otherwise, considering the super pixel area as insignificant; computing a topic model p (zs) for the ith salient region_i) Wherein s is_iRepresenting the ith super-pixel salient region, a topic model p (z | s) is calculated_i) P (z | bg), p (z | fg)₁)，p(z|fg₂) The distance between the three types of the three is taken as the type label corresponding to the topic model with the minimum distance, wherein the type label is one of an airport target, an oil tank target or a background, and the three are given to a significant area s_iLabeling the same labels, finishing learning when the type labels of all the salient regions are the type labels of the targets, and outputting the position of each salient region and the type label value of the salient region, otherwise, continuing the reinforcement learning process of the next layer; when calculating the distance between the topic models, the cosine distance between the two vectors is adopted;

and 7: feeding the level saliency map obtained from each layer back to the input image of the current layer by adopting a multilayer learning frame and utilizing a feedback mechanism between adjacent layers to realize background suppression layer by layer, namely taking the level saliency map obtained by fusion in the step 5 as an enhancement factor to act on the input image of the current layer, enhancing the image of the current layer after superpixel segmentation, taking a new image obtained after enhancement as the input image of the next layer for learning, executing the step 1 and starting the learning process of the new layer; repeating the steps, and gradually highlighting the target area through multi-layer learning;

R＝f(S) (11)

wherein f represents a stretching function applied to the hierarchical saliency map S, said stretching function selecting a quadratic function; defining the stretched matrix R as an enhancement matrix, and applying the enhancement matrix to enhance the input image, wherein the image enhancement formula of the first layer is as follows:

I₂＝I·R₁(12)

R₁is an enhanced matrix of a first layer in the learning process, and an enhanced image I of the first layer is obtained by the feedback between layers of an input image I to be detected₂Is shown by₂As an input image for the second layer reinforcement learning process, I is extracted using step 3 and step 4₂And (5) performing image fusion to obtain a level saliency map of the second-level learning process, and so on, wherein the feedback expression among levels is as follows:

I_t+1＝I_t·R_t(13)

k₁≥k₂≥k₃≥···≥k_t≥···，t＝1,2,3,··· (14)

namely, a segmentation mode from fine to coarse is adopted, the edge characteristics of the image can be accurately reserved by the initial fine segmentation, and the calculation amount is properly reduced by the subsequent coarse segmentation;

updated image I_t+1Starting the learning of a new layer of input images as a next layer, and executing the step 1;

and 8: after the learning is stopped, taking the level saliency map in the step 5 in the current layer as a final saliency map, extracting a saliency region in the final saliency map as a target region, and marking the target region and a category label of the target, namely an airport target or an oil tank target, in the image to be detected so as to obtain a final result of remote sensing target detection;

S_final＝S_T(15)

S_Trepresenting a hierarchy saliency map obtained in a layer T learning process;