CN110472658B

CN110472658B - Hierarchical fusion and extraction method for multi-source detection of moving target

Info

Publication number: CN110472658B
Application number: CN201910602605.0A
Authority: CN
Inventors: 韩煜; 吴限德; 史明月; 张芷丹; 吕俊廷; 吴白轩; 李荣成; 李佳黛; 谷雨; 孔繁盛; 陶文舰
Original assignee: Harbin Engineering University
Current assignee: Harbin Engineering University
Priority date: 2019-07-05
Filing date: 2019-07-05
Publication date: 2023-02-14
Anticipated expiration: 2039-07-05
Also published as: CN110472658A

Abstract

The invention belongs to the technical field of multi-source data hierarchical fusion and extraction based on a multi-source sensor, and particularly relates to a hierarchical fusion and extraction method for multi-source detection of a moving target. Registering and fusing a visible light image and an infrared light image to obtain a first-layer fused image; after the first layer of fused image and the hyperspectral image are registered, weakening the registered image pixels according to the ground feature classification area to obtain a second layer of fused image; and detecting the target of the second layer fused image to obtain the position information of the target in the image, sensing the target to obtain the longitude and latitude of the target in the real environment, adjusting the attitude of the aircraft to track the target, and realizing continuous detection and sensing of the target. The invention combines various image sources, effectively combines the signal characteristics of the various image sources through image fusion, removes redundant repeated data information, increases the accuracy of target detection and improves the detection efficiency.

Description

Hierarchical fusion and extraction method for multi-source detection of moving target

Technical Field

The invention belongs to the technical field of multi-source data hierarchical fusion and extraction based on a multi-source sensor, and particularly relates to a hierarchical fusion and extraction method for multi-source detection of a moving target.

Background

With the progress and development of science and technology, the quality of the payload is remarkably improved, the increase of the quality means that more sensing devices can be carried, the calculation capability and the information storage capability of the payload are also remarkably improved, and the calculation capable of being executed is more complicated. Various detection devices such as a visible light sensor, an infrared sensor, a hyperspectral sensor and the like can be carried in the payload of the spacecraft, and visible light sensing data images, infrared sensing data images, hyperspectral sensing data images and the like can be obtained respectively.

The precondition for tracking the target in the image is target detection, and how to establish a quick, accurate and effective target detection method is a key problem. Image recognition is a method and technique of pattern recognition used in the image domain. Pattern recognition refers to the process of processing and analyzing various forms of information characterizing an object or phenomenon, describing, recognizing, classifying and interpreting the object or phenomenon, i.e., identifying and classifying are implemented by a computer. The idea is used in images to realize the intelligence-like cognition on the perceived things. The main idea of image recognition is to establish an information base of the characteristics of things, collect the characteristics of strange images, compare the collected characteristics with the information in the known characteristic information base, and consider that a target is found and recognized when the acquired characteristics exceed a certain similarity threshold.

Disclosure of Invention

The invention aims to provide a hierarchical fusion and extraction method for multi-source detection of a moving target.

The purpose of the invention is realized by the following technical scheme: the method comprises the following steps:

step 1: reading an image input by a multi-source image sensor;

and 2, step: carrying out image registration on the visible light image and the infrared light image, and fusing the registered images to obtain a first layer of fused image;

and 3, step 3: carrying out image registration on the first layer of fused image and the hyperspectral image, and weakening the registered image pixels according to the ground feature classification area to obtain a second layer of fused image;

and 4, step 4: and detecting the target of the second layer fused image to obtain the position information of the target in the image, sensing the target to obtain the longitude and latitude of the target in the real environment, adjusting the attitude of the aircraft to track the target, and realizing continuous detection and sensing of the target.

The present invention may further comprise:

the image registration method in step 2 and step 3 specifically comprises the following steps:

step 2.1: extracting an edge contour of the image to obtain an edge contour image of the original image;

extracting the contour of the image by using a phase consistency algorithm, wherein a phase consistency function is as follows:

wherein A is _n Is the amplitude on the scale n; phi is a unit of _n (x) Is the nth fourier component phase value at x;

represents a weighted average quantity of local phase angles of each component of the fourier transform when PC (x) takes a maximum value at x;

step 2.2: establishing a characteristic corner point with scale, position and direction information in the edge contour image, wherein the specific method comprises the following steps:

step 2.2.1: constructing a nonlinear scale space, so that the characteristic angular points have scale information;

carrying out Gaussian filtering processing on the edge contour image to obtain an image gray level histogram and a contrast factor k; and converting a group of calculation time, and then obtaining all information layers of the nonlinear filtering image by adopting an additive operator splitting algorithm:

wherein A is _l A conduction matrix representing the image I in different dimensions l; t is t _i Defining as calculation time, and only using one group of calculation time to construct a nonlinear scale space at a time; e is a unit array;

step 2.2.2: detecting characteristic angular points to obtain characteristic angular point position information;

moving a local window point by point in an edge contour image of a nonlinear scale space, and calculating pixel values in the window to judge whether the window is an angular point;

step 2.2.3: calculating direction information of characteristic angular points

The coordinates of a characteristic angular point p (i) in an image are (x (i), y (i)), two points p (i-k) and p (i + k) are selected in the neighborhood, the distance between the two points and the point p (i) is k, T is a tangent line at the point p (i), and the main direction of the characteristic angular point p (i) is an included angle theta between the tangent line T and the positive direction of the x axis _Feature(s) Calculating publicThe formula is as follows:

step 2.3: establishing a shape description matrix;

let feature point set P { P } ₁ ,p ₂ ,...p _n ,},p _i ∈R ² Establishing a polar coordinate system in an r multiplied by r neighborhood taking a certain characteristic point p (i) as an origin and taking the p (i) point as a center, equally dividing 360 degrees to obtain 12 sectors, and sequentially taking the sectors as the sectors according to the radius

Drawing five concentric circles to obtain 60 small areas; counting the number of characteristic points in each cell, and calculating p _i The shape histogram hi of each feature point is the shape context descriptor of each feature point; the shape histogram hi of each feature point is calculated by:

hi(k)＝#{q≠p _i :(q-p _i )∈bin(k)}

wherein # represents the number of feature points in the statistical kth (k =1, 2.. 60) region;

step 2.4: matching the characteristic angular points of the two images to complete image registration;

searching the feature points of the nearest neighbor and the next nearest neighbor by using the Euclidean distance, wherein the Euclidean distance is as follows:

wherein, a _i Describing R (a) for the shape context of any feature point of reference image ₀ ,a ₁ ,...a ₅₉ ) The ith of (1), b _i Contextual description of shape I (b) for arbitrary feature points of a reference image ₀ ,b ₁ ,...b ₅₉ ) The ith of (2);

if p is any feature point in a certain image, the feature points are respectively matched with the nearest neighbor and the next nearest neighbor of p to be registeredSetting i and j as D, the Euclidean distances between them and the feature point p _ip And D _jp (ii) a Setting a calculation threshold

When the threshold is less than a certain value, p and i are considered as feature points of correct pairing, otherwise, the operation fails.

The method for fusing the registered visible light image and infrared light image in the step 2 specifically comprises the following steps:

step 3.1: performing region segmentation on the registered infrared image, and separating a suspected region and a background region of the infrared image; the suspected area is a high-brightness area with a large infrared radiation image bright;

step 3.2: respectively carrying out dual-tree complex wavelet transformation on the infrared image and the visible light image after registration to obtain low-frequency information and high-frequency information of the image, wherein the basic information of the image corresponds to the low-frequency information of a wavelet transformation result, and the detail information of the image corresponds to the high-frequency information of the wavelet transformation result;

step 3.3: fusing the result of image segmentation and the result of wavelet transformation to respectively obtain a low-frequency fused image and a high-frequency fused image;

step 3.4: and performing dual-tree complex wavelet inverse transformation on the low-frequency fusion image and the high-frequency fusion image to obtain a first fusion image.

The method for detecting the target of the second-layer fusion image in the step 4 to obtain the position information of the target in the image specifically comprises the following steps:

step 4.1: filtering the second layer fusion image;

establishing a window matrix to scan pixels one by one on the two-dimensional image, wherein the value of the central position of the matrix is replaced by the average value of all point values in the window matrix, and the expression is as follows:

wherein: f (x, y) is a second layer fusion image to be processed; g (x, y) is the second layer fused image after filtering processing; s is a set of neighborhood coordinate points with (x, y) points as midpoints, and M is the total number of coordinates in the set;

step 4.2: processing the second-layer fusion image after filtering by using a moving average image threshold method to obtain a binary image;

Z _k+1 representing one point encountered at step k +1 in the scan order, the moving average gray scale at the new point is:

wherein n is _{Grey scale} The initial value m (1) = z representing the number of points used in calculating the average gradation _i /n _{Grey scale} ；

The moving average is calculated for each point in the image, so the segmentation is performed using the following equation:

wherein K is [0,1 ]]Constant within the range, m _xy Is the moving average of the input image at (x, y);

step 4.3: deleting images with the area smaller than that of the target from the binary image, and removing interference of irrelevant information;

step 4.4: processing the binary image without the interference of the irrelevant information by using image morphology;

step 4.5: establishing a cutting function, and cutting the target from the whole image subjected to image morphological processing to obtain a target image to be checked;

the background part in the image I after the image morphological processing is a black value of 0, and the target part to be checked is a white value of 1; starting from the coordinate of the image (0, 0), finding the first point with the pixel value of 1, starting from the point, finding all the points with the pixel value of 1 connected with the point, and naming all the points as a set T ₁ In the set T ₁ Find the maximum x of the abscissa in the coordinates of the point in (1) _1max And minimum value x _1min In the set T ₁ Find the maximum value y of the ordinate in the coordinates of the point in (1) _1max And the minimum value y _1min Then cut the obtained target image to be checked

By parity of reasoning, all the objects to be checked are found, and all the images of the objects to be checked are obtained

Step 4.6: finding out the main symmetry axis of the target image to be checked by using a principal component analysis method, and obtaining the included angle theta between the main symmetry axis of the target image to be checked and the x axis _{To be checked} ；

The coordinates of each point in the image information of the target to be checked are two-dimensional, and the points are formed into n _{To be checked} Row 2 column matrix X _{To be checked} Wherein n is _{To be tested} Calculating X for the number of points in the target image information to be checked _{To be checked} Covariance matrix C of _{To be checked} And continuously calculating the covariance matrix C _{To be checked} Characteristic vector V of _{To be checked} ＝(x _v ,y _v ) The included angle theta between the main symmetry axis of the target image to be inspected and the x axis _{To be tested} Comprises the following steps:

step 4.7: performing image direction normalization processing, and rotating the target image to be checked by theta _{To be checked} Corners, and removing newly generated black edges;

step 4.8: carrying out image size normalization processing, and changing the image size of the target image to be checked after the direction normalization processing into the size of a template;

step 4.9: and matching the target image to be checked after the direction normalization and the size normalization with the images in the template library one by one, setting a similarity threshold T, and identifying the image as a target when the similarity degree exceeds the threshold.

The method for obtaining the low-frequency fusion image in the step 3.3 comprises the following steps: according to the position information of the infrared image divided into the suspected area and the background area, the visible light image is divided according to the same position information; for the suspected area of the low-frequency part of the infrared image and the visible light image, the following rule is adopted:

wherein the content of the first and second substances,

is the fused pattern low-frequency coefficient of the l-th layer,

is the low-frequency coefficient of the infrared image of the l layer,

the low-frequency coefficient of the visible light image of the l layer;

for a background area of the low-frequency parts of the infrared image and the visible light image, a regional variance method is adopted, the larger the regional variance is, the larger the gray value change corresponding to each pixel in the area is, the higher the contrast of the pixel in the area is, and the more the information corresponding to the area is; and (3) adding weight to the pixel points with large regional variance value in the image fusion, wherein the rule is as follows:

wherein, ω is _ir Is the weight of the infrared image, omega _vis The weight value is the visible light image weight value; infrared image weight omega _ir And the visible light image weight omega _vis The calculation method comprises the following steps:

ω _ir ＝1-ω _vis ；

wherein σ _vis And σ _ir Respectively the area variance of the visible light image and the infrared image, and r is a correlation coefficient area; regional variance σ of visible light image _vis Regional variance σ from infrared image _ir The calculation method comprises the following steps:

the calculation method of the correlation coefficient r comprises the following steps:

in which the size of the image is M x N,

represents the average gray-scale value of the visible light image,

representing the mean gray value of the infrared image, I _ir (I, j) represents an infrared image, I _vis (i, j) represents a visible light image.

The invention has the beneficial effects that:

the invention simultaneously applies the visible light image, the infrared light image and the hyperspectral image, simultaneously has the advantages of high visible light image resolution, high infrared light image target contrast, and the hyperspectral image can distinguish various sensor images such as artificial objects, natural objects and the like, the target detection judgment is accurate, and the influence of earth atmospheric activity on the target detection is effectively reduced. The invention combines various image sources, can effectively combine the signal characteristics of the various image sources through image fusion, and effectively removes redundant repeated data information, thereby effectively increasing the accuracy of target detection and improving the detection efficiency. The invention can realize the tracking of the target and the prediction of position information such as target course, navigational speed, longitude and latitude and the like.

Drawings

FIG. 1 is a schematic overall flow diagram of the present invention.

Fig. 2 is a schematic diagram of the registration and fusion process of the infrared image and the visible light image.

FIG. 3 is a schematic diagram of a fusion process of a first layer fused image and a hyperspectral image according to the present invention.

FIG. 4 is a schematic diagram of the object detection and sensing process of the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

The invention belongs to the technical field of multi-source data hierarchy fusion and extraction based on a multi-source sensor, and relates to a technology for respectively obtaining a visible light sensing data image, an infrared sensing data image, a hyperspectral sensing data image and the like by often carrying various detection devices such as a visible light sensor, an infrared sensor, a hyperspectral sensor and the like in a payload of a spacecraft, in particular to the technology for fusing multi-source images, identifying images, sensing targets and the like

According to the invention, image data come from a plurality of sensors, a plurality of images often appear on a target at the same position, but the information emphasis points carried by the images from each sensor are different, and data redundancy is caused by similar background information among different images, so that before image processing technologies such as image segmentation, target detection, target perception and the like are executed, fusion processing is carried out on the images, a large amount of information data is integrated and screened, so that redundant information is removed, and effective data information in different data source images is left. The image fusion technology comprises an image registration technology and an image data fusion technology.

The invention aims to utilize a multi-source image sensor and consider image distortion caused by factors such as aircraft attitude maneuver, spatial attitude disturbance, holder vibration and the like; considering image target shielding caused by earth atmospheric motion interference, sea surface sea condition and other conditions, fully utilizing the sensitivity difference of different sensors to different characteristics of signals, enabling the image to meet the requirements of image definition, target saliency and the requirement of displaying the shielded target to a certain extent through multi-layer image registration and image fusion, designing a target detection algorithm based on the fused image, achieving target detection of multiple image sources in multiple ranges, and achieving the functions of target discovery, tracking and position prediction in a certain time space by combining attitude and orbit information, attitude maneuver and the like of an aircraft.

The purpose realization mode of the invention is as follows:

step 1: reading an image input by a multi-source image sensor;

step 2: carrying out image registration on the visible light image and the infrared light image, and fusing the registered images to obtain a first layer of fused image;

and step 3: carrying out image registration on the first layer of fused image and the hyperspectral image, and weakening the registered image pixels according to the ground feature classification area to obtain a second layer of fused image;

1) First layer fusion: establishing an infrared image and visible light image registration and fusion algorithm, and making use of the characteristics of high visible light resolution, high atmospheric influence, high infrared image target contrast, low environmental resolution and the like to make the fused image target prominent and make the surrounding environment have certain details.

2) And (3) second layer fusion: and establishing a registration fusion algorithm of the hyperspectral image and the first-layer fusion image to achieve the purposes of highlighting the target and weakening the influence of irrelevant information.

3) And carrying out image recognition on the second layer fused image. And establishing a target detection algorithm with certain practical application capability and feasibility.

4) And 3) sensing the target, roughly calibrating the longitude and latitude of the target earth by combining the attitude and orbit information of the aircraft, the earth rotation, the position vector of the subsatellite point, the direction vector of the optical axis and other information on the basis of finding the target in the step 3), predicting the target track on the basis of the target track, and sensing and finding.

Through the steps, the capability of independently and autonomously discovering, identifying and tracking the target by the effective load can be achieved.

The invention has the following advantages: firstly, the image source of the invention is from an aircraft, and has the characteristics of global, all-weather, wide range, high timeliness and the like. Secondly, the invention simultaneously applies the visible light image, the infrared light image and the hyperspectral image, simultaneously has the advantages of high visible light image resolution, high infrared light image target contrast and capability of distinguishing various sensor images such as artificial objects, natural objects and the like by the hyperspectral image, and fully improves the target detection rate and the accuracy rate. Thirdly, the invention fully utilizes the advantages of the platform, and can realize the tracking of the target to a certain degree and the prediction of the position (course, navigational speed, longitude and latitude).

First, consider that image registration issues must be considered before fusing the infrared light image with the visible light image because of aircraft platform vibration and focus bias. And considering the target contour in the infrared image and the visible light image as a core, a data matching and fusing method based on a shape descriptor is selected. The image registration of the invention adopts a rapid similar nearest neighbor algorithm, and the existing algorithm is considered to be widely applied to the images under the conditions, so that the image registration has higher stability and universality. On the basis, in order to further enhance the robustness of the algorithm, a random sampling consistency algorithm is selected, matched feature points are screened, error points are removed, and an optimal set is left.

The image area segmentation method is based on image significance consideration, and an image fusion algorithm is based on a segmentation result. In the image processing, a pattern processing method of significance enhancement is considered to be used to enhance the target area of the infrared image, weaken background information and play a role in noise suppression to a certain extent. The characteristics of high-frequency information and low-frequency information in the image can be effectively separated by combining a dual-tree complex wavelet algorithm, the imaging characteristics and the image quality of the infrared image and the optical image are considered, and different fusion strategies are respectively adopted for the high frequency and the low frequency. And finishing the first layer image fusion.

Secondly, in the research of analyzing ground objects, the hyperspectral images have irreplaceable advantages, the invention adopts a frequency spectrum similarity classification method to improve the accuracy of ground object target detection, and performs a layer of image fusion on the basis of distinguishing ground objects as non-artificial or artificial objects, thereby achieving the purpose of weakening the interference background information of non-artificial objects and the like and providing powerful basis and effective support for finally distinguishing the targets. And completing the second layer image fusion.

Finally, based on the final image fusion, considering that redundant information in the image is effectively weakened and the target is highlighted, the method comprises an image recognition algorithm based on template matching, can continuously mark the position and the track of the target, can predict the direction (such as longitude and latitude, course, navigational speed and the like) of the target to a certain extent, guides the attitude change of the aircraft, and ensures the autonomous continuous recognition and tracking of the target.

The heterogeneous sensing data used in the present invention are from a visible light sensor and an infrared sensor, respectively. The image characteristics obtained by these two sensors are shown in table 1 below. The load is on the aircraft platform, and the earth atmospheric motion and the environment reflection refraction and scattering effect are also considered when the image is obtained; the attitude adjustment of the aircraft platform, the disturbance vibration of the platform, the stability of the platform and the like are considered; and the difference of image position, orientation, dimension and shape caused by various factors in consideration of photoelectric platform assembly debugging, platform performance and the like. Therefore, the method aims to reduce the influence of external factors on the image to the maximum extent and highlight the advantages of the two sensors for acquiring data to the maximum extent.

TABLE 1 characteristics of image acquisition by visible light sensor and infrared sensor

The invention includes an image registration process. The image is first outlined using a Phase consistency algorithm (Phase consistency). The phase consistency function is defined as:

wherein A is _n Is the amplitude on scale n; phi is a _n (x) Is the nth fourier component phase value at x;

represents the weighted average of the local phase angles of the components of the fourier transform when PC (x) takes the maximum value at x. And obtaining an edge contour image of the original image, and applying the edge contour image to subsequent processing.

The invention includes a method of establishing characteristic points. The nonlinear scale space construction method comprises the following steps of carrying out Gaussian filtering processing on an input image, then obtaining an image gray level histogram, and obtaining a contrast factor k. And converting a group of calculation time, and then obtaining all information layers of the nonlinear filtering image by adopting an additive operator splitting algorithm:

wherein A is _l Representing the conduction matrix of the image I in different dimensions I. t is t _i Defined as computation time, and only one set of computation time is used to construct the nonlinear scale space at a time, E being the unit matrix.

The angular point detection method comprises the following steps of moving a local window point by point in an image, and calculating pixel values in the window to judge whether the window is an angular point. Assuming the gray scale change produced after the local window C is translated (u, v), i.e.:

where I (x, y) is the gray value of the image at the (x, y) point and w (x, y) represents a gaussian weighting function.

In order to obtain a value for E (u, v) that is as large as possible, the above equation is developed by:

and then the matrix form is converted:

wherein

I _x ,I _y Are the gradient components of the image gray scale in the x-direction and the y-direction.

The local autocorrelation function E (u, v) can be approximated as an elliptical function:

E(u,v)≈Au ² +2Cuv+Bv ²

the isocorrelation points around the point form an elliptic curve, the correlation degree of the point on the ellipse and the central point is the same, and the characteristic value lambda of the second-order matrix M ₁ ,λ ₂ The major axis and the minor axis of the ellipse, respectively, represent the direction of the speed of the gray scale change. So when the characteristic value lambda ₁ ,λ ₂ Are large and quite large, i.e. corner points. When the two feature values are one large and one small, it is an edge. When both the eigenvalues are small, the flat area is obtained. In order to make the image have scale invariance during corner detection, the corner detection algorithm is brought into the nonlinear scale space introduced above, so that the feature points have scale and position information at the same time, and the corner response function is obtained as follows:

wherein σ _i,S Is a scale factor, and is a function of,

second order differential and partial derivative of the gray scale change in the x and y directions, respectively. The points satisfying the corner response function are the corners.

Adding appropriate direction information to the characteristic corner points. Knowing the coordinates of the characteristic angular point p (i) in the image as (x (i), y (i)), selecting two points p (i-k), p (i + k) in the neighborhood, making the distance between the two points and the point p (i) as k, T as a tangent at the point p (i), and the main direction of the characteristic angular point p (i) being an included angle theta between the tangent T and the positive direction of the x axis _Feature(s) The calculation formula is as follows:

the main direction is determined for the characteristic corner points on the basis of the above, which has the advantage of making it rotation-invariant. The method can be well used for solving the problem of matching characteristic angular points between the infrared image and the visible light image, and is referred to as characteristic points hereinafter.

The invention includes a shape descriptor generation algorithm. Let feature point set P { P ₁ ,p ₂ ,...p _n ,},p _i ∈R ² Establishing a polar coordinate system in an r multiplied by r neighborhood taking a certain characteristic point p (i) as an origin and taking the p (i) point as a center, equally dividing 360 degrees to obtain 12 sectors, and sequentially taking the sectors as the sectors according to the radius

Five concentric circles are drawn, resulting in 60 small regions. Counting the number of characteristic points in each cell and calculating p _i The shape histogram of points hi, hi is defined as follows:

hi(k)＝#{q≠p _i :(q-p _i )∈bin(k)}

where # denotes the number of feature points in the statistical kth (k =1, 2.. 60) region. The shape histogram of each feature point is a shape context descriptor for each feature point.

The invention comprises a matching method of a feature point set. Searching nearest neighbor feature points and next nearest neighbor feature points, searching each feature point by using Euclidean distance to search the feature points of the nearest neighbor and the next nearest neighbor of each feature point by using a fast approximate nearest neighbor algorithm. The Euclidean distance is defined as:

wherein, a _i Describing R (a) for the shape context of arbitrary feature points of a reference image ₀ ,a ₁ ,...a ₅₉ ) The ith of (1), b _i Contextual description I (b) of the shape for arbitrary feature points of a reference image ₀ ,b ₁ ,...b ₅₉ ) The ith. The specific operation steps of the algorithm are that if p is any feature point in the infrared image, and the nearest neighbor feature point and the next nearest neighbor feature point to be registered with p are respectively set as i and j, the Euclidean distances between the feature points and p are respectively D _ip And D _jp . Setting a calculation threshold

In order to further enhance the robustness of the algorithm, a random sampling consistency algorithm is selected, matched feature points are screened, error points are removed, and an optimal set is left. The algorithm substitutes all the best matching characteristic point pair position parameters into an image space projection transformation model, obtains an image projection transformation relation through a direct linear transformation algorithm, and the registration parameters of the image are the affine transformation relation between the infrared image and the visible light image. At this point, we complete the registration process between the infrared image and the visible image.

The present invention includes an image fusion process. The flow process is summarized as that the infrared image is firstly subjected to region segmentation, a highlight region and a background region of the infrared image can be separated, and then the visible light image is correspondingly mapped according to the result of the region segmentation of the infrared image. And respectively carrying out dual-tree complex wavelet transform on the infrared image and the visible light image, wherein the processing result is to obtain low-frequency information and high-frequency information of the image, the basic information of the image corresponds to the low-frequency information of the wavelet transform result, and the detail information of the image corresponds to the high-frequency information of the wavelet transform result. The results of the image segmentation and the results of the wavelet transform are taken into account. When low-frequency information is processed, a highlight area and a background area are considered, the difference of reflected information and the consideration of task requirements are considered, and different fusion strategies are adopted according to actual situations. When high-frequency information is processed, the detail characteristics of the main anti-positive image of the high-frequency information are considered, so that a weight is distributed to each region according to different richness degrees of the detail information, and a fusion strategy is designed according to the weight. The specific steps of the process are as follows:

step 3.2: respectively carrying out dual-tree complex wavelet transform on the infrared image and the visible light image after registration to obtain low-frequency information and high-frequency information of the image, wherein the basic information of the image corresponds to the low-frequency information of a wavelet transform result, and the detail information of the image corresponds to the high-frequency information of the wavelet transform result;

step 3.3: and fusing the image segmentation result and the wavelet transformation result to respectively obtain a low-frequency fusion image and a high-frequency fusion image.

And selecting a highlight area of the infrared image, firstly, performing saliency-enhanced image processing on the registered infrared image, enhancing the hot target information of the processed infrared image, and blurring background information to enhance the contrast of the whole infrared image. The algorithm for saliency enhancement is mainly based on histograms of images. Pixel I in image I _c Significance of (a) is defined as:

wherein Dis (I) _c ,I _i )＝||I _c -I _i I is _c Color distance, representing I _c And I _i The difference in color, the above formula can be rewritten as:

wherein a is _c Is a pixel I _c N is the total number of gray levels contained in the image, f _j Is a _j Probability of occurrence in the image; calculating the image saliency map to obtain the saliency map I _sal ；

The present invention encompasses region segmentation algorithms. And (3) expressing each pixel point in the image through K Gaussian mixture model mixed features, wherein K = K belongs to {1,2, \8230; K }. One pixel point in the image corresponds to the target Gaussian mixture model, and the other pixel point corresponds to the background Gaussian mixture model. Thus, the image Gibbs energy function of the region segmentation algorithm is:

E(α,k,θ,z)＝U(α,k,θ,z)+V(α,z)

wherein z is a pixel value, alpha belongs to {0,1}, the corresponding pixel belongs to the background when alpha is 0, the corresponding pixel belongs to the target when alpha is 1, U is a region item, V is a boundary item, and the calculation method of the region item U comprises the following steps:

θ＝{π(α,k),μ(α,k),∑(α,k),α＝0,1,k＝1...K}

the region item is used for distinguishing pixels in the target region or pixels in the background region, and after the parameter theta is determined through learning, the region energy item of Gibbs is determined.

The boundary item V is calculated by the following method:

where γ is an empirical constant, derived from training, C represents the set of pairs of adjacent pixel points, the function [ α [ ] _n ≠α _m ]Take values of only 1 or 0, when alpha is _n ≠α _m When is in [ alpha ] _n ≠α _m ]=1, when α _n ＝α _m When is in [ alpha ] _n ≠α _m ]＝0。

β＝(2<||z _m -z _n || ² >) ^-1 Representing the mathematical expectation of the sample. β corresponds to the contrast of the image and can determine the boundary term in the case of high or low contrast. And (3) segmenting the image by using a maximum flow minimum cut algorithm, optimizing parameters of the Gaussian mixture model after segmentation, repeating iteration for multiple times, and completing image segmentation when an energy function is minimized. When the image segmentation method is applied to the invention, the significance map I mentioned above needs to be applied _sal The image segmentation algorithm is used as an initialization value of the image segmentation algorithm to mark a highlight area and a background area of an image, and iterative segmentation is carried out according to the marked areas to obtain a segmentation result.

The invention includes an image fusion rule based on dual-tree complex wavelet transform (DTWT), the dual-tree complex wavelet function is defined as:

ψ(x)＝ψ _h (x)+jψ _g (x)

wherein psi _h (x) And psi _g (x) Are all real wavelets. After two-dimensional DTCTWT transformation, image decomposition obtains two low-frequency wavelet coefficients and high-frequency coefficients in six directions (plus or minus 15 degrees, plus or minus 45 degrees and plus or minus 75 degrees).

For the low-frequency part, the infrared image is firstly divided by the above-mentioned region division method, a suspected region and an excluded region are divided, position information is recorded, the visible light image is divided according to the same position information, and if the resolution is different, the position information is normalized and then is treated as a coefficient.

According to the position information of the infrared image divided into a suspected area and a background area, dividing the visible light image according to the same position information; for the suspected area of the low-frequency part of the infrared image and the visible light image, the following rule is adopted:

wherein the content of the first and second substances,

is the fused pattern low-frequency coefficient of the l-th layer,

is the low-frequency coefficient of the infrared image of the l layer,

the low-frequency coefficient of the visible light image of the l layer;

for the excluded region of the low-frequency part, a region variance method is adopted, and the larger the region variance is, the larger the change of the gray value corresponding to each pixel in the region is, the higher the contrast of the pixel in the region is, and we can consider that the information corresponding to the region is also more. According to the analysis, the weight of the pixel points with large regional variance values is added in the image fusion, and the rule is as follows:

for a background area of the low-frequency part of the infrared image and the visible light image, a regional variance method is adopted, the larger the regional variance is, the larger the change of the gray value corresponding to each pixel in the area is, the higher the contrast of the pixel in the area is, and the more information corresponding to the area can be considered. According to the analysis, the weight of the pixel points with large regional variance values is added in the image fusion, and the rule is as follows:

wherein, ω is _ir As an infrared imageWeight, ω _vis The weight value is the visible light image weight value; infrared image weight omega _ir And the visible light image weight omega _vis The calculation method comprises the following steps:

ω _ir ＝1-ω _vis ；

wherein σ _vis And σ _ir Respectively the area variance of the visible light image and the infrared image, and r is a correlation coefficient area; regional variance σ of visible light image _vis The variance σ of the area with the infrared image _ir The calculation method comprises the following steps:

in which the size of the image is M x N,

represents the average gray-scale value of the visible light image,

For high frequency parts, the image is divided into n regions by the above image division method, with a = { a = ₁ ,a ₂ ...a _n Denotes. Setting the area weight of each area corresponding to itself

The present invention gives the following weighted high frequency fusion rule.

Wherein a value C is set ^l,θ > 1 is used to amplify the high frequency coefficients in order to highlight the contrast of detailed parts in the image. However, in view of this, noise in the image is also amplified, so a binary matrix M is added ^l,θ When is coming into contact with

Time M ^l,θ =1, and removing the isolated point with the value of 1, so as to amplify only the high-frequency coefficient pixel points connected into slices and remove noise;

the objective is to reduce the effect of noise on the high frequency information for the function of contraction. In the actual fusion process, the concave-convex change of the edge may cause the fusion result to be distorted, a unit vector is calculated by using a high-frequency coefficient obtained by the DTWCT conversion of the infrared image and the visible light image, the original high-frequency coefficient is improved, and then the high-frequency coefficient of the fusion image is rewritten as follows:

wherein:

wherein the fusion rule f is that the infrared image and the weight in the light image are used

Calculating the area r of the image by taking the image as a reference _i Average of high-frequency coefficients, and thenThe mean is the corresponding high frequency coefficient of the fused image. S. the ^l,θ The image segmentation result of the highlight area is subjected to expansion operation and then 2-dimensional mean filtering, so that the detail information of each small area of the fused image is more remarkable. Wherein the region weight

Wherein H ^l,θ (x, y) high frequency coefficients, | is the number of layers, | is the direction subband, | r _i ^θ L is the region r _i ^θ Of (c) is used.

The method comprises a method for extracting the feature of a ground object at the level of a hyperspectral image source and fusing the image into a main image, as shown in the attached figure 3.

First, it is generally considered that hyperspectral remote sensing means a spectral resolution of 10 ^-2 Remote sensing in the range of lambda magnitude. The hyperspectral image has the characteristics of multiple wave bands, narrow spectral range, continuous spectrum and the like, a single pixel element in the image contains dozens or even hundreds of wave bands, and the range of each wave band is less than 10nm of spectrum. Therefore, the remote sensing information is analyzed from the spectral dimension, the characteristics of the reflection spectra of different ground objects are analyzed, calibration is carried out, an information base is established, the spectral data of the target are used for matching and identification in the information base, and therefore the label is added to the image target, and therefore identification of the ground objects is achieved.

And distinguishing ground objects from the angle of a frequency domain, taking the complete spectrum corresponding to each pixel of the hyperspectral image as a sequence signal, and classifying the images in the test area by using a spectrum similarity classification method (FSSM method). The hyperspectral data is discrete, so that Discrete Fourier Transform (DFT) can be adopted for analysis, the DFT can compress signals, noise and Hughes phenomena can be inhibited to a great extent, signal frequency spectrums are obtained, frequency spectrums of main wave crests and wave troughs of different wavelength positions of different object spectral curves can be effectively extracted, and effective information on the spectral curves is reserved.

Firstly, a one-dimensional discrete Fourier transform is adopted to convert a spectrum signal into a frequency domain to obtain a frequency spectrum. Regarding the spectrum sequence corresponding to each pixel in the HIS image as a one-dimensional discrete signal f (n), the DFT can be defined as:

wherein

P(k)＝R ² (k)+I ² (k)；

F _phase ＝arctan(I(k)/R(k))；

Wherein | F (k) |, P (k), F _phase The spectrum sequence comprises an amplitude spectrum value, an energy spectrum and a phase spectrum which are respectively corresponding to a pixel, R (k) and I (k) are respectively a real part and an imaginary part of F (k), k is a serial number of DFT conversion, N is a discrete sampling data length, N is a discrete sampling point, namely a corresponding hyperspectral data band number, and F (N) is a reflectivity value corresponding to each band of the pixel, namely a ground spectrum reflectivity value.

Then, the difference between the target spectrum and the reference spectrum is calculated by using the Laplace distance, and the similarity of the spectrums is further measured for classification, wherein the calculation formula is as follows:

in the formula, F _tar (i) And F _ref (i) Frequency spectra of the target and reference spectral curves, N, respectively _s Is the number of lower order harmonics that participate in the calculation. The reference spectrum can be a laboratory spectrum, a field measurement spectrum or a pixel spectrum extracted from an image. When the spectrum measured in the field is used as the reference spectrum, the remote sensing image needs to be corrected by atmosphere in order to eliminate the influence of the atmosphere on the spectrum.

Consider the region that was segmented at the time of first layer fusion. The simplicity and the high efficiency of the method are considered, namely, the requirements of the algorithm on hardware computing resources and storage resources are reduced as much as possible. Considering the material of artificial target and water, rock, plant, etc. in nature. In the case of the background of the target, the natural object is the background. Therefore, only the natural background part identified in the hyperspectral image needs to be weakened in the first layer image by a weighting method so as to weaken the influence of non-target information to the minimum. The specific image registration method is the same as that in the first layer fusion.

The invention includes a moving object detection and tracking algorithm, as shown in FIG. 4.

An object detection function:

firstly, the final image is filtered to remove noise. The mean filtering process is to establish a window matrix to scan the two-dimensional image pixel by pixel, and the value of the central position of the matrix is replaced by the average value of the values of the points in the window matrix, which can be expressed as

then, the gray map is changed into a binary map by processing using an image threshold method of moving average. The basic idea is to compute a moving average along the scan lines of an image. The zigzag mode is performed line by line, thereby reducing the lighting deviation. Let Z _k+1 Indicating a point encountered at step k +1 in the scan order. The moving average gray at the new point is given by:

wherein n is _{Grey scale} Representing the number of points used in calculating the average gray level, and an initial value m (1) = z _i /n _{Grey scale} ；

typically we take n to be 5 times the target width, K =0.5. The threshold value selection method can effectively avoid the influence of uneven light and shadow distribution on image binary change, and helps to extract the target image.

The image can be scaled appropriately to increase the computation speed, taking into account the hardware computation power and the allowable single processing time. Only the image with larger space is reduced, and the problem of amplification distortion is not involved, so that the method only needs to consider simplicity and convenience, and the nearest neighbor difference algorithm is selected in the invention

And then deleting the small-area image irrelevant to the target in the image, and deleting the image with the area obviously smaller than that of the target in the binary image so as to delete the interference of irrelevant information in the image.

And processing the image by using an image morphological function to ensure that the most essential shape characteristic is obtained after the image target is processed. First, two basic morphological operations are understood. The erosion operation is defined as:

wherein A | _ S indicates etching A using S, and a concrete procedure is to move the structural element S in the planar area of the A image, if S can be completely contained in A when the origin of S is translated to the z point, then a set of all such z points is denoted as an etched image of S to A. Erosion can ablate the boundaries of objects, breaking thin junctions in the image target.

The dilation operation is defined as:

wherein

Denotes the dilation of A with S by moving the structuring element S within the entire image plane of A, the mapping of S with respect to its own origin when translating to point z

With A in common, i.e.

If at least 1 pixel overlaps with A, the set of all such z points is denoted as the dilated image of S vs A. The expansion can enlarge the boundary of the object and can connect the broken gaps.

The open operation is to perform erosion operation and then perform dilation operation on the image a by using the structural element S, and is expressed as:

the open operation can eliminate small objects, separate objects at fine points, smooth large object boundaries and not change volume.

The closed operation is to perform an expansion operation on the image a by using the structural element S and then perform an erosion operation, and is expressed as:

the closed operation can fill small holes in the interior of an object, connect neighboring objects, and smooth its boundaries without significantly changing the object's volume and shape.

The method can select a corresponding morphological algorithm according to actual conditions to finally find the suspected target.

And establishing a cutting function, and cutting the target from the full image to obtain a target image to be checked. According to the result of the previous processing, the background part in the image I is a black value of 0, and the target part to be checked isThe white value was 1. Starting from the coordinates of the image (0, 0), finding the first point with the pixel value of 1, starting from the point, finding all the points with the pixel value of 1 connected with the point, and naming all the points as a set T ₁ In the set T ₁ Find the maximum x of the abscissa in the coordinates of the point in (1) _1max And minimum value x _1min In the set T ₁ Find the maximum value y of the ordinate in the coordinates of the points in (1) _1max And minimum value y _1min Then cutting the obtained target image to be checked

x _1min ＜x＜x _1max ,y _1min ＜y＜y _1max . By analogy, all the targets to be checked are found to obtain images of all the targets to be checked

Considering that the artificial target identified by us usually has symmetry, we use a principal component analysis method to find a principal symmetry axis of the image to be tested and obtain an included angle theta between the principal symmetry axis and an x-axis. The principal component analysis method is to find its maximum distribution vector sum value in the N-dimensional data. The coordinates of each point in the image information of the target to be checked are two-dimensional, and the points are combined into n _{To be tested} Row 2 column matrix X _{To be tested} Wherein n is _{To be tested} Calculating X for the point number in the target image information to be checked _{To be tested} Covariance matrix C of _{To be tested} And continuously calculating the covariance matrix C _{To be tested} Characteristic vector V of _{To be tested} ＝(x _v ,y _v ) The included angle theta between the main symmetry axis of the target image to be inspected and the x axis _{To be checked} Comprises the following steps:

then, the image orientation normalization processing is carried out, and the image is rotated by theta _{To be checked} And removing the newly generated black edge, namely cutting again.

Then, an image size normalization process is performed to change the image size to a template size. And establishing a template library, and specifying the size of the template library to be M multiplied by N.

Establishing an objective decision function:

and matching the processed images to be checked with the images in the template library one by one, and setting a certain similarity threshold T. When the degree of similarity exceeds this threshold, this image is identified as a target. The concrete matching steps are as follows:

1) Setting a similarity threshold T

2) H is calculated according to the following method, assuming image a to be checked, template image B. And judging whether the pixel value A (x, y) of the image to be checked is equal to the pixel value B (x, y) of the template image. H = H +1 if equal; and if not, judging the next point, wherein x belongs to (0, M), and y belongs to (0, N). The final H value is obtained

3) Judging that a target is found if H is greater than T; and if H < T, replacing the next template, and repeating the step 2) until all the templates are exhausted, and judging that no target is found.

Target perception:

the target position, i.e. the coordinates in the image, is obtained by applying the target probe function.

(1) And determining the latitude and longitude of the target position according to the position of the load.

(2) And determining a posture adjustment angle according to the position coordinates of the target in the image, and keeping the target in the central area of the image.

The above two points achieve the goal of sensing.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A hierarchical fusion and extraction method for multi-source detection of moving targets is characterized by comprising the following steps: the method comprises the following steps:

step 1: reading an image input by a multi-source image sensor;

step 2: carrying out image registration on the visible light image and the infrared light image, and fusing the registered images to obtain a first-layer fused image;

and 4, step 4: carrying out target detection on the second layer fused image to obtain position information of a target in the image, sensing the target to obtain the longitude and latitude of the target in a real environment, adjusting the attitude of the aircraft to track the target, and realizing continuous detection and sensing of the target;

represents a weighted average amount of local phase angles of respective components of Fourier when PC (x) takes a maximum value at x;

carrying out Gaussian filtering processing on the edge contour image to obtain an image gray level histogram and a contrast factor; and converting a group of calculation time, and then obtaining all information layers of the nonlinear filtering image by adopting an additive operator splitting algorithm:

wherein A is _l A conduction matrix representing the image I in different dimensions l; t is t _i Defining as computation time, and only using one group of computation time to construct the nonlinear scale space at a time; e is a unit array;

step 2.2.3: calculating direction information of characteristic angular points

The coordinates of a characteristic angular point p (i) in an image are (x (i), y (i)), two points p (i-k) and p (i + k) are selected in the neighborhood, the distance between the two points and the point p (i) is k, T is a tangent line at the point p (i), and the main direction of the characteristic angular point p (i) is an included angle theta between the tangent line T and the positive direction of the x axis _Feature(s) The calculation formula is as follows:

step 2.3: establishing a shape description matrix;

Drawing five concentric circles to obtain 60 small areas; counting the number of characteristic points in each cell, and calculating p _i The shape histogram hi of each feature point is the shape context descriptor of each feature point; each timeThe method for calculating the shape histogram hi of each feature point comprises the following steps:

hi(α)＝#{q≠p _i :(q-p _i )∈bin(α)}

wherein, # denotes the number of statistical feature points in the alpha region; α =1,2,. 60;

wherein, a _i Describing R (a) for the shape context of arbitrary feature points of a reference image ₀ ,a ₁ ,...a ₅₉ ) The ith of (1), b _i Contextual description I (b) of the shape for arbitrary feature points of a reference image ₀ ,b ₁ ,...b ₅₉ ) The ith of (1);

if p is any feature point in a certain image, the nearest neighbor feature point and the next nearest neighbor feature point to be registered with p are respectively set as i and j, and the Euclidean distances between the feature points and p are respectively D _ip And D _jp (ii) a Setting a calculation threshold

2. The hierarchical fusion and extraction method for multi-source detection of the moving object according to claim 1, characterized in that: the method for fusing the registered visible light image and infrared light image in the step 2 specifically comprises the following steps:

step 3.3: fusing the image segmentation result and the wavelet transformation result to respectively obtain a low-frequency fused image and a high-frequency fused image;

3. The hierarchical fusion and extraction method for multi-source detection of the moving object according to claim 1 or 2, characterized in that: the method for detecting the target of the second-layer fusion image in the step 4 to obtain the position information of the target in the image specifically comprises the following steps:

step 4.1: filtering the second layer fused image;

step 4.5: establishing a cutting function, and cutting the target from the full image after the image morphology processing to obtain a target image to be checked;

the background part in the image I after the image morphological processing is a black value of 0, and the target part to be checked is a white value of 1; starting from the coordinate of the image (0, 0), finding the first point with the pixel value of 1, starting from the point, finding all the points with the pixel value of 1 connected with the point, and naming all the points as a set T ₁ In the set T ₁ Find the maximum x of the abscissa in the coordinates of the point in (1) _1max And the minimum value x _1min In the set T ₁ Find the maximum value y of the ordinate in the coordinates of the points in (1) _1max And the minimum value y _1min Then cutting the obtained target image to be checked

Step 4.6: finding out the main symmetry axis of the target image to be checked by using a principal component analysis method, and obtaining the included angle theta between the main symmetry axis of the target image to be checked and the x axis _{To be tested} ；

The coordinates of each point in the image information of the target to be checked are two-dimensional, and the points are combined into n _{To be tested} Row 2 column matrix X _{To be tested} Wherein n is _{To be checked} Calculating X for the number of points in the target image information to be checked _{To be checked} Covariance matrix C of _{To be checked} And continuously calculating the covariance matrix C _{To be tested} Characteristic vector V of _{To be checked} ＝(x _v ,y _v ) Then the included angle theta between the main symmetry axis and the x-axis of the target image to be inspected _{To be checked} Comprises the following steps:

step 4.8: carrying out image size normalization processing, and changing the size of the target image to be checked after the direction normalization processing into the size of a template;

4. The hierarchical fusion and extraction method for multi-source detection of the moving object according to claim 2, characterized in that: the method for obtaining the low-frequency fusion image in the step 3.3 comprises the following steps: according to the position information of the infrared image divided into a suspected area and a background area, dividing the visible light image according to the same position information; for the suspected area of the low-frequency part of the infrared image and the visible light image, the following rule is adopted:

wherein the content of the first and second substances,

is the fused pattern low-frequency coefficient of the l-th layer,

is the low-frequency coefficient of the infrared image of the l layer,

a visible light image low-frequency coefficient of the l layer;

ω _ir ＝1-ω _vis ；

wherein σ _vis And σ _ir Respectively the area variance of the visible light image and the infrared image, and r is a correlation coefficient area; regional variance σ of visible light image _vis The variance σ of the area with the infrared image _ir Is calculated by a computerComprises the following steps:

in which the size of the image is M x N,

represents the average gray-scale value of the visible light image,