CN111489330A - Weak and small target detection method based on multi-source information fusion - Google Patents

Weak and small target detection method based on multi-source information fusion Download PDF

Info

Publication number
CN111489330A
CN111489330A CN202010215165.6A CN202010215165A CN111489330A CN 111489330 A CN111489330 A CN 111489330A CN 202010215165 A CN202010215165 A CN 202010215165A CN 111489330 A CN111489330 A CN 111489330A
Authority
CN
China
Prior art keywords
image
target
saliency
pixel
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010215165.6A
Other languages
Chinese (zh)
Other versions
CN111489330B (en
Inventor
韩振军
韩许盟
余学辉
宫宇琦
蒋楠
彭潇珂
王岿然
焦建彬
叶齐祥
万方
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Chinese Academy of Sciences
Original Assignee
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Chinese Academy of Sciences filed Critical University of Chinese Academy of Sciences
Priority to CN202010215165.6A priority Critical patent/CN111489330B/en
Publication of CN111489330A publication Critical patent/CN111489330A/en
Application granted granted Critical
Publication of CN111489330B publication Critical patent/CN111489330B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The invention provides a method for detecting a small and weak target, which is realized by the following steps: preprocessing an image for target detection, and enhancing the contrast of the image; carrying out image saliency analysis on the image to obtain a saliency map highlighting the area where the foreground target is located; and based on the saliency map of the image, carrying out segmentation processing on the saliency target in the image, and positioning to obtain the target position in the image. When the target detection simultaneously adopts a plurality of time and space aligned images collected by a multi-source channel, the detection method also comprises the step of fusing the images processed by the steps. The method has very important significance for monitoring and analyzing the remote target, and effectively realizes the extraction of the salient target in the image by using methods such as salient analysis, image segmentation, image fusion and the like.

Description

Weak and small target detection method based on multi-source information fusion
Technical Field
The invention belongs to the field of computer vision and image processing, relates to the detection of weak and small targets, and particularly relates to a weak and small target detection method based on multi-source information fusion.
Background
In the field of target detection, the task of detecting weak and small targets is always a challenging problem, but has very important significance for remote target monitoring and analysis. Because the distance between the sensor and the target is long, the scale of the target displayed on the image is small, the characteristics such as shape and contour are not obvious, the information content of the target is weak, and the image is influenced by various noises, so that the target is easily submerged in the noises, which brings great difficulty to how to quickly and accurately detect the target.
The weak and small target detection is a supplement of the existing general target on the scale, and the problem of weak and small target detection is solved, so that the target detection is truly general on all scales. The relative scale and the absolute scale of the target in the existing general target detection are often very large, and in practical application, the relative scale and the absolute scale of many scene targets are far smaller than the value, so that the small-scale target detection can be regarded as target detection closer to the tasks, and the existing general target detection method can not well cover the problems.
With the maturity and commercialization of unmanned aerial vehicle technology, the correlation technique based on unmanned aerial vehicle also receives more and more attention, and it is one of them to accomplish automatic monitoring based on target detection of taking photo by plane. Aerial photography is generally equipped with a camera with extremely high resolution to obtain high-quality images, but when high-altitude photography is carried out, the resolution of many targets (such as pedestrians, vehicles and the like) is still low, so that targets with weak information and small relative scale and absolute scale are generated, and aerial target detection is usually natural and is weak target detection. At present, unmanned aerial vehicle aerial photography has many application scenes, for example, photographers and amateurs use unmanned aerial vehicles to record daily activities or perform some innovative projects; or the method is used for detecting the defective solar panel by the solar power plant; or for the detection of early disease in plants; and even can be used for shark detection in the public safety field, and the like. In addition, for many national defense and military tasks, such as accurate defense and accurate strike, or border safety monitoring tasks and the like, the detection of weak and small targets is often like shadow following. The targets in these application scenarios also have the characteristics of weak targets and complex diversity.
Currently, a network framework based on an anchor frame (anchor) is a detector with better experimental performance at present, but the anchor frame setting of the detector is not friendly to an object with a smaller absolute scale, which is mainly reflected in the scale (scale) and stride (stride) of the anchor frame, for example, the minimum scale of the anchor frame in a Faster R-CNN is 32 ×, the optimal Intersection Over Unity (IOU) is only about 0.22 for a small object with an absolute scale of 15 3615, in the IOsetting of a Faster R-CNN, the maximum U matching principle is adopted to select the anchor frame for predicting the object, so that a large number of low-quality (IOU) matches exist for the small object, in addition, the optimal Intersection Over Unity (IOU) is about 0.22, the IOU matching principle is adopted for the IOU setting of the Faster R-CNN, the optimal U matching is adopted to predict the object, the optimal regression over unity (IOU) matches are more frequently found along with the object displacement, the initial regression over the length of the anchor frame in the anchor frame, the initial regression over the shortest length of the anchor frame is generally less likely to cause the problem that the initial matching with the target displacement of the target, the initial GPU 6316, and the optimal matching is not favorable for the target displacement of the target.
Therefore, aiming at the problems in the prior art, the invention provides a method for detecting a small and weak target, in particular to a method for detecting a small and weak target based on multivariate information fusion.
Disclosure of Invention
In order to overcome the above problems, the present inventors have conducted intensive studies and proposed a simple and effective method for detecting a small and weak target, which mainly uses an image processing method to autonomously analyze and process a scene image and extract a target region that may exist in a picture for a detection task without specific target limitation. The detection method comprises the following specific implementation processes: firstly, preprocessing an original picture by using an image equalization method to achieve a better image effect, so that subsequent operation processing is facilitated; secondly, extracting a salient region in the scene image by using a visual saliency analysis algorithm based on data driving; thirdly, based on the saliency map of the image, using an adaptive threshold and a GrabCut segmentation algorithm to realize fine segmentation processing on the saliency target in the image and locate the target in the image. When the target detection adopts a plurality of time and space aligned images collected by a multi-source channel, the method further comprises a step of fusing the images processed by the steps, thereby completing the invention.
The invention aims to provide the following technical scheme:
the invention aims to provide a weak and small target detection method, which comprises the following steps:
step 1), preprocessing an image for target detection, and enhancing the contrast of the image;
step S2), carrying out image significance analysis on the image to obtain a significance map highlighting the area where the foreground object is;
step S3), based on the saliency map of the image, segmentation processing is performed on the saliency target in the image, and the target position in the image is located.
When the target detection simultaneously adopts a plurality of time and space aligned images collected by a multi-source channel, the detection method also comprises the step of fusing the images processed in the steps 1-3).
According to the weak and small target detection method based on multi-source information fusion, the method has the following beneficial effects:
(1) the method for detecting the weak and small targets does not need a training process of a model, and can detect the targets to be detected in real time.
(2) According to the method for detecting the weak and small targets, a bottom-up visual attention mechanism is adopted for image significance analysis, data driving is adopted, the method is independent of specific tasks, comprehensive suspicious targets are favorably provided, and support is provided for judgment of specified targets by people;
(3) according to the method for detecting the weak and small targets, the targets can be quickly segmented and positioned by adopting self-adaptive threshold segmentation, the targets can be finely segmented and positioned by adopting self-adaptive threshold segmentation and GrabCT segmentation, and the weak and small targets can be segmented and positioned by adopting different modes according to requirements.
(4) The method for detecting the weak and small targets can fuse the multi-source information, comprehensively utilize the multi-source information and comprehensively and accurately position the targets.
Drawings
FIG. 1 shows a visible light image for object detection in an embodiment;
FIG. 2 shows a graph of the results of an HSV-based color image model equalization of FIG. 1;
FIG. 3 shows a graph of the results of a saliency analysis of a visible light image;
FIG. 4 shows a graph of the results of a saliency analysis of a visible light image temporally and spatially aligned with an infrared image;
FIG. 5 is a graph showing the result of adaptive threshold segmentation of a saliency map of a visible light image using the Otsu method and superposition of the segmentation result with an original image;
FIG. 6 is a result diagram of adaptive threshold segmentation of a saliency map of an infrared image by an Otsu method and superposition of the segmentation result with an original image;
FIG. 7 shows a graph of the results of the GrabCut method segmentation of FIG. 5;
FIG. 8 shows a graph of the results of the GrabCut method segmentation of FIG. 6;
FIG. 9 is a diagram showing the result of fusing the segmentation results of the visible light image and the infrared image using Bayesian decision;
FIG. 10a shows a picture to be detected, and FIG. 10b shows the detection result of FIG. 10 a;
FIG. 11a shows a picture to be detected, and FIG. 11b shows the detection result of FIG. 11 a;
FIG. 12a shows a picture to be detected, and FIG. 12b shows the detection result of FIG. 12 a;
FIG. 13a shows a picture to be detected, and FIG. 13b shows the detection result of FIG. 13 a;
FIG. 14a shows a picture to be detected, and FIG. 14b shows the detection result of FIG. 14 a;
FIG. 15a shows a picture to be examined, and FIG. 15b shows the examination result for FIG. 15 a;
FIG. 16a shows a picture to be detected, and FIG. 16b shows the detection result of FIG. 16 a;
FIG. 17a shows a picture to be detected, and FIG. 17b shows the detection result for FIG. 17 a;
fig. 18a shows a picture to be detected, and fig. 18b shows the detection result for fig. 18 a.
Detailed Description
The invention is explained in further detail below with reference to the drawing. The features and advantages of the present invention will become more apparent from the description.
The invention aims to provide a weak and small target detection method, which is not particularly limited to weak and small targets in an image and comprises the following steps:
step 1), preprocessing an image for target detection, and enhancing the contrast of the image;
step 2), carrying out image saliency analysis on the image to obtain a saliency map highlighting the area where the foreground target is located;
and 3) carrying out segmentation processing on the salient target in the image based on the saliency map of the image, and positioning to obtain the target position in the image.
In the present invention, the weak and small target refers to an interested target observed from a long distance, such as a human body, a detection, a vehicle, etc.
In the step 1) of the invention, the image for target detection is preprocessed, and the contrast of the image is enhanced.
The original image may be affected by complex and variable environmental factors, so that the information and features required by the image are not significant enough, and the operation directly on the original image often cannot achieve a satisfactory effect, so that the original image needs to be preprocessed first to enhance the contrast of the image. The image enhancement can highlight the interesting features in the image and inhibit the uninteresting features, thereby improving the image effect and enriching the image information. In this step, image enhancement will be achieved using a histogram equalization method.
The traditional histogram equalization method is directed at gray level images, and carries out nonlinear transformation on the gray level of an original image, so that the gray level histogram of the original image is uniformly expanded to the whole gray level range from an original area, thereby expanding the gray level difference between a foreground object and a background area and enhancing the contrast of the image. Most of the scene images shot today are colored. It is therefore necessary to extend the application of image equalization methods to color images.
In the present invention, the histogram equalization method includes a global histogram equalization method and a local histogram equalization method, and preferably, the local histogram equalization method.
The present inventors have found that when a global histogram equalization method is used, the details of small regions in an image are often ignored in the global calculation and thus do not achieve the desired effect.A method to solve this problem is to use a local histogram equalization method.in this case, the whole image is divided into a sub-window grid (usually 8 × 8) of m × n, and then each part is processed separately.
In the invention, the histogram equalization method is expanded and applied to the color image and comprises an equalization method based on an RGB color image model and an equalization method based on an HSV color image model:
an equalization method based on an RGB color image model comprises the following steps: and separating three channels of the image by using an RGB color image model to obtain three single-channel images, independently performing histogram equalization processing on each image, and finally merging the processed single-channel images to restore the processed single-channel images into a color image form. Since R, G, B are operated separately, the processed image may have color distortion, but the contrast of the image is still enhanced after the equalization process.
An equalization method based on an HSV color image model comprises the following steps: using the HSV color image model, the histogram equalization process is performed on the V (brightness) channel thereof alone while keeping the H (hue) and S (saturation) channels unchanged. The processing mode does not affect the hue and the saturation of the image, so that the defect of color distortion does not occur.
In the invention, when the image for target detection is an achromatic image (or called a gray image), the histogram equalization method is directly used to realize image contrast enhancement; when the image for target detection is a color image, an equalization method based on an RGB color image model or an equalization method based on an HSV color image model is adopted for image contrast enhancement, and the equalization method based on the HSV color image model is preferably adopted.
In the step 2), the image (especially the image after contrast enhancement) is subjected to image saliency analysis to obtain a saliency map highlighting the area where the foreground object is.
In this step, the inventors determined that salient regions in the scene image were extracted using a data-driven visual attention mechanism analysis algorithm. The visual attention mechanism has two processing modes, namely a bottom-up type and a top-down type. The bottom-up visual attention mechanism is unconscious, data-driven based, and independent of the specific task. Therefore, the attention model has no specific task limitation, and is suitable for a detection method without specific limitation on weak and small targets in an image; the method is free of the constraint of prior knowledge, free of manual control and high in calculation speed. The strategy utilizes the bottom layer information of the image such as brightness, color, texture and the like to calculate the difference between pixel points, thereby judging the saliency area.
Under the guidance of the above idea, the extraction of a salient region in an Image IS realized by using an Image Signature (IS) method. The image is labeled as a simple image descriptor that spatially approximates the foreground information of the image, and thus is useful for detecting salient regions of the image.
Consider an image of a scene, which has the following structure:
Figure RE-GDA0002516318760000081
where f denotes foreground signal, falseIt is assumed to be sparse on a standard spatial basis. b represents the background, which is assumed to be sparse on the basis of discrete cosine transform, and R represents the parameter value space. In other words, both f and dct (b) have only a few non-zero components. In general, given only x and their sparsity, it is very difficult to separate f and b. For the salient region extraction problem, we only care about the foreground signal f (the non-zero set of pixels in f). We can approximately separate f by solving the sign of the mixed signal x in the transform domain and then inverse transforming it into the spatial domain, i.e. computing the reconstructed image:
Figure RE-GDA0002516318760000091
wherein, DCT (-) and IDCT (-) are discrete cosine transform and inverse discrete cosine transform, respectively. Formally, an Image tag (Image Signature) is defined as follows:
imagesignature (x) sign (dct (x)) formula (2-2)
Further, reconstructing the image by smoothing
Figure RE-GDA0002516318760000094
To obtain the final saliency map s:
Figure RE-GDA0002516318760000092
wherein g is a Gaussian kernel, a convolution operator,
Figure RE-GDA0002516318760000093
is the hadamard product operator. Simple gaussian smoothing is necessary because some salient objects in the reconstructed image are point-like, however, in practice salient objects are not only sparse in space, but also located on a continuous area. Gaussian smoothing fuzzifies the saliency map, and is beneficial to obtaining a continuous saliency target in a certain area.
Specifically, the image saliency analysis comprises the following sub-steps:
step 2.1), based on the RGB color image model, for the input imageEach one-channel image I of IiA normalization process is performed such that:
0≤Ii(x, y) is not more than 1, i is 1,2, …, N, formula (2-4)
Wherein N is the number of image channels;
step 2.2), the width of the single-channel image is scaled to a set size (such as 512 pixels), and meanwhile, the image height is scaled in an equal proportion for subsequent processing;
step 2.3), after the preprocessing, performing the following operations on the single-channel image:
Si=IDCT(sign(DCT(Ii) I ═ 1,2, …, N formula (2-5)
Where N is the number of image channels, and DCT (-) and IDCT (-) are discrete cosine transform and inverse discrete cosine transform, respectively. The sign function sign (x) is defined as follows:
Figure RE-GDA0002516318760000101
after this operation, a plurality of single-channel saliency maps S are obtainedi
Step 2.4), averaging the multiple single-channel saliency maps, and combining to obtain a gray saliency map, namely:
Figure RE-GDA0002516318760000102
wherein h issAnd wsRespectively, the height and width of the grayscale saliency map.
By performing the above steps 2.1) to 2.4), the foreground object in the image can be obtained. Since salient objects are not only spatially sparse, but also localized on a continuous area, a simple gaussian smoothing is necessary for this purpose.
Thus, the image saliency analysis also includes step 2.5), blurring the grayscale saliency map using a gaussian kernel whose width and height are both:
ksize=int(4×ws×η) formula (2-8)
Wherein η is a fuzzy parameter, and the standard deviation of the Gaussian function along the X and Y directions is ═ ws×η。
In step 3), based on the saliency map of the image, the saliency target in the image is segmented, and the target position in the image is obtained by positioning. Namely, the grayscale saliency map obtained in step 2) is used to complete image segmentation, so as to realize extraction of a salient object in an image.
In the invention, the fine segmentation processing of the salient objects in the image is implemented by using adaptive threshold segmentation and GrabCT segmentation. Adaptive threshold segmentation of the saliency map is preferably achieved using the Otsu method (also known as the maximum inter-class difference method).
The Otsu method, which uses the inter-class variance to mark the difference between two classes of pixels divided, is optimal in the case where the inter-class variance is the largest. The Otsu method is calculated based on a gray histogram of an input image, and an optimal division threshold value can be automatically obtained. The Otsu method comprises the following specific steps:
a gray histogram of the image is calculated (only the most common 8bit image, i.e. the 256-level gray map, is considered) and normalized. Setting the segmentation threshold value as j, the segmentation threshold value divides the image pixel points into two classes, and the pixel gray value is in the interval [0, j]Inner is marked as C0Class, representing background area; the gray value of the pixel is in the interval [ j,255]Inner is marked as C1Class, representing foreground region. Statistics C0The proportion of the similar pixels is marked as omega0And calculating the average gray value thereof as mu0(ii) a Similarly, statistics of C1The proportion of the similar pixels is marked as omega1And calculating the average gray value thereof as mu1. Then C is0Class and C1The inter-class variance of a class can be expressed as:
g=ω0ω101)2formula (3-1)
And (4) finding a segmentation threshold j which enables the inter-class variance g to be maximum by traversing all the gray levels, namely the obtained threshold.
Because the adaptive threshold segmentation method can only obtain the approximate region of the saliency target, in order to segment more specific targets, the invention uses GrabCT image segmentation algorithm to realize more accurate segmentation of the saliency target in the image on the basis of the adaptive threshold segmentation.
The GrabCut algorithm uses the RGB color space, and models foreground objects and background regions with a full covariance GMM (gaussian mixture model) containing K gaussian components, respectively (in the present invention, K is taken to be 5). Let vector k ═ k1,…,kn,…,kNIn which k isnIs the Gaussian component corresponding to the nth pixel, kn∈ {1, …, K }. the energy of the entire image is:
e (α, k, θ, z) ═ U (α, k, θ, z) + V (α, z) formula (3-2)
Wherein z is the image pixel value, parameters α∈ {0,1}, 0 represents the background region, 1 represents the foreground object, and theta is the set of parameters.
The definition of the area item U is as follows:
U(α,k,θ,z)=∑nD(αn,kn,θ,zn) Formula (3-3)
Wherein D (α)n,kn,θ,zn)=-logp(znn,kn,θ)-logπ(αn,kn) P (-) represents a Gaussian probability distribution, and π (-) is a mixture weight coefficient, then:
Figure RE-GDA0002516318760000121
at this point, the parameters of the model are determined as:
θ ═ pi (α, K), μ (α, K), ∑ (α, K), α ═ 0,1, K ═ 1, …, K } formula (3-5)
Namely the weight pi of 2K gaussian components (the foreground object and the background region each contain K components), the mean vector mu and the covariance matrix ∑, all of which are obtained by learning, and when all of the three parameters are determined, the color value (R, G, B) of each pixel is sent to the GMM model to calculate the probability that it belongs to the foreground and the background.
The boundary term V is defined as follows:
Figure RE-GDA0002516318760000122
in the RGB color space, the distance between two pixel points is calculated using euclidean distance, i.e., | zm-znParameter β is determined by the image contrast, which is larger β to magnify the differences between pixels when the image contrast is lower, and conversely smaller β to narrow the differences when the image contrast is higher.
The optimal value of E is obtained by using a maximum flow algorithm, the iteration times are set, GMM model parameters are updated after each iteration is finished, and then the process of optimizing the energy function E is continuously repeated, so that a better target segmentation result is finally obtained.
In the initial stage of the GrabCont algorithm, a pixel set of an image background region and an initial pixel set of a foreground target need to be provided, the Otsu algorithm just gives a primary segmentation result, the foreground target and the background region obtained by the Otsu method are sent to the GrabCont algorithm, and an image used for target detection is segmented through iteration to obtain a finer target segmentation result. Therefore, the Otsu algorithm and the GrabCut algorithm are effectively combined, and the final fine segmentation is realized.
When the target detection simultaneously adopts a plurality of time and space aligned images collected by a multi-source channel, in order to comprehensively utilize the plurality of images and comprehensively and accurately position the target, the method for detecting the weak and small target further comprises the step of fusing the images processed by the steps.
The image fusion refers to that the time and space aligned images collected by the multi-source channel are subjected to fusion algorithm, information which is beneficial to outputting results in each image is extracted, and the images which are more convenient to observe and process are comprehensively formed. The image fusion can be divided into three levels, namely pixel level fusion, feature level fusion and decision level fusion. The invention realizes the fusion of the segmentation results of the multi-source images such as the visible light images and the infrared images at the decision level.
The decision-level fusion is a process of performing comprehensive analysis and decision after independently processing multi-source information to obtain a result, and is high-level fusion based on image understanding. The invention realizes the decision-level fusion of the multi-source image segmentation result by using Bayesian decision.
The fusion process comprises the following substeps:
step 4.1), fusing the images acquired based on the multi-source channels to obtain the conditional probability of the category of each pixel point under the scene image;
setting pixel class label ═ tone0,1Therein of0Indicating that the pixel belongs to the background region,1indicating that the pixel belongs to the foreground object. Is provided with
Figure RE-GDA0002516318760000141
Wherein, P (z)x,y|1) Is shown in1The conditional probability of the pixel point (x, y) under the condition of the class; for the same reason P (z)x,y|0) Is shown in0The conditional probability of the pixel point (x, y) under the condition of the class; c represents the number of fused images, and c is 2; pn(zx,y) Indicating that the pixel (x, y) belongs to in the nth image1αnRepresents the fusion weight of the nth image and satisfies
Figure RE-GDA0002516318760000142
Step 4.2), fusing to obtain the prior probability of the category of each pixel point under the scene image through the saliency map;
the prior probability of the category to which the pixel point (x, y) belongs can be calculated by the saliency map obtained in step 2), specifically:
Figure RE-GDA0002516318760000143
wherein S (x, y) represents the gray value of the pixel point (x, y) in the saliency map; 255 is the maximum value of the gray level.
Step 4.2), through the analysis, the posterior probability, namely the Bayesian formula is utilized to obtain the conditional probability which can be obtained by the prior probability and the fusion of the pixel points
Figure RE-GDA0002516318760000151
According to Bayes decision rule, the posterior probability P: (1|zx,y)>P(0|zx,y) Then, the pixel point (x, y) is judged to belong to1Class; otherwise, judging that the pixel point (x, y) belongs to0And positioning to obtain the target after image fusion.
Examples example 1
1. Image pre-processing enhancement
The input original image is shown in fig. 1, and is subjected to local histogram equalization based on an HSV color image model, and the result is shown in fig. 2. It can be seen that the contrast of the processed image is enhanced, so that the details in the image are displayed more clearly, and meanwhile, the adverse condition of color distortion does not occur.
2. Significance analysis
The results of the image saliency analysis are shown in fig. 3 and 4. It can be seen that the brightness of the area where the pedestrian and the vehicle are located in the original image is larger in the saliency map, which indicates that the saliency of the area is higher, and the foreground object interested by the person is most likely to appear in the area with high saliency. Therefore, the result of the significance analysis can be utilized to realize the detection of the significance target by combining with an image segmentation algorithm.
3. Adaptive threshold segmentation
The saliency map is adaptively threshold-segmented using the Otsu method, and its segmentation result (binary image) is superimposed with the original image, with the results shown in fig. 5 and 6. The area in the figure, which is pure black, is taken as a background and is binary to 0; the remaining part with the image is a foreground object and is binarized to 1. It can be seen that the regions with higher brightness (higher saliency) in the saliency map are basically divided. The result graph of the adaptive threshold segmentation can be used as a preliminary segmentation result, and then the GrabCut algorithm is combined to realize finer segmentation on the significance target.
4. Grabcut segmentation
The Otsu algorithm gives a preliminary segmentation result and generates a binary template map, wherein the label value of the foreground area is 1, and the label value of the background area is 0. The GrabCont algorithm needs to provide a pixel set of a background area and an initial pixel set of a foreground object, so that a pixel with a label value of 0 in a binary image is used as the pixel set of the background area, and a pixel with a label value of 1 is used as the initial pixel set of the foreground object. And then, the iteration times are set (set to be 3), and the segmentation model is subjected to iterative optimization to obtain a final target segmentation result.
The results of the GrabCut algorithm segmentation are shown in fig. 7 and 8. It can be seen that after the GrabCut algorithm, the significant targets in the foreground area in the template map are segmented, and the outline of most targets are well segmented, so that the expected effect is achieved.
5. Image decision level fusion
As shown in fig. 9, the segmentation results of the visible light image and the infrared image are fused by using a bayesian decision, so as to determine whether each pixel point in the image belongs to a foreground target or a background region, so as to obtain a final target detection result.
Example 2
By using the method disclosed by the invention, the detection speed and accuracy of the weak and small targets in the image are verified.
Wherein, FPS (number of pictures processed per second) is adopted as an evaluation criterion for the detection speed, and the following evaluation criteria are adopted for the detection precision:
Figure RE-GDA0002516318760000171
Figure RE-GDA0002516318760000172
Figure RE-GDA0002516318760000173
wherein: true Positives, TP: the number of targets detected as foreign objects and actually also as foreign objects; falsesubjects, FP: detecting the number of targets which are foreign targets and are actually non-foreign targets; true negotives, TN: the number of targets detected as non-foreign targets, which are actually non-foreign targets; false Negatives, FN: the number of targets detected as non-foreign targets, actually foreign targets. N is TN + FN and P is TP + FP. In this embodiment, precision and recall are mainly used as evaluation indexes, the recall reflects a missing detection rate, the larger the recall is, the lower the missing detection is, the precision reflects a false alarm rate, and the larger the precision is, the lower the false alarm rate is.
In this embodiment, the CPU obtains the detection speed (FPS) by calculating all the pictures in the data set (infrared and visible light separate operations) and obtaining the execution time, as shown in the following table.
Model detection speedometer
Figure RE-GDA0002516318760000174
In the present embodiment, weak and small target detection is performed for targets with different sizes, and specific detection results are shown in fig. 10-14; wherein, fig. 10a is a picture to be detected, and fig. 10b is a target detection result obtained by detecting fig. 10 a; FIG. 11a is a picture to be detected, and FIG. 11b is a target detection result obtained by detecting FIG. 11 a; fig. 12a is a picture to be detected, and fig. 12b is a target detection result obtained by detecting fig. 12 a; fig. 13a is a picture to be detected, and fig. 13b is a target detection result obtained by detecting fig. 13 a; fig. 14a is a picture to be detected, and fig. 14b is a target detection result obtained by detecting fig. 14 a. The corresponding detection accuracy is shown in the following table:
Figure RE-GDA0002516318760000181
weak and small target detection is carried out on targets with different background complex conditions, and specific detection results are shown in fig. 15-18; wherein, fig. 15a is a picture to be detected, and fig. 15b is a target detection result obtained by detecting fig. 15 a; fig. 16a is a picture to be detected, and fig. 16b is a target detection result obtained by detecting fig. 16 a; fig. 17a is a picture to be detected, and fig. 17b is a target detection result obtained by detecting fig. 17 a; fig. 18a is a picture to be detected, and fig. 18b is a target detection result obtained by detecting fig. 18 a. The corresponding detection accuracy is shown in the following table:
Figure RE-GDA0002516318760000182
according to the embodiment 2, the weak and small target detection method based on the multi-source information fusion can provide high-efficiency detection efficiency and good detection precision.
The present invention has been described above in connection with preferred embodiments, but these embodiments are merely exemplary and merely illustrative. On the basis of the above, the invention can be subjected to various substitutions and modifications, and the substitutions and the modifications are all within the protection scope of the invention.

Claims (10)

1. A weak and small target detection method comprises the following steps:
step 1), preprocessing an image for target detection, and enhancing the contrast of the image;
step 2), carrying out image saliency analysis on the image to obtain a saliency map highlighting the area where the foreground target is located;
and 3) carrying out segmentation processing on the salient target in the image based on the saliency map of the image, and positioning to obtain the target position in the image.
2. The method according to claim 1, wherein in step 1), for an achromatic image, the contrast of the image is enhanced using histogram equalization,
the histogram equalization method comprises a global histogram equalization method and a local histogram equalization method, and preferably comprises a local histogram equalization method.
3. The method according to claim 1, characterized in that in step 1), for color images, an equalization method based on an RGB color image model is used to enhance the contrast of the image, in particular:
and separating three channels of the image by using an RGB color image model to obtain three single-channel images, independently performing histogram equalization processing on each image, and finally merging the processed single-channel images to restore the processed single-channel images into a color image form.
4. The method according to claim 1, characterized in that in step 1), for the color image, an equalization method based on HSV color image model is used to enhance the contrast of the image, specifically:
using the HSV color image model, the histogram equalization process is performed on the V (brightness) channel thereof alone while keeping the H (hue) and S (saturation) channels unchanged.
5. The method according to claim 1, characterized in that in step 2), the image saliency analysis comprises the following sub-steps:
step 2.1), based on the RGB color image model, for each single-channel image I in the input image IiA normalization process is performed such that: i is more than or equal to 0i(x, y) is less than or equal to 1, i is 1,2, …, N, wherein N is the number of image channels;
step 2.2), the following operations are carried out on the single-channel image to obtain a plurality of single-channel saliency maps Si
Si=IDCT(sign(DCT(Ii))),i=1,2,…,N
Wherein, N is the number of image channels, and DCT (-) and IDCT (-) are discrete cosine transform and inverse discrete cosine transform, respectively;
step 2.3), calculating an average value of the multiple single-channel saliency maps, and combining to obtain a gray saliency map, namely:
Figure RE-FDA0002516318750000021
wherein h issAnd wsRespectively, the height and width of the grayscale saliency map.
6. The method according to claim 5, characterized in that before step 2.2), it further comprises the step of resizing the image:
the width of the single channel image is scaled to a set size while the image height is scaled equally.
7. The method of claim 5, wherein the image saliency analysis further comprises step 2.5) of blurring the grayscale saliency map using a Gaussian kernel whose width and height are ksize int (4 × w)s×η), wherein η is fuzzy parameters;
the standard deviation of the gaussian function in the X and Y directions is: σ ═ ws×η。
8. The method according to claim 1, wherein in step 3), the salient object is initially segmented by using adaptive threshold segmentation, and then the salient object in the image is finely segmented by using GrabCT segmentation on the basis of the initial segmentation, and the salient object is located;
optionally, adaptive thresholding of the saliency map is performed using an Otsu method.
9. The method of claim 1, wherein when the target detection employs a plurality of temporally and spatially aligned images acquired by a multi-source channel, the detection method further comprises a step of fusing the images processed in the steps 1) to 3).
10. The method according to claim 9, characterized in that the fusion process comprises the following sub-steps:
step 4.1), fusing the images acquired based on the multi-source channels to obtain the conditional probability of the category of each pixel point under the scene image;
setting pixel class label ═ tone0,1Therein of0Indicating that the pixel belongs to the background region,1representing that the pixel belongs to a foreground target; is provided with
Figure RE-FDA0002516318750000031
P(zx,y|0)=1-P(zx,y|1)
Wherein, P (z)x,y|1) Is shown in1The conditional probability of the pixel point (x, y) under the condition of the class; for the same reason P (z)x,y|0) Is shown in0The conditional probability of the pixel point (x, y) under the condition of the class; c represents the number of fused images; pn(zx,y) Indicating that the pixel (x, y) belongs to in the nth image1αnRepresents the fusion weight of the nth image and satisfies
Figure RE-FDA0002516318750000032
Step 4.2), fusing the saliency map obtained in the step 2) to obtain the prior probability of the category to which each pixel point belongs under the scene image;
the method specifically comprises the following steps:
Figure RE-FDA0002516318750000033
Px,y(0)=1-Px,y(1)
wherein S (x, y) represents the gray value of the pixel point (x, y) in the saliency map; 255 is the maximum value of the gray level;
step 4.3), based on the prior probability and the conditional probability of the pixel points, obtaining the posterior probability of the pixel points by using a Bayes algorithm, and judging whether the pixel points belong to the significance target or not according to the posterior probability;
specifically, the method comprises the following steps:
Figure RE-FDA0002516318750000041
when a posterior probability P: (1|zx,y)>P(0|zx,y) Then, the pixel point (x, y) is judged to belong to1Class; otherwise, judging that the pixel point (x, y) belongs to0And positioning to obtain the target after image fusion.
CN202010215165.6A 2020-03-24 2020-03-24 Weak and small target detection method based on multi-source information fusion Active CN111489330B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010215165.6A CN111489330B (en) 2020-03-24 2020-03-24 Weak and small target detection method based on multi-source information fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010215165.6A CN111489330B (en) 2020-03-24 2020-03-24 Weak and small target detection method based on multi-source information fusion

Publications (2)

Publication Number Publication Date
CN111489330A true CN111489330A (en) 2020-08-04
CN111489330B CN111489330B (en) 2021-06-22

Family

ID=71791600

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010215165.6A Active CN111489330B (en) 2020-03-24 2020-03-24 Weak and small target detection method based on multi-source information fusion

Country Status (1)

Country Link
CN (1) CN111489330B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077021A (en) * 2021-06-07 2021-07-06 广州天鹏计算机科技有限公司 Machine learning-based electronic medical record multidimensional mining method
CN113965371A (en) * 2021-10-19 2022-01-21 北京天融信网络安全技术有限公司 Task processing method, device, terminal and storage medium in website monitoring process
CN114140493A (en) * 2021-12-03 2022-03-04 湖北微模式科技发展有限公司 Target multi-angle display action continuity detection method
CN115174807A (en) * 2022-06-28 2022-10-11 上海艾为电子技术股份有限公司 Anti-shake detection method and device, terminal equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463911A (en) * 2014-12-09 2015-03-25 上海新跃仪表厂 Small infrared moving target detection method based on complicated background estimation
CN105184293A (en) * 2015-08-29 2015-12-23 电子科技大学 Automobile logo positioning method based on significance area detection
CN109558848A (en) * 2018-11-30 2019-04-02 湖南华诺星空电子技术有限公司 A kind of unmanned plane life detection method based on Multi-source Information Fusion
CN110782466A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Picture segmentation method, device and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104463911A (en) * 2014-12-09 2015-03-25 上海新跃仪表厂 Small infrared moving target detection method based on complicated background estimation
CN105184293A (en) * 2015-08-29 2015-12-23 电子科技大学 Automobile logo positioning method based on significance area detection
CN110782466A (en) * 2018-07-31 2020-02-11 阿里巴巴集团控股有限公司 Picture segmentation method, device and system
CN109558848A (en) * 2018-11-30 2019-04-02 湖南华诺星空电子技术有限公司 A kind of unmanned plane life detection method based on Multi-source Information Fusion

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KEITA FUKUDA等: "Automatic Segmentation of Object Region Using Graph Cuts Based on Saliency Maps and AdaBoost", 《THE 13TH IEEE INTERNATIONAL SYMPOSIUM ON CONSUMER ELECTRONICS》 *
王化祥等: "《传感器原理及应用》", 30 September 2014, 天津大学出版社 *
肖亮 等: "《基于图像先验建模的超分辨增强理论与算法 变分PDE、稀疏正则化与贝叶斯方法》", 31 July 2017, 国防工业出版社 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077021A (en) * 2021-06-07 2021-07-06 广州天鹏计算机科技有限公司 Machine learning-based electronic medical record multidimensional mining method
CN113965371A (en) * 2021-10-19 2022-01-21 北京天融信网络安全技术有限公司 Task processing method, device, terminal and storage medium in website monitoring process
CN113965371B (en) * 2021-10-19 2023-08-29 北京天融信网络安全技术有限公司 Task processing method, device, terminal and storage medium in website monitoring process
CN114140493A (en) * 2021-12-03 2022-03-04 湖北微模式科技发展有限公司 Target multi-angle display action continuity detection method
CN115174807A (en) * 2022-06-28 2022-10-11 上海艾为电子技术股份有限公司 Anti-shake detection method and device, terminal equipment and readable storage medium

Also Published As

Publication number Publication date
CN111489330B (en) 2021-06-22

Similar Documents

Publication Publication Date Title
Bahnsen et al. Rain removal in traffic surveillance: Does it matter?
CN111489330B (en) Weak and small target detection method based on multi-source information fusion
CN108062525B (en) Deep learning hand detection method based on hand region prediction
Xu et al. Learning-based shadow recognition and removal from monochromatic natural images
Kim et al. Spatiotemporal saliency detection and its applications in static and dynamic scenes
Zhang et al. Multi-class weather classification on single images
CN111914664A (en) Vehicle multi-target detection and track tracking method based on re-identification
CN107273832B (en) License plate recognition method and system based on integral channel characteristics and convolutional neural network
US20140341421A1 (en) Method for Detecting Persons Using 1D Depths and 2D Texture
CN108416780B (en) Object detection and matching method based on twin-region-of-interest pooling model
CN110929593A (en) Real-time significance pedestrian detection method based on detail distinguishing and distinguishing
Li et al. Saliency based image segmentation
CN109242032B (en) Target detection method based on deep learning
US20210256291A1 (en) Computer-implemented method of detecting foreign object on background object in an image, apparatus for detecting foreign object on background object in an image, and computer-program product
Naufal et al. Preprocessed mask RCNN for parking space detection in smart parking systems
CN111815528A (en) Bad weather image classification enhancement method based on convolution model and feature fusion
Cruz et al. Aerial detection in maritime scenarios using convolutional neural networks
Gupta et al. Early wildfire smoke detection in videos
Chen et al. Visual depth guided image rain streaks removal via sparse coding
CN114550134A (en) Deep learning-based traffic sign detection and identification method
Aung et al. Automatic license plate detection system for myanmar vehicle license plates
Hu et al. Fast face detection based on skin color segmentation using single chrominance Cr
Li et al. Grain depot image dehazing via quadtree decomposition and convolutional neural networks
CN110188693B (en) Improved complex environment vehicle feature extraction and parking discrimination method
Jeong et al. Homogeneity patch search method for voting-based efficient vehicle color classification using front-of-vehicle image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant