CN107169487B

CN107169487B - Salient object detection method based on superpixel segmentation and depth feature positioning

Info

Publication number: CN107169487B
Application number: CN201710255712.1A
Authority: CN
Inventors: 肖嵩; 熊晓彤; 刘雨晴; 李磊; 王欣远; 杜建超
Original assignee: Xian University of Electronic Science and Technology
Current assignee: Xian University of Electronic Science and Technology
Priority date: 2017-04-19
Filing date: 2017-04-19
Publication date: 2020-02-07
Anticipated expiration: 2037-04-19
Also published as: CN107169487A

Abstract

The invention provides a significant target detection method based on superpixel segmentation and depth feature positioning. The method solves the problem that the target segmentation effect of the traditional significant target detection method is not ideal. The method comprises the following steps of utilizing super-pixel segmentation of color similarity linear iteration to raise the processing unit of an image from an individual pixel point to a collective similarity region; the method comprises the steps of fully considering image features such as color features, direction features and depth features, combining the characteristics that human eyes are more concerned about a center and ignore surrounding backgrounds, feature similarity of a region where a significant image is located and prior knowledge of uniqueness of the significant image compared with global features, generating a positioning significant image and a depth significant image of an input image, and fusing and boundary processing the positioning significant image and the depth significant image. The invention has the advantages of clearer image effect edge detection, more complete background elimination and more complete target form segmentation. The method is used for various fields such as face recognition, vehicle detection, moving target detection and tracking, military missile detection, hospital pathology detection and the like.

Description

Salient object detection method based on superpixel segmentation and depth feature positioning

Technical Field

The invention belongs to the technical field of image detection, mainly relates to a salient target detection method, and particularly relates to a salient target detection method based on superpixel segmentation and depth feature positioning. The method is used for various fields such as face recognition, vehicle detection, moving target detection and tracking, military missile detection, hospital pathology detection and the like.

Background

With the continuous and huge data quantity, the data quantity accumulated in unit time rises exponentially, and the huge data quantity needs more excellent computer technology and algorithm theory to process refined data information. With the endless high-resolution image, the image brings great visual enjoyment to people. Human understanding of complex images has reached a high level. In the traditional image processing, pixel points are independently opened, or the information meaning transmitted by an image is completely and integrally analyzed, so that the traditional image processing method is far from meeting the requirement of high efficiency and real time in the face of huge data volume. Meanwhile, the desired effect of extracting the salient object detection cannot be achieved only by considering relevant features of a human eye attention mechanism, such as color features, direction features and other simple features. Or the image to be detected is manually processed, so that the working difficulty is high, the pressure is high and the load is heavy. How to enable a computer to simulate a human visual mechanism and realize a salient attention mechanism similar to that of human to process image information becomes a hot topic to be urgently solved.

Some existing salient object detection methods only consider the characteristics of images to find the difference between an image object region and a background region, so as to distinguish the object position from the background region. And processing the saliency map by using a Markov chain to find the influence relationship between the central saliency area and the surrounding background area. There are also methods that use the convolution of the magnitude spectrum and the filter to achieve the final finding of the salient regions for the redundant information. Further, there are various methods such as local contrast and global contrast. Although the methods achieve certain effectiveness of detecting the significant target, the detection effect is strong in the aspects of edge segmentation, background elimination and target form extraction, and has certain limitation. Moreover, most of the image features are processed in the form of individual pixel points, which is far from satisfying the current situation.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a salient object detection method based on superpixel segmentation and depth feature positioning, which has clearer edges, more complete background elimination and more complete object form segmentation.

The invention relates to a salient object detection method based on superpixel segmentation and depth feature positioning, which is characterized by comprising the following steps of:

step 1: the input image is subjected to linear iterative clustering segmentation. Inputting a target image to be detected, firstly dividing the target image into K areas, searching local gradient minimum value points of the neighborhoods of the areas as central points, and setting a label number for the same area; searching a central point with the minimum five-dimensional Euclidean distance in the neighborhood of the distance pixel points, and endowing a central point label to the pixel point to be processed; continuously iterating the process of searching the central point with the minimum distance to the pixel point, stopping iteration when the label value of the pixel point cannot change, and finishing superpixel segmentation;

step 2: and constructing a Gaussian difference to generate a positioning saliency map.

2 a: performing Gaussian function filtering processing according to an input original image to generate 8 hierarchical scale graphs of the original image;

2 b: combining the constructed 8-layer scale images with the original image to form a nine-layer scale image, and extracting red-green color difference images and blue-yellow color difference images of the nine-layer scale image to obtain 18 color difference images; extracting intensity maps of nine-layer scale maps, wherein 9 intensity maps are extracted; extracting Gabor filtering directional diagrams of the nine-layer scale diagram, and forming three types of characteristic diagrams in 36 pairs of directional diagrams;

2 c: because the sizes of the same-class characteristics of the nine-layer scale graph are different, the three-class characteristic graph is subjected to interpolation processing and then differential processing;

2 d: different types of feature maps need to be normalized and then fused into a positioning saliency map due to different feature measurement standards;

and step 3: and generating a depth feature saliency map. Firstly, performing positioning treatment on the super-pixel segmented image according to the positioning saliency map in the step 2, and then acquiring three types of feature information, namely nearest neighbor region information, global region information and corner background region information, for each segmented region and adjacent regions of the segmented region to generate a depth feature saliency map for detecting a saliency target;

and 4, step 4: and (3) fusing and carrying out boundary processing on the positioning salient map and the depth characteristic salient map which are finally determined through the steps 2 and 3 to generate a final salient target map, and completing salient target detection of superpixel segmentation and depth characteristic positioning.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention adopts linear iteration for calculating five-dimensional Euclidean distance color similarity to carry out superpixel segmentation pretreatment on an input image, solves the problem that the target edge segmentation effect of the traditional saliency target detection method is not ideal, and provides a saliency target detection method which is more intelligent, efficient and stronger in robustness.

2. The method of the invention fully considers the image characteristics such as color characteristics, direction characteristics, depth characteristics and the like, simultaneously fully considers the center of more concern and neglects the surrounding background, the characteristic similarity of the area where the target is located, and compares with the prior knowledge such as the uniqueness of the global characteristics; and the detection of the saliency target is realized, so that the computer has more logicality and more artificial intelligence.

3. The method of the invention obtains the detection target from the detection result without being limited to specific characteristics, the environment and other conditions, and the detection target can detect the salient object by shooting the image to be detected in a plurality of scenes such as office scenes, areas in campuses, parks and the like, and the detection effect is more in line with the human eye salient effect. The background elimination is more complete, and the position and the form of the target extraction are more complete.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

FIG. 2 is a diagram of the effect of the super-pixel segmentation in the method of the present invention, wherein FIG. 2(a) is a diagram of the segmentation effect of the corner of an office, and FIG. 2(b) is a diagram of the segmentation effect in a library scene;

fig. 3 is a diagram showing the detection effect and comparing the effect of the present invention with other methods in recent years for ten selected images, wherein fig. 3(a) is a selected original image, fig. 3(b) is a diagram showing the detection effect of the present invention, fig. 3(c) is a diagram showing the effect of the GS method, fig. 3(d) is a diagram showing the effect of the GBMR method, fig. 3(e) is a diagram showing the effect of the RARE method, fig. 3(f) is a diagram showing the effect of the HSD method, fig. 3(g) is a diagram showing the effect of the STD method, and fig. 3(h) is an artificial mark diagram;

FIG. 4 is a graph of the accuracy and recall of the present invention versus other methods in recent years for five hundred plots taken.

Detailed Description

The invention is described in detail below with reference to the accompanying drawings

Example 1

Some existing salient object detection methods only consider the characteristics of images to find the difference between an image object region and a background region, so as to distinguish the object position from the background region. And processing the saliency map by using a Markov chain to find the influence relationship between the central saliency area and the surrounding background area. There are also methods that use the convolution of the magnitude spectrum and the filter to achieve the final finding of the salient target region for the redundant information. Although the methods achieve certain effectiveness of detecting the significant target, the detection effect is strong in the aspects of edge segmentation, background elimination and target form extraction, and has certain limitation.

Aiming at the defects in the prior art, through discussion and innovation, the invention provides a salient object detection method based on superpixel segmentation and depth feature positioning, which is shown in fig. 1 and comprises the following steps:

step (1) performing linear iterative clustering segmentation on an input image: inputting a target image to be detected, namely an original image, dividing the target image into K regions, searching local gradient minimum value points of the neighborhood of each region as a central point, and setting a label number for the same region. Searching a central point with the minimum five-dimensional Euclidean distance in the neighborhood of the distance pixel points, and endowing a central point label to the pixel point to be processed; and continuously iterating to search a central point with the minimum distance to the pixel points, and giving labels to the pixel points until the label numbers of the pixel points are not changed, so that super-pixel segmentation is completed. In this example, 5 × 5 neighborhoods are used for searching each region neighborhood, and 2S × 2S neighborhoods are used for searching the distance pixel point neighborhood.

Step (2) generating a positioning saliency map by using a Gaussian difference method:

(2a) the input original image is subjected to Gaussian function filtering processing to generate 8 hierarchical scale maps of the original image.

(2b) Combining the built 8-layer scale images with an original image to form a nine-layer scale image, and extracting a red-green color difference image and a blue-yellow color difference image of the nine-layer scale image, wherein the two color difference images of the nine-layer scale image are 18 images in total; extracting an intensity map of a nine-layer scale map, wherein 9 sub-maps are the intensity maps of the nine-layer scale map; and extracting four directional diagrams of Gabor filtering of nine-layer scale diagrams, wherein the four directional diagrams are 36 pairs of diagrams in the four directions of 0 degree, 45 degrees, 90 degrees, 135 degrees and the nine-layer scale diagram, and forming three characteristic diagrams of a color difference diagram, an intensity diagram and a directional diagram.

(2c) Because the sizes of the obtained same-class features of the nine-layer scale graph are different, the three-class feature graph needs to be subjected to interpolation processing and then to differential processing.

(2d) Because the measurement standards of the features of different types of feature maps are different, the single amplitude cannot reflect the significance of the significance, so that the features of different types need to be normalized and then fused into the positioned significance map.

And (3) generating a depth feature saliency map of the input image: firstly, performing positioning processing on the image after the superpixel segmentation according to the positioning saliency map in the step 2, and fully considering that the possibility that the central position is taken as a saliency target is far greater than the peripheral position of the image and the concentration of the saliency target, namely the saliency target is concentrated in a certain area and cannot be scattered in all areas or most areas of the image; therefore, for each region and the adjacent region thereof segmented in the step 1, three types of feature information, namely nearest neighbor region information, global region information and corner background region information, are acquired to generate a depth feature saliency map for detecting a saliency target.

And (4) fusing and carrying out boundary processing on the positioning salient map and the depth feature salient map to generate a final salient target map and finish salient target detection of superpixel segmentation and depth feature positioning so as to enable the segmentation of the object to be more regular and enable the boundary between the salient target and the ignored background to be clearer.

The method of the invention fully considers the image characteristics such as color characteristics, direction characteristics, depth characteristics and the like, simultaneously fully considers the center of more concern and neglects the surrounding background, the characteristic similarity of the area where the target is located, and compares with the prior knowledge such as the uniqueness of the global characteristics; and the detection of the saliency target is realized, so that the computer has more logicality and more artificial intelligence.

Example 2

The salient object detection method based on the superpixel segmentation and the depth feature positioning is the same as the embodiment 1, and the superpixel segmentation of the target image to be detected in the step 1 comprises the following steps:

1.1 assume the target image, i.e. the original image has N total pixel points, the total area desired to be divided is K, obviously each divided area has N/K total pixel points, and the distance between different areas is about

It may happen that the set center point is exactly on the edge, and to avoid this, the position of the local gradient minimum is found around the set center, and the center position is moved to this local gradient minimum. And a label number is set to the same area for marking.

1.2 respectively calculating the Euclidean distance value of the five-dimensional feature vector from each pixel point to the central point determined by the surrounding neighborhood, and then assigning the central point label number with the minimum value to the pixel currently processedThe point is that. Computing five-dimensional feature vector C_i＝[l_i,a_i,b_i,x_i,y_i]^TThe Euclidean distance of (a) is shown in the following three formulas, i.e., in a five-dimensional feature vector_i,a_i,b_iThree color component information values, x, representing the luminance of a color in the CIELAB space, the position between red and green, and the position between yellow and blue, respectively_i,y_iAnd representing the coordinate position information value of the target image to be detected where the pixel points are located.

In the above formula, d_labRepresenting the Euclidean distance between the pixel point k and the central point i in the CIELAB color space; d_xyRepresenting the Euclidean distance between the pixel point k and the central point i in the aspect of space coordinate position; d_iIn order to evaluate the measurement standard whether the pixel point k and the central point i belong to one label, the larger the value of the label is, the closer the similarity degree between the pixel point k and the central point i is, and the labels are consistent; m is a fixed parameter used for balancing the relation between variables; s is the distance between the different regions is about

The above is an iteration cycle for setting the label number to which the pixel belongs.

1.3, continuously performing iteration operation according to the step 1.2, and further optimizing the accuracy of the label number to which the pixel point belongs until the label number to which each pixel point of the whole image belongs does not change any more, wherein usually, about 10 iterations are performed to achieve the effect.

1.4 through the iterative process, some problems may occur, for example, a very small region is divided into a super-pixel region, and only one pixel is isolated to form a super-pixel region. In order to eliminate the occurrence of the situation, an independent area with an excessively small size or an isolated single pixel point is allocated to a nearby label number, and the super-pixel segmentation of the target image is completed.

The method adopts the color similarity to carry out superpixel segmentation, carries out preprocessing on an input image, solves the problem that the target segmentation effect of the traditional saliency target detection method is not ideal, and provides the saliency target detection method which is more intelligent, efficient and strong in robustness.

Example 3

The salient object detection method based on superpixel segmentation and depth feature positioning is the same as the embodiment 1-2, three types of feature information, namely nearest neighbor region information, global region information and corner background region information, are acquired in the step 3 of the invention in order to fully consider the center of interest and ignore the surrounding background and the feature similarity of the region where the object is located, and compared with the prior knowledge such as the uniqueness of the global feature, the method comprises the following steps:

3.1, considering that the possibility that the center is significant is far higher than the surrounding background, the significant target is necessarily concentrated in a region with a certain area, and for each segmented region, the information in the nearest neighbor region, namely the nearest neighbor region information, is collected.

And 3.2, considering the influence degree of the processing area on the whole image, and removing the information contained in the current area, namely the other areas of each divided area, namely the global area information.

3.3 representing the information of four corner areas of the background feature for each area after the segmentation, namely the information of the corner background area.

The acquisition is completed by the feature information provided by the three parts.

Example 4

The salient object detection method based on superpixel segmentation and depth feature localization is the same as that in embodiments 1-3, and the generation of the depth feature salient map in step 3 specifically comprises the following steps:

for one of the regions R which is completely segmented, the depth characteristic saliency of the quantized region R is:

wherein s (R) represents the saliency of the region R; pi is the multiplication of a plurality of factors; s (R, psi)^C) Representing nearest neighbor area information; s (R, psi)^G) Representing global area information; s (R, psi)^B) Representing corner background area information.

Usually, a pair of images such as a bunch of flowers on a grassland is seen, a focus is instantly placed on the flowers and surrounding backgrounds are ignored, and it can be exactly understood that green leaves as the backgrounds have extremely high probability of appearing in the whole pair of images, and are target flowers with high significance, and the probability of appearing in the whole pair of images is relatively low; a high probability of causing low attention because of its generality, while a low probability of causing high attention because of its uniqueness; this is contrary to shannon information theory, where a low probability is an indication of a high amount of information, and a high probability indicates that the amount of information it brings is low. Thus will be s (R, ψ)^m) The definition is as follows:

s(R,ψ^m)＝-log(p(R|ψ^m))

in the above formula, s (R, ψ)^m) Representing nearest neighbor area information, global area information and corner background area information extracted under the depth characteristic; p is a probability value.

For three types of area information, namely nearest neighbor area information, global area information and corner background area information, the above formula is simplified by the area average value respectively:

in the above formula, d represents the depth average value of the currently processed region block R;

is as mentioned hereinbefore^mDepth average of (2), wherein d_i ^mDepth average, n, of ith block region representing information characteristic of mth type region_mThere are three cases in total, n_CRepresenting the total number of nearest neighbor area information, n_GRepresenting the total number of global area information, n_BRepresenting the total amount of corner background area information.

Estimation with a Gaussian functionThe realization of (1):

in the above formula, the first and second carbon atoms are,influencing factors, D, D, representing the difference in depth of different blocks of area^m,

n_mThe meanings have been explained above.

The depth feature salient model fully considers the characteristic that human eyes care more about the center but ignore the surrounding background, and compared with the uniqueness of global features, namely the prior knowledge such as influence factors of depth differences of different region blocks and the like, the depth feature salient model makes the edge of the target detection effect clearer, the background rejection more complete and the target form segmentation more complete by utilizing the feature similarity of the region where the target is located. The computer is more logical and more artificial intelligent.

The following examples are given in more detail to further illustrate the invention:

example 5

The salient object detection method based on the superpixel segmentation and the depth feature positioning is the same as the embodiment 1-4, and the core steps comprise the following steps:

step (1) performing linear iterative clustering segmentation on an input image: the input image is divided into K areas, local gradient minimum value points of the neighborhood of each area are searched as central points, one label number is set in the same area, and different label numbers are set in different areas. And aiming at each pixel point, searching a central point with the minimum five-dimensional Euclidean distance in the neighborhood of the pixel point, and endowing the central point label to the pixel point to be processed. And setting the label of the current pixel point as the central point with the minimum distance, and continuously iterating to find the central point with the minimum distance to the pixel point and endowing the pixel point with the label according to the process shown in the figure 1. Setting labels of pixel points by comparing the distances between pixel points and a central point by taking a K region as a unit, finishing the setting of the labels of the pixel points in the K region through iterative optimization, traversing the whole image, judging whether the label value of the pixel points changes or not in the iterative optimization process, repeating the iterative operation if the label value changes relative to the last iterative process, or else, if the label value does not change relative to the last iterative process, allocating an independent region with an over small size or an isolated single pixel point to a nearby label number, wherein the label value does not change when the iteration number is 10 times, removing a super-pixel region formed by isolated points, and finishing super-pixel segmentation. The number of regions, not necessarily K regions, that generate a controlled number after segmentation is complete. In this example, the neighborhood of each region is found to be 3 × 3 neighborhood, and the neighborhood of the distance pixel point is found to be 2S × 2S neighborhood.

Step (2) generating a positioning saliency map of the input image by using a Gaussian difference method:

(2a) the input image is subjected to gaussian function filtering processing to generate 1/2 scale maps of the original, 1/4 scale maps of the original, and 1/256 scale maps of the original, which are 8-layer-level scale maps in total.

(2b) And combining the constructed 8-hierarchy scale map with the original map, namely adding the 8-hierarchy scale map with the original map to form a nine-hierarchy scale map. And extracting 18 color difference maps in total from the red-green color difference map RG and the blue-yellow color difference map BY of the nine-layer scale map. And extracting the intensity map I of the nine-layer scale map, wherein the total number of the intensity map I is 9. And extracting a Gabor filtering directional diagram O of the nine-layer scale diagram, and extracting 36 pairs of direction diagrams in four directions of 0 degree, 45 degrees, 90 degrees and 135 degrees of the nine-layer scale diagram.

The process is to extract feature maps from three aspects of nine-layer scale maps, including a color difference map, an intensity map and a directional diagram.

(2c) Since the sizes of the similar features of the obtained three feature maps are different, the similar features of the three feature maps need to be interpolated and then subjected to difference processing, as shown in fig. 1.

(2d) Because the measurement standards of the features of different types of feature maps are different, the importance of significance cannot be reflected by a single amplitude value, and the features of different types are normalized and then fused into a positioning significance map to obtain the positioning significance map of the input image.

And (3) extracting a depth feature saliency map D of the input image: firstly, a positioning treatment is carried out on the map after the superpixel segmentation according to the positioning saliency map in the step 2, and the possibility that the central position is taken as a saliency target is fully considered to be far more than the concentration of the peripheral positions of the image and the saliency target, namely the saliency target is concentrated in a certain area and cannot be scattered in all or most of the area of the image. And (3) acquiring three types of feature information, namely nearest neighbor region information, global region information and corner background region information, of each region and adjacent regions of the region which are segmented in the step (1), and generating a depth feature saliency map of the input image for detecting a saliency target.

And (4) in order to make the object segmentation more regular and make the boundary between the salient object and the ignored background clearer, fusing and boundary processing are carried out on the positioning salient image obtained in the step 2 and the depth characteristic salient image obtained finally in the step 3 to generate a final salient object image, and salient object detection of superpixel segmentation and depth characteristic positioning is completed.

By the target detection method, the target detection effect that the edge is clearer, the background is more completely removed, and the target form is more completely segmented is obtained.

The technical effects of the invention are explained in detail below with reference to the drawings and simulation data:

example 6

The salient object detection method based on superpixel segmentation and depth feature positioning is the same as that in the embodiments 1-5, and the effect display and result analysis are performed on the superpixel segmentation part in the method through simulation.

The simulation conditions are as follows: PC, AMD A8-7650K Radon R7, 10 computer Cores 4C +6G, 3.3GHz, 4GB memory; MATLAB R2015 b.

The simulation content is to perform superpixel segmentation on office corners and library scenes by adopting the method.

Fig. 2 is an effect diagram after super-pixel segmentation in the method of the present invention, wherein fig. 2(a) is a segmentation effect diagram of an office corner, and fig. 2(b) is a segmentation effect diagram in a library scene.

In fig. 2, the original image selected by the present invention is shown without a grid, which is an effect diagram of the method of the present invention after super-pixel segmentation.

The composition of the image is often that individual pixel points appear independently, but the target that detects can not be single pixel point but all occupy certain area, possess a plurality of pixel points to have certain commonality rather than individual independent division between the pixel points. Thus, in view of these characteristics, the present invention segments an image superpixel into regions of homogeneity and regions of dissimilarity. The super pixel region block is used for replacing a single independent pixel point with huge number, and the processing complexity of a computer on the image is reduced.

Fig. 2(a) is a view of a potting scene at a corner of an office, and a simple background is provided at other places except for a potting at the corner. As can be seen from the detection effect of the invention, for other places except the potted plant, because the characteristics are single, the invention firstly carries out superpixel segmentation on the input, and the segmentation result is regular in size and shape. When the computer processes the image, the same type of segmentation is already carried out, and the image is not required to be processed in detail one by one, so that the processing complexity is reduced. For the potted plant, because the characteristics are variable, the green leaves and the white pots can be finely divided according to similar and different characteristics by the method. And then the images are processed by taking the areas of the same type and different types as units, so that the processing speed of the computer on the images is improved. Fig. 2(b) is a diagram of a library scene, eight exhibits are placed in a single background of the library, wherein one exhibit is placed in the middle of the scene, and although the scene of (b) is much more complicated than that of (a), a single wall area still exists in the diagram. According to the effect display, the wall surface with single characteristics is regular in the dividing condition in both size and shape. The method can clearly divide the same type of characteristics and different characteristics for the place where the exhibit is placed. All the images to be detected have regions of the same type similar to a single wall surface to an unacceptable extent. The invention takes the region as the unit when the computer processes, thereby effectively reducing the complexity of the computer for processing the image.

Compared with the traditional super-pixel segmentation method, the clustering segmentation method realized by linear iteration has the advantages that the segmentation condition of the regions is regular in size and shape, and the segmentation processing effect of the boundaries between different regions is clearer.

Example 7

The salient object detection method based on superpixel segmentation and depth feature positioning is the same as that in the embodiments 1-5, and in the embodiment, object detection and result analysis are performed through simulation.

The simulation conditions were the same as in example 6.

Simulation content: aiming at the selected ten images, five methods of global contrast significant detection GCS (RARE), geodesic distance significant detection GS, graph theory-based significance detection GBMR (global significance measure), hierarchical significance detection HSD and statistical texture significance detection STD are used for comparing effects on the same image. The selected images include indoor and outdoor scene images, office scenes, areas in a campus, parks, and the like.

Referring to fig. 3, fig. 3(a) is a selected original image, fig. 3(b) is a detection effect diagram of the present invention, fig. 3(c) is a GS method effect diagram, fig. 3(d) is a GBMR method effect diagram, fig. 3(e) is a RARE method effect diagram, fig. 3(f) is an HSD method effect diagram, fig. 3(g) is an STD method effect diagram, and fig. 3(h) is an artificial labeling diagram.

For potted drawings in office scenes, the method and the GBMR method have the advantages that not only can the position of a significant target be detected, but also the basic form can be displayed, and although other methods generally display the target form, the background elimination is not complete, particularly, a large part of the background remains in HSD and STD methods. For basketball detection images in simple scenes, the six methods can well detect the form of a target object, are close to a result image marked manually, and meet the result requirement. For a single-target roof red lantern, the roof is red in the scene, so that the interference effect of the roof is obvious, and in the scene, the target object can be detected by the methods, but the GS method and the STD method cannot be removed from the red roof with strong interference. For the roof red lantern picture with double targets, the method provided by the invention relates to positioning of a significant picture, so that the detection of multiple targets has certain limitation, only the target on the left side is detected more emphatically, the best method for detecting the scene target is an HSD method, but the HSD method and other methods can detect the position of the target, but the removal of roof backgrounds which are strong interferences is incomplete, and the residual background areas are too obvious, so that the invention is not influenced by the roof with the strong interference item, and the removal of the background around the detected single target is cleaner. For a plurality of similar ancient characters in a wall surface, it is obvious that human eyes only pay attention to the middle ancient character region which is the middle most in the scene and occupies the largest scene area, for the scene, the HSD method detects the small similar areas above and to the right of the interference as significance targets, which is obviously unreasonable, the RARE method and the STD method are the same, and the GS method and the GBMR method have ideal results. Compared with other methods, the method has obvious advantages for nameplate pictures in park scenes, and is closest to artificial marker pictures. For the three images in the museum, the GS method and the HSD method can better show the advantages, and the detection of the morphological position of the target is ideal, but the GS method and the HSD method have background areas which are eliminated like other methods, and the background areas are not needed by the detection result image. For the detection of a figure 9 with a relatively simple scene, except that the RARE method has an undesirable effect, the results of the other five algorithms are relatively close to the image displayed by the artificial label map. Generally, the detection effect of the target detection method provided by the invention on various scenes is better than that of other five methods in the aspects of edge definition, background rejection and target form segmentation.

Example 8

The salient target detection method based on superpixel segmentation and depth feature positioning is the same as that in the embodiments 1-5, and the performance analysis of the target detection result is performed through simulation in the embodiment.

The simulation conditions were the same as in example 6.

Simulation content: five types of methods, namely GCS (global contrast significant detection), RARE (RARE earth), GS (geodesic distance significant detection), GBMR (global significance detection) based on graph theory, HSD (hierarchical significance detection) and STD (statistical texture significance detection), are used for performing effect comparison on the same image aiming at the selected five hundred images. The selected images include indoor and outdoor scene images, office scenes, areas in a campus, parks, and the like.

Referring to fig. 4, the performance analysis of the detection effect of the present invention and the five methods reflects the performance of the algorithm by the accuracy (Precision) index and the Recall (Recall) index, which are defined as follows:

TP: intersection of the target areas in the obtained saliency map and the artificially calibrated saliency map;

TN: intersection of non-target areas in the obtained saliency map and the artificial calibration saliency map;

FP: manually calibrating the intersection of non-target areas in the saliency map of the obtained saliency map;

FN: manually calibrating the intersection of the target areas in the saliency map of the obtained saliency map;

therefore, the following steps are carried out:

precision and Recall indices were calculated for the methods of the invention and the five classes of methods, respectively. It can be seen that, in these comparative significant target detection methods, the present invention exhibits a better effect, where the AUC (area Under cut) value reaches 0.6888, and the less preferred is the GBMR method, where the AUC value is 0.6093. With the increasing recall rate, the overall accuracy tends to decrease. And when the recall rate is 0 to 0.8, the Precision index of the invention is obviously superior to other methods. The Precision index of the invention is not reduced to be below 0.6 until the recall rate is close to 0.8, which fully shows that the invention can realize detection more excellently, is closer to an artificial marking map, background rejection is more complete, and target form segmentation is more complete.

In short, the invention designs and provides a salient object detection method based on superpixel segmentation and depth feature positioning. The super-pixel segmentation method of linear iteration of color similarity of five-dimensional Euclidean distance is utilized to raise the processing unit of the image from an individual pixel point to a collective similarity region, so that the detected target and the complex background edge can be clearly separated, and the problem that the target edge segmentation effect of the traditional significant target detection method is not ideal is solved. The image characteristics such as color characteristics, direction characteristics and depth characteristics are fully considered, and the prior knowledge that human eyes are more concerned about the center and ignore the characteristics of surrounding backgrounds, the characteristic similarity of the region where the significant image is located and the uniqueness of the significant image is compared with the global characteristics is combined. And then, the positioning saliency map and the depth saliency map of the input image are generated by simulating the features through an algorithm, and the final saliency map with a human eye attention mechanism is generated by fusing and carrying out boundary processing on the positioning saliency map and the depth saliency map. The computer has more logicality, more artificial intelligence, logic understanding capacity similar to that of a human, or more intellectualization, high efficiency and stronger robustness. The image effect edge detected by the invention is clearer, the background elimination is more complete, and the target form segmentation is more complete. The method is used for various fields such as face recognition, vehicle detection, moving target detection and tracking, military missile detection, hospital pathology detection and the like.

Claims

1. A salient object detection method based on superpixel segmentation and depth feature positioning is characterized by comprising the following steps:

step 1: the input image is subjected to linear iterative clustering segmentation, a target image to be detected is input, the target image is firstly segmented into K regions, local gradient minimum value points of the neighborhood of each region are searched as central points, and a label number is set for the same region; searching a central point with the minimum five-dimensional Euclidean distance in the neighborhood of the distance pixel points, and endowing a central point label to the pixel point to be processed; continuously iterating the process of searching the central point with the minimum distance to the pixel point, stopping iteration when the label value of the pixel point cannot change, and finishing superpixel segmentation;

step 2: constructing a Gaussian difference to generate a positioning saliency map;

and step 3: generating a depth characteristic saliency map, firstly performing positioning treatment on the map subjected to superpixel segmentation according to the positioning saliency map in the step 2, and then acquiring three types of characteristic information, namely nearest neighbor area information, global area information and corner background area information, of each segmented area and adjacent areas of the segmented area to generate the depth characteristic saliency map for detecting a saliency target;

and 4, step 4: and (3) fusing and carrying out boundary processing on the positioning salient map and the depth characteristic salient map which are finally determined through the step 2 and the step 3 to generate a final salient target map, and completing salient target detection of the superpixel segmentation.

2. The salient object detection method based on superpixel segmentation and depth feature localization as claimed in claim 1, wherein the superpixel segmentation of the target image to be detected in step 1 comprises the following steps:

1.1 assume that a target image has N total pixel points, the total area desired to be divided is K, obviously each divided area has N/K pixel points, and the distance between different areas isTo avoid this, the position with the smallest local gradient is searched around the set center, the center position is moved to the position with the smallest local gradient, and a label number is set for the same area to be marked;

1.2 respectively calculating the Euclidean distance value of the five-dimensional feature vector from each pixel point to the central point determined by the surrounding neighborhood, then assigning the central point label number with the minimum value to the pixel point currently processed, and calculating the five-dimensional feature vector C_i＝[l_i,a_i,b_i,x_i,y_i]^TThe Euclidean distance of (a) is shown in the following three formulas, i.e., in a five-dimensional feature vector_i,a_i,b_iThree color component information values, x, representing the luminance of a color in the CIELAB space, the position between red and green, and the position between yellow and blue, respectively_i,y_iRepresenting the coordinate position information value of the target image to be detected where the pixel points are located;

in the above formula, d_labRepresenting the Euclidean distance between the pixel point k and the central point i in the CIELAB color space; d_xyRepresenting the Euclidean distance between the pixel point k and the central point i in the aspect of space coordinate position; d_iIn order to evaluate the measurement standard whether the pixel point k and the central point i belong to one label, the larger the value of the label is, the closer the similarity degree between the pixel point k and the central point i is, and the labels are consistent; m is a fixed parameter used for balancing the relation between variables; s is the distance between the different regions

The above is an iteration cycle for setting the label number to which the pixel point belongs;

1.3, continuously carrying out iterative operation according to the step 1.2, and further optimizing the accuracy of the label number to which the pixel point belongs until the label number to which each pixel point of the whole image belongs does not change any more;

1.4, through an iterative process, allocating independent areas with small sizes or isolated single pixel points to nearby label numbers to complete superpixel segmentation of the target image.

3. The method for detecting a salient object based on superpixel segmentation and depth feature localization as claimed in claim 1, wherein said step 3 of collecting three types of feature information including nearest neighbor region information, global region information, and corner background region information comprises the steps of:

3.1 collecting the information in the nearest neighbor area of each area after the super-pixel segmentation is finished, namely the nearest neighbor area information;

3.2 removing the information contained in the current area, namely the other areas of each divided area, namely the global area information;

4. The method for detecting a salient object based on superpixel segmentation and depth feature localization as claimed in claim 1, wherein said generating a depth feature saliency map using three types of region information in step 3 specifically comprises:

wherein s (R) represents the saliency of the region R; pi is the multiplication of a plurality of factors; s (R, psi)^C) Representing nearest neighbor area information; s (R, psi)^G) Representing global area information; s (R, psi)^B) Representing corner background area information;

the low probability is the representation of high information content, and the high probability indicates that the information content brought by the low probability is lower; mixing s (R, psi)^m) The definition is as follows:

s(R,ψ^m)＝-log(p(R|ψ^m))

in the above formula, s (R, ψ)^m) Representing nearest neighbor area information, global area information and corner background area information extracted under the depth characteristic; p is a probability value;

is as mentioned hereinbefore^mDepth average of (2), whereinDepth average, n, of ith block region representing information characteristic of mth type region_mThere are three cases in total, n_CRepresenting the total number of nearest neighbor area information, n_GRepresenting the total number of global area information, n_BRepresenting the total number of the corner background area information;

estimation with a Gaussian functionThe realization of (1):

in the above formula, the first and second carbon atoms are,representing the influence factor of the depth difference of different area blocks.