CN105868789B

CN105868789B - A kind of target detection method estimated based on image-region cohesion

Info

Publication number: CN105868789B
Application number: CN201610212736.4A
Authority: CN
Inventors: 王菡子; 郭冠军; 赵万磊; 严严; 沈春华
Original assignee: Xiamen University
Current assignee: Xiamen University
Priority date: 2016-04-07
Filing date: 2016-04-07
Publication date: 2019-04-26
Anticipated expiration: 2036-04-07
Also published as: CN105868789A

Abstract

A kind of target detection method estimated based on image-region cohesion, is related to computer vision technique.A small amount of target can quickly be provided by, which providing, suggests window, so that target is suggested in window as far as possible comprising a kind of target detection method estimated based on image-region cohesion of target.Target detection and conspicuousness test problems are solved simultaneously.The method that the image-region cohesion of proposition is estimated also is applied to conspicuousness detection, and conspicuousness detection is also applied to other Computer Vision Tasks by a basic task as Computer Vision Task and extensively.

Description

A kind of target detection method estimated based on image-region cohesion

Technical field

The present invention relates to computer vision techniques, are specifically related to a kind of target detection estimated based on image-region cohesion Method.

Background technique

One, human perception world important sources just pass through visual information, studies have shown that the mankind obtain in external information greatly There are about the visual informations that 80%~90% information is obtained from human eye.The mankind to extraneous visual information sensing capability very Height can quickly position target and analysis target.One of main task of computer vision is exactly to be desirable for computer to have The similar powerful Target detection and identification ability of the mankind.Target detection is an important premise of visual perception and object understanding Work, the efficiency and precision of Target Acquisition decide the speed and effect of visual perception.Therefore, to the target in computer vision Detection technique is furtherd investigate, and the accuracy rate of detection and identification is continuously improved, has important practical significance.

Solved in academic circles at present this problem development trend be from use heuristic to using machine learning Method.Feature used also turns to the feature of task based access control extracted in self-adaptive from manual feature.The model of Target detection and identification Start to occur detecting and recognizing multi-target detection and identification while the functional shift carried out from single specific objective.It is most typical Example is exactly the appearance of deep learning model, solves conventional target detection and the model identified just for the target of limited task Detection and the effective problem of identification mission.For example, at 2001, the obverse face detection frame of Viola-Jone [1] proposition It is quite effective to Face datection based on Harr feature, but be not very good for side face face and pedestrian detection effect.Until 2005, Dalal et al. [2] proposed HOG feature and using SVM HOG (Histogram of corresponding to each smoothing windows Gradient) after the strategy that feature is classified, vertical pedestrian's detection effect just has the breakthrough of a matter.However, HOG this Manual feature does not enable the detection effect of the targets such as the pedestrian of image classification and identification and any attitude, animal, plant People is satisfied.Then deformation model [3] (Deformable Part Models:DPM) comes into being, and solves to have the target of deformation to examine Survey problem.Although deformation model tries to solve the problems, such as to cause because of deformation target detection less than but the shape needed in its model Become in component reality and be sometimes difficult to be captured to, reason carrys out identification component with regard to none good model and good feature, therefore Its effect on multi-class targets detection data collection (such as PASCAL VOC, ImageNet) is not very good.Nearest one is prominent Broken sex work is the appearance of deep learning model.In maximum image classification and target detection data set ImageNet, it is based on The raising for the Target detection and identification precision that one of deep learning model convolutional neural networks (CNN) are done is more than even more previous highest As many as one times of precision.Nearest 2 years ImageNet data sets classification and the outstanding algorithm almost all of detection performance are using convolution mind Through network, they different network structures is different.Image classification and target detection are highest on ImageNet data set at present Precision is respectively 95% and 55%.

Although improving very high precision on Target detection and identification based on the method for convolutional neural networks, due to Convolutional neural networks networks is complicated and computationally intensive, apply in target detection efficiency be not it is very high, many methods are all at present It is to be accelerated based on GPU to object detection program.A target image is given, carries out target inspection using smoothing windows strategy It surveys, even if being accelerated using GPU, algorithm complexity is still very big, and efficiency is extremely low.In order to solve convolutional neural networks in target Efficiency in detection, the solution of mainstream can be divided into three classes at present.The first kind is the method [4] cut based on figure, first Image segmentation is carried out to given image, obtains some potential target areas by dividing block.Then with convolutional neural networks pair These target areas carry out feature extraction and classification, finally obtain the position of target.The shortcomings that this method, is to rely on image The performance of segmentation.Second class is to extract feature to original image by convolutional neural networks, and smoothing windows are then used on characteristic pattern Strategy does the recurrence of target position and the classification [5] of target.This method is extracting feature to big figure using convolutional neural networks When, can lose some pairs classification and return useful feature information, therefore the model performance finally obtained be unable to reach it is optimal.The Three classes method is then component to be found with the advantage of convolutional neural networks classification, and then construct deformation model, using deformation model Thought [6] are detected to target.But the target detection in this classification and deformation model convolutional neural networks separates The way of execution, so that the detection performance of general frame is general, in addition the efficiency of this model is also not very high.

Bibliography:

[1]P.Viola and M.Jones.Robust real time object detection.In IEEE ICCV Workshop on Statistical and Computational Theories of Vision,2001.

[2]N.Dalal and B.Triggs,“Histograms of Oriented Gradients for Human Detection,”Proc.IEEE Conf.Computer Vision and Pattern Recognition,2005.

[3]P.F.Felzenszwalb,R.B.Girshick,D.McAllester,and D.Ramanan,“Object Detection with Discriminatively Trained Part Based Models,”IEEE Trans.Pattern Analysis and Machine Intelligence,vol.32,no.9,pp.1627‐1645,Sept.2010.

[4]R.Girshick,J.Donahue,T.Darrell,and J.Malik.Rich feature hierarchies for accurate object detection and semantic segmentation.In CVPR, 2014.

[5]P.Sermanet,D.Eigen,X.Zhang,M.Mathieu,R.Fergus,and Y.LeCun.Overfeat:Integrated recognition,localization and detection using convolutional networks.CoRR,2013.

[6]Ross B.Girshick,Forrest N.Iandola,Trevor Darrell,Jitendra Malik.Deformable Part Models are Convolutional Neural Networks.CoPR,2014.

Summary of the invention

The purpose of the present invention is to provide can quickly provide a small amount of target to suggest window, so that target is suggested in window to the greatest extent It may include a kind of target detection method estimated based on image-region cohesion of target.

The present invention the following steps are included:

A. a height of h is given, width is each pixel p on the color image of w_i(x, y), wherein i is pixel subscript, and x and y divide Not Wei pixel abscissa and ordinate, the pixel p_iThe coordinate of (x, y) in RGB color are as follows: p_i=< r_i,g_i,b_i>, Wherein r, g, b respectively indicate the value of three color component of RGB.Based on including p_iA fixed size smooth window Ω_k(wherein k=x × w+y indicates window subscript, and window size is usually 3*3 or 5*5), p_iLocal normalized vector definition Are as follows:

Wherein μ_kFor window Ω_kThe mean value of interior pixel, σ_kFor window Ω_kThe variance of interior pixel, τ are the constant of a very little To prevent from removing 0, operator/be expressed as a little removing at (formula one).H(p_i) assign the linear statement of each one part of pixel. Based on (formula one), window Ω_kThe definition of inner product of the local normalized vector of interior any two pixel are as follows:

H(p_i)^T·H(p_j)=(p_i-μ_k)^T(∑_k+τE)^-1(p_j-μ_k), (formula two)

Wherein ∑_kFor window Ω_kThe covariance of interior pixel, E are unit matrix, the operation of T representing matrix transposition, (formula two) Indicate the similarity in window between any two pixel.

B. for a height of h, width is that the similarity in the color image of w between any two pixel can state a phase as Like degree matrix:

Wherein C is a constant, for distinguishing window Ω_kThe region and window Ω that interior similarity is 0_kExcept region.Phase It is the matrix of a N × N like matrix A, wherein N=w*h.Matrix D is a diagonal matrix, each of which element It is decomposed by singular value (SVD), (formula two) (i.e. (formula three) molecular moiety) can resolve into:

H(p_i)^T·H(p_j)=ω (U (p_i-μ_k)^T(p_j-μ_k)), (formula four)

WhereinIt is an element of weight vector ω.Indicate covariance matrix ∑_kZ-th of feature Value.Pixel p is indicated on the right of (formula four) equation_iAnd p_jIn the inner product for being constituted coordinate system using the column vector of U as reference axis.In addition, power Weight ω can make the inner product for the variation robust of illumination；For example when same color value is illuminated by the light influence, brightness value can be sent out Changing.If the pixel value of direct calculation amount pixel at this time, can occur accidentally to survey；However in the brightness side of a certain color component z In the bigger wicket of difference, the characteristic value of corresponding covariance matrixAlso bigger, therefore can be balanced divided by this feature value The influence that similarity value is illuminated by the light.

C. the feature vector of the similar matrix A in step B is calculated, it is as follows which is write as Formal Representation:

Av=λ v (formula five)

Wherein v indicates feature vector, and λ indicates characteristic value.Generally each feature vector v represents a cluster result, Corresponding characteristic value indicates the cohesion measure value of cluster result.Each principal component represents the potential target in an image Or target component.

D. each feature vector in step C is converted into two dimensional image format, then value is normalized in [0,255], The process is defined as follows:

Wherein v_kIndicating k-th of feature vector of similar matrix A, min (v) indicates to take the minimum value in feature vector v, Max (v) expression takes the maximum value in feature vector v, and mod (k, w) indicates that k takes the remainder w, and the V (x, y) of two-dimensional format is referred to as Object diagram, each feature vector have corresponded to an object diagram.Therefore, it is also contained in each object diagram latent in an image In target or target component；

E. the object diagram in step D is detected applied to conspicuousness, the purpose of conspicuousness detection is to be partitioned into image to dive Conspicuousness target can be calculated by the following formula with the method for threshold value come the conspicuousness target in cutting object figure:

Wherein T_cRepresent given threshold value, V^*(x, y) represents thresholding as a result, since an object diagram represents just image In one or part conspicuousness target, find whole conspicuousness targets, the combination of multiple object diagrams in order to as much as possible Also it is suggested, which is defined as follows:

Wherein m_hRepresent the mean value of outermost pixel in object diagram, V_sCompound object figure up to standard.Threshold value meter in (formula seven) Calculation method can also be used for the compound object figure V of (formula eight) acquisition_sTo divide conspicuousness target；

F. the object diagram in step D is applied to the generation that target suggests window.Tuscany (Canny) boundary operator is applied The edge of object is obtained in each object diagram, so the extraneous rectangle at connection edge is all a potential target suggestion Window.Window, which does primary screening, is suggested to the potential target that multiple target figures obtain, for potential target windows overlay rate two-by-two Window greater than 0.9 removes one.Final remaining potential target window is that finally obtained target suggests window.

In step, the similarity in the window between any two pixel can be measured by following methods:

A1. as window Ω_kWhen middle partial pixel runs off image boundary, the Ω of corresponding window_kPixel is ignored calculating；

A2. the decimal τ except 0 is prevented to be set as 10^-5。

In stepb, the calculating of the similar matrix A includes following sub-step:

B1. in each element of calculating similar matrix A, constant C is set as 1；

B2. in each element of calculating similar matrix A, computation sequence is given for the element computed repeatedly and covers the former together One element.

In step E, it includes following sub-step that target figure, which is applied to conspicuousness detection:

E1. threshold value T in (formula seven)_cSelection be according to industrial practice threshold value from 0~255 one by one choose obtain threshold value Change result；

E2. it includes some make an uproar that object diagram, which carries out the two-value template obtained after Threshold segmentation by the threshold value obtained with (formula seven), Sound, simple morphological image algorithm be used to remove noise in two-value template.

The present invention discloses a kind of new target detection method estimated based on image-region cohesion, comprising steps of A. is based on A wicket on one sub-picture defines two pixel p in wicket_iAnd p_jBetween similarity measure, i, j=1 ..., N, N are image pixel points, and N is natural number.B. any two picture in image in the application of similarity measure defined in step A Between vegetarian refreshments, the similarity between all pixels point in whole image then constitutes the similar matrix A of a N × N.Similar matrix A to because light shade variation caused by color degradation very robust.Therefore, image clustering is done with similar matrix A to be easier to divide Similar target area out.C. the feature vector ν of similar matrix A is calculated, a generally feature vector ν has been corresponded in image One target or target component.D. the value in the feature vector ν of similar matrix A is mapped between 0-255, and turns and changes into The form of two dimensional image is referred to as target figure V (object map), a target figure V correspond to a feature of similar matrix A to Measure ν.E. target figure V obtained in step D is detected applied to conspicuousness.For target figure, we are divided using threshold method Cut out the conspicuousness target in each target figure V.F. target figure V obtained in step D is applied to target and suggests window (Object Proposal) is generated.Edge is detected using Tuscany (Canny) edge detection operator in each target figure V, is owned The boundary rectangle at connection edge be all used as a potential target to suggest window, 0.9 window is greater than by removal multiplicity Final target is obtained afterwards suggests window.

The method that image-region cohesion proposed by the present invention is estimated also be applied to conspicuousness detection, conspicuousness detection also by As Computer Vision Task a basic task and be applied to other Computer Vision Tasks extensively.Image of the invention Region cohesion Measurement Method is used to solve target detection and conspicuousness test problems simultaneously.

Detailed description of the invention

Fig. 1 is the object diagram example diagram of the embodiment of the present invention.

Fig. 2 is that the target of the embodiment of the present invention suggests that window generates result figure.

Fig. 3 is the present invention and other several conspicuousness detection methods on MSRA10K data set and THUR15K data set The PR curve graph of comparison.Wherein, curve method of the invention corresponds to the present invention；

Method 1 corresponds to method (J.Kim, D.Han, Y.-W.Tai, the and J.Kim.Salient of J.Kim et al. proposition region detection via high‐dimensional color transform.In CVPR,2014.)；

Method 2 correspond to M.-M.Cheng et al. proposition method (M.-M.Cheng, N.J.Mitra, X.Huang, P.H.S.Torr,and S.‐M.Hu.Global contrast based salient region detection.TPAMI, 37(3):569–582,2014)；

Method 3 corresponds to method (S.Goferman, L.Zelnik-Manor, the and of S.Goferman et al. proposition A.Tal.Context‐aware saliency detection.TPMAI,34(10):1915–1926,2012.)；

Method 4 corresponds to method (L.Itti, C.Koch, and E.Niebur.A the model of of L.Itti et al. proposition saliency‐based visual attention for rapid scene analysis.TPAMI,20(11):1254– 1259,1998.)；

Method 5 corresponds to method (R.Achanta, F.Estrada, P.Wils, the and S.S of R.Achanta et al. proposition ¨usstrunk.Salient region detection and segmentation.In ICVS,pages 66–75, 2008.)；

Method 6 corresponds to method (the Y.Zhai and M.Shah.Visual attention of Y.Zhai et al. proposition detection in video sequences using spatiotemporal cues.In ACM MM,pages 815– 824,2006.)；

Method 7 corresponds to method (the X.Hou and L.Zhang.Saliency detection:A of X.Hou et al. proposition spectral residual approach.In CVPR,pages 1–8,2007.)；

Method 8 corresponds to method (J.Harel, C.Koch, the and P.Perona.Graph- of J.Harel et al. proposition based visual saliency.In NIPS,pages 545–552,2007.)；

Method 9 corresponds to method (the N.D.B.Bruce and of N.D.B.Bruce et al. proposition J.K.Tsotsos.Saliency,attention,and visual search:An information theoretic approach.Journal of Vision,9(3):1915–1926,2009.)。

Method 10 corresponds to method (L.Itti, C.Koch, and the E.Niebur.A model of L.Itti et al. proposition of saliency‐based visual attention for rapid scene analysis.TPAMI,20(11): 1254–1259,1998.)。

Specific embodiment

It elaborates with reference to the accompanying drawings and examples to method of the invention, the present embodiment is with the technology of the present invention side Implemented under premised on case, give embodiment and specific operation process, but protection scope of the present invention be not limited to it is following Embodiment.

Referring to Fig. 1, the embodiment of the embodiment of the present invention the following steps are included:

1. a kind of target detection estimated based on image-region cohesion, it is characterised in that: the following steps are included:

A. a height of h is given, width is each pixel p on the color image of w_i(x, y), wherein i is pixel subscript, and x and y divide Not Wei pixel abscissa and ordinate.Coordinate of the pixel in RGB color are as follows: p_i=< r_i,g_i,b_i>, wherein r, G, b respectively indicate the value of three color component of RGB.Based on including p_iA fixed size smooth window Ω_k(wherein K=x × w+y indicates window subscript, and window size is usually 3*3 or 5*5), p_iLocal normalized vector is defined as:

H(p_i)^T·H(p_j)=(p_i-μ_k)^T(∑_k+τE)^-1(p_j-μ_k), (formula two)

Wherein ∑_kFor window Ω_kThe covariance of interior pixel, E are unit matrix, the operation of T representing matrix transposition.(formula two) Indicate the similarity in window between any two pixel.

B. it is based on step A, for a height of h, width is that the similarity in the color image of w between any two pixel can be with table State into a similarity matrix:

Wherein C is a constant, for distinguishing window Ω_kThe region and window Ω that interior similarity is 0_kExcept region.Phase It is the matrix of a N × N like matrix A, wherein N=w*h.Matrix D is a diagonal matrix, each of which element It is decomposed by singular value (SVD), (formula two) (both (formula three) molecular moieties) can resolve into (formula four).

H(p_i)^T·H(p_j)=ω (U (p_i-μ_k)^T(p_j-μ_k)), (formula four)

WhereinIt is an element of weight vector ω.Indicate covariance matrix ∑_kZ-th of feature Value.Pixel p is indicated on the right of (formula four) equation_iAnd p_jIn the inner product for being constituted coordinate system using the column vector of U as reference axis.In addition it weighs Weight ω can make the inner product for the variation robust of illumination.For example when same color value is illuminated by the light influence, brightness value can be sent out Changing.It can occur accidentally to survey if the pixel value of direct calculation amount pixel at this time.However in the brightness variance of a certain color component z In bigger wicket, the characteristic value of corresponding covariance matrixAlso bigger.Therefore phase can be balanced divided by this feature value The influence being illuminated by the light like angle value.

Av=λ v (formula five)

Wherein v_kIndicating k-th of feature vector of similar matrix A, min (v) indicates to take the minimum value in feature vector ν, Max (ν) expression takes the maximum value in feature vector ν, and mod (k, w) indicates that k takes the remainder w.The V (x, y) of two-dimensional format is referred to as Object diagram, each feature vector have corresponded to an object diagram.Therefore, it is also contained in each object diagram latent in an image In target or target component.Partial objects figure is as shown in Figure 1, wherein Fig. 1 (a) indicates original input picture, Fig. 1 (b)-(g) For the corresponding feature vector object diagram generated of the first six maximum characteristic value.

E. the object diagram in step D is detected applied to conspicuousness.The purpose of conspicuousness detection is to be partitioned into image to dive Conspicuousness target, with the method for threshold value come the conspicuousness target in cutting object figure.It can be calculated by the following formula:

Wherein T_cRepresent given threshold value, V^*(x, y) represents thresholding result.Since an object diagram represents just image In one or part conspicuousness target, find whole conspicuousness targets, the combination of multiple object diagrams in order to as much as possible Also it is suggested.The process is defined as follows:

Wherein m_hRepresent the mean value of outermost pixel in object diagram, V_sCompound object figure up to standard.Threshold value meter in (formula seven) Calculation method can also be used for the compound object figure V of (formula eight) acquisition_sTo divide conspicuousness target.Method application of the invention In conspicuousness detection and comparison of other several conspicuousness detection methods on MSRA10K data set and THUR15K data set Recall rate and precision curve graph are as shown in Figure 3.Wherein Fig. 3 (a) is that the method for the present invention is several with other using the first two feature vector The recall rate and precision curve graph that kind conspicuousness detection method obtains on MSRA10K data set.Fig. 3 (b) is the method for the present invention The recall rate and precision obtained on THUR15K data set using the first two feature vector and other several conspicuousness detection methods Curve graph.

Table 1

Feature vector number	6	12	24	48	96
						Recall rate	0.2983	0.4326	0.6548	0.7770	0.9913
Precision	0.0107	0.0055	0.0046	0.0012	0.0009

Final target suggests that window effect figure is as shown in Figure 2.Using recall rate acquired in different characteristic number of vectors With accuracy value referring to table 1.

Claims

1. a kind of target detection method estimated based on image-region cohesion, it is characterised in that the following steps are included:

A. a height of h is given, width is each pixel p on the color image of w_i(x, y), wherein i is pixel subscript, and x and y are respectively The abscissa and ordinate of pixel, the pixel p_iThe coordinate of (x, y) in RGB color are as follows: p_i=< r_i,g_i,b_i>, wherein R, g, b respectively indicate the value of three color component of RGB；Based on including p_iA fixed size smooth window Ω_kIts Middle k=x × w+y indicates window subscript, and window size is usually 3 × 3 or 5 × 5, p_iLocal normalized vector is defined as:

Wherein μ_kFor window Ω_kThe mean value of interior pixel, σ_kFor window Ω_kThe variance of interior pixel, τ are the constant of a very little to prevent Except 0, operator/be expressed as a little removing in formula one；H(p_i) assign the linear statement of each one part of pixel；Based on formula One, window Ω_kThe definition of inner product of the local normalized vector of interior any two pixel are as follows:

H(p_i)^T·H(p_j)=(p_i-μ_k)^T(Σ_k+τE)^-1(p_j-μ_k), (formula two)

Wherein Σ_kFor window Ω_kThe covariance of interior pixel, E are unit matrix, the operation of T representing matrix transposition, the expression window of formula two Similarity in mouthful between any two pixel；

B. for a height of h, width is that the similarity in the color image of w between any two pixel can state a similarity as Matrix:

Wherein C is a constant, for distinguishing window Ω_kThe region and window Ω that interior similarity is 0_kExcept region；Similar square Battle array A is the matrix of a N × N, wherein N=w*h；Matrix D is a diagonal matrix, each of which elementPass through Singular value decomposition, formula two, i.e. three molecular moiety of formula, can resolve into:

H(p_i)^T·H(p_j)=ω (U (p_i-μ_k)^T(p_j-μ_k)), (formula four)

WhereinIt is an element of weight vector ω；Indicate covariance matrix ∑_kZ-th of characteristic value；Formula Pixel p is indicated on the right of four equatioies_iAnd p_jIn the inner product for being constituted coordinate system using the column vector of U as reference axis；In addition, weights omega can be with So that variation robust of the inner product for illumination；For example when same color value is illuminated by the light influence, brightness value can change；This If when direct calculation amount pixel pixel value, can occur accidentally to survey；However the brightness variance ratio in a certain color component z is biggish In wicket, the characteristic value of corresponding covariance matrixAlso bigger, thus divided by this feature value can balance similarity value by The influence of illumination；

A ν=λ ν (formula five)

Wherein ν indicates feature vector, and λ indicates characteristic value；Generally each feature vector ν represents a cluster result, corresponding Characteristic value indicate cluster result cohesion measure value；Each principal component represent potential target in an image or Target component；

D. each feature vector in step C is converted into two dimensional image format, then value is normalized in [0,255], the mistake Journey is defined as follows:

WhereinY=mod (k, w)

Wherein ν_kIndicate k-th of feature vector of similar matrix A, min (ν) indicates to take the minimum value in feature vector ν, max (ν) It indicates to take the maximum value in feature vector ν, mod (k, w) indicates that k takes the remainder w, and the V (x, y) of two-dimensional format is referred to as object Figure, each feature vector have corresponded to an object diagram；Therefore, the potential mesh in an image is also contained in each object diagram Mark or target component；

E. the object diagram in step D is detected applied to conspicuousness, the purpose of conspicuousness detection is to be partitioned into image potentially Conspicuousness target can be calculated by the following formula with the method for threshold value come the conspicuousness target in cutting object figure:

Wherein T_cRepresent given threshold value, V^*(x, y) represents thresholding as a result, since an object diagram represents just in image One or part conspicuousness target find whole conspicuousness targets in order to as much as possible, the combination of multiple object diagrams also by It proposes, which is defined as follows:

Wherein m_hRepresent the mean value of outermost pixel in object diagram, V_sIndicate compound object figure；Threshold value calculation method in formula seven It can also be used for the compound object figure V of the acquisition of formula eight_sTo divide conspicuousness target；

F. the object diagram in step D is applied to the generation that target suggests window；It is right that Tuscany boundary operator is used in each Edge as obtaining object on figure, so the extraneous rectangle at connection edge is all that a potential target suggests window；To more The potential target that a target figure obtains suggests that window does primary screening, for potential target windows overlay rate two-by-two greater than 0.9 Window removes one；Final remaining potential target window is that finally obtained target suggests window.

2. a kind of target detection method estimated based on image-region cohesion as described in claim 1, it is characterised in that in step A In, the similarity in the window between any two pixel is measured by following methods:

A2. the decimal τ except 0 is prevented to be set as 10^-5。

3. a kind of target detection method estimated based on image-region cohesion as described in claim 1, it is characterised in that in step B In, the calculating of the similar matrix A includes following sub-step:

B1. in each element of calculating similar matrix A, constant C is set as 1；

B2. in each element of calculating similar matrix A, computation sequence is given for the element computed repeatedly and covers the former same unitary Element.

4. a kind of target detection method estimated based on image-region cohesion as described in claim 1, it is characterised in that in step E In, it includes following sub-step that target figure, which is applied to conspicuousness detection:

E1. threshold value T in formula seven_cSelection be according to industrial practice threshold value from 0~255 one by one choose obtain thresholding result；

E2. it includes some noises, letter that object diagram, which carries out the two-value template obtained after Threshold segmentation by the threshold value obtained with formula seven, Single morphological image algorithm be used to remove noise in two-value template.