CN105868789A - Object discovery method based on image area convergence measurement - Google Patents
Object discovery method based on image area convergence measurement Download PDFInfo
- Publication number
- CN105868789A CN105868789A CN201610212736.4A CN201610212736A CN105868789A CN 105868789 A CN105868789 A CN 105868789A CN 201610212736 A CN201610212736 A CN 201610212736A CN 105868789 A CN105868789 A CN 105868789A
- Authority
- CN
- China
- Prior art keywords
- target
- window
- pixel
- image
- formula
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 56
- 238000005259 measurement Methods 0.000 title abstract 4
- 238000001514 detection method Methods 0.000 claims abstract description 59
- 239000011159 matrix material Substances 0.000 claims description 44
- 239000013598 vector Substances 0.000 claims description 44
- 238000010586 diagram Methods 0.000 claims description 29
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000005286 illumination Methods 0.000 claims description 9
- 230000000694 effects Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 239000000470 constituent Substances 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000017105 transposition Effects 0.000 claims description 3
- 230000000877 morphologic effect Effects 0.000 claims description 2
- 238000000354 decomposition reaction Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 10
- 230000000007 visual effect Effects 0.000 description 8
- 238000009499 grossing Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000009189 diving Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000008447 perception Effects 0.000 description 2
- 238000011524 similarity measure Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000000691 measurement method Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses an object discovery method based on image area convergence measurement, and relates to a computer vision technology. According to the object discovery method based on the image area convergence measurement, a few object suggest windows are rapidly given and the object suggest windows are enabled to comprise objects as much as possible. At the same time, the problems of object detection and significance detection are solved. The method based on the image area convergence measurement, brought forward by the invention, is also applied to the significance detection, and the significance detection is also taken as a basic task of a computer vision task and is widely applied to other computer vision tasks.
Description
Technical field
The present invention relates to computer vision technique, be specifically related to a kind of based on the cohesion target detection estimated of image-region
Method.
Background technology
One, human perception world important sources is just by visual information, and research shows, the mankind obtain in external information big
The information that there are about 80%~90% comes from the visual information that human eye obtains.Mankind's visual information perception to external world is very
Height, can quickly position target and analyze target.One of main task of computer vision is desirable for computer exactly to be possessed
The Target detection and identification ability that the similar mankind are powerful.Target detection is an important premise of visually-perceptible and object understanding
Work, the efficiency of Target Acquisition and precision decide speed and the effect of visually-perceptible.Therefore, to the target in computer vision
Detection technique is furtherd investigate, and improves constantly detection and the accuracy rate identified, has important practical significance.
Academic circles at present solves the development trend of this problem from using heuristic to using machine learning
Method.Feature used also turns to the feature of task based access control extracted in self-adaptive from manual feature.The model of Target detection and identification is also
Start occur detecting from single specific objective and recognizing the functional shift that multi-target detection is carried out simultaneously with identification.Most typical
Example is exactly the appearance of degree of depth learning model, solves the model target just for limited task of conventional target detection and identification
Detection and the effective problem of identification mission.Such as, when calendar year 2001, the obverse face detection framework that Viola Jone [1] proposes
Quite effective to Face datection based on Harr feature, but it not very good for side face face and pedestrian detection effect.Until
2005, Dalal et al. [2] proposed HOG feature and uses HOG (Histogram of corresponding to each smoothing windows for SVM
Gradient), after feature carries out the strategy classified, vertical pedestrian's Detection results has just had the breakthrough of a matter.But, HOG this
Manual feature, the Detection results for image classification and the target such as identification and the pedestrian of any attitude, animal, plant does not make
People is satisfied.Then deformation model [3] (Deformable Part Models:DPM) arises at the historic moment, and solves have the target of deformation to examine
Survey problem.Although deformation model try solve because of deformation cause target detection less than problem, but the shape needed in its model
Becoming in parts reality and be sometimes difficult to be captured to, the feature that reason is become reconciled with regard to the model that neither one is good carrys out identification component, therefore
It detects the upper effect of data set (such as PASCAL VOC, ImageNet) at multi-class targets is not very good.Nearest one dashes forward
Broken sex work is the appearance of degree of depth learning model.In maximum image classification and target detection data set ImageNet, based on
The raising of the Target detection and identification precision that one of degree of depth learning model convolutional neural networks (CNN) does exceedes the highest especially
Precision one times more than.The algorithm almost all of ImageNet data set classification in nearest 2 years and detection superior uses convolution god
Through network, they different network structures is different.ImageNet data set epigraph classification at present and target detection are the highest
Precision is respectively 95% and 55%.
Although method based on convolutional neural networks improves the highest precision on Target detection and identification, but due to
Convolutional neural networks network is complicated and computationally intensive, and it is not the highest for applying efficiency in target detection, and the most a lot of methods are all
It is based on GPU, object detection program to be accelerated.A given target image, uses smoothing windows strategy to carry out target inspection
Surveying, even if using GPU to accelerate, its algorithm complex is the biggest, extremely inefficient.In order to solve convolutional neural networks in target
Efficiency in detection, the solution of current main flow can be divided three classes.The first kind is the method [4] cut based on figure, first
Given image is carried out image segmentation, obtains some potential target areas by segmentation block.Then with convolutional neural networks pair
These target areas carry out feature extraction and classification, finally give the position of target.The shortcoming of this method is to rely on image
The performance of segmentation.Equations of The Second Kind is, by convolutional neural networks, original image is extracted feature, then uses smoothing windows on characteristic pattern
Strategy does the recurrence of target location and the classification [5] of target.This method is utilizing convolutional neural networks to big figure extraction feature
Time, can be lost some and to classification and return useful feature information, the model performance therefore finally obtained is unable to reach optimum.The
Three class methods are then to find parts by the advantage of convolutional neural networks classification, and then build deformation model, use deformation model
Thought target is detected [6].But the target detection in the classification of this convolutional neural networks and deformation model is separately
The way performed so that the detection performance of general frame is general, and the efficiency of the most this model is not the highest.
List of references:
[1]P.Viola and M.Jones.Robust real time object detection.In IEEE ICCV
Workshop on Statistical and Computational Theories of Vision,2001.
[2]N.Dalal and B.Triggs,“Histograms of Oriented Gradients for Human
Detection,”Proc.IEEE Conf.Computer Vision and Pattern Recognition,2005.
[3]P.F.Felzenszwalb,R.B.Girshick,D.McAllester,and D.Ramanan,“Object
Detection with Discriminatively Trained Part Based Models,”IEEE Trans.Pattern
Analysis and Machine Intelligence,vol.32,no.9,pp.1627‐1645,Sept.2010.
[4]R.Girshick,J.Donahue,T.Darrell,and J.Malik.Rich feature
hierarchies for accurate object detection and semantic segmentation.In CVPR,
2014.
[5]P.Sermanet,D.Eigen,X.Zhang,M.Mathieu,R.Fergus,and
Y.LeCun.Overfeat:Integrated recognition,localization and detection using
convolutional networks.CoRR,2013.
[6]Ross B.Girshick,Forrest N.Iandola,Trevor Darrell,Jitendra
Malik.Deformable Part Models are Convolutional Neural Networks.CoPR,2014.
Summary of the invention
It is an object of the invention to provide and can quickly provide a small amount of target suggestion window so that in target suggestion window to the greatest extent
The a kind of based on the cohesion target detection method estimated of image-region of target may be comprised.
The present invention comprises the following steps:
A. a height of h is given, each pixel p on the coloured image of a width of wi(x, y), wherein i is pixel subscript, x and y divides
Wei the abscissa of pixel and vertical coordinate, this pixel pi(x, y) coordinate in RGB color is: pi=< ri,gi,bi>,
Wherein r, g, b represent the value of RGB three color component respectively.Based on comprising piThe smooth window of a fixed size
Ωk(wherein k=x × w+y represents window subscript, and window size is usually 3*3 or 5*5), piThe definition of local normalized vector
For:
Wherein μkFor window ΩkThe average of interior pixel, σkFor window ΩkThe variance of interior pixel, τ is a constant the least
To prevent except 0, operator/be expressed as a little removing at (formula one).H(pi) give linearly stating of each one local of pixel.
Based on (formula one), window ΩkThe definition of inner product of the local normalized vector of interior any two pixel is:
H(pi)T·H(pj)=(pi-μk)T(∑k+τE)-1(pj-μk), (formula two)
Wherein ∑kFor window ΩkThe covariance of interior pixel, E is unit matrix, and T representing matrix transposition operates, (formula two)
Represent the similarity between any two pixel in window.
B. for a height of h, in the coloured image of a width of w, the similarity between any two pixel can state a phase as
Seemingly spend matrix:
Wherein C is a constant, is used for distinguishing window ΩkInterior similarity is region and the window Ω of 0kOutside region.Phase
It is the matrix of a N × N, wherein N=w*h like matrix A.Matrix D is a diagonal matrix, each of which element
Being decomposed by singular value (SVD), (formula two) (i.e. (formula three) molecular moiety) can resolve into:
H(pi)T·H(pj)=ω (U (pi-μk)T(pj-μk)), (formula four)
WhereinIt it is an element of weight vector ω.Represent covariance matrix ∑kThe z feature
Value.Pixel p is represented on the right of (formula four) equationiAnd pjIn the inner product being constituted coordinate system with the column vector of U for coordinate axes.It addition, power
Weight ω is so that this inner product is for the change robust of illumination;Such as when same color value is by illumination effect, brightness value can be sent out
Changing.If now the pixel value of direct amount of calculation pixel, then can occur to survey by mistake;But in the brightness side of a certain color component z
In the wicket that difference is bigger, the eigenvalue of corresponding covariance matrixThe biggest, therefore can balance divided by this feature value
Similarity value is affected by illumination.
C. the characteristic vector of the similar matrix A in calculation procedure B, it is as follows that this process is write as Formal Representation:
Av=λ v (formula five)
Wherein v represents that characteristic vector, λ represent eigenvalue.The most each characteristic vector v represents a cluster result,
Characteristic of correspondence value represents the cohesion measure value of cluster result.Each main constituent represents the potential target in an image
Or target component.
D. each characteristic vector in step C is converted into two dimensional image form, then value is normalized in [0,255],
This process is defined as follows:
Wherein vkRepresent that the kth characteristic vector of similar matrix A, min (v) expression take the minima in characteristic vector v,
Max (v) expression takes the maximum in characteristic vector v, and (k, w) expression k is to w remainder number, and (x y) is referred to as the V of two-dimensional format for mod
Object diagram, an object diagram that each characteristic vector is corresponding.Therefore, also contains diving in an image in each object diagram
At target or target component;
E. the object diagram in step D being applied to significance detection, the purpose of significance detection is to be partitioned in image to dive
Significance target, carry out the significance target in cutting object figure by the method for threshold value, it can be calculated by below equation:
Wherein TcRepresent given threshold value, V*(x, y) represents thresholding result, owing to an object diagram represent just image
In one or part significance target, find whole significance target, the combination of multiple object diagram in order to as much as possible
Also being suggested, this process is defined as follows:
Wherein mhRepresent the average of outermost pixel, V in object diagramsCompound object figure up to standard.Threshold value meter in (formula seven)
Calculation method can also be used for the compound object figure V that (formula eight) obtainssTo split significance target;
F. the object diagram in step D is applied to the generation of target suggestion window.Tuscany (Canny) boundary operator is employed
To obtain the edge of object in each object diagram, so the extraneous rectangle at UNICOM edge is all a potential target suggestion
Window.The potential target suggestion window obtaining multiple target figures does primary screening, for potential target windows overlay rate two-by-two
Window more than 0.9 removes one.Final remaining potential target window is the target suggestion window finally given.
In step, in described window, similarity between any two pixel can be measured by following methods:
A1. as window ΩkWhen middle partial pixel runs off image boundary, the Ω of corresponding windowkPixel is left in the basket calculating;
A2. prevent from being set as 10 except the decimal τ of 0-5。
In stepb, the calculating of described similar matrix A includes following sub-step:
B1., when calculating each element of similar matrix A, constant C is set as 1;
B2. when calculating each element of similar matrix A, for the element of double counting give computation sequence cover the former with
One element.
In step E, target figure is applied to significance detection and includes following sub-step:
E1. threshold value T in (formula seven)cTo choose be, according to industrial practice, threshold value is chosen acquisition threshold value one by one from 0~255
Change result;
E2. the two-value template that object diagram obtains after being carried out Threshold segmentation by the threshold value obtained with (formula seven) comprises some and makes an uproar
Sound, simple morphological image algorithm is used in two-value template removing noise.
The present invention discloses a kind of new based on the cohesion target detection method estimated of image-region, including step: A. based on
A wicket on one sub-picture, two pixel p in definition wicketiAnd pjBetween similarity measure, i, j=1 ...,
N, N are that image pixel is counted, and N is natural number.B. any two picture in image in the similarity measure defined in step A being applied
Between vegetarian refreshments, the similarity between all pixels on whole image then constitutes the similar matrix A of a N × N.Similar matrix
A changes, to because of light shade, the color degradation very robust caused.Therefore, do image clustering with similar matrix A and be easier to segmentation
Go out similar target area.C. characteristic vector ν of similar matrix A is calculated, generally in the corresponding image of characteristic vector ν
One target or target component.D. the value in characteristic vector ν of similar matrix A is mapped between 0-255, and turns and change into
The form of two dimensional image is referred to as target figure V (object map), a feature of a target figure V correspondence similar matrix A to
Amount ν.E. the target figure V obtained in step D is applied to significance detection.For target figure, we use threshold method to divide
Cut out the significance target in each target figure V.F. the target figure V obtained in step D is applied to target suggestion window
(Object Proposal) generates.Tuscany (Canny) edge detection operator detection edge is used in each target figure V, all
The boundary rectangle at UNICOM edge all as potential target suggestion window, by removing the multiplicity window more than 0.9
After obtain final target suggestion window.
The present invention propose the cohesion method estimated of image-region be also applied to significance detection, significance detection also by
As a basic task of Computer Vision Task and be extensively applied to other Computer Vision Task.The image of the present invention
The cohesion Measurement Method in region is used to solve target detection and significance test problems simultaneously.
Accompanying drawing explanation
Fig. 1 is the object diagram illustration of the embodiment of the present invention.
Fig. 2 is that the target suggestion window of the embodiment of the present invention generates result figure.
Fig. 3 is the present invention with other several significance detection methods on MSRA10K data set and THUR15K data set
The PR curve chart of contrast.Wherein, the method for the curve present invention corresponds to the present invention;
Method 1 corresponds to method (J.Kim, D.Han, Y. W.Tai, the and J.Kim.Salient that J.Kim et al. proposes
region detection via high‐dimensional color transform.In CVPR,2014.);
Method 2 correspond to M. M.Cheng et al. propose method (M. M.Cheng, N.J.Mitra, X.Huang,
P.H.S.Torr,and S.‐M.Hu.Global contrast based salient region detection.TPAMI,
37(3):569–582,2014);
Method 3 corresponds to method (S.Goferman, L.Zelnik Manor, the and that S.Goferman et al. proposes
A.Tal.Context‐aware saliency detection.TPMAI,34(10):1915–1926,2012.);
Method 4 corresponds to method (L.Itti, C.Koch, the and E.Niebur.A model of that L.Itti et al. proposes
saliency‐based visual attention for rapid scene analysis.TPAMI,20(11):1254–
1259,1998.);
Method 5 corresponds to method (R.Achanta, F.Estrada, P.Wils, the and S.S that R.Achanta et al. proposes
¨usstrunk.Salient region detection and segmentation.In ICVS,pages 66–75,
2008.);
Method 6 corresponds to method (the Y.Zhai and M.Shah.Visual attention that Y.Zhai et al. proposes
detection in video sequences using spatiotemporal cues.In ACM MM,pages 815–
824,2006.);
Method 7 corresponds to method (the X.Hou and L.Zhang.Saliency detection:A that X.Hou et al. proposes
spectral residual approach.In CVPR,pages 1–8,2007.);
Method 8 corresponds to method (J.Harel, C.Koch, the and P.Perona.Graph that J.Harel et al. proposes
based visual saliency.In NIPS,pages 545–552,2007.);
Method 9 corresponds to method (the N.D.B.Bruce and that N.D.B.Bruce et al. proposes
J.K.Tsotsos.Saliency,attention,and visual search:An information theoretic
approach.Journal of Vision,9(3):1915–1926,2009.)。
Method 10 corresponds to method (L.Itti, C.Koch, the and E.Niebur.A model that L.Itti et al. proposes
of saliency‐based visual attention for rapid scene analysis.TPAMI,20(11):
1254–1259,1998.)。
Detailed description of the invention
Elaborating the method for the present invention with embodiment below in conjunction with the accompanying drawings, the present embodiment is with the technology of the present invention side
Implement under premised on case, give embodiment and specific operation process, but protection scope of the present invention is not limited to following
Embodiment.
Seeing Fig. 1, the embodiment of the embodiment of the present invention comprises the following steps:
1. one kind based on the cohesion target detection estimated of image-region, it is characterised in that: comprise the following steps:
A. a height of h is given, each pixel p on the coloured image of a width of wi(x, y), wherein i is pixel subscript, x and y divides
Wei the abscissa of pixel and vertical coordinate.This pixel coordinate in RGB color is: pi=< ri,gi,bi>, wherein r,
G, b represent the value of RGB three color component respectively.Based on comprising piSmooth window Ω of a fixed sizek(wherein
K=x × w+y represents window subscript, and window size is usually 3*3 or 5*5), piLocal normalized vector be defined as:
Wherein μkFor window ΩkThe average of interior pixel, σkFor window ΩkThe variance of interior pixel, τ is a constant the least
To prevent except 0, operator/be expressed as a little removing at (formula one).H(pi) give linearly stating of each one local of pixel.
Based on (formula one), window ΩkThe definition of inner product of the local normalized vector of interior any two pixel is:
H(pi)T·H(pj)=(pi-μk)T(∑k+τE)-1(pj-μk), (formula two)
Wherein ∑kFor window ΩkThe covariance of interior pixel, E is unit matrix, and T representing matrix transposition operates.(formula two)
Represent the similarity between any two pixel in window.
B. based on step A, for a height of h, in the coloured image of a width of w, the similarity between any two pixel can be with table
State into a similarity matrix:
Wherein C is a constant, is used for distinguishing window ΩkInterior similarity is region and the window Ω of 0kOutside region.Phase
It is the matrix of a N × N, wherein N=w*h like matrix A.Matrix D is a diagonal matrix, each of which element
Being decomposed by singular value (SVD), (formula two) (both (formula three) molecular moieties) can resolve into (formula four).
H(pi)T·H(pj)=ω (U (pi-μk)T(pj-μk)), (formula four)
WhereinIt it is an element of weight vector ω.Represent covariance matrix ∑kThe z feature
Value.Pixel p is represented on the right of (formula four) equationiAnd pjIn the inner product being constituted coordinate system with the column vector of U for coordinate axes.Additionally weigh
Weight ω is so that this inner product is for the change robust of illumination.Such as when same color value is by illumination effect, brightness value can be sent out
Changing.If now the pixel value of direct amount of calculation pixel, can occur to survey by mistake.But in the brightness variance of a certain color component z
In bigger wicket, the eigenvalue of corresponding covariance matrixThe biggest.Therefore phase can be balanced divided by this feature value
Affected by illumination like angle value.
C. the characteristic vector of the similar matrix A in calculation procedure B, it is as follows that this process is write as Formal Representation:
Av=λ v (formula five)
Wherein v represents that characteristic vector, λ represent eigenvalue.The most each characteristic vector v represents a cluster result,
Characteristic of correspondence value represents the cohesion measure value of cluster result.Each main constituent represents the potential target in an image
Or target component.
D. each characteristic vector in step C is converted into two dimensional image form, then value is normalized in [0,255],
This process is defined as follows:
Wherein vkRepresent that the kth characteristic vector of similar matrix A, min (v) expression take the minima in characteristic vector ν,
Max (ν) expression takes the maximum in characteristic vector ν, and (k w) represents that k is to w remainder number to mod.(x y) is referred to as the V of two-dimensional format
Object diagram, an object diagram that each characteristic vector is corresponding.Therefore, also contains diving in an image in each object diagram
At target or target component.Partial objects figure as it is shown in figure 1, wherein Fig. 1 (a) represent original input picture, Fig. 1 (b) (g)
The object diagram generated by the first six maximum eigenvalue characteristic of correspondence vector.
E. the object diagram in step D is applied to significance detection.The purpose of significance detection is to be partitioned in image to dive
Significance target, carry out the significance target in cutting object figure by the method for threshold value.It can be calculated by below equation:
Wherein TcRepresent given threshold value, V*(x y) represents thresholding result.Owing to an object diagram represent just image
In one or part significance target, find whole significance target, the combination of multiple object diagram in order to as much as possible
Also it is suggested.This process is defined as follows:
Wherein mhRepresent the average of outermost pixel, V in object diagramsCompound object figure up to standard.Threshold value meter in (formula seven)
Calculation method can also be used for the compound object figure V that (formula eight) obtainssTo split significance target.The method application of the present invention
In significance detection and other several significance detection methods contrast on MSRA10K data set and THUR15K data set
Recall rate and precision curve chart are as shown in Figure 3.Wherein Fig. 3 (a) is that the inventive method uses the first two characteristic vector and other several
Plant recall rate and precision curve chart that significance detection method obtains on MSRA10K data set.Fig. 3 (b) is the inventive method
Use recall rate and precision that the first two characteristic vector and other several significance detection methods obtain on THUR15K data set
Curve chart.
F. the object diagram in step D is applied to the generation of target suggestion window.Tuscany (Canny) boundary operator is employed
To obtain the edge of object in each object diagram, so the extraneous rectangle at UNICOM edge is all a potential target suggestion
Window.The potential target suggestion window obtaining multiple target figures does primary screening, for potential target windows overlay rate two-by-two
Window more than 0.9 removes one.Final remaining potential target window is the target suggestion window finally given.
Table 1
Characteristic vector number | 6 | 12 | 24 | 48 | 96 |
Recall rate | 0.2983 | 0.4326 | 0.6548 | 0.7770 | 0.9913 |
Precision | 0.0107 | 0.0055 | 0.0046 | 0.0012 | 0.0009 |
Final target suggestion window effect figure is as shown in Figure 2.Use the recall rate acquired in different characteristic number of vectors
Table 1 is seen with accuracy value.
Claims (4)
1. one kind based on the cohesion target detection method estimated of image-region, it is characterised in that comprise the following steps:
A. a height of h is given, each pixel p on the coloured image of a width of wi(x, y), wherein i is pixel subscript, x and y is respectively
The abscissa of pixel and vertical coordinate, this pixel pi(x, y) coordinate in RGB color is: pi=< ri,gi,bi>, wherein
R, g, b represent the value of RGB three color component respectively;Based on comprising piSmooth window Ω of a fixed sizekIts
Middle k=x × w+y represents window subscript, and window size is usually 3 × 3 or 5 × 5, piLocal normalized vector be defined as:
Wherein μkFor window ΩkThe average of interior pixel, σkFor window ΩkThe variance of interior pixel, τ is that a constant the least is to prevent
Except 0, operator/be expressed as a little removing at formula one;H(pi) give linearly stating of each one local of pixel;Based on formula
One, window ΩkThe definition of inner product of the local normalized vector of interior any two pixel is:
H(pi)T·H(pj)=(pi-μk)T(∑k+τE)-1(pj-μk), (formula two)
Wherein ∑kFor window ΩkThe covariance of interior pixel, E is unit matrix, and T representing matrix transposition operates, and formula two represents window
Similarity between the interior any two pixel of mouth;
B. for a height of h, in the coloured image of a width of w, the similarity between any two pixel can state a similarity as
Matrix:
Wherein C is a constant, is used for distinguishing window ΩkInterior similarity is region and the window Ω of 0kOutside region;Similar square
Battle array A is the matrix of a N × N, wherein N=w*h;Matrix D is a diagonal matrix, each of which elementPass through
Singular value decomposition, formula two, i.e. formula three molecular moiety, can resolve into:
H(pi)T·H(pj)=ω (U (pi-μk)T(pj-μk)), (formula four)
WhereinIt it is an element of weight vector ω;Represent covariance matrix ∑kThe z eigenvalue;Formula
Pixel p is represented on the right of four equatioiesiAnd pjIn the inner product being constituted coordinate system with the column vector of U for coordinate axes;It addition, weights omega is permissible
Make this inner product for the change robust of illumination;Such as when same color value is by illumination effect, brightness value can change;This
If time direct amount of calculation pixel pixel value, then can occur to survey by mistake;But it is bigger in the brightness variance ratio of a certain color component z
In wicket, the eigenvalue of corresponding covariance matrixThe biggest, therefore can balance Similarity value divided by this feature value and be subject to
The impact of illumination;
C. the characteristic vector of the similar matrix A in calculation procedure B, it is as follows that this process is write as Formal Representation:
Av=λ v (formula five)
Wherein v represents that characteristic vector, λ represent eigenvalue;The most each characteristic vector v represents a cluster result, corresponding
Eigenvalue represent the cohesion measure value of cluster result;Each main constituent represent the potential target in an image or
Target component;
D. each characteristic vector in step C is converted into two dimensional image form, then value is normalized in [0,255], this mistake
Journey is defined as follows:
Wherein vkRepresent that the kth characteristic vector of similar matrix A, min (v) expression take the minima in characteristic vector v, max (v)
Representing and take the maximum in characteristic vector v, (k, w) expression k is to w remainder number, and (x y) is referred to as object to the V of two-dimensional format for mod
Figure, an object diagram that each characteristic vector is corresponding;Therefore, also contains the potential mesh in an image in each object diagram
Mark or target component;
E. the object diagram in step D being applied to significance detection, the purpose of significance detection is partitioned in image potential
Significance target, carrys out the significance target in cutting object figure by the method for threshold value, and it can be calculated by below equation:
Wherein TcRepresent given threshold value, V*(x, y) represents thresholding result, owing to an object diagram represent just in image
One or part significance target, find whole significance target in order to as much as possible, the combination of multiple object diagram also by
Proposing, this process is defined as follows:
Wherein mhRepresent the average of outermost pixel, V in object diagramsCompound object figure up to standard;Threshold value calculation method in formula seven
The compound object figure V that formula eight obtains can also be used forsTo split significance target;
F. the object diagram in step D is applied to the generation of target suggestion window;It is right that Tuscany boundary operator is used in each
As figure is upper to obtain the edge of object, so the extraneous rectangle at UNICOM edge is all a potential target advises window;To many
The potential target suggestion window that individual target figure obtains does primary screening, is more than 0.9 for potential target windows overlay rate two-by-two
Window removes one;Final remaining potential target window is the target suggestion window finally given.
A kind of based on the cohesion target detection method estimated of image-region, it is characterised in that in step A
In, in described window, the similarity between any two pixel is measured by following methods:
A1. as window ΩkWhen middle partial pixel runs off image boundary, the Ω of corresponding windowkPixel is left in the basket calculating;
A2. prevent from being set as 10 except the decimal τ of 0-5。
A kind of based on the cohesion target detection method estimated of image-region, it is characterised in that in step B
In, the calculating of described similar matrix A includes following sub-step:
B1., when calculating each element of similar matrix A, constant C is set as 1;
B2., when calculating each element of similar matrix A, computation sequence is given for the element of double counting and covers the former same unitary
Element.
A kind of based on the cohesion target detection method estimated of image-region, it is characterised in that in step E
In, target figure is applied to significance detection and includes following sub-step:
E1. threshold value T in formula sevencChoose be according to industrial practice threshold value from 0~255 one by one choose acquisition thresholding result;
E2. the two-value template that the threshold value that object diagram is obtained with formula seven obtains after carrying out Threshold segmentation comprises some noises, letter
Single morphological image algorithm is used in two-value template removing noise.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610212736.4A CN105868789B (en) | 2016-04-07 | 2016-04-07 | A kind of target detection method estimated based on image-region cohesion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610212736.4A CN105868789B (en) | 2016-04-07 | 2016-04-07 | A kind of target detection method estimated based on image-region cohesion |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105868789A true CN105868789A (en) | 2016-08-17 |
CN105868789B CN105868789B (en) | 2019-04-26 |
Family
ID=56636074
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610212736.4A Expired - Fee Related CN105868789B (en) | 2016-04-07 | 2016-04-07 | A kind of target detection method estimated based on image-region cohesion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105868789B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108734151A (en) * | 2018-06-14 | 2018-11-02 | 厦门大学 | Robust long-range method for tracking target based on correlation filtering and the twin network of depth |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103247049A (en) * | 2013-05-15 | 2013-08-14 | 桂林电子科技大学 | SMT (Surface Mounting Technology) welding spot image segmentation method |
CN104424642A (en) * | 2013-09-09 | 2015-03-18 | 华为软件技术有限公司 | Detection method and detection system for video salient regions |
-
2016
- 2016-04-07 CN CN201610212736.4A patent/CN105868789B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103247049A (en) * | 2013-05-15 | 2013-08-14 | 桂林电子科技大学 | SMT (Surface Mounting Technology) welding spot image segmentation method |
CN104424642A (en) * | 2013-09-09 | 2015-03-18 | 华为软件技术有限公司 | Detection method and detection system for video salient regions |
Non-Patent Citations (1)
Title |
---|
周昌雄等: "基于区域内一致性和区域间差异性的图像分割", 《中南大学学报》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108734151A (en) * | 2018-06-14 | 2018-11-02 | 厦门大学 | Robust long-range method for tracking target based on correlation filtering and the twin network of depth |
Also Published As
Publication number | Publication date |
---|---|
CN105868789B (en) | 2019-04-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111328396B (en) | Pose estimation and model retrieval for objects in images | |
CN109753885B (en) | Target detection method and device and pedestrian detection method and system | |
Kim et al. | Salient region detection via high-dimensional color transform | |
CN104835175B (en) | Object detection method in a kind of nuclear environment of view-based access control model attention mechanism | |
CN104392223B (en) | Human posture recognition method in two-dimensional video image | |
EP3813661A1 (en) | Human pose analysis system and method | |
CN106682598A (en) | Multi-pose facial feature point detection method based on cascade regression | |
Liu et al. | Efficient 3D object recognition via geometric information preservation | |
CN107239777B (en) | Tableware detection and identification method based on multi-view graph model | |
CN110298867B (en) | Video target tracking method | |
CN106446862A (en) | Face detection method and system | |
Wang et al. | Head pose estimation with combined 2D SIFT and 3D HOG features | |
CN114005169B (en) | Face key point detection method and device, electronic equipment and storage medium | |
CN104050460B (en) | The pedestrian detection method of multiple features fusion | |
CN110751097A (en) | Semi-supervised three-dimensional point cloud gesture key point detection method | |
Polewski et al. | Detection of single standing dead trees from aerial color infrared imagery by segmentation with shape and intensity priors | |
CN107862680A (en) | A kind of target following optimization method based on correlation filter | |
CN112396036A (en) | Method for re-identifying blocked pedestrians by combining space transformation network and multi-scale feature extraction | |
Donoser et al. | Robust planar target tracking and pose estimation from a single concavity | |
CN113724329A (en) | Object attitude estimation method, system and medium fusing plane and stereo information | |
Duong et al. | Accurate sparse feature regression forest learning for real-time camera relocalization | |
CN105868789B (en) | A kind of target detection method estimated based on image-region cohesion | |
CN116704029A (en) | Dense object semantic map construction method and device, storage medium and electronic equipment | |
Al-Shakarji et al. | Cs-loft: Color and scale adaptive tracking using max-pooling with bhattacharyya distance | |
Domhof et al. | Multimodal joint visual attention model for natural human-robot interaction in domestic environments |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190426 |
|
CF01 | Termination of patent right due to non-payment of annual fee |