CN106446933B - Multi-target detection method based on contextual information - Google Patents
Multi-target detection method based on contextual information Download PDFInfo
- Publication number
- CN106446933B CN106446933B CN201610785155.XA CN201610785155A CN106446933B CN 106446933 B CN106446933 B CN 106446933B CN 201610785155 A CN201610785155 A CN 201610785155A CN 106446933 B CN106446933 B CN 106446933B
- Authority
- CN
- China
- Prior art keywords
- target
- scene
- indicate
- image
- consistency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2323—Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/29—Graphical models, e.g. Bayesian networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
- G06V10/507—Summing image-intensity values; Histogram projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/35—Categorising the entire scene, e.g. birthday party or wedding scene
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Discrete Mathematics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a kind of multi-target detection methods based on contextual information, including Matching Model on training under line and line, using Gist this feature of input picture by the corresponding scene where at a distance from scene clustering center, selecting the picture, and obtain corresponding select probability;By running all existing simple target basis detector DPM of target class, corresponding target detection window and corresponding target detection score value are obtained, obtains the testing result of target in conjunction with Gist feature using trained context model.This method utilizes global context data separation different scenes, then according to mutuality of objectives under different scenes, corresponding target detection model is formed, interfering with each other between target between different scenes is effectively reduced, further improves the accuracy of multi-target detection.
Description
Technical field
The present invention relates to a kind of multiple targets based on contextual information that can be applied in real time in multi-target detection system
Detection technique.
Background technique:
When target detection based on image or video is computer vision field recent decades and is considerably long one section later
Interior research hotspot, is the basis of visual analysis.The technology can widely be adapted to target following, object detection and identification,
The subjects such as information security, autonomous driving, image retrieval, robot, human-computer interaction, medical image analysis, Internet of Things and engineering are answered
Use field.
Present object detection system mainly passes through the identification of target itself appearance information portrayed to realize different target
And detection.Currently, such system is mainly to utilize the feature (such as HOG, LBP, SIFT etc.) of engineer or pass through deep learning
The further feature that directly obtains from image itself portrays target appearance, utilizes target appearance, realizes target detection.But daily
Life it is actually detected in, mostly unconstrained open environment is complicated and changeable, and there are illumination variation, view transformation, target is hidden
The interference such as gear, if simple only from the appearance information of target itself, when the information that target itself provides in image or video
When very little, target category is only unable to judge accurately out according to target itself.
A kind of " target knowledge based on context constraint of the inventions such as the Chinese Central China University of Science and Technology Wang Yuehuan, Liu Chang, Chen Junling
Other method " it on December 7th, 2012 applies for a patent and gets the Green Light to China State Intellectual Property Office, in public affairs on April 17th, 2013
It opens, publication number: CN103049763A.
This publication disclose a kind of target identification methods based on context constraint, are used for remote sensing images scene classification
And the detection identification of target.This method is first filtered image, then carries out region segmentation, divides the image into more
A connected domain, and each connected domain is marked, secondly, calculating the feature vector of each connected domain, and it is input to prior instruction
Scene classification calculating is carried out in the classifier perfected, output category label figure then on this basis, identifies as needed
Target delimit target local area that may be present on label figure, and carries out pretreatment operation to the regional area,
Area-of-interest is calculated in the region, finally, extracting feature, and is input in classifier and is identified.The present invention provides one
Plant quickly and effectively scene classification method, it is intended to provide effective context constraint for target identification, improve recognition efficiency and standard
True rate.The algorithm flow is illustrated in fig. 1 shown below.
Above-mentioned patented technology document still has defect: although the patent is using cut zone and is labeled to obtain
Scene classification carries out global context constraint then on basis of classification to calculate area-of-interest, obtain correlated characteristic to
Amount identifies respective objects by trained classifier.But such method obtains target just with global scene context
Probability Area, it is contemplated that relative position distribution of the target based on scene, symbiosis is portrayed between having ignored target.In addition, working as target
When self-information content is smaller, target can not be accurately portrayed, corresponding target detection can not be obtained by classifier.
Summary of the invention:
The problems such as present invention is insufficient for target self-information, by means of the correlation in picture or video outside target
Information is directly or indirectly that target detection provides auxiliary information to improve the accuracy of target detection.
Realize technical solution used by the goal of the invention: a kind of multi-target detection method based on contextual information,
It is characterized in that including training and Matching Model on line under line, training obtains subtree model step under line:
Step 1: first against training set, being labeled the image object class in training set using LableMe software,
Obtain the training set image of target identification;And train the DPM detector of each target in image;Step 2: training set is calculated
The Gist feature of middle picture obtains global context information;Then scene partitioning is realized using improved Spectral Clustering;
Step 3: indicating scene by hidden variable, then under different scenes, according to the mark of the target of training picture
As a result the symbiosis and location distribution information of target are obtained;
Step 4: by calculating, target judges two mesh to the mapping distribution in transformed space in two width pictures in training set
Whether mark is consistency target, forms consistency target pair;
Step 5: the symbiosis and location distribution information and consistency target obtained using step 3 and step 4 is to logical
The Chow-Liu algorithm for crossing Weight carries out the study of tree construction, is then trained to parameter, obtains subtree model;
Matching Model step on line:
Step 1: when detecting, firstly, the Gist feature of calculating input image;
Step 2: then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training
And obtain the probability distribution of corresponding scene subspace;
Step 3: then, the detection score value of each target of image is obtained by the DPM detector of trained different target
With detection window information;
Step 4: the scene probability distribution and each target detection score value and detection window obtained using step 2 and step 3
Information, using iterative manner, the subtree prior model that training department separately wins under bonding wire seeks target detection and whether correct
Probability MAP estimation, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection
As a result.
Wherein, the Gist of 520 dimensions of every width picture in training set is obtained in the step of training obtains subtree model under line two
Feature, acquisition process step is: firstly, being filtered using the Gabor filter group of one group of different scale and direction to image
Wave obtains one group of image after filtering processing;Then, respectively that filtered image is non-overlapping according to fixed size progress
Grid dividing, and to image divide after each grid seek mean value;Finally, each grid mean value grade that image group is obtained
Connection forms global characteristics, obtains the Gist feature of 520 final dimensions of image, and expression formula is as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjIndicate that grid division is r × l's
Jth width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate image and Gabor filter
Convolution algorithm, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc.This programme uses 4
The Gabor filter in 8 direction of scale;
Wherein, 6-8 class subfield is obtained using improved Spectral Clustering in the step of training obtains subtree model under line two
Scape comprises the concrete steps that: firstly, inputting the Gist feature of every width figure in training set, being indicated using Random Forest method
In training set between each image similitude similar matrix;Then, using the similar matrix as input, using spectral clustering
Method clusters training set picture, realizes the scene partitioning of different training set pictures.
Wherein, the step of training obtains subtree model under line three incorporates consistency target pair in training subtree model
Subtree context model, comprises the concrete steps that:
(1)
Each representation in components for obtaining consistency target is as follows
Wherein, (lx(oik),ly(oik)) center of the target frame of i-th k-th of example of classification target in o figure is indicated
Coordinate.Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame
?;Similar (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated.
Scale is sc (qil), and visual angle is p (qil);Utilize variableIndicate same in two width figures
Corresponding variation of the class target variable on different four dimensions spaces, wherein r ∈ R indicates correlation, and R indicates each consistent
The similar target correspondence set of two width figures of property target centering,Indicate that the mutual variation of target position is closed
System,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;Pass through (2) formula
Calculated mapping distribution judges respective objects to whether consistency distribution is met, if met, respective objects are same to belonging to
One target normal form, that is, belong to consistency target pair;
(2)
It is clustered using greediness, generates target complex set final under different subspace, avoid turning by the way of soft ballot
Change sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition;If the frequency that target occurs is in target complex
Do not surpass 50%, then rejects the target in target complex;Ultimately form the target complex under different scenes subspace;In the target of formation
On the basis of group, in the same target complex, by the combination of two of inhomogeneity target, consistency target pair is formed;
(3)
By the consistency target of proposition between simple target symbiosis and mutual alignment relation portray expression mesh jointly
Local context information between mark;The steps include: firstly, portray consistency target to and sub-scene correlation:
θit=cfit×isfi (3)
Wherein, cfitIndicate i-th of consistency target to the frequency occurred in t-th of sub-scene, isfiIt indicates i-th
The inverse scene frequency index of consistency target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type comprising i-th of consistency target pair
Number, ξ are a minimum to avoid isfiValue is 0;Obtain all relative coefficient θitAfterwards, it is normalized;
(4)
Using the markup information of training set picture, at different sub-scene t, establish description target symbiosis binary tree and
The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly;
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|zt)∏ip(bi|bpa(i),zt) (5)
Wherein, i indicates that the node in tree, pa (i) indicate the father node of node i, bi∈ { 0,1 } indicate target i whether
Occur in image;With b ≡ { biIndicate all target class;brootIndicate the root node of subtree, ztIt is a discrete variable, table
Show t-th of sub-scene space;
The position L of target iiDependent on the appearance of target, the relation of interdependence between position has and target occurs
Consistent binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|broot)Πip(Li|Lpa(i),bi,bpa(i)) (6)
Wherein, LrootIndicate the position of root node, Lpa(i)Indicate the position of father node.
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(5) the testing result and Gist of trained simple target detector DPM
Global characteristics are dissolved into prior model, and global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WikIndicate the position of detected k-th of the simple target candidate window using target class i,
sikIndicate the score value of detected k-th of the simple target candidate window using target class i;cikIndicate target class i
K-th of candidate window whether be correctly to detect, otherwise it is 0 that if it is value, which is 1,;
(6) training subtree model mainly includes the study of tree structure and the study of relevant parameter;It is calculated using Chow-Liu
When method carries out the prior model study of tree, by the correlation of the consistency target pair and scene portrayed in formula (3)
θit, change the interactive information S of target centering father and son's nodei:
Si=Si×(1+sigm(θit)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit;
Study for model parameter, firstly, p (b in formula (8)i|bpa(i)) by statistics target symbiosis with it is consistent
Property target pair and mutual information variation obtain;p(Li|Lpa(i),bi,bpa(i)) according to the appearance of father and son's node progress value, it is divided into
Father and son's node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value:
By the Gist global characteristics of each training image in formula (9), and estimation p (g | bi), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressioni|g);
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstik|bi), value with
Whether target is closely related, and form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is correct detection
Number and training set in target mark total number of tags ratio;
It is then detected that the location probability p (W of windowik|cik,Li), it is Gaussian Profile, is fixed against correct detection cikAnd target
The position L of class iiIts expression are as follows:
Wherein, when window correctly detects, then WikMeet Gaussian Profile, ΛiIndicate the variance of target predicted position;Window
It does not detect correctly, then WikNot against Li, a constant can be expressed as;
Finally, for the score value Probability p (s of basic detectorik|cik) acquisition, be fixed against the result correctly detected
cik, indicate are as follows:
Wherein p (cik|sik) estimated using the method for logistic regression.
Wherein, compatible portion on line:
(1) when detecting, the method in formula (1) is used to obtain Gist global characteristics first the image j of input
;
(2) then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained
Obtain the probability distribution of corresponding scene subspace;The probability distribution of corresponding sub-scene is embodied as:
Wherein,Indicate the inverse distance at picture j to t sub- scene clustering centers of input,Indicate all poly-
Class centre distance inverse and;It is indicated using the probability after normalizationSubordinate is in the probability of a certain sub-scene;
(3) the initial detecting score value of each target of image is obtained using the DPM detector by trained different target
With detection window information;
(4) probability distribution of the sub-scene obtained using step (2) and (3) and each target detection score value and detection window are believed
Breath, using iterative manner, the subtree prior model that training department separately wins under bonding wire, seek target detection and whether correctly
The MAP estimation of probability, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection knot
Fruit;It is obtained by the iteration optimization to following formula:
Beneficial effects of the present invention: in the present system, it is insufficient for target self-information the problems such as, by means of picture or view
Relevant information in frequency outside target, such as the correlation locating for target between scene information and different target, directly
Or auxiliary information indirectly is provided to improve the accuracy of target detection for target detection.This system utilizes expression global scene
The Gist global characteristics of contextual information realize scene selection, then, for different scene subspaces, incorporate simple target
Between symbiosis and while positional relationship, propose the concept of consistency target pair, and using it as important part up and down
Literary information is dissolved into the target detection model of corresponding sub-tree structure.By consistency target pair, in subtree target detection model
In forming process, change corresponding mutual information weight.To change subtree mesh using the local context information of consistency target pair
Mark the structure of detection model.This method utilizes global context data separation different scenes, then according to target under different scenes
Between correlation, form corresponding target detection model, be effectively reduced interfering with each other between target between different scenes, and
And by introducing consistency target pair, the mutual constraint between target is enhanced, the local context information of more robust is provided,
Compared to existing system, the accuracy of multi-target detection is further improved.
Detailed description of the invention:
Fig. 1 is the flow chart of the prior art;
Fig. 2 is the present invention for vehicle targets flow chart;
Fig. 3 is consistency target of the present invention to acquisition schematic diagram;
Fig. 4 is the multi-target detection method partial detection figure the present invention is based on contextual information.
Specific embodiment:
Due to by the relevant information that detected in image or video outside target, such as scene locating for detection target
The correlation etc. of information and different target and detection target can directly or indirectly be that detection target provides auxiliary information, more
It is abundant to feature detection target, so as to improve the accuracy of target detection.The present invention is based on this thinkings, propose one kind
The multi-target detection system of a variety of contextual informations is merged, which selects layer and subtree layer two parts to form by scene.Firstly,
Scene is obtained by Gist global characteristics and selects layer, then, in corresponding sub-scene, recycles simple target and consistency target
Symbiosis and positional relationship portraying target to the probability graph model by tree construction obtain subtree layer, thus using global and
The contextual information of part realizes multi-target detection.When training, firstly, layer is selected in scene, in the Gist character representation overall situation
Context information obtains initial scene subset using improved Spectral Clustering using this feature, and selects subtree under subset
Root node;Then, under respective subset, using the training set image marked, by the consistency target of proposition to it is single
The local context information indicated between target is portrayed in symbiosis and mutual alignment relation between target jointly, using the local message,
Training obtains different subtree models.When carrying out target detection, firstly, the Gist feature for obtaining input picture is calculated, in scene
Layer is selected, using this feature by the corresponding scene where at a distance from scene clustering center, selecting the picture, and is obtained corresponding
Select probability;Then, by running all existing simple target basis detector DPM of target class, corresponding target is obtained
Detection window and corresponding target detection score value obtain the inspection of target in conjunction with Gist feature using trained context model
Survey result.This method is reduced or removed using the part and global context information of acquisition to be obtained according to appearance object detector
Testing result, complete the result that detects to simple target and be modified, obtain final object detection results.
The present embodiment realizes the step of multi-target detection system based on context:
Training part under the line of multi-target detection system, 1) first against training set, using LableMe software to training set
In image object class be labeled, obtain the training set image of target identification.2) the Gist feature of picture in training set is calculated
Obtain global context information;Then scene partitioning is realized using improved Spectral Clustering;3) scene is indicated by hidden variable,
Then under different scenes, the symbiosis of target is obtained according to the annotation results of the target of training picture and position distribution is believed
Breath;4) by calculating, target judges whether two targets are one to the mapping distribution in transformed space in two width pictures in training set
Cause property target, forms consistency target pair;5) symbiosis and location distribution information and consistency target using 3), 4) obtained
The study that tree construction is carried out to the Chow-Liu algorithm by Weight, is then trained parameter, obtains subtree model.
Matching Model on line
1) when detecting, firstly, the Gist feature of calculating input image.2) then, according to the Gist feature of input picture,
Image is divided into corresponding scene subspace in training and obtains the probability distribution of corresponding scene subspace;3) then, lead to
The DPM detector for crossing trained different target obtains the detection score value and detection window information of each target of image;4) it utilizes
2) the scene probability distribution and each target detection score value and detection window information, 3) obtained is instructed under bonding wire using iterative manner
Practice part obtain subtree prior model, seek target detection and whether the MAP estimation of correct probability, to repair
The object detection results of just all kinds of DPM detectors obtain final multi-target detection result (DPM detector: DPM
(Deformable Parts Model) is an extremely successful algorithm of target detection, continuously obtains VOC (Visual
Object Class) many years detection champion.The weight of numerous classifiers, segmentation, human body attitude and behavior classification is had become at present
Want part.Its inventor Pedro Felzenszwalb in 2010 authorizes " Life Achievement Award " by VOC.DPM can regard HOG as
The extension of (Histogrrams of Oriented Gradients), general idea are consistent with HOG.It is straight first to calculate gradient direction
Then Fang Tu obtains the gradient former (Model) of object with SVM (Surpport Vector Machine) training.Have in this way
Template can be used directly to classification, it is simple to understand to be exactly model and object matching).
The implementation process of this programme is as shown in Figure 2.
For above-mentioned process, this programme is elaborated:
Training obtains subtree model under one line
1) firstly, carrying out target mark using LabelMe software to training set image, obtaining includes target category and position
The training set image of information, and train the DPM detector of each target in image.
2) the Gist feature of sample in training set is calculated then, to obtain the global context information of sample image, benefit
Realize that different scenes divide with based on improved Spectral Clustering.Detailed step is:
(2.1)
Obtain the Gist feature of 520 dimensions of every width picture in training set.Its acquisition process is: firstly, utilizing one group of difference
The Gabor filter group in scale and direction is filtered image, obtains one group of image after filtering processing;Then, divide
It is other that filtered image is carried out non-overlapping grid dividing according to fixed size, and each grid after image division is sought
Mean value;Finally, each grid mean value that image group obtains is cascaded to form global characteristics, 520 final dimensions of image are obtained
Gist feature, expression formula are as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjExpression grid division is r × l
Jth width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate image and Gabor filter
Convolution algorithm, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc.This programme uses 4
The Gabor filter in 8 direction of scale.
(2.2) for the Gist feature of the every width picture of training set obtained, 6-8 class is obtained using improved Spectral Clustering
Sub-scene.Detailed process are as follows: firstly, inputting the Gist feature of every width figure in training set, Random Forest method is utilized to obtain
Indicate the similar matrix of similitude between each image in training set;Then, poly- using spectrum using the similar matrix as input
The method of class clusters training set picture, realizes the scene partitioning of different training set pictures.
3) in each different scene subspace, using the image subset obtained under the scene subspace, using tree construction
Probability graph model train corresponding subtree model under the scene.This programme has incorporated consistency in training subtree model
Target proposes the subtree context model of consistency target pair to the description of relationship two-by-two carrying out target.Detailed process is such as
Under:
(3.1) firstly, according under scene subspace in two width different images two neighbouring inhomogeneity targets in space bit
It sets, scale, the consistency distribution on visual angle obtains the consistency target pair under scene subspace.Obtain consistency target pair
Detailed process is as shown in Figure 3.Its each representation in components is as follows
Wherein, (lx(oik),ly(oik)) center of the target frame of i-th k-th of example of classification target in o figure is indicated
Coordinate.Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame
?;Similar (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated.
Scale is sc (qil), and visual angle is p (qil);Utilize variableIndicate same in two width figures
Corresponding variation of the class target variable on different four dimensions spaces, wherein r ∈ R indicates correlation, and R indicates each consistent
The similar target correspondence set of two width figures of property target centering,Indicate that the mutual variation of target position is closed
System,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;Pass through (2) formula
Calculated mapping distribution judges respective objects to whether consistency distribution is met, if met, respective objects are same to belonging to
One target normal form, that is, belong to consistency target pair;
(3.2)
It is clustered using greediness, generates target complex set final under different subspace.It avoids turning by the way of soft ballot
Change sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition.Meanwhile in order to reduce the series of target complex, such as
The frequency that fruit target occurs does not surpass 50% in target complex, then the target is rejected in target complex, by operating above, most end form
At the target complex under different scenes subspace.On the basis of the target complex of formation, in the same target complex, pass through different classifications
Target combination of two forms consistency target pair.
(3.3)
By the consistency target of proposition between simple target symbiosis and mutual alignment relation portray expression mesh jointly
Local context information between mark.Firstly, portray consistency target to and sub-scene correlation.It is shown below:
θit=cfit×isfi (3)
Wherein, cfitIndicate i-th of consistency target to the frequency occurred in t-th of sub-scene, isfiIt indicates i-th
The inverse scene frequency index of consistency target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type comprising i-th of consistency target pair
Number, ξ is a minimum, to avoid isfiValue is 0.Obtain all relative coefficient θitAfterwards, it is normalized.
(3.4)
Using the markup information of training set picture, at different sub-scene t, establish description target symbiosis binary tree and
The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly.
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|zt)Πip(bi|bpa(i),zt) (5)
Wherein, i indicates that the node in tree, pa (i) indicate the father node of node i, bi∈ { 0,1 } indicate target i whether
Occur in image.With b ≡ { biIndicate all target class.brootIndicate the root node of subtree, ztIt is a discrete variable, table
Show t-th of sub-scene space.
The position L of target iiDependent on the appearance of target, the relation of interdependence between position has and target occurs
Consistent binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|broot)Πip(Li|Lpa(i),bi,bpa(i)) (6)
Wherein, LrootIndicate the position of root node, Lpa(i)Indicate the position of father node.
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(3.5) the testing result of trained simple target detector DPM and Gist global characteristics are dissolved into elder generation
It tests in model, global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WikIndicate the position of detected k-th of the simple target candidate window using target class i, sikIt indicates to utilize mesh
Mark the score value of detected k-th of the simple target candidate window of class i;cikIndicate target class i k-th of candidate window whether be
Correct detection, otherwise it is 0 that if it is value, which is 1,.
(3.6) training subtree model mainly includes the study of tree structure and the study of relevant parameter.Utilize Chow-Liu
When algorithm carries out the prior model study of tree, pass through
(3.3) the correlation θ of the consistency target pair portrayed in and sceneit, change the friendship of target centering father and son's node
Mutual information Si.It is embodied as:
Si=Si×(1+sigm(θit)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit.
Study for model parameter, firstly, p (b in (8) formulai|bpa(i)) pass through the symbiosis and consistency of statistics target
Target pair and mutual information variation obtain.p(Li|Lpa(i),bi,bpa(i)) according to the appearance of father and son's node progress value, it is divided into father
Child node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value.Concrete form is such as
Under:
(9) in formula by the Gist global characteristics of each training image, and estimation p (g | bi), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressioni|g)。
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstik|bi), value with
Whether target is closely related, and form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is correct detection
Number and training set in target mark total number of tags ratio.
It is then detected that the location probability p (W of windowik|cik,Li), it is Gaussian Profile, is fixed against correct detection cikAnd target
The position L of class iiIts expression are as follows:
Wherein, when window correctly detects, then WikMeet Gaussian Profile, ΛiIndicate the variance of target predicted position;Window
It does not detect correctly, then WikNot against Li, a constant can be expressed as.
Finally, for the score value Probability p (s of basic detectorik|cik) acquisition, be fixed against the result correctly detected
cik, indicate are as follows:
Wherein p (cik|sik) estimated using the method for logistic regression.
Compatible portion on two wires
4) when detecting, the method in 2) is used to obtain Gist global characteristics first the image j of input
5) then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained
Obtain the probability distribution of corresponding scene subspace.Wherein, the probability distribution of corresponding sub-scene is embodied as:
Wherein,Indicate the inverse distance at picture j to t sub- scene clustering centers of input,Indicate all poly-
Class centre distance inverse and.It is indicated using the probability after normalizationSubordinate is in the probability of a certain sub-scene.
6) using obtained by the DPM detector of trained different target the initial detecting score value of each target of image with
Detection window information;
7) probability distribution and each target detection score value and detection window information of the sub-scene using 5), 6) obtained, use
Iterative manner, training department separately wins under bonding wire subtree prior model, seek target detection and whether correct probability
MAP estimation, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection result.Specifically
It is to be obtained by the iteration optimization to following formula.
This programme has merged contextual information, enriches objective expression, as shown in figure 4, more mesh based on contextual information
Mark detection method achieves satisfactory testing result.
Claims (4)
1. a kind of multi-target detection method based on contextual information, it is characterised in that including matching mould on trained and line under line
Type,
Training obtains subtree model step under line:
Step 1: first against training set, being labeled the image object class in training set using LableMe software, obtains
The training set image of target identification;And train the DPM detector of each target in image;
Step 2: the Gist feature for calculating picture in training set obtains global context information;Then improved spectral clustering is utilized
Method realizes scene partitioning;
Step 3: indicating scene by hidden variable, then under different scenes, according to the annotation results of the target of training picture
Obtain the symbiosis and location distribution information of target;
Step 4: by calculating, target judges that two targets are to the mapping distribution in transformed space in two width pictures in training set
No is consistency target, forms consistency target pair;
Step 5: the symbiosis and location distribution information that are obtained using step 3 and step 4 and consistency target are to passing through band
The Chow-Liu algorithm of weight carries out the study of tree construction, is then trained to parameter, obtains subtree model;
Matching Model step on line:
Step 1: when detecting, firstly, the Gist feature of calculating input image;
Step 2: then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained
Obtain the probability distribution of corresponding scene subspace;
Step 3: then, detection score value and the inspection of each target of image are obtained by the DPM detector of trained different target
Survey window information;
Step 4: the scene probability distribution and each target detection score value and detection window obtained using step 2 and step 3 is believed
Breath, using iterative manner, the subtree prior model that training department separately wins under bonding wire, seek target detection and whether correctly
The MAP estimation of probability, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection knot
Fruit.
2. a kind of multi-target detection method based on contextual information, which is characterized in that training obtains the step of subtree model under line
The Gist feature of 520 dimensions of every width picture in training set is obtained in rapid two, acquisition process step is: firstly, not using one group
Gabor filter group with scale and direction is filtered image, obtains one group of image after filtering processing;Then,
Filtered image is carried out non-overlapping grid dividing according to fixed size respectively, and each grid after image division is asked
Take mean value;Finally, each grid mean value that image group obtains is cascaded to form global characteristics, 520 final dimensions of image are obtained
Gist feature, expression formula are as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjIndicate that grid division is the jth of r × l
Width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate the volume of image and Gabor filter
Product operation, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc。
3. the multi-target detection method according to claim 1 based on contextual information, it is characterised in that: training obtains under line
6-8 class sub-scene is obtained using improved Spectral Clustering in the step of obtaining subtree model two, is comprised the concrete steps that: firstly, input
The Gist feature of every width figure in training set utilizes phase between each image in Random Forest method acquisition expression training set
Like the similar matrix of property;Then, training set picture is carried out using the method for spectral clustering using the similar matrix as input
Cluster, realizes the scene partitioning of different training set pictures.
4. the multi-target detection method according to claim 1 based on contextual information, it is characterised in that: training obtains under line
The step of obtaining subtree model three is indicating scene by hidden variable, then under different scenes, according to the target of training picture
Annotation results obtain target symbiosis and location distribution information when, incorporate consistency target pair subtree context model,
It comprises the concrete steps that:
(1) firstly, according under scene subspace in two width different images two neighbouring inhomogeneity targets in spatial position, ruler
It spends, the consistency distribution on visual angle obtains the consistency target pair under scene subspace;
Each representation in components for obtaining consistency target is as follows
Wherein, (lx(oik),ly(oik)) center position coordinates of the target frame of i-th k-th of example of classification target in o figure are indicated;
Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame;Class
As (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated;Scale
For sc (qil), visual angle is p (qil);Utilize variableIndicate same classification in two width figures
Corresponding variation of the variable on different four dimensions spaces is marked, wherein r ∈ R indicates correlation, and R indicates each consistency mesh
The similar target correspondence set of two width figures of centering is marked,Indicate the mutual variation relation of target position,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;It is calculated by (2) formula
Mapping distribution out judges respective objects to whether consistency distribution is met, if met, respective objects are to belonging to same mesh
Normal form is marked, that is, belongs to consistency target pair;
(2) using greedy cluster, target complex set final under different subspace is generated, avoids converting by the way of soft ballot
Sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition;If target occur frequency in target complex not
Surpass 50%, then rejects the target in target complex;Ultimately form the target complex under different scenes subspace;In the target complex of formation
On the basis of, in the same target complex, by the combination of two of inhomogeneity target, form consistency target pair;
(3) by propose consistency target between simple target symbiosis and mutual alignment relation portray expression target jointly
Between local context information;The steps include: firstly, portray consistency target to and sub-scene correlation:
θvt=cfvt×isfv (3)
Wherein, cfvtIndicate v-th of consistency target to the frequency occurred in t-th of sub-scene, isfvV-th of expression consistent
The inverse scene frequency index of property target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type number comprising v-th of consistency target pair, ξ
Isf is avoided for minimumvValue is 0;Obtain all relative coefficient θvtAfterwards, it is normalized;
(4) utilize training set picture markup information, at different sub-scene t, establish description target symbiosis binary tree and
The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly;
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|zt)∏wp(bw|bpa(w),zt) (5)
Wherein, w indicates that the node in tree, pa (w) indicate the father node of node w, bwWhether in the picture ∈ { 0,1 } indicates target w
Occur;With b ≡ { bwIndicate all target class;brootIndicate the root node of subtree, ztIt is a discrete variable, indicates t-th
Sub-scene space;
The position L of target wwDependent on the appearance of target, the relation of interdependence between position has and target appearance is consistent
Binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|broot)∏wp(Lw|Lpa(w),bw,bpa(w)) (6)
Wherein, LrootIndicate the position of root node, Lpa(w)Indicate the position of father node;
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(5) the testing result of trained simple target detector DPM and Gist global characteristics are dissolved into prior model
In, global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WwkIndicate the position of detected k-th of the simple target candidate window using target class w, swkIt indicates to utilize target class w
Detected k-th candidate window of simple target score value;cwkWhether k-th of candidate window for indicating target class w is correct
Detection, otherwise it is 0 that if it is value, which is 1,;(6) training subtree model mainly includes study and the relevant parameter of tree structure
Study;When learning using the prior model that Chow-Liu algorithm carries out tree, pass through the consistency mesh portrayed in formula (3)
Mark is to the correlation θ with scenewt, change the interactive information S of target centering father and son's nodew:
Sw=Sw×(1+sigm(θwt)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit;
Study for model parameter, firstly, p (b in formula (8)w|bpa(w)) pass through the symbiosis and consistency mesh of statistics target
Mark pair and mutual information variation obtain;p(Lw|Lpa(w),bw,bpa(w)) according to the appearance of father and son's node progress value, it is divided into father and son
Node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value:
By the Gist global characteristics of each training image in formula (9), and estimation p (g | bw), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressionw|g);
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstwk|bw), value and target
Whether it is closely related, form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is the number correctly detected
The ratio of total number of tags of target mark in mesh and training set;
It is then detected that the location probability p (W of windowwk|cwk,Lw), it is Gaussian Profile, is fixed against correct detection cikWith target class w's
Position LwIts expression are as follows:
Wherein, when window correctly detects, then WwkMeet Gaussian Profile, ΛwIndicate the variance of target predicted position;Window does not have
It correctly detects, then WwkNot against Lw, a constant can be expressed as;
Finally, for the score value Probability p (s of basic detectorwk|cwk) acquisition, be fixed against the result c correctly detectedwk, table
It is shown as:
Wherein p (cwk|swk) estimated using the method for logistic regression.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610785155.XA CN106446933B (en) | 2016-08-31 | 2016-08-31 | Multi-target detection method based on contextual information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610785155.XA CN106446933B (en) | 2016-08-31 | 2016-08-31 | Multi-target detection method based on contextual information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106446933A CN106446933A (en) | 2017-02-22 |
CN106446933B true CN106446933B (en) | 2019-08-02 |
Family
ID=58091496
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610785155.XA Expired - Fee Related CN106446933B (en) | 2016-08-31 | 2016-08-31 | Multi-target detection method based on contextual information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106446933B (en) |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106951574B (en) * | 2017-05-03 | 2019-06-14 | 牡丹江医学院 | A kind of information processing system and method based on computer network |
CN107832795B (en) * | 2017-11-14 | 2021-07-27 | 深圳码隆科技有限公司 | Article identification method and system and electronic equipment |
CN108062531B (en) * | 2017-12-25 | 2021-10-19 | 南京信息工程大学 | Video target detection method based on cascade regression convolutional neural network |
CN109977738B (en) * | 2017-12-28 | 2023-07-25 | 深圳Tcl新技术有限公司 | Video scene segmentation judging method, intelligent terminal and storage medium |
CN108363992B (en) * | 2018-03-15 | 2021-12-14 | 南京钜力智能制造技术研究院有限公司 | Fire early warning method for monitoring video image smoke based on machine learning |
CN109241819A (en) * | 2018-07-07 | 2019-01-18 | 西安电子科技大学 | Based on quickly multiple dimensioned and joint template matching multiple target pedestrian detection method |
CN110288629B (en) * | 2019-06-24 | 2021-07-06 | 湖北亿咖通科技有限公司 | Target detection automatic labeling method and device based on moving object detection |
CN110334639B (en) * | 2019-06-28 | 2021-08-10 | 北京精英系统科技有限公司 | Device and method for filtering error detection result of image analysis detection algorithm |
CN111079674B (en) * | 2019-12-22 | 2022-04-26 | 东北师范大学 | Target detection method based on global and local information fusion |
CN111080639A (en) * | 2019-12-30 | 2020-04-28 | 四川希氏异构医疗科技有限公司 | Multi-scene digestive tract endoscope image identification method and system based on artificial intelligence |
CN111814885B (en) * | 2020-07-10 | 2021-06-22 | 云从科技集团股份有限公司 | Method, system, device and medium for managing image frames |
CN112052350B (en) * | 2020-08-25 | 2024-03-01 | 腾讯科技(深圳)有限公司 | Picture retrieval method, device, equipment and computer readable storage medium |
CN112148267A (en) * | 2020-09-30 | 2020-12-29 | 深圳壹账通智能科技有限公司 | Artificial intelligence function providing method, device and storage medium |
CN112395974B (en) * | 2020-11-16 | 2021-09-07 | 南京工程学院 | Target confidence correction method based on dependency relationship between objects |
CN113138924B (en) * | 2021-04-23 | 2023-10-31 | 扬州大学 | Thread safety code identification method based on graph learning |
CN112906696B (en) * | 2021-05-06 | 2021-08-13 | 北京惠朗时代科技有限公司 | English image region identification method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577832A (en) * | 2012-07-30 | 2014-02-12 | 华中科技大学 | People flow statistical method based on spatio-temporal context |
CN104778466A (en) * | 2015-04-16 | 2015-07-15 | 北京航空航天大学 | Detection method combining various context clues for image focus region |
CN104933735A (en) * | 2015-06-30 | 2015-09-23 | 中国电子科技集团公司第二十九研究所 | A real time human face tracking method and a system based on spatio-temporal context learning |
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105740891A (en) * | 2016-01-27 | 2016-07-06 | 北京工业大学 | Target detection method based on multilevel characteristic extraction and context model |
-
2016
- 2016-08-31 CN CN201610785155.XA patent/CN106446933B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577832A (en) * | 2012-07-30 | 2014-02-12 | 华中科技大学 | People flow statistical method based on spatio-temporal context |
CN104778466A (en) * | 2015-04-16 | 2015-07-15 | 北京航空航天大学 | Detection method combining various context clues for image focus region |
CN104933735A (en) * | 2015-06-30 | 2015-09-23 | 中国电子科技集团公司第二十九研究所 | A real time human face tracking method and a system based on spatio-temporal context learning |
CN105631895A (en) * | 2015-12-18 | 2016-06-01 | 重庆大学 | Temporal-spatial context video target tracking method combining particle filtering |
CN105740891A (en) * | 2016-01-27 | 2016-07-06 | 北京工业大学 | Target detection method based on multilevel characteristic extraction and context model |
Non-Patent Citations (2)
Title |
---|
A Tree-Based Context Model for Object Recognition;M. J. Choi等;《TPAMI》;20121231;全文 |
基于线性拟合的多运动目标跟踪算法;李涛等;《西南师范大学学报》;20150531;第40卷(第5期);全文 |
Also Published As
Publication number | Publication date |
---|---|
CN106446933A (en) | 2017-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106446933B (en) | Multi-target detection method based on contextual information | |
Santos et al. | Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association | |
Luo et al. | Traffic sign recognition using a multi-task convolutional neural network | |
CN105809198B (en) | SAR image target recognition method based on depth confidence network | |
Yang et al. | Layered object models for image segmentation | |
Hariharan et al. | Semantic contours from inverse detectors | |
Huang et al. | A new building extraction postprocessing framework for high-spatial-resolution remote-sensing imagery | |
CN109034210A (en) | Object detection method based on super Fusion Features Yu multi-Scale Pyramid network | |
Oliva et al. | Scene-centered description from spatial envelope properties | |
CN108830188A (en) | Vehicle checking method based on deep learning | |
CN104616316B (en) | Personage's Activity recognition method based on threshold matrix and Fusion Features vision word | |
CN105160310A (en) | 3D (three-dimensional) convolutional neural network based human body behavior recognition method | |
CN106408030A (en) | SAR image classification method based on middle lamella semantic attribute and convolution neural network | |
CN108537102A (en) | High Resolution SAR image classification method based on sparse features and condition random field | |
CN109409384A (en) | Image-recognizing method, device, medium and equipment based on fine granularity image | |
CN104268593A (en) | Multiple-sparse-representation face recognition method for solving small sample size problem | |
Shahab et al. | How salient is scene text? | |
Wang et al. | Tea picking point detection and location based on Mask-RCNN | |
Willems et al. | Exemplar-based Action Recognition in Video. | |
CN110599463A (en) | Tongue image detection and positioning algorithm based on lightweight cascade neural network | |
Bukhari et al. | Assessing the impact of segmentation on wheat stripe rust disease classification using computer vision and deep learning | |
Li et al. | Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes | |
Lee et al. | Automatic recognition of flower species in the natural environment | |
Zhao et al. | Generalized symmetric pair model for action classification in still images | |
CN108734200A (en) | Human body target visible detection method and device based on BING features |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190802 Termination date: 20210831 |
|
CF01 | Termination of patent right due to non-payment of annual fee |