CN106446933B - Multi-target detection method based on contextual information - Google Patents

Multi-target detection method based on contextual information Download PDF

Info

Publication number
CN106446933B
CN106446933B CN201610785155.XA CN201610785155A CN106446933B CN 106446933 B CN106446933 B CN 106446933B CN 201610785155 A CN201610785155 A CN 201610785155A CN 106446933 B CN106446933 B CN 106446933B
Authority
CN
China
Prior art keywords
target
scene
indicate
image
consistency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201610785155.XA
Other languages
Chinese (zh)
Other versions
CN106446933A (en
Inventor
李涛
裴利沈
赵雪专
张栋梁
李冬梅
朱晓珺
曲豪
邹香玲
高大伟
刘永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
HENAN RADIO & TELEVISION UNIVERSITY
Original Assignee
HENAN RADIO & TELEVISION UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by HENAN RADIO & TELEVISION UNIVERSITY filed Critical HENAN RADIO & TELEVISION UNIVERSITY
Priority to CN201610785155.XA priority Critical patent/CN106446933B/en
Publication of CN106446933A publication Critical patent/CN106446933A/en
Application granted granted Critical
Publication of CN106446933B publication Critical patent/CN106446933B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Discrete Mathematics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of multi-target detection methods based on contextual information, including Matching Model on training under line and line, using Gist this feature of input picture by the corresponding scene where at a distance from scene clustering center, selecting the picture, and obtain corresponding select probability;By running all existing simple target basis detector DPM of target class, corresponding target detection window and corresponding target detection score value are obtained, obtains the testing result of target in conjunction with Gist feature using trained context model.This method utilizes global context data separation different scenes, then according to mutuality of objectives under different scenes, corresponding target detection model is formed, interfering with each other between target between different scenes is effectively reduced, further improves the accuracy of multi-target detection.

Description

Multi-target detection method based on contextual information
Technical field
The present invention relates to a kind of multiple targets based on contextual information that can be applied in real time in multi-target detection system Detection technique.
Background technique:
When target detection based on image or video is computer vision field recent decades and is considerably long one section later Interior research hotspot, is the basis of visual analysis.The technology can widely be adapted to target following, object detection and identification, The subjects such as information security, autonomous driving, image retrieval, robot, human-computer interaction, medical image analysis, Internet of Things and engineering are answered Use field.
Present object detection system mainly passes through the identification of target itself appearance information portrayed to realize different target And detection.Currently, such system is mainly to utilize the feature (such as HOG, LBP, SIFT etc.) of engineer or pass through deep learning The further feature that directly obtains from image itself portrays target appearance, utilizes target appearance, realizes target detection.But daily Life it is actually detected in, mostly unconstrained open environment is complicated and changeable, and there are illumination variation, view transformation, target is hidden The interference such as gear, if simple only from the appearance information of target itself, when the information that target itself provides in image or video When very little, target category is only unable to judge accurately out according to target itself.
A kind of " target knowledge based on context constraint of the inventions such as the Chinese Central China University of Science and Technology Wang Yuehuan, Liu Chang, Chen Junling Other method " it on December 7th, 2012 applies for a patent and gets the Green Light to China State Intellectual Property Office, in public affairs on April 17th, 2013 It opens, publication number: CN103049763A.
This publication disclose a kind of target identification methods based on context constraint, are used for remote sensing images scene classification And the detection identification of target.This method is first filtered image, then carries out region segmentation, divides the image into more A connected domain, and each connected domain is marked, secondly, calculating the feature vector of each connected domain, and it is input to prior instruction Scene classification calculating is carried out in the classifier perfected, output category label figure then on this basis, identifies as needed Target delimit target local area that may be present on label figure, and carries out pretreatment operation to the regional area, Area-of-interest is calculated in the region, finally, extracting feature, and is input in classifier and is identified.The present invention provides one Plant quickly and effectively scene classification method, it is intended to provide effective context constraint for target identification, improve recognition efficiency and standard True rate.The algorithm flow is illustrated in fig. 1 shown below.
Above-mentioned patented technology document still has defect: although the patent is using cut zone and is labeled to obtain Scene classification carries out global context constraint then on basis of classification to calculate area-of-interest, obtain correlated characteristic to Amount identifies respective objects by trained classifier.But such method obtains target just with global scene context Probability Area, it is contemplated that relative position distribution of the target based on scene, symbiosis is portrayed between having ignored target.In addition, working as target When self-information content is smaller, target can not be accurately portrayed, corresponding target detection can not be obtained by classifier.
Summary of the invention:
The problems such as present invention is insufficient for target self-information, by means of the correlation in picture or video outside target Information is directly or indirectly that target detection provides auxiliary information to improve the accuracy of target detection.
Realize technical solution used by the goal of the invention: a kind of multi-target detection method based on contextual information, It is characterized in that including training and Matching Model on line under line, training obtains subtree model step under line:
Step 1: first against training set, being labeled the image object class in training set using LableMe software, Obtain the training set image of target identification;And train the DPM detector of each target in image;Step 2: training set is calculated The Gist feature of middle picture obtains global context information;Then scene partitioning is realized using improved Spectral Clustering;
Step 3: indicating scene by hidden variable, then under different scenes, according to the mark of the target of training picture As a result the symbiosis and location distribution information of target are obtained;
Step 4: by calculating, target judges two mesh to the mapping distribution in transformed space in two width pictures in training set Whether mark is consistency target, forms consistency target pair;
Step 5: the symbiosis and location distribution information and consistency target obtained using step 3 and step 4 is to logical The Chow-Liu algorithm for crossing Weight carries out the study of tree construction, is then trained to parameter, obtains subtree model;
Matching Model step on line:
Step 1: when detecting, firstly, the Gist feature of calculating input image;
Step 2: then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training And obtain the probability distribution of corresponding scene subspace;
Step 3: then, the detection score value of each target of image is obtained by the DPM detector of trained different target With detection window information;
Step 4: the scene probability distribution and each target detection score value and detection window obtained using step 2 and step 3 Information, using iterative manner, the subtree prior model that training department separately wins under bonding wire seeks target detection and whether correct Probability MAP estimation, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection As a result.
Wherein, the Gist of 520 dimensions of every width picture in training set is obtained in the step of training obtains subtree model under line two Feature, acquisition process step is: firstly, being filtered using the Gabor filter group of one group of different scale and direction to image Wave obtains one group of image after filtering processing;Then, respectively that filtered image is non-overlapping according to fixed size progress Grid dividing, and to image divide after each grid seek mean value;Finally, each grid mean value grade that image group is obtained Connection forms global characteristics, obtains the Gist feature of 520 final dimensions of image, and expression formula is as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjIndicate that grid division is r × l's Jth width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate image and Gabor filter Convolution algorithm, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc.This programme uses 4 The Gabor filter in 8 direction of scale;
Wherein, 6-8 class subfield is obtained using improved Spectral Clustering in the step of training obtains subtree model under line two Scape comprises the concrete steps that: firstly, inputting the Gist feature of every width figure in training set, being indicated using Random Forest method In training set between each image similitude similar matrix;Then, using the similar matrix as input, using spectral clustering Method clusters training set picture, realizes the scene partitioning of different training set pictures.
Wherein, the step of training obtains subtree model under line three incorporates consistency target pair in training subtree model Subtree context model, comprises the concrete steps that:
(1)
Each representation in components for obtaining consistency target is as follows
Wherein, (lx(oik),ly(oik)) center of the target frame of i-th k-th of example of classification target in o figure is indicated Coordinate.Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame ?;Similar (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated. Scale is sc (qil), and visual angle is p (qil);Utilize variableIndicate same in two width figures Corresponding variation of the class target variable on different four dimensions spaces, wherein r ∈ R indicates correlation, and R indicates each consistent The similar target correspondence set of two width figures of property target centering,Indicate that the mutual variation of target position is closed System,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;Pass through (2) formula Calculated mapping distribution judges respective objects to whether consistency distribution is met, if met, respective objects are same to belonging to One target normal form, that is, belong to consistency target pair;
(2)
It is clustered using greediness, generates target complex set final under different subspace, avoid turning by the way of soft ballot Change sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition;If the frequency that target occurs is in target complex Do not surpass 50%, then rejects the target in target complex;Ultimately form the target complex under different scenes subspace;In the target of formation On the basis of group, in the same target complex, by the combination of two of inhomogeneity target, consistency target pair is formed;
(3)
By the consistency target of proposition between simple target symbiosis and mutual alignment relation portray expression mesh jointly Local context information between mark;The steps include: firstly, portray consistency target to and sub-scene correlation:
θit=cfit×isfi (3)
Wherein, cfitIndicate i-th of consistency target to the frequency occurred in t-th of sub-scene, isfiIt indicates i-th The inverse scene frequency index of consistency target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type comprising i-th of consistency target pair Number, ξ are a minimum to avoid isfiValue is 0;Obtain all relative coefficient θitAfterwards, it is normalized;
(4)
Using the markup information of training set picture, at different sub-scene t, establish description target symbiosis binary tree and The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly;
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|zt)∏ip(bi|bpa(i),zt) (5)
Wherein, i indicates that the node in tree, pa (i) indicate the father node of node i, bi∈ { 0,1 } indicate target i whether Occur in image;With b ≡ { biIndicate all target class;brootIndicate the root node of subtree, ztIt is a discrete variable, table Show t-th of sub-scene space;
The position L of target iiDependent on the appearance of target, the relation of interdependence between position has and target occurs Consistent binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|brootip(Li|Lpa(i),bi,bpa(i)) (6)
Wherein, LrootIndicate the position of root node, Lpa(i)Indicate the position of father node.
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(5) the testing result and Gist of trained simple target detector DPM
Global characteristics are dissolved into prior model, and global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WikIndicate the position of detected k-th of the simple target candidate window using target class i,
sikIndicate the score value of detected k-th of the simple target candidate window using target class i;cikIndicate target class i K-th of candidate window whether be correctly to detect, otherwise it is 0 that if it is value, which is 1,;
(6) training subtree model mainly includes the study of tree structure and the study of relevant parameter;It is calculated using Chow-Liu When method carries out the prior model study of tree, by the correlation of the consistency target pair and scene portrayed in formula (3) θit, change the interactive information S of target centering father and son's nodei:
Si=Si×(1+sigm(θit)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit;
Study for model parameter, firstly, p (b in formula (8)i|bpa(i)) by statistics target symbiosis with it is consistent Property target pair and mutual information variation obtain;p(Li|Lpa(i),bi,bpa(i)) according to the appearance of father and son's node progress value, it is divided into Father and son's node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value:
By the Gist global characteristics of each training image in formula (9), and estimation p (g | bi), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressioni|g);
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstik|bi), value with Whether target is closely related, and form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is correct detection Number and training set in target mark total number of tags ratio;
It is then detected that the location probability p (W of windowik|cik,Li), it is Gaussian Profile, is fixed against correct detection cikAnd target The position L of class iiIts expression are as follows:
Wherein, when window correctly detects, then WikMeet Gaussian Profile, ΛiIndicate the variance of target predicted position;Window It does not detect correctly, then WikNot against Li, a constant can be expressed as;
Finally, for the score value Probability p (s of basic detectorik|cik) acquisition, be fixed against the result correctly detected cik, indicate are as follows:
Wherein p (cik|sik) estimated using the method for logistic regression.
Wherein, compatible portion on line:
(1) when detecting, the method in formula (1) is used to obtain Gist global characteristics first the image j of input
(2) then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained Obtain the probability distribution of corresponding scene subspace;The probability distribution of corresponding sub-scene is embodied as:
Wherein,Indicate the inverse distance at picture j to t sub- scene clustering centers of input,Indicate all poly- Class centre distance inverse and;It is indicated using the probability after normalizationSubordinate is in the probability of a certain sub-scene;
(3) the initial detecting score value of each target of image is obtained using the DPM detector by trained different target With detection window information;
(4) probability distribution of the sub-scene obtained using step (2) and (3) and each target detection score value and detection window are believed Breath, using iterative manner, the subtree prior model that training department separately wins under bonding wire, seek target detection and whether correctly The MAP estimation of probability, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection knot Fruit;It is obtained by the iteration optimization to following formula:
Beneficial effects of the present invention: in the present system, it is insufficient for target self-information the problems such as, by means of picture or view Relevant information in frequency outside target, such as the correlation locating for target between scene information and different target, directly Or auxiliary information indirectly is provided to improve the accuracy of target detection for target detection.This system utilizes expression global scene The Gist global characteristics of contextual information realize scene selection, then, for different scene subspaces, incorporate simple target Between symbiosis and while positional relationship, propose the concept of consistency target pair, and using it as important part up and down Literary information is dissolved into the target detection model of corresponding sub-tree structure.By consistency target pair, in subtree target detection model In forming process, change corresponding mutual information weight.To change subtree mesh using the local context information of consistency target pair Mark the structure of detection model.This method utilizes global context data separation different scenes, then according to target under different scenes Between correlation, form corresponding target detection model, be effectively reduced interfering with each other between target between different scenes, and And by introducing consistency target pair, the mutual constraint between target is enhanced, the local context information of more robust is provided, Compared to existing system, the accuracy of multi-target detection is further improved.
Detailed description of the invention:
Fig. 1 is the flow chart of the prior art;
Fig. 2 is the present invention for vehicle targets flow chart;
Fig. 3 is consistency target of the present invention to acquisition schematic diagram;
Fig. 4 is the multi-target detection method partial detection figure the present invention is based on contextual information.
Specific embodiment:
Due to by the relevant information that detected in image or video outside target, such as scene locating for detection target The correlation etc. of information and different target and detection target can directly or indirectly be that detection target provides auxiliary information, more It is abundant to feature detection target, so as to improve the accuracy of target detection.The present invention is based on this thinkings, propose one kind The multi-target detection system of a variety of contextual informations is merged, which selects layer and subtree layer two parts to form by scene.Firstly, Scene is obtained by Gist global characteristics and selects layer, then, in corresponding sub-scene, recycles simple target and consistency target Symbiosis and positional relationship portraying target to the probability graph model by tree construction obtain subtree layer, thus using global and The contextual information of part realizes multi-target detection.When training, firstly, layer is selected in scene, in the Gist character representation overall situation Context information obtains initial scene subset using improved Spectral Clustering using this feature, and selects subtree under subset Root node;Then, under respective subset, using the training set image marked, by the consistency target of proposition to it is single The local context information indicated between target is portrayed in symbiosis and mutual alignment relation between target jointly, using the local message, Training obtains different subtree models.When carrying out target detection, firstly, the Gist feature for obtaining input picture is calculated, in scene Layer is selected, using this feature by the corresponding scene where at a distance from scene clustering center, selecting the picture, and is obtained corresponding Select probability;Then, by running all existing simple target basis detector DPM of target class, corresponding target is obtained Detection window and corresponding target detection score value obtain the inspection of target in conjunction with Gist feature using trained context model Survey result.This method is reduced or removed using the part and global context information of acquisition to be obtained according to appearance object detector Testing result, complete the result that detects to simple target and be modified, obtain final object detection results.
The present embodiment realizes the step of multi-target detection system based on context:
Training part under the line of multi-target detection system, 1) first against training set, using LableMe software to training set In image object class be labeled, obtain the training set image of target identification.2) the Gist feature of picture in training set is calculated Obtain global context information;Then scene partitioning is realized using improved Spectral Clustering;3) scene is indicated by hidden variable, Then under different scenes, the symbiosis of target is obtained according to the annotation results of the target of training picture and position distribution is believed Breath;4) by calculating, target judges whether two targets are one to the mapping distribution in transformed space in two width pictures in training set Cause property target, forms consistency target pair;5) symbiosis and location distribution information and consistency target using 3), 4) obtained The study that tree construction is carried out to the Chow-Liu algorithm by Weight, is then trained parameter, obtains subtree model.
Matching Model on line
1) when detecting, firstly, the Gist feature of calculating input image.2) then, according to the Gist feature of input picture, Image is divided into corresponding scene subspace in training and obtains the probability distribution of corresponding scene subspace;3) then, lead to The DPM detector for crossing trained different target obtains the detection score value and detection window information of each target of image;4) it utilizes 2) the scene probability distribution and each target detection score value and detection window information, 3) obtained is instructed under bonding wire using iterative manner Practice part obtain subtree prior model, seek target detection and whether the MAP estimation of correct probability, to repair The object detection results of just all kinds of DPM detectors obtain final multi-target detection result (DPM detector: DPM (Deformable Parts Model) is an extremely successful algorithm of target detection, continuously obtains VOC (Visual Object Class) many years detection champion.The weight of numerous classifiers, segmentation, human body attitude and behavior classification is had become at present Want part.Its inventor Pedro Felzenszwalb in 2010 authorizes " Life Achievement Award " by VOC.DPM can regard HOG as The extension of (Histogrrams of Oriented Gradients), general idea are consistent with HOG.It is straight first to calculate gradient direction Then Fang Tu obtains the gradient former (Model) of object with SVM (Surpport Vector Machine) training.Have in this way Template can be used directly to classification, it is simple to understand to be exactly model and object matching).
The implementation process of this programme is as shown in Figure 2.
For above-mentioned process, this programme is elaborated:
Training obtains subtree model under one line
1) firstly, carrying out target mark using LabelMe software to training set image, obtaining includes target category and position The training set image of information, and train the DPM detector of each target in image.
2) the Gist feature of sample in training set is calculated then, to obtain the global context information of sample image, benefit Realize that different scenes divide with based on improved Spectral Clustering.Detailed step is:
(2.1)
Obtain the Gist feature of 520 dimensions of every width picture in training set.Its acquisition process is: firstly, utilizing one group of difference The Gabor filter group in scale and direction is filtered image, obtains one group of image after filtering processing;Then, divide It is other that filtered image is carried out non-overlapping grid dividing according to fixed size, and each grid after image division is sought Mean value;Finally, each grid mean value that image group obtains is cascaded to form global characteristics, 520 final dimensions of image are obtained Gist feature, expression formula are as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjExpression grid division is r × l Jth width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate image and Gabor filter Convolution algorithm, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc.This programme uses 4 The Gabor filter in 8 direction of scale.
(2.2) for the Gist feature of the every width picture of training set obtained, 6-8 class is obtained using improved Spectral Clustering Sub-scene.Detailed process are as follows: firstly, inputting the Gist feature of every width figure in training set, Random Forest method is utilized to obtain Indicate the similar matrix of similitude between each image in training set;Then, poly- using spectrum using the similar matrix as input The method of class clusters training set picture, realizes the scene partitioning of different training set pictures.
3) in each different scene subspace, using the image subset obtained under the scene subspace, using tree construction Probability graph model train corresponding subtree model under the scene.This programme has incorporated consistency in training subtree model Target proposes the subtree context model of consistency target pair to the description of relationship two-by-two carrying out target.Detailed process is such as Under:
(3.1) firstly, according under scene subspace in two width different images two neighbouring inhomogeneity targets in space bit It sets, scale, the consistency distribution on visual angle obtains the consistency target pair under scene subspace.Obtain consistency target pair Detailed process is as shown in Figure 3.Its each representation in components is as follows
Wherein, (lx(oik),ly(oik)) center of the target frame of i-th k-th of example of classification target in o figure is indicated Coordinate.Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame ?;Similar (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated. Scale is sc (qil), and visual angle is p (qil);Utilize variableIndicate same in two width figures Corresponding variation of the class target variable on different four dimensions spaces, wherein r ∈ R indicates correlation, and R indicates each consistent The similar target correspondence set of two width figures of property target centering,Indicate that the mutual variation of target position is closed System,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;Pass through (2) formula Calculated mapping distribution judges respective objects to whether consistency distribution is met, if met, respective objects are same to belonging to One target normal form, that is, belong to consistency target pair;
(3.2)
It is clustered using greediness, generates target complex set final under different subspace.It avoids turning by the way of soft ballot Change sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition.Meanwhile in order to reduce the series of target complex, such as The frequency that fruit target occurs does not surpass 50% in target complex, then the target is rejected in target complex, by operating above, most end form At the target complex under different scenes subspace.On the basis of the target complex of formation, in the same target complex, pass through different classifications Target combination of two forms consistency target pair.
(3.3)
By the consistency target of proposition between simple target symbiosis and mutual alignment relation portray expression mesh jointly Local context information between mark.Firstly, portray consistency target to and sub-scene correlation.It is shown below:
θit=cfit×isfi (3)
Wherein, cfitIndicate i-th of consistency target to the frequency occurred in t-th of sub-scene, isfiIt indicates i-th The inverse scene frequency index of consistency target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type comprising i-th of consistency target pair Number, ξ is a minimum, to avoid isfiValue is 0.Obtain all relative coefficient θitAfterwards, it is normalized.
(3.4)
Using the markup information of training set picture, at different sub-scene t, establish description target symbiosis binary tree and The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly.
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|ztip(bi|bpa(i),zt) (5)
Wherein, i indicates that the node in tree, pa (i) indicate the father node of node i, bi∈ { 0,1 } indicate target i whether Occur in image.With b ≡ { biIndicate all target class.brootIndicate the root node of subtree, ztIt is a discrete variable, table Show t-th of sub-scene space.
The position L of target iiDependent on the appearance of target, the relation of interdependence between position has and target occurs Consistent binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|brootip(Li|Lpa(i),bi,bpa(i)) (6)
Wherein, LrootIndicate the position of root node, Lpa(i)Indicate the position of father node.
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(3.5) the testing result of trained simple target detector DPM and Gist global characteristics are dissolved into elder generation It tests in model, global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WikIndicate the position of detected k-th of the simple target candidate window using target class i, sikIt indicates to utilize mesh Mark the score value of detected k-th of the simple target candidate window of class i;cikIndicate target class i k-th of candidate window whether be Correct detection, otherwise it is 0 that if it is value, which is 1,.
(3.6) training subtree model mainly includes the study of tree structure and the study of relevant parameter.Utilize Chow-Liu When algorithm carries out the prior model study of tree, pass through
(3.3) the correlation θ of the consistency target pair portrayed in and sceneit, change the friendship of target centering father and son's node Mutual information Si.It is embodied as:
Si=Si×(1+sigm(θit)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit.
Study for model parameter, firstly, p (b in (8) formulai|bpa(i)) pass through the symbiosis and consistency of statistics target Target pair and mutual information variation obtain.p(Li|Lpa(i),bi,bpa(i)) according to the appearance of father and son's node progress value, it is divided into father Child node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value.Concrete form is such as Under:
(9) in formula by the Gist global characteristics of each training image, and estimation p (g | bi), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressioni|g)。
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstik|bi), value with Whether target is closely related, and form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is correct detection Number and training set in target mark total number of tags ratio.
It is then detected that the location probability p (W of windowik|cik,Li), it is Gaussian Profile, is fixed against correct detection cikAnd target The position L of class iiIts expression are as follows:
Wherein, when window correctly detects, then WikMeet Gaussian Profile, ΛiIndicate the variance of target predicted position;Window It does not detect correctly, then WikNot against Li, a constant can be expressed as.
Finally, for the score value Probability p (s of basic detectorik|cik) acquisition, be fixed against the result correctly detected cik, indicate are as follows:
Wherein p (cik|sik) estimated using the method for logistic regression.
Compatible portion on two wires
4) when detecting, the method in 2) is used to obtain Gist global characteristics first the image j of input
5) then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained Obtain the probability distribution of corresponding scene subspace.Wherein, the probability distribution of corresponding sub-scene is embodied as:
Wherein,Indicate the inverse distance at picture j to t sub- scene clustering centers of input,Indicate all poly- Class centre distance inverse and.It is indicated using the probability after normalizationSubordinate is in the probability of a certain sub-scene.
6) using obtained by the DPM detector of trained different target the initial detecting score value of each target of image with Detection window information;
7) probability distribution and each target detection score value and detection window information of the sub-scene using 5), 6) obtained, use Iterative manner, training department separately wins under bonding wire subtree prior model, seek target detection and whether correct probability MAP estimation, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection result.Specifically It is to be obtained by the iteration optimization to following formula.
This programme has merged contextual information, enriches objective expression, as shown in figure 4, more mesh based on contextual information Mark detection method achieves satisfactory testing result.

Claims (4)

1. a kind of multi-target detection method based on contextual information, it is characterised in that including matching mould on trained and line under line Type,
Training obtains subtree model step under line:
Step 1: first against training set, being labeled the image object class in training set using LableMe software, obtains The training set image of target identification;And train the DPM detector of each target in image;
Step 2: the Gist feature for calculating picture in training set obtains global context information;Then improved spectral clustering is utilized Method realizes scene partitioning;
Step 3: indicating scene by hidden variable, then under different scenes, according to the annotation results of the target of training picture Obtain the symbiosis and location distribution information of target;
Step 4: by calculating, target judges that two targets are to the mapping distribution in transformed space in two width pictures in training set No is consistency target, forms consistency target pair;
Step 5: the symbiosis and location distribution information that are obtained using step 3 and step 4 and consistency target are to passing through band The Chow-Liu algorithm of weight carries out the study of tree construction, is then trained to parameter, obtains subtree model;
Matching Model step on line:
Step 1: when detecting, firstly, the Gist feature of calculating input image;
Step 2: then, according to the Gist feature of input picture, image is divided into corresponding scene subspace in training and is obtained Obtain the probability distribution of corresponding scene subspace;
Step 3: then, detection score value and the inspection of each target of image are obtained by the DPM detector of trained different target Survey window information;
Step 4: the scene probability distribution and each target detection score value and detection window obtained using step 2 and step 3 is believed Breath, using iterative manner, the subtree prior model that training department separately wins under bonding wire, seek target detection and whether correctly The MAP estimation of probability, so that the object detection results for correcting all kinds of DPM detectors obtain final multi-target detection knot Fruit.
2. a kind of multi-target detection method based on contextual information, which is characterized in that training obtains the step of subtree model under line The Gist feature of 520 dimensions of every width picture in training set is obtained in rapid two, acquisition process step is: firstly, not using one group Gabor filter group with scale and direction is filtered image, obtains one group of image after filtering processing;Then, Filtered image is carried out non-overlapping grid dividing according to fixed size respectively, and each grid after image division is asked Take mean value;Finally, each grid mean value that image group obtains is cascaded to form global characteristics, 520 final dimensions of image are obtained Gist feature, expression formula are as follows:
Wherein,Indicate the Gist feature of jth width image, cat indicates feature cascade, IjIndicate that grid division is the jth of r × l Width image grayscale figure, gmnThe Gabor filter in the direction n of m scale is respectively represented,Indicate the volume of image and Gabor filter Product operation, ncIndicate that the number of filter of convolution, size are m × n,Dimension be r × l × nc
3. the multi-target detection method according to claim 1 based on contextual information, it is characterised in that: training obtains under line 6-8 class sub-scene is obtained using improved Spectral Clustering in the step of obtaining subtree model two, is comprised the concrete steps that: firstly, input The Gist feature of every width figure in training set utilizes phase between each image in Random Forest method acquisition expression training set Like the similar matrix of property;Then, training set picture is carried out using the method for spectral clustering using the similar matrix as input Cluster, realizes the scene partitioning of different training set pictures.
4. the multi-target detection method according to claim 1 based on contextual information, it is characterised in that: training obtains under line The step of obtaining subtree model three is indicating scene by hidden variable, then under different scenes, according to the target of training picture Annotation results obtain target symbiosis and location distribution information when, incorporate consistency target pair subtree context model, It comprises the concrete steps that:
(1) firstly, according under scene subspace in two width different images two neighbouring inhomogeneity targets in spatial position, ruler It spends, the consistency distribution on visual angle obtains the consistency target pair under scene subspace;
Each representation in components for obtaining consistency target is as follows
Wherein, (lx(oik),ly(oik)) center position coordinates of the target frame of i-th k-th of example of classification target in o figure are indicated; Scale is that sc (oik) is indicated with the example goal frame area square root, and visual angle is that p (oik) is obtained with the length-width ratio of target frame;Class As (lx(qil),ly(qil)) center position coordinates of the target frame of i-th first of example of classification target in q figure are indicated;Scale For sc (qil), visual angle is p (qil);Utilize variableIndicate same classification in two width figures Corresponding variation of the variable on different four dimensions spaces is marked, wherein r ∈ R indicates correlation, and R indicates each consistency mesh The similar target correspondence set of two width figures of centering is marked,Indicate the mutual variation relation of target position,Indicate the mutual variation relation of target scale,Indicate the mutual variation relation of aspect;It is calculated by (2) formula Mapping distribution out judges respective objects to whether consistency distribution is met, if met, respective objects are to belonging to same mesh Normal form is marked, that is, belongs to consistency target pair;
(2) using greedy cluster, target complex set final under different subspace is generated, avoids converting by the way of soft ballot Sensibility and similar purpose all living creatures redundancy caused by of Spacial domain decomposition;If target occur frequency in target complex not Surpass 50%, then rejects the target in target complex;Ultimately form the target complex under different scenes subspace;In the target complex of formation On the basis of, in the same target complex, by the combination of two of inhomogeneity target, form consistency target pair;
(3) by propose consistency target between simple target symbiosis and mutual alignment relation portray expression target jointly Between local context information;The steps include: firstly, portray consistency target to and sub-scene correlation:
θvt=cfvt×isfv (3)
Wherein, cfvtIndicate v-th of consistency target to the frequency occurred in t-th of sub-scene, isfvV-th of expression consistent The inverse scene frequency index of property target pair, is expressed as follows:
Wherein, T indicates the total type number of sub-scene, TtIndicate the sub-scene type number comprising v-th of consistency target pair, ξ Isf is avoided for minimumvValue is 0;Obtain all relative coefficient θvtAfterwards, it is normalized;
(4) utilize training set picture markup information, at different sub-scene t, establish description target symbiosis binary tree and The Gauss tree of target position relationship is described, the two portrays priori subtree model jointly;
The joint probability that whether all targets occur in binary tree indicates are as follows:
p(b|zt)=p (broot|zt)∏wp(bw|bpa(w),zt) (5)
Wherein, w indicates that the node in tree, pa (w) indicate the father node of node w, bwWhether in the picture ∈ { 0,1 } indicates target w Occur;With b ≡ { bwIndicate all target class;brootIndicate the root node of subtree, ztIt is a discrete variable, indicates t-th Sub-scene space;
The position L of target wwDependent on the appearance of target, the relation of interdependence between position has and target appearance is consistent Binary tree construction, is expressed as follows:
P (L | b)=p (Lroot|broot)∏wp(Lw|Lpa(w),bw,bpa(w)) (6)
Wherein, LrootIndicate the position of root node, Lpa(w)Indicate the position of father node;
So the Joint Distribution of variable b and position L indicate are as follows:
WhereinIt indicates are as follows:
(5) the testing result of trained simple target detector DPM and Gist global characteristics are dissolved into prior model In, global characteristics are indicated with g, then its Joint Distribution indicates are as follows:
Wherein,It indicates are as follows:
WwkIndicate the position of detected k-th of the simple target candidate window using target class w, swkIt indicates to utilize target class w Detected k-th candidate window of simple target score value;cwkWhether k-th of candidate window for indicating target class w is correct Detection, otherwise it is 0 that if it is value, which is 1,;(6) training subtree model mainly includes study and the relevant parameter of tree structure Study;When learning using the prior model that Chow-Liu algorithm carries out tree, pass through the consistency mesh portrayed in formula (3) Mark is to the correlation θ with scenewt, change the interactive information S of target centering father and son's nodew:
Sw=Sw×(1+sigm(θwt)) (11)
Then, the Structure learning of subtree prior model is completed according to weight limit;
Study for model parameter, firstly, p (b in formula (8)w|bpa(w)) pass through the symbiosis and consistency mesh of statistics target Mark pair and mutual information variation obtain;p(Lw|Lpa(w),bw,bpa(w)) according to the appearance of father and son's node progress value, it is divided into father and son Node co-occurrence, child node occurs and child node does not occur three kinds of situations and considers that its Gaussian Profile obtains value:
By the Gist global characteristics of each training image in formula (9), and estimation p (g | bw), it is obtained especially by following formula:
For global characteristics g, p (b is estimated using the method for logistic regressionw|g);
The corresponding testing result of single basic detector is integrated, is the Probability p (c correctly detected firstwk|bw), value and target Whether it is closely related, form is as follows:
When target does not occur, then correct verification and measurement ratio takes 0, and when target occurs, the probability correctly detected is the number correctly detected The ratio of total number of tags of target mark in mesh and training set;
It is then detected that the location probability p (W of windowwk|cwk,Lw), it is Gaussian Profile, is fixed against correct detection cikWith target class w's Position LwIts expression are as follows:
Wherein, when window correctly detects, then WwkMeet Gaussian Profile, ΛwIndicate the variance of target predicted position;Window does not have It correctly detects, then WwkNot against Lw, a constant can be expressed as;
Finally, for the score value Probability p (s of basic detectorwk|cwk) acquisition, be fixed against the result c correctly detectedwk, table It is shown as:
Wherein p (cwk|swk) estimated using the method for logistic regression.
CN201610785155.XA 2016-08-31 2016-08-31 Multi-target detection method based on contextual information Expired - Fee Related CN106446933B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610785155.XA CN106446933B (en) 2016-08-31 2016-08-31 Multi-target detection method based on contextual information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610785155.XA CN106446933B (en) 2016-08-31 2016-08-31 Multi-target detection method based on contextual information

Publications (2)

Publication Number Publication Date
CN106446933A CN106446933A (en) 2017-02-22
CN106446933B true CN106446933B (en) 2019-08-02

Family

ID=58091496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610785155.XA Expired - Fee Related CN106446933B (en) 2016-08-31 2016-08-31 Multi-target detection method based on contextual information

Country Status (1)

Country Link
CN (1) CN106446933B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951574B (en) * 2017-05-03 2019-06-14 牡丹江医学院 A kind of information processing system and method based on computer network
CN107832795B (en) * 2017-11-14 2021-07-27 深圳码隆科技有限公司 Article identification method and system and electronic equipment
CN108062531B (en) * 2017-12-25 2021-10-19 南京信息工程大学 Video target detection method based on cascade regression convolutional neural network
CN109977738B (en) * 2017-12-28 2023-07-25 深圳Tcl新技术有限公司 Video scene segmentation judging method, intelligent terminal and storage medium
CN108363992B (en) * 2018-03-15 2021-12-14 南京钜力智能制造技术研究院有限公司 Fire early warning method for monitoring video image smoke based on machine learning
CN109241819A (en) * 2018-07-07 2019-01-18 西安电子科技大学 Based on quickly multiple dimensioned and joint template matching multiple target pedestrian detection method
CN110288629B (en) * 2019-06-24 2021-07-06 湖北亿咖通科技有限公司 Target detection automatic labeling method and device based on moving object detection
CN110334639B (en) * 2019-06-28 2021-08-10 北京精英系统科技有限公司 Device and method for filtering error detection result of image analysis detection algorithm
CN111079674B (en) * 2019-12-22 2022-04-26 东北师范大学 Target detection method based on global and local information fusion
CN111080639A (en) * 2019-12-30 2020-04-28 四川希氏异构医疗科技有限公司 Multi-scene digestive tract endoscope image identification method and system based on artificial intelligence
CN111814885B (en) * 2020-07-10 2021-06-22 云从科技集团股份有限公司 Method, system, device and medium for managing image frames
CN112052350B (en) * 2020-08-25 2024-03-01 腾讯科技(深圳)有限公司 Picture retrieval method, device, equipment and computer readable storage medium
CN112148267A (en) * 2020-09-30 2020-12-29 深圳壹账通智能科技有限公司 Artificial intelligence function providing method, device and storage medium
CN112395974B (en) * 2020-11-16 2021-09-07 南京工程学院 Target confidence correction method based on dependency relationship between objects
CN113138924B (en) * 2021-04-23 2023-10-31 扬州大学 Thread safety code identification method based on graph learning
CN112906696B (en) * 2021-05-06 2021-08-13 北京惠朗时代科技有限公司 English image region identification method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577832A (en) * 2012-07-30 2014-02-12 华中科技大学 People flow statistical method based on spatio-temporal context
CN104778466A (en) * 2015-04-16 2015-07-15 北京航空航天大学 Detection method combining various context clues for image focus region
CN104933735A (en) * 2015-06-30 2015-09-23 中国电子科技集团公司第二十九研究所 A real time human face tracking method and a system based on spatio-temporal context learning
CN105631895A (en) * 2015-12-18 2016-06-01 重庆大学 Temporal-spatial context video target tracking method combining particle filtering
CN105740891A (en) * 2016-01-27 2016-07-06 北京工业大学 Target detection method based on multilevel characteristic extraction and context model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577832A (en) * 2012-07-30 2014-02-12 华中科技大学 People flow statistical method based on spatio-temporal context
CN104778466A (en) * 2015-04-16 2015-07-15 北京航空航天大学 Detection method combining various context clues for image focus region
CN104933735A (en) * 2015-06-30 2015-09-23 中国电子科技集团公司第二十九研究所 A real time human face tracking method and a system based on spatio-temporal context learning
CN105631895A (en) * 2015-12-18 2016-06-01 重庆大学 Temporal-spatial context video target tracking method combining particle filtering
CN105740891A (en) * 2016-01-27 2016-07-06 北京工业大学 Target detection method based on multilevel characteristic extraction and context model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Tree-Based Context Model for Object Recognition;M. J. Choi等;《TPAMI》;20121231;全文
基于线性拟合的多运动目标跟踪算法;李涛等;《西南师范大学学报》;20150531;第40卷(第5期);全文

Also Published As

Publication number Publication date
CN106446933A (en) 2017-02-22

Similar Documents

Publication Publication Date Title
CN106446933B (en) Multi-target detection method based on contextual information
Santos et al. Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association
Luo et al. Traffic sign recognition using a multi-task convolutional neural network
CN105809198B (en) SAR image target recognition method based on depth confidence network
Yang et al. Layered object models for image segmentation
Hariharan et al. Semantic contours from inverse detectors
Huang et al. A new building extraction postprocessing framework for high-spatial-resolution remote-sensing imagery
CN109034210A (en) Object detection method based on super Fusion Features Yu multi-Scale Pyramid network
Oliva et al. Scene-centered description from spatial envelope properties
CN108830188A (en) Vehicle checking method based on deep learning
CN104616316B (en) Personage's Activity recognition method based on threshold matrix and Fusion Features vision word
CN105160310A (en) 3D (three-dimensional) convolutional neural network based human body behavior recognition method
CN106408030A (en) SAR image classification method based on middle lamella semantic attribute and convolution neural network
CN108537102A (en) High Resolution SAR image classification method based on sparse features and condition random field
CN109409384A (en) Image-recognizing method, device, medium and equipment based on fine granularity image
CN104268593A (en) Multiple-sparse-representation face recognition method for solving small sample size problem
Shahab et al. How salient is scene text?
Wang et al. Tea picking point detection and location based on Mask-RCNN
Willems et al. Exemplar-based Action Recognition in Video.
CN110599463A (en) Tongue image detection and positioning algorithm based on lightweight cascade neural network
Bukhari et al. Assessing the impact of segmentation on wheat stripe rust disease classification using computer vision and deep learning
Li et al. Dating ancient paintings of Mogao Grottoes using deeply learnt visual codes
Lee et al. Automatic recognition of flower species in the natural environment
Zhao et al. Generalized symmetric pair model for action classification in still images
CN108734200A (en) Human body target visible detection method and device based on BING features

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20190802

Termination date: 20210831

CF01 Termination of patent right due to non-payment of annual fee