CN104036296A - Method and device for representing and processing image - Google Patents

Method and device for representing and processing image Download PDF

Info

Publication number
CN104036296A
CN104036296A CN201410281723.3A CN201410281723A CN104036296A CN 104036296 A CN104036296 A CN 104036296A CN 201410281723 A CN201410281723 A CN 201410281723A CN 104036296 A CN104036296 A CN 104036296A
Authority
CN
China
Prior art keywords
independent information
sigma
parameter
image
local
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410281723.3A
Other languages
Chinese (zh)
Other versions
CN104036296B (en
Inventor
乔宇
蔡卓伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201410281723.3A priority Critical patent/CN104036296B/en
Publication of CN104036296A publication Critical patent/CN104036296A/en
Application granted granted Critical
Publication of CN104036296B publication Critical patent/CN104036296B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention is suitable for the image processing field, and provides a method and a device for representing and processing an image. The method for representing and processing the image includes: extracting at least two types of local features from the image to be represented; building a mixed independent information decomposing model in united mode according to the different types of the local features extracted from the image to be represented; encoding shared information and independent information of the different types of the local features according to the mixed independent information decomposing model, summarizing codes of the local features, and obtaining multi-perspective super vector representation of the image. The obtained multi-perspective super vector representation can be used for classifier training and image retrieval. Due to the fact that the multi-perspective super vector representation is used to encode the shared information and the independent information of the different types of the local features, the method and the device for representing and processing the image can not cause redundancy on the premise of guaranteeing information integrity of the encoded image, and thereby can obviously improve encoding effects of the image.

Description

A kind of expression of image and disposal route and device
Technical field
The invention belongs to image processing field, relate in particular to a kind of expression and disposal route and device of image.
Background technology
When image (comprising the media datas such as picture and video) is classified or is retrieved, need in the image of to be sorted or retrieval, extract the topography's characteristic information for Description Image, described topography feature is encoded, so that described image is represented, be convenient to the classification of image and retrieval.Can be by becoming the descriptions such as replacement characteristic (SIFT), gradient orientation histogram (HOG) for picture, can be by descriptions such as HOG, light stream histograms (HOF) for video.
Research discovery before this, from different aspect (different visual angles) Description Image, there is certain complementarity in dissimilar local feature between them.And by thering is complementary dissimilar local feature, contribute to improve the effect of classification and the identification of image.
At present the dissimilar local feature disposal route of image is generally: graduation regional area in picture first then extracts the features such as gradient orientation histogram according to these regional areas in primitive frame.When feature is merged, comprise early stage fusion method and later stage fusion method, wherein early stage fusion method is, all local feature series connection is formed to a long proper vector and encode and polymerization, to form the picture in middle level, represents; Later stage fusion method is, list is planted to local feature and encode respectively and polymerization, form a plurality of middle levels picture method for expressing, then middle level represented to series connection or obtain classifying after score in single feature, be weighted fusion, finally middle level picture represented to input in sorter and classify.
Yet, because said method is only by the splicing of image to after local feature before encoding or coding, may cause the imperfect or redundancy too of the coding of graphics finally obtaining, the encoding efficiency of image is not good.
Summary of the invention
The object of the embodiment of the present invention is to provide a kind of expression and disposal route that comprises the image of the shared information between different local features and independent information, when image is encoded in prior art to solve, due to only, by the splicing of the image before coding or after coding, cause the imperfect or problem of redundancy too of coding of graphics.
The embodiment of the present invention is achieved in that a kind of expression and disposal route of image, and described method comprises:
In treating presentation video, extract the local feature of at least two types;
According to different types of local feature for the treatment of to extract in presentation video, to combine to set up and mix independent information decomposition model, described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature;
According to described mixing independent information decomposition model, shared information and the independent information of dissimilar local feature to be encoded, the various visual angles super vector that obtains described image represents.
Another object of the embodiment of the present invention is to provide a kind of expression and treating apparatus of image, and described device comprises:
Extraction unit, for treating that presentation video extracts the local feature of at least two types;
Modeling unit, for according to treating different types of local feature that presentation video extracts, combines to set up and mixes independent information decomposition model, and described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature;
Coding unit, for according to described mixing independent information decomposition model, encodes to shared information and the independent information of dissimilar local feature, and the various visual angles super vector that obtains described image represents.
In embodiments of the present invention, by extraction, treat the local feature of at least two types in presentation video, combine the mixing independent information decomposition model of setting up the shared information and the independent information that comprise dissimilar local feature, and according to described mixing independent information decomposition model, shared information and the independent information of dissimilar local feature are encoded, the various visual angles super vector that obtains image represents.The various visual angles super vector obtaining due to the present invention is expressed as encodes to shared information and the independent information of dissimilar local feature, can be so that the image information after coding be complete time, also there will not be redundancy, thereby can significantly improve the encoding efficiency of image.
Accompanying drawing explanation
Fig. 1 is the expression of image and the realization flow figure of disposal route that the specific embodiment of the invention provides;
Fig. 2 is the realization flow figure of the parameter training of the mixing independent information decomposition model that provides of the specific embodiment of the invention;
Fig. 3 is the realization flow figure that the various visual angles super vector that obtains image that the specific embodiment of the invention provides represents;
The structural representation of the treating apparatus of the picture that Fig. 4 provides for the specific embodiment of the invention.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.
The description that the embodiment of the present invention is mainly used in picture or the video to carry out multiple local feature represents.Described picture and video, and other media data, the in the present invention unified image that is called.Disposal route due to image representation of the present invention, merged multiple local feature, and comprise the super vector that shared information forms by dissimilar local feature, so it can be described as expression and the disposal route of the image based on various visual angles (comprising polytype local feature) super vector.In addition, the expression of image of the present invention and disposal route obtain the various visual angles super vector of image, can be for training classifier, so that use sorter to complete fast the classification to image, also can be according to the similarity estimation function of various visual angles super vector, as Ma Shi function, kernel function etc., retrieve the image with image similarity to be found.Below by embodiment, be specifically described.
Be illustrated in figure 1 the expression of image and the implementing procedure schematic diagram of disposal route described in the embodiment of the present invention.Wherein, in step S101, in treating presentation video, extract the local feature of at least two types.
Wherein, the local feature of described image, local feature for picture, can for the constant converting characteristic of yardstick, (its English full name be Scale-invariant feature transform, its English is referred to as SIFT), (English full name is Histogram of oriented gradient to gradient orientation histogram, English referred to as HoG) etc., for the local feature of video, can be gradient orientation histogram or light stream histogram etc.
The point of interest that described SIFT feature is some local appearance based on object and with size and the irrelevant to rotation of image.For light, noise, visual angle changes slightly tolerance is also quite high.Based on these characteristics, they are highly significant and relatively easily acquisition, in the huge property data base of female number, are easy to identification object and rarely have misidentification.And it is also quite high to use SIFT feature to describe the detecting rate of covering for part object, even only need 3 above SIFT object features to be just enough to calculate position and orientation.
The core concept of described gradient orientation histogram HoG descriptor is that presentation and the shape of the object in piece image can be distributed and describe well by the direction at pixel intensity gradient or edge.Its implementation is first image to be divided into the little grid unit connected region that is called; Then gather gradient direction or the edge orientation histogram of each pixel in grid unit; Finally altogether just can constitutive characteristic descriptor these set of histograms.In order to improve degree of accuracy, these local histograms can also be carried out in the larger interval of image (block) to contrast normalization (contrast-normalized), the method, by first calculating the density of each histogram in this interval (block), is then done normalization according to this density value to each grid unit in interval.After this normalization, can obtain better stability to illumination variation and shade.
Compare with other character description method, histograms of oriented gradients (HOG) descriptor has many good qualities.First, because HoG method is to operate on the local grid unit of image, so it can keep good unchangeability to image geometry with deformation optics, these two kinds of deformation only there will be on larger space field.Secondly, under the sampling of thick spatial domain, meticulous direction sampling and stronger conditions such as indicative of local optical normalization, as long as the posture that pedestrian can be kept upright substantially, can allow that pedestrian has some trickle limb actions, these trickle actions can be left in the basket and not affect detection effect.Histograms of oriented gradients method is the pedestrian detection that is particularly suitable for doing in image.
As can be seen here, dissimilar local feature is from different aspect, i.e. different visual angles Description Image and video, and the advantage between them has complementarity with shortcoming.
In step S102, according to different types of local feature for the treatment of to extract in presentation video, to combine to set up and mix independent information decomposition model, described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature.
For extracting two kinds of features corresponding to each regional area in given picture, these two kinds of features are together in series and have obtained comprising the new feature of information from various visual angles.For the further modeling of new feature that series connection is obtained, need to utilize the decomposition model of independent information.For two kinds of low-level image feature x, y, supposes that they meet the following conditions:
x=W xz+z xx
y=W yz+z yy
Wherein, z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y, ξ xand ξ yrepresenting noise, is error term, respectively Gaussian distributed.Therefore, just like lower probability model:
p(x|z)=N(W xz+μ xx)
p(y|z)=N(W yz+μ yy)
Edge calculation distributes, and we can obtain:
p(x)=∫P(x|z)p(z)dz=N(μ x,W x(W x) Tx)
p(y)=∫P(y|z)p(z)dz=N(μ y,W y(W y) Ty)。
Conventional independent information decomposition model can extract a part of feature that new feature the inside is shared by two kinds of features, but this model can only be processed the relation of linear dependence.In actual data, between different features, often there is nonlinear correlativity, therefore need to adopt mixing independent information decomposition model to carry out modeling to new feature.
As a mixture model, composite dependency analytical model is divided into several regional areas by feature space, and local with the modeling of independent information decomposition model at each.Because the correlativity in regional area can be approximately linear relationship, so overall nonlinear relationship just can be carried out modeling by the mixture model of local linear.Therefore, the nonlinear dependence relation of mixing independent information decomposition model of the present invention between can processing feature.
In a kind of optional embodiment, the mathematical statistical model of described mixing independent information decomposition model can be described as:
p ( x , y ) = Σ k ω k p ( x , y | k ) = Σ k ω k ∫ p ( x | z k , k ) p ( y | z k , k ) p ( z k ) dz k ,
The sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability normal Distribution, z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y.
Suppose z kmeet standard Gaussian distribution, can derive:
a Gaussian distribution N (μ k, ∑ k), its average is: μ k = μ x k μ y k , Its variance is: Σk = W x k ( W x k ) T + Ψ x k , W x k ( W y k ) T W y k ( W x k ) T , W y k ( W y k ) T + Ψ y k .
The parameter of described mixing independent information decomposition model, under concrete scene, as the set for multiple different local features, need to carry out concrete training, to obtain parameter corresponding to specific features data, wherein, the training process of parameter is comprised to following steps as shown in Figure 2:
Step S201, initialization mixes the parameter of independent information decomposition model, the parameter of initialized mixing independent information decomposition model comprise the weights of covariance matrix, local Gaussian center of distribution, projection matrix and each local matrix.
Concrete optional, can to all features, carry out the aggregating algorithm processing of k-means, obtain image vocabulary { v kk=1...K and corresponding to the part association correlation matrix of feature x and y with as each local Gaussian center of distribution, then each local Gaussian is carried out to single independent information decomposition, obtain the parameter corresponding to each local Gaussian, change is projection matrix the weight of each partial model is made as identical, and change is ω k=1/k.
Step S202, according to the parameter of initialized mixing independent information decomposition model, calculate the estimation corresponding to hidden variable and the posterior probability of sample local feature.
Concrete optional, based on initialized model parameter, calculate corresponding to sample local feature x iand y ihidden variable z i,kand posterior probability γ i, the estimation of k.First under each Local Gaussian Model k, calculate the posterior probability γ of sample local feature i i,k, computing formula is as follows: described hidden variable z i,kthe computing formula of estimation as follows: wherein, described ω krepresent the prior probability of k mixed number, υ i=[x i, y i] represent i local feature pair, it is the center of k mixed number.
And then renewal is corresponding to expectation, covariance and the correlation matrix of hidden variable in each partial model.Wherein, hidden variable expectation calculating formula is: z ^ i , k = E ( z i , k ) = [ ( W x k ) T , ( W y k ) T ] &Sigma; k - 1 x i - &mu; x k y i - &mu; y k , Covariance calculating formula is: E z i , k = Var ( z i , k ) = I - [ ( W x k ) T , ( W y k ) T ] &Sigma; k - 1 W x k W y k , The computing formula of correlation matrix is: < z i , k z i , k T > = E ( z i , k z i , k T ) = &Sigma; z i , k + z ^ i , k z ^ i , k T .
Step S203, according to calculated hidden variable and posterior probability, upgrades described parameter, and described parameter comprises center, the projection matrix of described covariance matrix, Local Gaussian Model, the weights of each local matrix.
Concrete optional, based on hidden variable z i,kand posterior probability γ i, other parameter of k Renewal model, change be weight in each Local Gaussian Model, Local Gaussian Model center,, projection matrix, each local matrix, be specifically as follows:
According to formula calculate the weight of every Local Gaussian Model;
According to formula &mu; x k = &Sigma; i &gamma; ^ i , k ( x i - W x k z ^ i , k ) &Sigma; i &gamma; ^ i , k , &mu; y k = &Sigma; i &gamma; ^ i , k ( y i - W y k z ^ i , k ) &Sigma; i &gamma; ^ i , k Calculate the center of Local Gaussian Model;
According to formula:
W x k = { &Sigma; i &gamma; ^ i , k ( x i - &mu; x k ) z ^ i , k T } { &Sigma; i &gamma; ^ i , k < z ^ i , k , z ^ i , k T > } - 1 ,
W y k = { &Sigma; i &gamma; ^ i , k ( y i - &mu; y k ) z ^ i , k T } { &Sigma; i &gamma; ^ i , k < z ^ i , k , z ^ i , k T > } - 1 Calculate covariance matrix;
According to formula
&Psi; x k = &Sigma; i &gamma; ^ i , k ( x i - W x k z ^ i , k - &mu; x k ) ( x i - W x k z ^ i , k - &mu; x k ) T &Sigma; i &gamma; ^ i , k + W x k &Sigma; z k W x kT ,
&Psi; y k = &Sigma; i &gamma; ^ i , k ( y i - W y k z ^ i , k - &mu; y k ) ( y i - W y k z ^ i , k - &mu; y k ) T &Sigma; i &gamma; ^ i , k + W y k &Sigma; z k W y kT Calculate projection matrix;
Wherein, the sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability normal Distribution, z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y.
Step S204, judge whether described parameter convergence restrains, or whether the number of times that calculates described parameter iteration reaches the maximum times of iteration, if described parameter does not restrain or the number of times of described parameter iteration does not reach the maximum times of iteration, proceed to step S202, otherwise proceed to the parameter after the study that step S205 obtains described mixing independent information decomposition model.
In step S103, according to described mixing independent information decomposition model, shared information and the independent information of dissimilar local feature are encoded.Because image can be regarded the set of a dissimilar local feature to (sample) as, we can carry out summary by the coding of local feature on to the basis of local feature coding, and then the various visual angles super vector that obtains described image represents.
The various visual angles super vector that utilize to mix independent information decomposition model design of graphics picture in this step represents, the dissimilar local feature extracting from image can be regarded right the set { (x of local feature of image as i, y i).Then we utilize mixing independent information decomposition model to encode to the shared information of dissimilar local feature and various independent information,, and the coding of local feature is carried out to summary, thus the various visual angles super vector of acquisition image represents.Specifically obtaining step that the various visual angles super vector of image represents can be as shown in Figure 3:
In step S301, according to the parameter of described mixing independent information decomposition model, determine the estimation of the hidden variable of each sample, and be weighted integration by posterior probability, obtain the estimation of each Local Gaussian Model hidden variable, and by the estimation series connection of the hidden change of each Local Gaussian Model, obtain the super vector of the information of sharing.
Concrete, mix independent information decomposition model EM algorithm and be averaging as shown in step S202, from each sample, obtain corresponding hidden variable z i,kestimation, then by these estimate by and posterior probability γ i, k is weighted fusion, just obtains each Local Gaussian Model hidden variable z kcomputing formula as follows:
the super vector of the information of sharing represents that Z is exactly all local hidden variable z kseries connection vector, be about to the hidden variable that each local Gaussian obtains and string together form and share information vector.
In step S302, obtain mix independent information decomposition model sample likelihood function respectively with respect to the gradient vector of the parameter of x and y, also referred to as Fisher vector, English full name is Fisher information metric.The independent information that this gradient vector comprises dissimilar local feature x and y.
Concrete, ask model respectively with respect to the parameter of x and y with gradient vector g xand g y.In concrete with according to formula
&PartialD; E ( L ) &PartialD; &mu; x k = 2 &omega; k ( &Psi; x k ) - 1 { &mu; x k - &Sigma; i &gamma; i , k ( x i - W x k z ^ i , k ) &Sigma; i &gamma; i , k }
&PartialD; E ( L ) &PartialD; &Psi; x k = &omega; k ( &Psi; x k ) - 1 { &Psi; x k - &Sigma; i &gamma; i , k ( x i , k - W x k z ^ i , k ) ( x i , k - W x k z ^ i , k ) T &Sigma; i &gamma; i , k } ( &Psi; x k ) - 1 Calculate described mixing independent information decomposition model with respect to the gradient vector of the parameter of x;
According to formula
&PartialD; E ( L ) &PartialD; &mu; y k = 2 &omega; k ( &Psi; x k ) - 1 { &mu; y k - &Sigma; i &gamma; i , k ( y i - W y k z ^ i , k ) &Sigma; i &gamma; i , k }
&PartialD; E ( L ) &PartialD; &Psi; y k = &omega; k ( &Psi; y k ) - 1 { &Psi; y k - &Sigma; i &gamma; i , k ( y i , k - W y k z ^ i , k ) ( y i , k - W y k z ^ i , k ) T &Sigma; i &gamma; i , k } ( &Psi; y k ) - 1 Calculate described mixing independent information decomposition model with respect to the gradient vector of the parameter of y;
Wherein, representative mixes the log-likelihood function of independent information decomposition model, the sequence number that k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, W x, W ybe respectively linear transformation.
In step S303, by the super vector of described shared information and the described gradient vector that comprises independent information, be together in series, obtain various visual angles super vector and represent.
Be about to share information super vector and represent Z and gradient vector
G x = [ &PartialD; E ( L ) &PartialD; &mu; x k , &PartialD; E ( L ) &PartialD; &Psi; x k ] G y = [ &PartialD; E ( L ) &PartialD; &mu; y k , &PartialD; E ( L ) &PartialD; &Psi; y k ] Be together in series, just obtain last various visual angles super vector and represent.
As in the further embodiment of the embodiment of the present invention, the expression of described image and disposal route also can comprise step S104, according to the various visual angles super vector of obtained different classes of image, training classifier, make sorter to carry out classification step according to the various visual angles super vector of image, or the various visual angles super vector training detecting device according to image block, makes to detect in detected image piece, whether to have certain objects.
Or step S105, according to the phase similarity estimation function of obtained various visual angles super vector and described various visual angles super vector, retrieves the image similar to it.Thereby more convenient effectively completing the classification of image and retrieval.
The embodiment of the present invention is treated the local feature of at least two types in presentation video by extraction, combine the mixing independent information decomposition model of setting up the shared information and the independent information that comprise dissimilar local feature, and according to described mixing independent information decomposition model, shared information and the independent information of dissimilar local feature are encoded, the various visual angles super vector that obtains image represents.The various visual angles super vector obtaining due to the present invention is expressed as encodes to shared information and the independent information of dissimilar local feature, can be so that the image information after coding be complete time, also there will not be redundancy, thereby can significantly improve the encoding efficiency of image.
Expression and the treating apparatus of another embodiment of the invention image as shown in Figure 4, it comprises:
Extraction unit 401, for treating that presentation video extracts the local feature of at least two types;
Modeling unit 402, for according to treating different types of local feature that presentation video extracts, combines to set up and mixes independent information decomposition model, and described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature;
Coding unit 403, for according to described mixing independent information decomposition model, encodes to shared information and the independent information of dissimilar local feature, and the coding of local feature is carried out to summary, and the various visual angles super vector that obtains described image represents.
Further, the present invention is afraid of that expression and the disposal route of stating image can also comprise:
Recognition unit 404, for according to the various visual angles super vector of obtained different classes of image, training classifier, can judge according to the various visual angles super vector of image sorter to its classification; Or the various visual angles super vector training detecting device according to image block, makes to detect in detected image piece, whether to have certain objects.
Or retrieval unit 405, for according to the phase similarity estimation function of obtained various visual angles super vector and described various visual angles super vector, retrieves the image similar to it.
Optionally, described mixing independent information decomposition model is:
p ( x , y ) = &Sigma; k &omega; k p ( x , y | k ) = &Sigma; k &omega; k &Integral; p ( x | z k , k ) p ( y | z k , k ) p ( z k ) dz k ,
The sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y;
Described modeling unit comprises:
Initialization subelement, the parameter of mixing independent information decomposition model for initialization, the parameter of initialized mixing independent information decomposition model comprise the weights of covariance matrix, local Gaussian center of distribution, projection matrix and each local matrix;
Computation subunit, for according to the parameter of initialized mixing independent information decomposition model, calculate the estimation corresponding to hidden variable and the posterior probability of sample local feature;
Upgrade subelement, for according to calculated hidden variable and posterior probability, upgrade described parameter, described parameter comprises center, the projection matrix of described covariance matrix, Local Gaussian Model, the weights of each local matrix;
Judgment sub-unit, be used for judging whether described parameter convergence restrains, or whether the number of times that calculates described parameter iteration reaches the maximum times of iteration, if described parameter does not restrain or the number of times of described parameter iteration does not reach the maximum times of iteration, proceed to computing unit, otherwise obtain the parameter after the study of described mixing independent information decomposition model.
Optionally, described coding unit extracts dissimilar local feature from image, form polymorphic local feature sample set, the mixing independent information decomposition model that utilization trains, calculate shared information and the independent information vector of sample set, and then the various visual angles super vector that represents of composing images, specifically comprise:
Super vector obtains subelement, for determine the estimation of the hidden variable of each sample according to the parameter of described mixing independent information decomposition model, and be weighted integration by posterior probability, obtain the estimation of each Local Gaussian Model hidden variable, and by the estimation series connection of the hidden change of each Local Gaussian Model, obtain the super vector of the information of sharing;
Gradient vector is obtained subelement, for obtaining, mixes independent information decomposition model respectively with respect to the gradient vector of the parameter of x and y;
Sub-series unit, is together in series for the super vector by described shared information and described gradient vector, obtains various visual angles super vector and represents.
Described in embodiment of the present invention Fig. 4, the expression of image is corresponding with expression and the disposal route of the image shown in Fig. 1, Fig. 2 and Fig. 3 with treating apparatus, at this, do not repeat.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims (10)

1. the expression of image and a disposal route, is characterized in that, described method comprises:
In treating presentation video, extract the local feature of at least two types;
According to different types of local feature for the treatment of to extract in presentation video, to combine to set up and mix independent information decomposition model, described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature;
According to described mixing independent information decomposition model, shared information and the independent information of dissimilar local feature to be encoded, and the coding of local feature is carried out to summary, the various visual angles super vector that obtains described image represents.
2. method according to claim 1, is characterized in that, described mixing independent information decomposition model is:
p ( x , y ) = &Sigma; k &omega; k p ( x , y | k ) = &Sigma; k &omega; k &Integral; p ( x | z k , k ) p ( y | z k , k ) p ( z k ) dz k ,
The sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability normal Distribution; Z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y;
Different types of local feature that described basis is extracted from treat presentation video, combine foundation mixing independent information decomposition model step and comprise:
A, initialization mixes the parameter of independent information decomposition model, the parameter of initialized mixing independent information decomposition model comprise the weights of covariance matrix, local Gaussian center of distribution, projection matrix and each local matrix;
B, according to the parameter of initialized mixing independent information decomposition model, calculate the estimation corresponding to hidden variable and the posterior probability of sample local feature;
C, according to calculated hidden variable and posterior probability, upgrades described parameter, and described parameter comprises center, the projection matrix of described covariance matrix, Local Gaussian Model, the weights of each local matrix;
D, judge whether described parameter convergence restrains, or whether the number of times that calculates described parameter iteration reaches the maximum times of iteration, if described parameter does not restrain or the number of times of described parameter iteration does not reach the maximum times of iteration, proceed to step B, otherwise obtain the parameter after the study of described mixing independent information decomposition model.
3. method according to claim 2, is characterized in that, described according to the parameter of initialized mixing independent information decomposition model, calculate in the hidden variable and posterior probability step corresponding to sample local feature,
Described posterior probability γ i,kthe computing formula of estimation as follows: wherein, described ω krepresent the prior probability of k mixed number, υ i=[x i, y i] represent i local feature pair, it is the center of k mixed number; Described hidden variable z i,kthe computing formula of estimation as follows: z ^ k = [ W x kT , W y kT ] &Sigma; k - 1 &Sigma; i &gamma; i , k ( &upsi; i - &mu; k ) ;
Described according to calculated hidden variable and posterior probability, upgrade described parameter, described parameter comprises center, the projection matrix of described covariance matrix, Local Gaussian Model, the weights step of each local matrix is:
According to formula calculate the weight of every Local Gaussian Model;
According to formula &mu; x k = &Sigma; i &gamma; ^ i , k ( x i - W x k z ^ i , k ) &Sigma; i &gamma; ^ i , k , &mu; y k = &Sigma; i &gamma; ^ i , k ( y i - W y k z ^ i , k ) &Sigma; i &gamma; ^ i , k Calculate the center of Local Gaussian Model;
According to formula:
W x k = { &Sigma; i &gamma; ^ i , k ( x i - &mu; x k ) z ^ i , k T } { &Sigma; i &gamma; ^ i , k < z ^ i , k , z ^ i , k T > } - 1 ,
W y k = { &Sigma; i &gamma; ^ i , k ( x i - &mu; y k ) z ^ i , k T } { &Sigma; i &gamma; ^ i , k < z ^ i , k , z ^ i , k T > } - 1 Calculate covariance matrix;
According to formula
&Psi; x k = &Sigma; i &gamma; ^ i , k ( x i - W x k z ^ i , k - &mu; x k ) ( x i - W x k z ^ i , k - &mu; x k ) T &Sigma; i &gamma; ^ i , k + W x k &Sigma; z k W x kT ,
&Psi; y k = &Sigma; i &gamma; ^ i , k ( y i - W y k z ^ i , k - &mu; y k ) ( y i - W y k z ^ i , k - &mu; y k ) T &Sigma; i &gamma; ^ i , k + W y k &Sigma; z k W y kT Calculate projection matrix;
Wherein, the sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability normal Distribution, z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y.
4. method according to claim 1, it is characterized in that, described according to described mixing independent information decomposition model, shared information and independent information to dissimilar local feature are encoded, and the coding of local feature is carried out to summary, the various visual angles super vector that obtains described image represents, concrete steps comprise:
According to each sample of parameter estimation of described mixing independent information decomposition model, it is the right shared hidden variable of local feature, and be weighted integration by the posterior probability of each sample, obtain the estimation of each Local Gaussian Model hidden variable, and by the estimation series connection of the hidden change of each Local Gaussian Model, obtain the super vector of the information of sharing;
Obtain mix independent information decomposition model sample likelihood function respectively with respect to the parameter of x and y gradient vector, the independent information that this gradient vector comprises dissimilar local feature x and y;
By the super vector of described shared information and the described gradient vector that comprises independent information, be together in series, obtain various visual angles super vector and represent.
5. method according to claim 4, it is characterized in that, the described parameter estimation according to described mixing independent information decomposition model is determined each sample, it is the estimation of the right shared hidden variable of local feature, and be weighted integration by the posterior probability of each sample, obtain the estimation of each Local Gaussian Model hidden variable, and by the estimation series connection of the hidden change of each Local Gaussian Model, obtain in the super vector step of the information of sharing, according to formula calculate the estimation of each Local Gaussian Model hidden variable;
The hidden variable that each local Gaussian is obtained strings together form and share information vector;
Described obtaining mixed independent information decomposition model respectively with respect to the parameter of x and y gradient vector step in, according to formula
&PartialD; E ( L ) &PartialD; &mu; x k = 2 &omega; k ( &Psi; x k ) - 1 { &mu; x k - &Sigma; i &gamma; i , k ( x i - W x k z ^ i , k ) &Sigma; i &gamma; i , k }
&PartialD; E ( L ) &PartialD; &Psi; x k = &omega; k ( &Psi; x k ) - 1 { &Psi; x k - &Sigma; i &gamma; i , k ( x i , k - W x k z ^ i , k ) ( x i , k - W x k z ^ i , k ) T &Sigma; i &gamma; i , k } ( &Psi; x k ) - 1 Calculate described mixing independent information decomposition model with respect to the gradient vector of the parameter of x;
According to formula
&PartialD; E ( L ) &PartialD; &mu; y k = 2 &omega; k ( &Psi; x k ) - 1 { &mu; y k - &Sigma; i &gamma; i , k ( y i - W y k z ^ i , k ) &Sigma; i &gamma; i , k }
&PartialD; E ( L ) &PartialD; &Psi; y k = &omega; k ( &Psi; y k ) - 1 { &Psi; y k - &Sigma; i &gamma; i , k ( y i , k - W y k z ^ i , k ) ( y i , k - W y k z ^ i , k ) T &Sigma; i &gamma; i , k } ( &Psi; y k ) - 1 Calculate described mixing independent information decomposition model with respect to the gradient vector of the parameter of y;
Wherein, representative mixes the log-likelihood function of independent information decomposition model, the sequence number that k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, W x, W ybe respectively linear transformation;
Wherein, G x = [ &PartialD; E ( L ) &PartialD; &mu; x k , &PartialD; E ( L ) &PartialD; &Psi; x k ] With G y = [ &PartialD; E ( L ) &PartialD; &mu; y k , &PartialD; E ( L ) &PartialD; &Psi; y k ] Respectively in representative image about the independent information of x and y; By sharing information vector and independent information vector, string together the various visual angles super vector that gets final product composing images and represent [Z, Gx, Gy].
6. method according to claim 1, is characterized in that, described method also comprises:
According to the various visual angles super vector of obtained different classes of image, training classifier, can judge according to the various visual angles super vector of image sorter to its classification;
Or the various visual angles super vector training detecting device according to image block, makes to detect in detected image piece, whether to have certain objects.
7. method according to claim 1, is characterized in that, described method also comprises:
According to the phase similarity estimation function of obtained various visual angles super vector and described various visual angles super vector, retrieve the image similar to it.
8. the expression of image and a treating apparatus, is characterized in that, described device comprises:
Extraction unit, for treating that presentation video extracts the local feature of at least two types;
Modeling unit, for according to treating different types of local feature that presentation video extracts, combines to set up and mixes independent information decomposition model, and described mixing independent information decomposition model comprises shared information and the independent information in dissimilar local feature;
Coding unit, for according to described mixing independent information decomposition model, encodes to shared information and the independent information of dissimilar local feature, and the coding of local feature is carried out to summary, and the various visual angles super vector that obtains described image represents.
9. install according to claim 8, it is characterized in that, described mixing independent information decomposition model is:
p ( x , y ) = &Sigma; k &omega; k p ( x , y | k ) = &Sigma; k &omega; k &Integral; p ( x | z k , k ) p ( y | z k , k ) p ( z k ) dz k ,
The sequence number that wherein k is mixed number, ω k=P (k) represents the prior probability of k mixed number, z krepresent the shared variable of x and y in k mixed number, conditional probability z represents the shared information of x and two kinds of low-level image features of y, W x, W ybe respectively linear transformation, z xand z yrepresent respectively the distinctive independent information of x and y;
Described modeling unit comprises:
Initialization subelement, the parameter of mixing independent information decomposition model for initialization, the parameter of initialized mixing independent information decomposition model comprise the weights of covariance matrix, local Gaussian center of distribution, projection matrix and each local matrix;
Computation subunit, for according to the parameter of initialized mixing independent information decomposition model, calculate the estimation corresponding to hidden variable and the posterior probability of sample local feature;
Upgrade subelement, for according to calculated hidden variable and posterior probability, upgrade described parameter, described parameter comprises center, the projection matrix of described covariance matrix, Local Gaussian Model, the weights of each local matrix;
Judgment sub-unit, be used for judging whether described parameter convergence restrains, or whether the number of times that calculates described parameter iteration reaches the maximum times of iteration, if described parameter does not restrain or the number of times of described parameter iteration does not reach the maximum times of iteration, proceed to computing unit, otherwise obtain the parameter after the study of described mixing independent information decomposition model.
10. install according to claim 8, it is characterized in that, described coding unit extracts dissimilar local feature from image, form polymorphic local feature sample set, the mixing independent information decomposition model that utilization trains, calculate shared information and the independent information vector of sample set, and then the various visual angles super vector that represents of composing images, specifically comprise:
Super vector obtains subelement, for determine the estimation of the hidden variable of each sample according to the parameter of described mixing independent information decomposition model, and be weighted integration by posterior probability, obtain the estimation of each Local Gaussian Model hidden variable, and by the estimation series connection of the hidden change of each Local Gaussian Model, obtain the super vector of the information of sharing;
Gradient vector is obtained subelement, for obtaining, mixes independent information decomposition model respectively with respect to the gradient vector of the parameter of x and y, represents the independent information of all kinds of local features;
Sub-series unit, is together in series for the super vector by described shared information and described gradient vector, obtains various visual angles super vector and represents.
CN201410281723.3A 2014-06-20 2014-06-20 A kind of expression of image and processing method and processing device Active CN104036296B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410281723.3A CN104036296B (en) 2014-06-20 2014-06-20 A kind of expression of image and processing method and processing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410281723.3A CN104036296B (en) 2014-06-20 2014-06-20 A kind of expression of image and processing method and processing device

Publications (2)

Publication Number Publication Date
CN104036296A true CN104036296A (en) 2014-09-10
CN104036296B CN104036296B (en) 2017-10-13

Family

ID=51467061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410281723.3A Active CN104036296B (en) 2014-06-20 2014-06-20 A kind of expression of image and processing method and processing device

Country Status (1)

Country Link
CN (1) CN104036296B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389588A (en) * 2015-11-04 2016-03-09 上海交通大学 Multi-semantic-codebook-based image feature representation method
CN108259914A (en) * 2018-03-20 2018-07-06 西安电子科技大学 Cloud method for encoding images based on object library
CN108647649A (en) * 2018-05-14 2018-10-12 中国科学技术大学 The detection method of abnormal behaviour in a kind of video
CN108875463A (en) * 2017-05-16 2018-11-23 富士通株式会社 Multi-angle of view vector processing method and equipment
CN109284744A (en) * 2018-11-02 2019-01-29 张彦龙 A method of iris image is encoded from eye gray level image likelihood figure and is retrieved
CN110533088A (en) * 2019-08-16 2019-12-03 湖北工业大学 A kind of scene text Language Identification based on differentiated convolutional neural networks
CN113505801A (en) * 2021-09-13 2021-10-15 拓小拓科技(天津)有限公司 Intensity value vector table generating method and image coding method for super-dimensional calculation

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663409A (en) * 2012-02-28 2012-09-12 西安电子科技大学 Pedestrian tracking method based on HOG-LBP
US20130294685A1 (en) * 2010-04-01 2013-11-07 Microsoft Corporation Material recognition from an image
CN103562964A (en) * 2011-06-07 2014-02-05 欧姆龙株式会社 Image processing device, information generation device, image processing method, information generation method, control program, and recording medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130294685A1 (en) * 2010-04-01 2013-11-07 Microsoft Corporation Material recognition from an image
CN103562964A (en) * 2011-06-07 2014-02-05 欧姆龙株式会社 Image processing device, information generation device, image processing method, information generation method, control program, and recording medium
CN102663409A (en) * 2012-02-28 2012-09-12 西安电子科技大学 Pedestrian tracking method based on HOG-LBP

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
FRANCIS R. BACH 等: "A Probabilistic Interpretation of Canonical Correlation Analysis", 《TECHNICAL REPORT 688 DEPARTMENT OF STATISTICS UNIVERSITY OF CALIFORNIA》 *
XIAOJIANG PENG 等: "Hybrid Super Vector with Improved Dense Trajectories for Action Recognition", 《ICCV 2013 WORKSHOP OF THUMOS’13 ACTION RECOGNITION CHANLLENGE》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105389588A (en) * 2015-11-04 2016-03-09 上海交通大学 Multi-semantic-codebook-based image feature representation method
CN105389588B (en) * 2015-11-04 2019-02-22 上海交通大学 Based on multi-semantic meaning code book image feature representation method
CN108875463A (en) * 2017-05-16 2018-11-23 富士通株式会社 Multi-angle of view vector processing method and equipment
CN108259914A (en) * 2018-03-20 2018-07-06 西安电子科技大学 Cloud method for encoding images based on object library
CN108259914B (en) * 2018-03-20 2019-10-11 西安电子科技大学 Cloud image encoding method based on object library
CN108647649A (en) * 2018-05-14 2018-10-12 中国科学技术大学 The detection method of abnormal behaviour in a kind of video
CN108647649B (en) * 2018-05-14 2021-10-01 中国科学技术大学 Method for detecting abnormal behaviors in video
CN109284744A (en) * 2018-11-02 2019-01-29 张彦龙 A method of iris image is encoded from eye gray level image likelihood figure and is retrieved
CN110533088A (en) * 2019-08-16 2019-12-03 湖北工业大学 A kind of scene text Language Identification based on differentiated convolutional neural networks
CN113505801A (en) * 2021-09-13 2021-10-15 拓小拓科技(天津)有限公司 Intensity value vector table generating method and image coding method for super-dimensional calculation
CN113505801B (en) * 2021-09-13 2021-11-30 拓小拓科技(天津)有限公司 Image coding method for super-dimensional calculation

Also Published As

Publication number Publication date
CN104036296B (en) 2017-10-13

Similar Documents

Publication Publication Date Title
CN104036296A (en) Method and device for representing and processing image
Gosselin et al. Revisiting the fisher vector for fine-grained classification
CN112734775B (en) Image labeling, image semantic segmentation and model training methods and devices
EP2907082B1 (en) Using a probabilistic model for detecting an object in visual data
US20140257995A1 (en) Method, device, and system for playing video advertisement
CN105069481B (en) Natural scene multiple labeling sorting technique based on spatial pyramid sparse coding
CN105574063A (en) Image retrieval method based on visual saliency
CN105205475A (en) Dynamic gesture recognition method
CN105894046A (en) Convolutional neural network training and image processing method and system and computer equipment
CN110059769B (en) Semantic segmentation method and system based on pixel rearrangement reconstruction and used for street view understanding
Wang A multi-scale approach for delineating individual tree crowns with very high resolution imagery
CN109426828B (en) Point cloud classification method, device, equipment and storage medium
CN103903013A (en) Optimization algorithm of unmarked flat object recognition
CN103390046A (en) Multi-scale dictionary natural scene image classification method based on latent Dirichlet model
Wang et al. Bodhisattva head images modeling style recognition of Dazu Rock Carvings based on deep convolutional network
CN110675421B (en) Depth image collaborative segmentation method based on few labeling frames
CN113095333B (en) Unsupervised feature point detection method and unsupervised feature point detection device
He et al. Learning hybrid models for image annotation with partially labeled data
CN105574475A (en) Common vector dictionary based sparse representation classification method
Yao et al. Sensing urban land-use patterns by integrating Google Tensorflow and scene-classification models
Gleason et al. A Fusion Approach for Tree Crown Delineation from Lidar Data.
CN104050628A (en) Image processing method and image processing device
CN105956610B (en) A kind of remote sensing images classification of landform method based on multi-layer coding structure
US20110243426A1 (en) Method, apparatus, and program for generating classifiers
CN115331012A (en) Joint generation type image instance segmentation method and system based on zero sample learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant