CN104616013A - Method for acquiring low-dimensional local characteristics descriptor - Google Patents

Method for acquiring low-dimensional local characteristics descriptor Download PDF

Info

Publication number
CN104616013A
CN104616013A CN201410183573.2A CN201410183573A CN104616013A CN 104616013 A CN104616013 A CN 104616013A CN 201410183573 A CN201410183573 A CN 201410183573A CN 104616013 A CN104616013 A CN 104616013A
Authority
CN
China
Prior art keywords
matrix
local feature
dimension
descriptor
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410183573.2A
Other languages
Chinese (zh)
Inventor
段凌宇
林杰
王哲
杨爽
陈杰
黄铁军
高文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201410183573.2A priority Critical patent/CN104616013A/en
Publication of CN104616013A publication Critical patent/CN104616013A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a method for acquiring a low-dimensional local characteristics descriptor. The method comprises the steps of acquiring the local characteristics descriptor of an image to be processed; forming a descriptor set through the acquired local characteristics descriptor; reducing the dimension of each local characteristics descriptor in the descriptor set according to a dimension reduction matrix to obtain the local-dimensional local characteristics descriptor corresponding to each local characteristics descriptor, wherein the dimension reduction matrix is a matrix acquired by training a preset image data set. With the adoption of the method, the dimensionality of the local characteristics descriptor in the prior art can be reduced, and redundant information of the local characteristics descriptor in the prior art can be removed.

Description

Method for obtaining low-dimensional local feature descriptor
Technical Field
The embodiment of the invention relates to the field of computers, in particular to a method for acquiring a low-dimensional local feature descriptor.
Background
At present, mobile visual search is applied more and more, and the industry generally adopts local feature descriptors to be aggregated into global feature descriptors to realize image retrieval or classification. For example, the local feature descriptors are aggregated into global feature descriptors such as Fisher vectors.
In the prior art, a specific implementation manner of aggregating local feature descriptors into global feature descriptors to implement image retrieval or classification is as follows: first, extracting local feature descriptors of the image, and directly aggregating Fisher vectors by using the local feature descriptors. However, the dimensionality of the local feature descriptors for extracting the image is high, so that the time and space complexity of the aggregation Fisher vector is high, and further, the dimensionality of the obtained Fisher vector is high due to the high dimensionality of the local feature descriptors, so that the occupied space of the global feature descriptors is very large, transmission delay is easily caused, and the response time of image retrieval or image classification is influenced.
In addition, the Fisher vectors are directly aggregated by using the local feature descriptors, so that the discrimination of the aggregated Fisher vectors is reduced, the robustness is not provided, and the accuracy of image retrieval is further reduced.
Disclosure of Invention
In order to solve the defects in the prior art, the invention provides a method for acquiring a low-dimensional local feature descriptor, which is used for reducing the dimensionality of the local feature descriptor in the prior art and removing redundant information of the local feature descriptor in the prior art.
The invention provides a method for acquiring a low-dimensional local feature descriptor, which comprises the following steps:
acquiring a local feature descriptor of an image to be processed;
forming the obtained local feature descriptors into a descriptor set;
reducing the dimension of each local feature descriptor in the descriptor set according to the dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor; the dimension reduction matrix is obtained by training a preset image data set.
Optionally, performing dimension reduction on each local feature descriptor in the descriptor set according to a dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor, including:
subtracting a preset mean vector from each local feature descriptor in the descriptor set to obtain a converted local feature descriptor;
the converted local feature description is sub-grouped into a data matrix;
multiplying the dimensionality reduction matrix and the data matrix to obtain a result matrix;
splitting the result matrix to obtain a low-dimensional local feature descriptor;
the preset mean vector is obtained by training a preset image data set, and the dimensionality of the preset mean vector is the same as that of the local feature descriptor.
Optionally, the converting the local feature description into a data matrix includes:
when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding row in the data matrix to obtain an M-N dimensional data matrix;
or,
when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding column in the data matrix to obtain a data matrix with the dimension of N x M;
wherein M is the number of the converted local feature descriptors, M is a natural number, and N is equal to 128.
Optionally, the dimension reduction matrix is a matrix obtained from the image dataset by adopting a principal component analysis method, and the dimension of the dimension reduction matrix is N × K, or the dimension of the dimension reduction matrix is K × N;
when the dimensionality of the dimensionality reduction matrix is N x K and the dimensionality of the data matrix is M x N, the dimensionality of the result matrix is M x K; or,
when the dimensionality of the dimensionality reduction matrix is K x N and the dimensionality of the data matrix is N x M, the dimensionality of the result matrix is K x M;
where K is equal to 32.
Optionally, the splitting the result matrix to obtain a low-dimensional local feature descriptor includes:
if the dimension of the result matrix is M x K, extracting the numerical value in each row in the result matrix, and taking the extracted numerical value of each row as a low-dimensional local feature descriptor;
or,
if the dimension of the result matrix is K x M, extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32.
Optionally, the splitting the result matrix to obtain a low-dimensional local feature descriptor includes:
extracting a numerical value in each row in the result matrix, and taking the extracted numerical value in each row as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32;
or,
extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32.
Optionally, before the performing dimension reduction on the local feature descriptors in the descriptor set according to the dimension reduction matrix to obtain low-dimensional local feature descriptors, the method further includes:
obtaining a sample matrix of the image dataset;
obtaining a mean vector according to the sample matrix;
centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix;
calculating a covariance matrix of the centered sample matrix;
acquiring an eigenvalue of the covariance matrix and an eigenvector corresponding to the eigenvalue;
sorting the eigenvectors from big to small according to the magnitude of the eigenvalue, and selecting the first K eigenvectors;
forming the first K eigenvectors into the dimensionality reduction matrix;
where K is equal to 32.
Optionally, each local feature descriptor of each image in the image dataset corresponds to a row of values in the sample matrix, each image in the image dataset corresponds to a number of rows of sample values in the sample matrix, and there are N sample values in each row of the sample matrix;
the obtaining of the mean vector according to the sample matrix includes:
averaging all values on each column of the sample matrix, the value of the ith dimension of the mean vector being equal to the average value of the ith column of the sample matrix, where i is 1, …, N;
the centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix includes:
subtracting the value of the ith dimension of the mean vector from the ith numerical value of each row of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;
the dimension of the covariance matrix is N x N;
the dimension of the feature vector is N;
elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of rows/columns in the dimension reduction matrix;
or,
each local feature descriptor of each image in the image data set corresponds to a column of values in the sample matrix, each image in the image data set corresponds to a plurality of columns of sample values in the sample matrix, and N sample values are provided for each column in the sample matrix;
the obtaining of the mean vector according to the sample matrix includes:
averaging all values in each row of the sample matrix, wherein the value in the ith dimension of the mean vector is equal to the average value in the ith row of the sample matrix, and i is 1, …, N;
the centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix includes:
subtracting the value of the ith dimension of the mean vector from the ith value on each column of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;
the dimension of the covariance matrix is N x N;
the dimension of the feature vector is N;
elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of columns/rows in the dimension reduction matrix;
where N equals 128 and K equals 32.
Optionally, the image dataset comprises: planar object images and three-dimensional object images.
According to the technical scheme, the method for obtaining the low-dimensional local feature descriptors comprises the steps of obtaining the local feature descriptors of the image to be processed, forming descriptor sets from all the obtained local feature descriptors, and reducing the dimension of each local feature descriptor by using a dimension reduction matrix to obtain the low-dimensional local feature descriptors of each local feature descriptor, so that the dimension of the local feature descriptors in the prior art can be reduced, and redundant information of the local feature descriptors in the prior art can be removed.
Drawings
Fig. 1 is a schematic flow chart of obtaining a low-dimensional local feature descriptor according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of obtaining a low-dimensional local feature descriptor according to another embodiment of the present invention;
fig. 3 is a schematic diagram of a gradient direction histogram vector according to an embodiment of the invention.
Detailed Description
Fig. 1 illustrates a method for obtaining a low-dimensional local feature descriptor according to an embodiment of the present invention, and as shown in fig. 1, the method for obtaining a low-dimensional local feature descriptor in the embodiment is as follows.
101. And acquiring a local feature descriptor of the image to be processed.
For example, the image to be processed may be any image, for example, the image to be processed may be a photograph of a file, or a hand-drawn picture, an oil-drawn image, a frame captured from a video, a landmark photograph, or an article photograph, and the like.
In particular, the manner of obtaining one or more local feature descriptors of the image to be processed is an existing manner, for example, the local feature descriptors may be Scale invariant feature descriptors (Scale invariant feature Transform, referred to as SIFT), or the local feature descriptors may be fast Robust Scale invariant feature descriptors (speedup Robust Features, referred to as SURF), or other local feature descriptors.
It should be understood that the SIFT or SURF extraction method may be an existing extraction method, and the embodiment is not described in detail. Generally, the SIFT may be 128-dimensional in dimension and the SURF may be 64-dimensional in dimension.
Optionally, the local feature descriptors for acquiring the image to be processed may be subjected to feature selection and other processing based on the above-mentioned local feature descriptor acquiring manner, and one or more of all local feature descriptors corresponding to one image are selected.
102. And forming the obtained local feature descriptors into a descriptor set.
In the present embodiment, all the acquired local feature descriptors form a descriptor set.
103. And reducing the dimension of each local feature descriptor in the descriptor set according to the dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor.
In this embodiment, the dimension reduction matrix in step 103 may be a matrix obtained by training a preset image data set.
Optionally, before step 103, normalization processing may be further performed on all local feature descriptors in the descriptor set, and then, in step 103, dimension reduction processing may be performed on the normalized local feature descriptors to obtain low-dimensional local feature descriptors corresponding to each local feature descriptor.
The steps of the normalization process are exemplified as follows:
a01 if the local feature descriptor is htM-1, normalized using L1 for each dimension, yielding h't,j=ht,j/|htJ ═ 0., 127; wherein, | htI represents a 128-dimensional local feature descriptor vector htThe sum of the absolute values of the dimensions.
A02, continue to normalize for each dimension using power normalization with parameter 0.5, get h't,j←sgn(h′t,j)|h′t,j|0.5
Wherein, | h't,jL represents dimension h't,jThe absolute value of (a) is, <math> <mrow> <mi>sgn</mi> <mrow> <mo>(</mo> <msubsup> <mi>h</mi> <mrow> <mi>t</mi> <mo>,</mo> <mi>j</mi> </mrow> <mo>&prime;</mo> </msubsup> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open='{' close=''> <mtable> <mtr> <mtd> <mo>-</mo> <mn>1</mn> </mtd> <mtd> <msubsup> <mi>h</mi> <mrow> <mi>t</mi> <mo>,</mo> <mi>j</mi> </mrow> <mo>&prime;</mo> </msubsup> <mo>&lt;</mo> <mn>0</mn> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <msubsup> <mi>h</mi> <mrow> <mi>t</mi> <mo>,</mo> <mi>j</mi> </mrow> <mo>&prime;</mo> </msubsup> <mo>=</mo> <mn>0</mn> </mtd> </mtr> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <msubsup> <mi>h</mi> <mrow> <mi>t</mi> <mo>,</mo> <mi>j</mi> </mrow> <mo>&prime;</mo> </msubsup> <mo>></mo> <mn>0</mn> </mtd> </mtr> </mtable> </mfenced> <mo>.</mo> </mrow> </math>
it should be noted that the above method may be performed on any device, and the embodiment does not limit whether its execution subject is a client or a server.
The method for obtaining the low-dimensional local feature descriptor in the embodiment can reduce the dimension of the local feature descriptor in the prior art and remove redundant information of the local feature descriptor in the prior art.
Fig. 2 illustrates a method for obtaining a low-dimensional local feature descriptor according to another embodiment of the present invention, and as shown in fig. 2, the method for obtaining a low-dimensional local feature descriptor in this embodiment is as follows.
201. And acquiring a local feature descriptor of the image to be processed.
In particular, the manner of obtaining the local feature descriptors of the image to be processed is exemplified as follows:
the first step is as follows: the image I to be processed and a group of Gaussian filtersAnd (3) obtaining Gaussian blurred images of the image I under different scales in a Gaussian scale space by convolution, wherein sigma is the standard deviation of Gaussian and expresses the scale corresponding to each Gaussian blurred image in the Gaussian scale space. σ is taken as an exponential power of 2, the firstk scales of σkAnd is andwherein sigma0The initial scale is 1.6, and K represents the number of sampling layers of the scale space, namely the number of Gaussian filters. Then the k-th Gaussian blur image is IkCorresponding scale is σkAnd I isk=I*g(σk),k=0,...,K。
The second step is that: in the Gaussian scale space, each Gaussian blur image is convolved with a Laplacian filter with normalized scale to obtain a Laplacian scale space responseWherein f = 0 1 0 1 - 4 1 0 1 0 Is the laplacian operator.
The third step: and acquiring a local maximum value or minimum value point as a candidate interest point in the Gaussian Laplace scale space. The interest point comprises three attributes, namely the position coordinate x, y and the corresponding scale sigma of the interest point in the corresponding Gaussian blur imagek
The fourth step: for the interest points, obtaining the correspondingOf the same scale of gaussian blurred image IkThe circular area with x, y as the center and m sigma as the radius, wherein m is 3.96. Then, for the pixels in the circular area, the gradient of each pixel is calculated according to the following formula, including the modulus length of the gradientAnd direction of gradient <math> <mrow> <msub> <msub> <mi>&theta;</mi> <mi>I</mi> </msub> <mi>k</mi> </msub> <mo>:</mo> </mrow> </math>
m I k ( x , y ) = ( I k ( x + 1 , y ) - I k ( x - 1 , y ) ) 2 + ( I k ( x , y + 1 ) - I k ( x , y - 1 ) ) 2
<math> <mrow> <msub> <msub> <mi>&theta;</mi> <mi>I</mi> </msub> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>arctan</mi> <mfrac> <mrow> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mn>1</mn> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>-</mo> <mn>1</mn> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow> </math>
The gradient direction of each pixel in the circular area is quantized to the direction equally divided by the circumference 36 according to the rule of the nearest distance. And each direction is weighted and accumulated by taking the gradient modular length as weight to obtain a 36-dimensional gradient direction histogram.
The fifth step: and selecting the direction with the largest accumulation in the histogram as the main direction theta of the interest point. Meanwhile, if the accumulated value of other directions exceeds 80% of the accumulated value of the main direction, the point of interest is copied and expanded to be a new point of interest, and the direction is used as the main direction of the new point of interest.
Optionally, for the interest points, importance ranking is performed according to attributes such as the positions x, y, the scale σ, the direction θ and the like, and the required points M are screened out for subsequent global feature calculation.
And a sixth step: for the detected interest points, acquiring Gaussian blurred images I with the same scalekAbove, a square area centered on x, y and with the coordinate system rotated to align with the principal direction θ, with a radius of 3 σ. Then, the square area is uniformly divided into 4 × 4 image blocks, after the gradient is calculated for each pixel in the image block, the gradient direction vector is quantized to the direction of 8 equal divisions on the circumference and the gradient direction histogram is calculated, the accumulation process adopts a trilinear interpolation mode, and then the 8-dimensional vectors corresponding to the gradient direction histogram of each image block are spliced according to the sequence from left to right and from top to bottom, as shown in fig. 3, the gradient direction histogram vector of 4 × 8 — 128 is obtained.
Finally, the resulting 128-dimensional gradient direction histogram vector is normalized by L2 once. Then, performing truncation operation on each dimension, that is, if the value of each dimension is greater than 0.2, the truncation value is 0.2. Next, the truncated vector is normalized by L2 once again. Finally, the local feature descriptors are generated.
If the gradient vector histogram vector is h, hiIn the case of the h ith dimension, i is 0.,. 127, the L2 is normalized by:h′inumber of i-th dimension normalized by L2 for hThe value is obtained.
Optionally, the local feature descriptors for acquiring the image to be processed may be subjected to feature selection and other processing based on the above-mentioned local feature descriptor acquiring manner, and one or more of all local feature descriptors corresponding to one image are selected.
202. And forming the obtained local feature descriptors into a descriptor set.
203. And subtracting a preset mean vector from each local feature descriptor in the descriptor set to obtain a converted local feature descriptor.
The preset mean vector is obtained by training a preset image data set, and the dimensionality of the preset mean vector is the same as that of the local feature descriptor.
204. And sub-assembling the converted local feature description into a data matrix.
For example, when the dimension of each converted local feature descriptor is N, the elements in each dimension of each local feature descriptor are combined into a numerical value in a corresponding row in the data matrix to obtain an M × N-dimensional data matrix;
or,
when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding column in the data matrix to obtain a data matrix with dimensions of N x M;
m is the number of the converted local feature descriptors in the descriptor set, and N is equal to 128.
For example, in step 201, the dimension of each local feature descriptor is N-128, and 300 local feature descriptors, that is, M-300, are obtained, the dimension N-128 of the converted local feature descriptor is used as one row of the data matrix, and a data matrix with 300-128 dimensions is obtained by taking 128 elements of the converted local feature descriptor as one row of the data matrix. Of course, if the 128 elements of the converted local feature descriptor are used as a column of the data matrix, a 128 × 300 data matrix is obtained.
205. And multiplying the dimensionality reduction matrix and the data matrix to obtain a result matrix.
In this embodiment, the dimension reduction matrix may be a matrix obtained from the image dataset by using a principal component analysis method, where the dimension of the dimension reduction matrix is N × K, or the dimension of the dimension reduction matrix is K × N, where K is equal to 32;
as can be seen from the above, the dimension of each row in the dimension reduction matrix is the same as the dimension of the local feature descriptor, for example, if the dimension of the local feature descriptor is 128 dimensions, the dimension of each row in the dimension reduction matrix is 128 dimensions; the dimension of each column in the dimension reduction matrix is the same as the dimension of the low-dimensional local feature descriptor, for example, if the dimension of the low-dimensional local feature descriptor is 32 dimensions, the dimension of each column in the dimension reduction matrix is 32 dimensions;
or,
the dimension of each column in the dimension reduction matrix is the same as the dimension of the local feature descriptor, for example, if the dimension of the local feature descriptor is 128 dimensions, the dimension of each column in the dimension reduction matrix is 128 dimensions; the dimension of each row in the dimension reduction matrix is the same as the dimension of the low-dimensional local feature descriptor, for example, if the dimension of the low-dimensional local feature descriptor is 32 dimensions, the dimension of each row in the dimension reduction matrix is 32 dimensions.
Therefore, the dimension reduction matrix should be a 128x32 or 32x128 dimensional matrix.
Note that, when the dimension of the dimensionality reduction matrix in this step is N × K and the dimension of the data matrix is M × N, the dimension of the result matrix is M × K.
Or, in this step, when the dimension of the dimensionality reduction matrix is K × N and the dimension of the data matrix is N × M, the dimension of the result matrix is K × M.
Specifically, the dimension of the data matrix is 300x128, the dimension of the dimensionality reduction matrix is 128x32, the dimension of the obtained result matrix is 300x32, and the calculation process is as follows:
the above calculation process is a matrix multiplication operation in the prior art, and the embodiment is not described in detail.
206. And splitting the result matrix to obtain a low-dimensional local feature descriptor.
For example, if the dimension of the result matrix is M × K, extracting a value in each row in the result matrix, and using the extracted value of each row as a low-dimensional local feature descriptor;
or if the dimension of the result matrix is K × M, extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor.
M and K are as described above.
In a preferred implementation, a numerical value in each row in the result matrix is extracted, the extracted numerical value in each row is used as a low-dimensional local feature descriptor, M low-dimensional local feature descriptors are obtained, and the dimension of each low-dimensional local feature descriptor is K;
or extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K.
That is, each row (or each column) in the result matrix corresponds to one low-dimensional local feature descriptor, and the dimension of the low-dimensional local feature descriptor is K.
For example, if the dimension of the result matrix is 300 × 32, each row of the result matrix corresponds to one reduced local feature descriptor; if the dimension of the result matrix is 32x300, each column of the result matrix corresponds to one reduced local feature descriptor.
Specifically, if a local feature descriptor is reduced to obtain a low-dimensional local feature descriptor, the above steps 201 to 206 can be expressed by the following formulas:
<math> <mrow> <msub> <mi>x</mi> <mi>t</mi> </msub> <mo>=</mo> <msup> <mi>P</mi> <mi>T</mi> </msup> <mrow> <mo>(</mo> <msubsup> <mi>h</mi> <mi>t</mi> <mo>&prime;</mo> </msubsup> <mo>-</mo> <mover> <mi>h</mi> <mo>~</mo> </mover> <mo>)</mo> </mrow> </mrow> </math>
wherein x istIs the low-dimensional local feature descriptor, P is the dimensionality reduction matrix, h'tFor the purpose of the local feature descriptors,and the vector is the preset mean value vector.
Because the K is 32, the dimension of the finally obtained low-dimensional local feature descriptor can be 32, the dimension of the local feature descriptor in the prior art can be better reduced, the redundant information of the local feature descriptor in the prior art can be removed, and the influence of noise on the performance of the local feature descriptor is avoided.
Particularly, the process of adopting the low-dimensional local feature descriptors to aggregate the Fisher vectors has lower time and space complexity, and the low-dimensional Fisher vectors can be aggregated, so that the space required by compressing the Fisher vectors is reduced, the delay generated by wireless network transmission is also reduced, and the performance of the aggregated Fisher vectors in image retrieval and matching is greatly improved.
In another alternative implementation, the aforementioned step 103 in fig. 1 may specifically include the following exemplary sub-steps a1031 to a1037 not shown in the figure;
and A1031, acquiring a sample matrix of the image data set according to the image data set.
For example, a number of local feature descriptors for each image in the image dataset make up the sample values for each row in the sample matrix. Alternatively, the local feature descriptors of each image in the image dataset constitute sample values for each column in the sample matrix.
In particular, the image data set covers all kinds of images that may occur in practical applications, including planar object images, such as: business cards, CD covers, DVD covers, newspapers, paintings, video frames, etc., as well as three-dimensional object images such as: photographs of landmark buildings and various stereoscopic real objects, and the like. The image data set should contain a full range of image types and the scale of the various types of images is appropriate, for example: the proportion of the planar object image is 80%, and the proportion of the three-dimensional object image is 20%.
A local feature descriptor of an image in the image dataset is obtained in the manner described above for step 201.
Preferably, the dimensionality of the local feature descriptors is 128, and if the number of the obtained local feature descriptors is L and each row of the sample matrix corresponds to one local feature descriptor, an Lx128 sample matrix is obtained; if each column of the sample matrix corresponds to a local feature descriptor, a 128xL sample matrix is obtained.
And A1032, obtaining a mean vector according to the sample matrix.
For example, if the sample matrix is a matrix of L × 128, then all values on each column of the sample matrix are averaged, and the value of the ith dimension of the mean vector is equal to the average value of the ith column of the sample matrix, where i is 1, …, N;
or,
if the sample matrix is a 128 × L matrix, averaging all values in each row of the sample matrix, where the value of the ith dimension of the preset average vector is equal to the average value in the ith row of the sample matrix, where i is 1, …, N;
and A1033, centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix.
For example, if the sample matrix is a matrix of L × 128, the i-th dimension of the preset mean vector is subtracted from the i-th value on each row of the sample matrix to obtain a centered sample matrix, where i is 1, …, N;
or,
if the sample matrix is a 128 × L matrix, subtracting the ith dimension value of the preset mean vector from the ith value on each column of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;
and A1034, calculating a covariance matrix of the sample matrix.
Taking the local feature descriptor as an example, a 128 × 128 covariance matrix is obtained.
And A1035, acquiring an eigenvalue of the covariance matrix and an eigenvector corresponding to the eigenvalue.
For example, the eigenvalue of the covariance matrix and the eigenvector corresponding to the eigenvalue may be calculated using an existing eigenvalue decomposition method.
In a specific application, the dimension of the feature vector is equal to that of the local feature descriptor, and is 128-dimensional.
And A1036, sorting the eigenvectors from large to small according to the sizes of the eigenvalues, and selecting the first K eigenvectors, wherein K is 32.
A1037, the first K eigenvectors form the dimensionality reduction matrix;
for example, elements of all dimensions of all feature vectors in the first K feature vectors constitute values of rows/columns in the dimension-reduced matrix;
that is, the first K eigenvectors constitute the dimension reduction matrix, and N elements of each eigenvector correspond to one column or one row of the dimension reduction matrix.
Specifically, if 128 elements of each feature vector correspond to a column of the dimension reduction matrix, a 128 × 32 dimension reduction matrix is obtained; if 128 elements of each feature vector correspond to a row of the dimension reduction matrix, a 32x128 dimension reduction matrix is obtained.
Optionally, in this embodiment, there are N sample values in each row in the sample matrix of step a 1031; the dimension of the covariance matrix in step a1032 is N × N; the dimension of the feature vector in step 1033 is N;
wherein N equals 128, K equals 32;
the image dataset of any of the above embodiments comprises at least a planar object image and a three-dimensional object image. Preferably, if the image data set includes only the planar object image and the three-dimensional object image, the proportion of the planar object image may be 80% and the proportion of the three-dimensional object image may be 20%.
In addition, to better explain the process of obtaining the low-dimensional local feature descriptor in this embodiment, in the embodiment of the present invention, a specific numerical value of a mean vector is given as shown in the following table i, where numerical values of respective dimensions of a preset mean vector are sequentially written into the table i from left to right, and a first numerical value in a first row in the table i is a first element of the preset mean vector.
That is, the values in table one are the values corresponding to the mean vector;
the numerical values in the first table are numerical values of all dimensions of the mean vector, the numerical values of all dimensions of the mean vector are sequentially arranged from left to right, and the first numerical value in the first row in the first table is a first element of a preset mean vector;
table one:
0.078 0.049 0.035 0.043 0.067 0.055 0.05 0.058 0.116 0.069
0.042 0.045 0.062 0.052 0.054 0.079 0.118 0.077 0.05 0.049
0.06 0.045 0.044 0.072 0.081 0.058 0.047 0.052 0.063 0.041
0.036 0.051 0.096 0.056 0.037 0.052 0.083 0.062 0.05 0.064
0.156 0.084 0.042 0.051 0.075 0.06 0.053 0.09 0.155 0.087
0.05 0.058 0.072 0.053 0.046 0.089 0.101 0.064 0.048 0.059
0.076 0.051 0.039 0.06 0.096 0.063 0.05 0.063 0.083 0.052
0.037 0.056 0.156 0.09 0.053 0.06 0.075 0.051 0.042 0.085
0.155 0.088 0.046 0.053 0.073 0.057 0.05 0.087 0.101 0.059
0.039 0.051 0.076 0.058 0.048 0.064 0.078 0.058 0.05 0.056
0.067 0.042 0.034 0.049 0.116 0.078 0.054 0.052 0.062 0.044
0.042 0.069 0.118 0.071 0.044 0.045 0.06 0.049 0.05 0.077
0.081 0.051 0.036 0.042 0.063 0.052 0.047 0.059
the embodiment of the present invention further provides a specific numerical value of the dimension reduction matrix, as shown in table two, where 32 numerical values in each row of the dimension reduction matrix are written into table two in the sequence of one row and one row, the numerical values in each row are written into table two in the sequence from left to right, and the first numerical value in the first row in table two is the first element in the first row of the dimension reduction matrix.
That is, the elements in table two constitute the dimension reduction matrix, or the elements in table two constitute the transpose matrix of the dimension reduction matrix;
the numerical values in the second table are numerical values in a row and a column in the dimension reduction matrix, the numerical values in each row are sequentially arranged from left to right, and the first numerical value in the first row in the second table is the first element in the first row in the dimension reduction matrix;
table two:
the dimension reduction matrix is used for carrying out dimension reduction processing on the local feature descriptors of any image to be processed, redundant information in the local feature descriptors can be removed, the influence of noise on the performance of the local feature descriptors is avoided, and the performance of the Fisher vectors obtained by aggregation of the local feature descriptors after dimension reduction in image retrieval and matching is also improved.
The low-dimensional local feature descriptors of the above embodiments may have lower time and space complexity when aggregated into the Fisher vector, so that the dimensionality of the Fisher vector obtained by aggregation is also relatively lower, and the space required for compressing the Fisher vector is reduced.
Further, the above method may be implemented on any terminal, in particular, a mobile terminal. According to the wireless network bandwidth in the prior art, the Fisher vector obtained by aggregation of the low-dimensional local feature descriptors obtained in the embodiment can realize faster transmission, and the response time of image retrieval or image classification is prolonged; in addition, the Fisher vectors are aggregated by adopting the low-dimensional local feature descriptors, and the discrimination and the robustness of the Fisher vectors can be improved.
Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for obtaining a low-dimensional local feature descriptor, comprising:
acquiring a local feature descriptor of an image to be processed;
forming the obtained local feature descriptors into a descriptor set;
reducing the dimension of each local feature descriptor in the descriptor set according to the dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor; the dimension reduction matrix is obtained by training a preset image data set.
2. The method according to claim 1, wherein performing dimension reduction on each local feature descriptor in the descriptor set according to a dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor comprises:
subtracting a preset mean vector from each local feature descriptor in the descriptor set to obtain a converted local feature descriptor;
the converted local feature description is sub-grouped into a data matrix;
multiplying the dimensionality reduction matrix and the data matrix to obtain a result matrix;
splitting the result matrix to obtain a low-dimensional local feature descriptor;
the preset mean vector is obtained by training a preset image data set, and the dimensionality of the preset mean vector is the same as that of the local feature descriptor.
3. The method of claim 2, wherein the sub-assembling the converted local feature descriptions into a data matrix comprises:
when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding row in the data matrix to obtain an M-N dimensional data matrix;
or,
when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding column in the data matrix to obtain a data matrix with the dimension of N x M;
wherein M is the number of the converted local feature descriptors, M is a natural number, and N is equal to 128.
4. The method according to claim 3, wherein the dimensionality reduction matrix is a matrix obtained from the image dataset by principal component analysis, and the dimensionality of the dimensionality reduction matrix is N x K, or the dimensionality of the dimensionality reduction matrix is K x N;
when the dimensionality of the dimensionality reduction matrix is N x K and the dimensionality of the data matrix is M x N, the dimensionality of the result matrix is M x K; or,
when the dimensionality of the dimensionality reduction matrix is K x N and the dimensionality of the data matrix is N x M, the dimensionality of the result matrix is K x M;
where K is equal to 32.
5. The method of claim 4, wherein the splitting the result matrix to obtain the low-dimensional local feature descriptors comprises:
if the dimension of the result matrix is M x K, extracting the numerical value in each row in the result matrix, and taking the extracted numerical value of each row as a low-dimensional local feature descriptor;
or,
if the dimension of the result matrix is K x M, extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32.
6. The method of claim 2, wherein the splitting the result matrix to obtain the low-dimensional local feature descriptors comprises:
extracting a numerical value in each row in the result matrix, and taking the extracted numerical value in each row as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32;
or,
extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;
wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32.
7. The method according to claim 1, wherein before the local feature descriptors in the descriptor set are dimension-reduced according to a dimension-reducing matrix to obtain low-dimensional local feature descriptors, the method further comprises:
obtaining a sample matrix of the image dataset;
obtaining a mean vector according to the sample matrix;
centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix;
calculating a covariance matrix of the centered sample matrix;
acquiring an eigenvalue of the covariance matrix and an eigenvector corresponding to the eigenvalue;
sorting the eigenvectors from big to small according to the magnitude of the eigenvalue, and selecting the first K eigenvectors;
forming the first K eigenvectors into the dimensionality reduction matrix;
where K is equal to 32.
8. The method of claim 7, wherein each local feature descriptor of each image in the image dataset corresponds to a row of values in the sample matrix, each image in the image dataset corresponds to a number of rows of sample values in the sample matrix, and there are N sample values in each row of the sample matrix;
the obtaining of the mean vector according to the sample matrix includes:
averaging all values on each column of the sample matrix, the value of the ith dimension of the mean vector being equal to the average value of the ith column of the sample matrix, where i is 1, …, N;
the centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix includes:
subtracting the value of the ith dimension of the mean vector from the ith numerical value of each row of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;
the dimension of the covariance matrix is N x N;
the dimension of the feature vector is N;
elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of rows/columns in the dimension reduction matrix;
or,
each local feature descriptor of each image in the image data set corresponds to a column of values in the sample matrix, each image in the image data set corresponds to a plurality of columns of sample values in the sample matrix, and N sample values are provided for each column in the sample matrix;
the obtaining of the mean vector according to the sample matrix includes:
averaging all values in each row of the sample matrix, wherein the value in the ith dimension of the mean vector is equal to the average value in the ith row of the sample matrix, and i is 1, …, N;
the centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix includes:
subtracting the value of the ith dimension of the mean vector from the ith value on each column of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;
the dimension of the covariance matrix is N x N;
the dimension of the feature vector is N;
elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of columns/rows in the dimension reduction matrix;
where N equals 128 and K equals 32.
9. The method of claim 8, wherein the image dataset comprises:
planar object images and three-dimensional object images.
10. The method according to any one of claims 2 to 9,
the numerical value in the table I is the numerical value corresponding to the mean value vector;
the numerical values in the first table are numerical values of all dimensions of the mean vector, the numerical values of all dimensions of the mean vector are sequentially arranged from left to right, and the first numerical value in the first row in the first table is a first element of a preset mean vector;
table one:
0.078 0.049 0.035 0.043 0.067 0.055 0.05 0.058 0.116 0.069 0.042 0.045 0.062 0.052 0.054 0.079 0.118 0.077 0.05 0.049 0.06 0.045 0.044 0.072 0.081 0.058 0.047 0.052 0.063 0.041 0.036 0.051 0.096 0.056 0.037 0.052 0.083 0.062 0.05 0.064 0.156 0.084 0.042 0.051 0.075 0.06 0.053 0.09 0.155 0.087 0.05 0.058 0.072 0.053 0.046 0.089 0.101 0.064 0.048 0.059 0.076 0.051 0.039 0.06 0.096 0.063 0.05 0.063 0.083 0.052 0.037 0.056 0.156 0.09 0.053 0.06 0.075 0.051 0.042 0.085 0.155 0.088 0.046 0.053 0.073 0.057 0.05 0.087 0.101 0.059 0.039 0.051 0.076 0.058 0.048 0.064 0.078 0.058 0.05 0.056 0.067 0.042 0.034 0.049 0.116 0.078 0.054 0.052 0.062 0.044 0.042 0.069 0.118 0.071 0.044 0.045 0.06 0.049 0.05 0.077 0.081 0.051 0.036 0.042 0.063 0.052 0.047 0.059
elements in the second table form the dimension reduction matrix, or elements in the second table form a transpose matrix of the dimension reduction matrix;
the numerical values in the second table are numerical values in a row and a column in the dimension reduction matrix, the numerical values in each row are sequentially arranged from left to right, and the first numerical value in the first row in the second table is the first element in the first row in the dimension reduction matrix;
table two:
CN201410183573.2A 2014-04-30 2014-04-30 Method for acquiring low-dimensional local characteristics descriptor Pending CN104616013A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410183573.2A CN104616013A (en) 2014-04-30 2014-04-30 Method for acquiring low-dimensional local characteristics descriptor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410183573.2A CN104616013A (en) 2014-04-30 2014-04-30 Method for acquiring low-dimensional local characteristics descriptor

Publications (1)

Publication Number Publication Date
CN104616013A true CN104616013A (en) 2015-05-13

Family

ID=53150450

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410183573.2A Pending CN104616013A (en) 2014-04-30 2014-04-30 Method for acquiring low-dimensional local characteristics descriptor

Country Status (1)

Country Link
CN (1) CN104616013A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978395A (en) * 2015-05-22 2015-10-14 北京交通大学 Vision dictionary construction and application method and apparatus
CN105005783A (en) * 2015-05-18 2015-10-28 电子科技大学 Method of extracting classification information from high dimensional asymmetric data
CN106408037A (en) * 2015-07-30 2017-02-15 阿里巴巴集团控股有限公司 Image recognition method and apparatus
CN106503143A (en) * 2016-10-21 2017-03-15 广东工业大学 A kind of image search method and device
CN108446890A (en) * 2018-02-26 2018-08-24 平安普惠企业管理有限公司 A kind of examination & approval model training method, computer readable storage medium and terminal device
CN108984340A (en) * 2018-06-06 2018-12-11 深圳先进技术研究院 Fault-tolerant guard method, device, equipment and the storage medium of memory data
CN111612099A (en) * 2020-06-03 2020-09-01 江苏科技大学 Texture image classification method and system based on local sorting difference refinement mode

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521618A (en) * 2011-11-11 2012-06-27 北京大学 Extracting method for local descriptor, image searching method and image matching method
CN102968632A (en) * 2012-10-15 2013-03-13 北京大学 Method for obtaining compact global characteristic descriptors of images and image searching method
CN103218427A (en) * 2013-04-08 2013-07-24 北京大学 Local descriptor extracting method, image searching method and image matching method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521618A (en) * 2011-11-11 2012-06-27 北京大学 Extracting method for local descriptor, image searching method and image matching method
CN102968632A (en) * 2012-10-15 2013-03-13 北京大学 Method for obtaining compact global characteristic descriptors of images and image searching method
CN103226589A (en) * 2012-10-15 2013-07-31 北京大学 Method for obtaining compact global feature descriptors of image and image retrieval method
CN103218427A (en) * 2013-04-08 2013-07-24 北京大学 Local descriptor extracting method, image searching method and image matching method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘红岩: "《商务智能方法与应用》", 31 May 2013, 清华大学出版社 *
吴朝晖 等: "《说话人识别模型与方法》", 31 March 2009, 清华大学出版社 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105005783A (en) * 2015-05-18 2015-10-28 电子科技大学 Method of extracting classification information from high dimensional asymmetric data
CN104978395A (en) * 2015-05-22 2015-10-14 北京交通大学 Vision dictionary construction and application method and apparatus
CN104978395B (en) * 2015-05-22 2019-05-21 北京交通大学 Visual dictionary building and application method and device
CN106408037A (en) * 2015-07-30 2017-02-15 阿里巴巴集团控股有限公司 Image recognition method and apparatus
CN106408037B (en) * 2015-07-30 2020-02-18 阿里巴巴集团控股有限公司 Image recognition method and device
CN106503143A (en) * 2016-10-21 2017-03-15 广东工业大学 A kind of image search method and device
CN108446890A (en) * 2018-02-26 2018-08-24 平安普惠企业管理有限公司 A kind of examination & approval model training method, computer readable storage medium and terminal device
CN108984340A (en) * 2018-06-06 2018-12-11 深圳先进技术研究院 Fault-tolerant guard method, device, equipment and the storage medium of memory data
CN108984340B (en) * 2018-06-06 2021-07-23 深圳先进技术研究院 Fault-tolerant protection method, device, equipment and storage medium for storage data
CN111612099A (en) * 2020-06-03 2020-09-01 江苏科技大学 Texture image classification method and system based on local sorting difference refinement mode
CN111612099B (en) * 2020-06-03 2022-11-29 江苏科技大学 Texture image classification method and system based on local sorting difference refinement mode

Similar Documents

Publication Publication Date Title
CN104616013A (en) Method for acquiring low-dimensional local characteristics descriptor
Manap et al. Non-distortion-specific no-reference image quality assessment: A survey
WO2020062360A1 (en) Image fusion classification method and apparatus
US20140254936A1 (en) Local feature based image compression
US9697442B2 (en) Object detection in digital images
CA3066029A1 (en) Image feature acquisition
CN102236675A (en) Method for processing matched pairs of characteristic points of images, image retrieval method and image retrieval equipment
CN105590304B (en) Super-resolution image reconstruction method and device
CN109800781A (en) A kind of image processing method, device and computer readable storage medium
CN108205657A (en) Method, storage medium and the mobile terminal of video lens segmentation
CN108764351B (en) Riemann manifold preservation kernel learning method and device based on geodesic distance
Nizami et al. No-reference image quality assessment using bag-of-features with feature selection
CN107256378A (en) Language Identification and device
CN111178398B (en) Method, system, storage medium and device for detecting tampering of identity card image information
Siméoni et al. Unsupervised object discovery for instance recognition
CN106557526B (en) Apparatus and method for processing image
CN104615611A (en) Method for obtaining global feature descriptors
CN104615613B (en) The polymerization of global characteristics description
CN107423739B (en) Image feature extraction method and device
CN108764112A (en) A kind of Remote Sensing Target object detecting method and equipment
Li et al. A refined analysis for the sample complexity of adaptive compressive outlier sensing
CN108804988B (en) Remote sensing image scene classification method and device
CN113095185B (en) Facial expression recognition method, device, equipment and storage medium
CN108921776A (en) A kind of image split-joint method and device based on unmanned plane
Ranjan et al. Image retrieval using dictionary similarity measure

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20150513

RJ01 Rejection of invention patent application after publication