CN104616013A

CN104616013A - Method for acquiring low-dimensional local characteristics descriptor

Info

Publication number: CN104616013A
Application number: CN201410183573.2A
Authority: CN
Inventors: 段凌宇; 林杰; 王哲; 杨爽; 陈杰; 黄铁军; 高文
Original assignee: Peking University
Current assignee: Peking University
Priority date: 2014-04-30
Filing date: 2014-04-30
Publication date: 2015-05-13

Abstract

The invention provides a method for acquiring a low-dimensional local characteristics descriptor. The method comprises the steps of acquiring the local characteristics descriptor of an image to be processed; forming a descriptor set through the acquired local characteristics descriptor; reducing the dimension of each local characteristics descriptor in the descriptor set according to a dimension reduction matrix to obtain the local-dimensional local characteristics descriptor corresponding to each local characteristics descriptor, wherein the dimension reduction matrix is a matrix acquired by training a preset image data set. With the adoption of the method, the dimensionality of the local characteristics descriptor in the prior art can be reduced, and redundant information of the local characteristics descriptor in the prior art can be removed.

Description

Method for obtaining low-dimensional local feature descriptor

Technical Field

The embodiment of the invention relates to the field of computers, in particular to a method for acquiring a low-dimensional local feature descriptor.

Background

At present, mobile visual search is applied more and more, and the industry generally adopts local feature descriptors to be aggregated into global feature descriptors to realize image retrieval or classification. For example, the local feature descriptors are aggregated into global feature descriptors such as Fisher vectors.

In the prior art, a specific implementation manner of aggregating local feature descriptors into global feature descriptors to implement image retrieval or classification is as follows: first, extracting local feature descriptors of the image, and directly aggregating Fisher vectors by using the local feature descriptors. However, the dimensionality of the local feature descriptors for extracting the image is high, so that the time and space complexity of the aggregation Fisher vector is high, and further, the dimensionality of the obtained Fisher vector is high due to the high dimensionality of the local feature descriptors, so that the occupied space of the global feature descriptors is very large, transmission delay is easily caused, and the response time of image retrieval or image classification is influenced.

In addition, the Fisher vectors are directly aggregated by using the local feature descriptors, so that the discrimination of the aggregated Fisher vectors is reduced, the robustness is not provided, and the accuracy of image retrieval is further reduced.

Disclosure of Invention

In order to solve the defects in the prior art, the invention provides a method for acquiring a low-dimensional local feature descriptor, which is used for reducing the dimensionality of the local feature descriptor in the prior art and removing redundant information of the local feature descriptor in the prior art.

The invention provides a method for acquiring a low-dimensional local feature descriptor, which comprises the following steps:

acquiring a local feature descriptor of an image to be processed;

forming the obtained local feature descriptors into a descriptor set;

reducing the dimension of each local feature descriptor in the descriptor set according to the dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor; the dimension reduction matrix is obtained by training a preset image data set.

Optionally, performing dimension reduction on each local feature descriptor in the descriptor set according to a dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor, including:

subtracting a preset mean vector from each local feature descriptor in the descriptor set to obtain a converted local feature descriptor;

the converted local feature description is sub-grouped into a data matrix;

multiplying the dimensionality reduction matrix and the data matrix to obtain a result matrix;

splitting the result matrix to obtain a low-dimensional local feature descriptor;

the preset mean vector is obtained by training a preset image data set, and the dimensionality of the preset mean vector is the same as that of the local feature descriptor.

Optionally, the converting the local feature description into a data matrix includes:

when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding row in the data matrix to obtain an M-N dimensional data matrix;

or,

when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding column in the data matrix to obtain a data matrix with the dimension of N x M;

wherein M is the number of the converted local feature descriptors, M is a natural number, and N is equal to 128.

Optionally, the dimension reduction matrix is a matrix obtained from the image dataset by adopting a principal component analysis method, and the dimension of the dimension reduction matrix is N × K, or the dimension of the dimension reduction matrix is K × N;

when the dimensionality of the dimensionality reduction matrix is N x K and the dimensionality of the data matrix is M x N, the dimensionality of the result matrix is M x K; or,

when the dimensionality of the dimensionality reduction matrix is K x N and the dimensionality of the data matrix is N x M, the dimensionality of the result matrix is K x M;

where K is equal to 32.

Optionally, the splitting the result matrix to obtain a low-dimensional local feature descriptor includes:

if the dimension of the result matrix is M x K, extracting the numerical value in each row in the result matrix, and taking the extracted numerical value of each row as a low-dimensional local feature descriptor;

or,

if the dimension of the result matrix is K x M, extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor;

wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32.

extracting a numerical value in each row in the result matrix, and taking the extracted numerical value in each row as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;

wherein M is the number of the converted local feature descriptors, M is a natural number, and K is equal to 32;

or,

extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K;

Optionally, before the performing dimension reduction on the local feature descriptors in the descriptor set according to the dimension reduction matrix to obtain low-dimensional local feature descriptors, the method further includes:

obtaining a sample matrix of the image dataset;

obtaining a mean vector according to the sample matrix;

centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix;

calculating a covariance matrix of the centered sample matrix;

acquiring an eigenvalue of the covariance matrix and an eigenvector corresponding to the eigenvalue;

sorting the eigenvectors from big to small according to the magnitude of the eigenvalue, and selecting the first K eigenvectors;

forming the first K eigenvectors into the dimensionality reduction matrix;

where K is equal to 32.

Optionally, each local feature descriptor of each image in the image dataset corresponds to a row of values in the sample matrix, each image in the image dataset corresponds to a number of rows of sample values in the sample matrix, and there are N sample values in each row of the sample matrix;

the obtaining of the mean vector according to the sample matrix includes:

averaging all values on each column of the sample matrix, the value of the ith dimension of the mean vector being equal to the average value of the ith column of the sample matrix, where i is 1, …, N;

the centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix includes:

subtracting the value of the ith dimension of the mean vector from the ith numerical value of each row of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;

the dimension of the covariance matrix is N x N;

the dimension of the feature vector is N;

elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of rows/columns in the dimension reduction matrix;

or,

each local feature descriptor of each image in the image data set corresponds to a column of values in the sample matrix, each image in the image data set corresponds to a plurality of columns of sample values in the sample matrix, and N sample values are provided for each column in the sample matrix;

the obtaining of the mean vector according to the sample matrix includes:

averaging all values in each row of the sample matrix, wherein the value in the ith dimension of the mean vector is equal to the average value in the ith row of the sample matrix, and i is 1, …, N;

subtracting the value of the ith dimension of the mean vector from the ith value on each column of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;

the dimension of the covariance matrix is N x N;

the dimension of the feature vector is N;

elements of all dimensions of all the feature vectors in the first K feature vectors form numerical values of columns/rows in the dimension reduction matrix;

where N equals 128 and K equals 32.

Optionally, the image dataset comprises: planar object images and three-dimensional object images.

According to the technical scheme, the method for obtaining the low-dimensional local feature descriptors comprises the steps of obtaining the local feature descriptors of the image to be processed, forming descriptor sets from all the obtained local feature descriptors, and reducing the dimension of each local feature descriptor by using a dimension reduction matrix to obtain the low-dimensional local feature descriptors of each local feature descriptor, so that the dimension of the local feature descriptors in the prior art can be reduced, and redundant information of the local feature descriptors in the prior art can be removed.

Drawings

Fig. 1 is a schematic flow chart of obtaining a low-dimensional local feature descriptor according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of obtaining a low-dimensional local feature descriptor according to another embodiment of the present invention;

fig. 3 is a schematic diagram of a gradient direction histogram vector according to an embodiment of the invention.

Detailed Description

Fig. 1 illustrates a method for obtaining a low-dimensional local feature descriptor according to an embodiment of the present invention, and as shown in fig. 1, the method for obtaining a low-dimensional local feature descriptor in the embodiment is as follows.

101. And acquiring a local feature descriptor of the image to be processed.

For example, the image to be processed may be any image, for example, the image to be processed may be a photograph of a file, or a hand-drawn picture, an oil-drawn image, a frame captured from a video, a landmark photograph, or an article photograph, and the like.

In particular, the manner of obtaining one or more local feature descriptors of the image to be processed is an existing manner, for example, the local feature descriptors may be Scale invariant feature descriptors (Scale invariant feature Transform, referred to as SIFT), or the local feature descriptors may be fast Robust Scale invariant feature descriptors (speedup Robust Features, referred to as SURF), or other local feature descriptors.

It should be understood that the SIFT or SURF extraction method may be an existing extraction method, and the embodiment is not described in detail. Generally, the SIFT may be 128-dimensional in dimension and the SURF may be 64-dimensional in dimension.

Optionally, the local feature descriptors for acquiring the image to be processed may be subjected to feature selection and other processing based on the above-mentioned local feature descriptor acquiring manner, and one or more of all local feature descriptors corresponding to one image are selected.

102. And forming the obtained local feature descriptors into a descriptor set.

In the present embodiment, all the acquired local feature descriptors form a descriptor set.

103. And reducing the dimension of each local feature descriptor in the descriptor set according to the dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor.

In this embodiment, the dimension reduction matrix in step 103 may be a matrix obtained by training a preset image data set.

Optionally, before step 103, normalization processing may be further performed on all local feature descriptors in the descriptor set, and then, in step 103, dimension reduction processing may be performed on the normalized local feature descriptors to obtain low-dimensional local feature descriptors corresponding to each local feature descriptor.

The steps of the normalization process are exemplified as follows:

a01 if the local feature descriptor is h_tM-1, normalized using L1 for each dimension, yielding h'_t,j＝h_t,j/|h_tJ ═ 0., 127; wherein, | h_tI represents a 128-dimensional local feature descriptor vector h_tThe sum of the absolute values of the dimensions.

A02, continue to normalize for each dimension using power normalization with parameter 0.5, get h'_t,j←sgn(h′_t,j)|h′_t,j|^0.5；

Wherein, | h'_t,jL represents dimension h'_t,jThe absolute value of (a) is,

it should be noted that the above method may be performed on any device, and the embodiment does not limit whether its execution subject is a client or a server.

The method for obtaining the low-dimensional local feature descriptor in the embodiment can reduce the dimension of the local feature descriptor in the prior art and remove redundant information of the local feature descriptor in the prior art.

Fig. 2 illustrates a method for obtaining a low-dimensional local feature descriptor according to another embodiment of the present invention, and as shown in fig. 2, the method for obtaining a low-dimensional local feature descriptor in this embodiment is as follows.

201. And acquiring a local feature descriptor of the image to be processed.

In particular, the manner of obtaining the local feature descriptors of the image to be processed is exemplified as follows:

the first step is as follows: the image I to be processed and a group of Gaussian filtersAnd (3) obtaining Gaussian blurred images of the image I under different scales in a Gaussian scale space by convolution, wherein sigma is the standard deviation of Gaussian and expresses the scale corresponding to each Gaussian blurred image in the Gaussian scale space. σ is taken as an exponential power of 2, the firstk scales of σ_kAnd is andwherein sigma₀The initial scale is 1.6, and K represents the number of sampling layers of the scale space, namely the number of Gaussian filters. Then the k-th Gaussian blur image is I_kCorresponding scale is σ_kAnd I is_k＝I*g(σ_k),k＝0,...,K。

The second step is that: in the Gaussian scale space, each Gaussian blur image is convolved with a Laplacian filter with normalized scale to obtain a Laplacian scale space responseWherein

f = [\begin{matrix} 0 & 1 & 0 \\ 1 & - 4 & 1 \\ 0 & 1 & 0 \end{matrix}]

Is the laplacian operator.

The third step: and acquiring a local maximum value or minimum value point as a candidate interest point in the Gaussian Laplace scale space. The interest point comprises three attributes, namely the position coordinate x, y and the corresponding scale sigma of the interest point in the corresponding Gaussian blur image_k。

The fourth step: for the interest points, obtaining the correspondingOf the same scale of gaussian blurred image I_kThe circular area with x, y as the center and m sigma as the radius, wherein m is 3.96. Then, for the pixels in the circular area, the gradient of each pixel is calculated according to the following formula, including the modulus length of the gradientAnd direction of gradient

{m_{I}}_{k} (x, y) = \sqrt{{(I_{k} (x + 1, y) - I_{k} (x - 1, y))}^{2} + {(I_{k} (x, y + 1) - I_{k} (x, y - 1))}^{2}}

<math> <mrow> <msub> <msub> <mi>θ</mi> <mi>I</mi> </msub> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>arctan</mi> <mfrac> <mrow> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <mn>1</mn> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>-</mo> <msub> <mi>I</mi> <mi>k</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>-</mo> <mn>1</mn> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mfrac> </mrow> </math>

The gradient direction of each pixel in the circular area is quantized to the direction equally divided by the circumference 36 according to the rule of the nearest distance. And each direction is weighted and accumulated by taking the gradient modular length as weight to obtain a 36-dimensional gradient direction histogram.

The fifth step: and selecting the direction with the largest accumulation in the histogram as the main direction theta of the interest point. Meanwhile, if the accumulated value of other directions exceeds 80% of the accumulated value of the main direction, the point of interest is copied and expanded to be a new point of interest, and the direction is used as the main direction of the new point of interest.

Optionally, for the interest points, importance ranking is performed according to attributes such as the positions x, y, the scale σ, the direction θ and the like, and the required points M are screened out for subsequent global feature calculation.

And a sixth step: for the detected interest points, acquiring Gaussian blurred images I with the same scale_kAbove, a square area centered on x, y and with the coordinate system rotated to align with the principal direction θ, with a radius of 3 σ. Then, the square area is uniformly divided into 4 × 4 image blocks, after the gradient is calculated for each pixel in the image block, the gradient direction vector is quantized to the direction of 8 equal divisions on the circumference and the gradient direction histogram is calculated, the accumulation process adopts a trilinear interpolation mode, and then the 8-dimensional vectors corresponding to the gradient direction histogram of each image block are spliced according to the sequence from left to right and from top to bottom, as shown in fig. 3, the gradient direction histogram vector of 4 × 8 — 128 is obtained.

Finally, the resulting 128-dimensional gradient direction histogram vector is normalized by L2 once. Then, performing truncation operation on each dimension, that is, if the value of each dimension is greater than 0.2, the truncation value is 0.2. Next, the truncated vector is normalized by L2 once again. Finally, the local feature descriptors are generated.

If the gradient vector histogram vector is h, h_iIn the case of the h ith dimension, i is 0.,. 127, the L2 is normalized by:h′_inumber of i-th dimension normalized by L2 for hThe value is obtained.

202. And forming the obtained local feature descriptors into a descriptor set.

203. And subtracting a preset mean vector from each local feature descriptor in the descriptor set to obtain a converted local feature descriptor.

204. And sub-assembling the converted local feature description into a data matrix.

For example, when the dimension of each converted local feature descriptor is N, the elements in each dimension of each local feature descriptor are combined into a numerical value in a corresponding row in the data matrix to obtain an M × N-dimensional data matrix;

or,

when the dimensionality of each converted local feature descriptor is N, forming elements on each dimensionality of each local feature descriptor into numerical values on a corresponding column in the data matrix to obtain a data matrix with dimensions of N x M;

m is the number of the converted local feature descriptors in the descriptor set, and N is equal to 128.

For example, in step 201, the dimension of each local feature descriptor is N-128, and 300 local feature descriptors, that is, M-300, are obtained, the dimension N-128 of the converted local feature descriptor is used as one row of the data matrix, and a data matrix with 300-128 dimensions is obtained by taking 128 elements of the converted local feature descriptor as one row of the data matrix. Of course, if the 128 elements of the converted local feature descriptor are used as a column of the data matrix, a 128 × 300 data matrix is obtained.

205. And multiplying the dimensionality reduction matrix and the data matrix to obtain a result matrix.

In this embodiment, the dimension reduction matrix may be a matrix obtained from the image dataset by using a principal component analysis method, where the dimension of the dimension reduction matrix is N × K, or the dimension of the dimension reduction matrix is K × N, where K is equal to 32;

as can be seen from the above, the dimension of each row in the dimension reduction matrix is the same as the dimension of the local feature descriptor, for example, if the dimension of the local feature descriptor is 128 dimensions, the dimension of each row in the dimension reduction matrix is 128 dimensions; the dimension of each column in the dimension reduction matrix is the same as the dimension of the low-dimensional local feature descriptor, for example, if the dimension of the low-dimensional local feature descriptor is 32 dimensions, the dimension of each column in the dimension reduction matrix is 32 dimensions;

or,

the dimension of each column in the dimension reduction matrix is the same as the dimension of the local feature descriptor, for example, if the dimension of the local feature descriptor is 128 dimensions, the dimension of each column in the dimension reduction matrix is 128 dimensions; the dimension of each row in the dimension reduction matrix is the same as the dimension of the low-dimensional local feature descriptor, for example, if the dimension of the low-dimensional local feature descriptor is 32 dimensions, the dimension of each row in the dimension reduction matrix is 32 dimensions.

Therefore, the dimension reduction matrix should be a 128x32 or 32x128 dimensional matrix.

Note that, when the dimension of the dimensionality reduction matrix in this step is N × K and the dimension of the data matrix is M × N, the dimension of the result matrix is M × K.

Or, in this step, when the dimension of the dimensionality reduction matrix is K × N and the dimension of the data matrix is N × M, the dimension of the result matrix is K × M.

Specifically, the dimension of the data matrix is 300x128, the dimension of the dimensionality reduction matrix is 128x32, the dimension of the obtained result matrix is 300x32, and the calculation process is as follows:

the above calculation process is a matrix multiplication operation in the prior art, and the embodiment is not described in detail.

206. And splitting the result matrix to obtain a low-dimensional local feature descriptor.

For example, if the dimension of the result matrix is M × K, extracting a value in each row in the result matrix, and using the extracted value of each row as a low-dimensional local feature descriptor;

or if the dimension of the result matrix is K × M, extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor.

M and K are as described above.

In a preferred implementation, a numerical value in each row in the result matrix is extracted, the extracted numerical value in each row is used as a low-dimensional local feature descriptor, M low-dimensional local feature descriptors are obtained, and the dimension of each low-dimensional local feature descriptor is K;

or extracting a numerical value in each column in the result matrix, and taking the extracted numerical value in each column as a low-dimensional local feature descriptor to obtain M low-dimensional local feature descriptors, wherein the dimension of each low-dimensional local feature descriptor is K.

That is, each row (or each column) in the result matrix corresponds to one low-dimensional local feature descriptor, and the dimension of the low-dimensional local feature descriptor is K.

For example, if the dimension of the result matrix is 300 × 32, each row of the result matrix corresponds to one reduced local feature descriptor; if the dimension of the result matrix is 32x300, each column of the result matrix corresponds to one reduced local feature descriptor.

Specifically, if a local feature descriptor is reduced to obtain a low-dimensional local feature descriptor, the above steps 201 to 206 can be expressed by the following formulas:

wherein x is_tIs the low-dimensional local feature descriptor, P is the dimensionality reduction matrix, h'_tFor the purpose of the local feature descriptors,and the vector is the preset mean value vector.

Because the K is 32, the dimension of the finally obtained low-dimensional local feature descriptor can be 32, the dimension of the local feature descriptor in the prior art can be better reduced, the redundant information of the local feature descriptor in the prior art can be removed, and the influence of noise on the performance of the local feature descriptor is avoided.

Particularly, the process of adopting the low-dimensional local feature descriptors to aggregate the Fisher vectors has lower time and space complexity, and the low-dimensional Fisher vectors can be aggregated, so that the space required by compressing the Fisher vectors is reduced, the delay generated by wireless network transmission is also reduced, and the performance of the aggregated Fisher vectors in image retrieval and matching is greatly improved.

In another alternative implementation, the aforementioned step 103 in fig. 1 may specifically include the following exemplary sub-steps a1031 to a1037 not shown in the figure;

and A1031, acquiring a sample matrix of the image data set according to the image data set.

For example, a number of local feature descriptors for each image in the image dataset make up the sample values for each row in the sample matrix. Alternatively, the local feature descriptors of each image in the image dataset constitute sample values for each column in the sample matrix.

In particular, the image data set covers all kinds of images that may occur in practical applications, including planar object images, such as: business cards, CD covers, DVD covers, newspapers, paintings, video frames, etc., as well as three-dimensional object images such as: photographs of landmark buildings and various stereoscopic real objects, and the like. The image data set should contain a full range of image types and the scale of the various types of images is appropriate, for example: the proportion of the planar object image is 80%, and the proportion of the three-dimensional object image is 20%.

A local feature descriptor of an image in the image dataset is obtained in the manner described above for step 201.

Preferably, the dimensionality of the local feature descriptors is 128, and if the number of the obtained local feature descriptors is L and each row of the sample matrix corresponds to one local feature descriptor, an Lx128 sample matrix is obtained; if each column of the sample matrix corresponds to a local feature descriptor, a 128xL sample matrix is obtained.

And A1032, obtaining a mean vector according to the sample matrix.

For example, if the sample matrix is a matrix of L × 128, then all values on each column of the sample matrix are averaged, and the value of the ith dimension of the mean vector is equal to the average value of the ith column of the sample matrix, where i is 1, …, N;

or,

if the sample matrix is a 128 × L matrix, averaging all values in each row of the sample matrix, where the value of the ith dimension of the preset average vector is equal to the average value in the ith row of the sample matrix, where i is 1, …, N;

and A1033, centralizing the sample matrix by using the mean vector to obtain a centralized sample matrix.

For example, if the sample matrix is a matrix of L × 128, the i-th dimension of the preset mean vector is subtracted from the i-th value on each row of the sample matrix to obtain a centered sample matrix, where i is 1, …, N;

or,

if the sample matrix is a 128 × L matrix, subtracting the ith dimension value of the preset mean vector from the ith value on each column of the sample matrix to obtain a centralized sample matrix, wherein i is 1, …, N;

and A1034, calculating a covariance matrix of the sample matrix.

Taking the local feature descriptor as an example, a 128 × 128 covariance matrix is obtained.

And A1035, acquiring an eigenvalue of the covariance matrix and an eigenvector corresponding to the eigenvalue.

For example, the eigenvalue of the covariance matrix and the eigenvector corresponding to the eigenvalue may be calculated using an existing eigenvalue decomposition method.

In a specific application, the dimension of the feature vector is equal to that of the local feature descriptor, and is 128-dimensional.

And A1036, sorting the eigenvectors from large to small according to the sizes of the eigenvalues, and selecting the first K eigenvectors, wherein K is 32.

A1037, the first K eigenvectors form the dimensionality reduction matrix;

for example, elements of all dimensions of all feature vectors in the first K feature vectors constitute values of rows/columns in the dimension-reduced matrix;

that is, the first K eigenvectors constitute the dimension reduction matrix, and N elements of each eigenvector correspond to one column or one row of the dimension reduction matrix.

Specifically, if 128 elements of each feature vector correspond to a column of the dimension reduction matrix, a 128 × 32 dimension reduction matrix is obtained; if 128 elements of each feature vector correspond to a row of the dimension reduction matrix, a 32x128 dimension reduction matrix is obtained.

Optionally, in this embodiment, there are N sample values in each row in the sample matrix of step a 1031; the dimension of the covariance matrix in step a1032 is N × N; the dimension of the feature vector in step 1033 is N;

wherein N equals 128, K equals 32;

the image dataset of any of the above embodiments comprises at least a planar object image and a three-dimensional object image. Preferably, if the image data set includes only the planar object image and the three-dimensional object image, the proportion of the planar object image may be 80% and the proportion of the three-dimensional object image may be 20%.

In addition, to better explain the process of obtaining the low-dimensional local feature descriptor in this embodiment, in the embodiment of the present invention, a specific numerical value of a mean vector is given as shown in the following table i, where numerical values of respective dimensions of a preset mean vector are sequentially written into the table i from left to right, and a first numerical value in a first row in the table i is a first element of the preset mean vector.

That is, the values in table one are the values corresponding to the mean vector;

the numerical values in the first table are numerical values of all dimensions of the mean vector, the numerical values of all dimensions of the mean vector are sequentially arranged from left to right, and the first numerical value in the first row in the first table is a first element of a preset mean vector;

table one:

0.078	0.049	0.035	0.043	0.067	0.055	0.05	0.058	0.116	0.069
										0.042	0.045	0.062	0.052	0.054	0.079	0.118	0.077	0.05	0.049
0.06	0.045	0.044	0.072	0.081	0.058	0.047	0.052	0.063	0.041
										0.036	0.051	0.096	0.056	0.037	0.052	0.083	0.062	0.05	0.064
0.156	0.084	0.042	0.051	0.075	0.06	0.053	0.09	0.155	0.087
										0.05	0.058	0.072	0.053	0.046	0.089	0.101	0.064	0.048	0.059
0.076	0.051	0.039	0.06	0.096	0.063	0.05	0.063	0.083	0.052
										0.037	0.056	0.156	0.09	0.053	0.06	0.075	0.051	0.042	0.085
0.155	0.088	0.046	0.053	0.073	0.057	0.05	0.087	0.101	0.059
										0.039	0.051	0.076	0.058	0.048	0.064	0.078	0.058	0.05	0.056
0.067	0.042	0.034	0.049	0.116	0.078	0.054	0.052	0.062	0.044
										0.042	0.069	0.118	0.071	0.044	0.045	0.06	0.049	0.05	0.077
0.081	0.051	0.036	0.042	0.063	0.052	0.047	0.059

the embodiment of the present invention further provides a specific numerical value of the dimension reduction matrix, as shown in table two, where 32 numerical values in each row of the dimension reduction matrix are written into table two in the sequence of one row and one row, the numerical values in each row are written into table two in the sequence from left to right, and the first numerical value in the first row in table two is the first element in the first row of the dimension reduction matrix.

That is, the elements in table two constitute the dimension reduction matrix, or the elements in table two constitute the transpose matrix of the dimension reduction matrix;

the numerical values in the second table are numerical values in a row and a column in the dimension reduction matrix, the numerical values in each row are sequentially arranged from left to right, and the first numerical value in the first row in the second table is the first element in the first row in the dimension reduction matrix;

table two:

the dimension reduction matrix is used for carrying out dimension reduction processing on the local feature descriptors of any image to be processed, redundant information in the local feature descriptors can be removed, the influence of noise on the performance of the local feature descriptors is avoided, and the performance of the Fisher vectors obtained by aggregation of the local feature descriptors after dimension reduction in image retrieval and matching is also improved.

The low-dimensional local feature descriptors of the above embodiments may have lower time and space complexity when aggregated into the Fisher vector, so that the dimensionality of the Fisher vector obtained by aggregation is also relatively lower, and the space required for compressing the Fisher vector is reduced.

Further, the above method may be implemented on any terminal, in particular, a mobile terminal. According to the wireless network bandwidth in the prior art, the Fisher vector obtained by aggregation of the low-dimensional local feature descriptors obtained in the embodiment can realize faster transmission, and the response time of image retrieval or image classification is prolonged; in addition, the Fisher vectors are aggregated by adopting the low-dimensional local feature descriptors, and the discrimination and the robustness of the Fisher vectors can be improved.

Those of ordinary skill in the art will understand that: all or a portion of the steps of implementing the above-described method embodiments may be performed by hardware associated with program instructions. The program may be stored in a computer-readable storage medium. When executed, the program performs steps comprising the method embodiments described above; and the aforementioned storage medium includes: various media that can store program codes, such as ROM, RAM, magnetic or optical disks.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for obtaining a low-dimensional local feature descriptor, comprising:

acquiring a local feature descriptor of an image to be processed;

forming the obtained local feature descriptors into a descriptor set;

2. The method according to claim 1, wherein performing dimension reduction on each local feature descriptor in the descriptor set according to a dimension reduction matrix to obtain a low-dimensional local feature descriptor corresponding to each local feature descriptor comprises:

the converted local feature description is sub-grouped into a data matrix;

3. The method of claim 2, wherein the sub-assembling the converted local feature descriptions into a data matrix comprises:

or,

4. The method according to claim 3, wherein the dimensionality reduction matrix is a matrix obtained from the image dataset by principal component analysis, and the dimensionality of the dimensionality reduction matrix is N x K, or the dimensionality of the dimensionality reduction matrix is K x N;

where K is equal to 32.

5. The method of claim 4, wherein the splitting the result matrix to obtain the low-dimensional local feature descriptors comprises:

or,

6. The method of claim 2, wherein the splitting the result matrix to obtain the low-dimensional local feature descriptors comprises:

or,

7. The method according to claim 1, wherein before the local feature descriptors in the descriptor set are dimension-reduced according to a dimension-reducing matrix to obtain low-dimensional local feature descriptors, the method further comprises:

obtaining a sample matrix of the image dataset;

obtaining a mean vector according to the sample matrix;

calculating a covariance matrix of the centered sample matrix;

forming the first K eigenvectors into the dimensionality reduction matrix;

where K is equal to 32.

8. The method of claim 7, wherein each local feature descriptor of each image in the image dataset corresponds to a row of values in the sample matrix, each image in the image dataset corresponds to a number of rows of sample values in the sample matrix, and there are N sample values in each row of the sample matrix;

the obtaining of the mean vector according to the sample matrix includes:

the dimension of the covariance matrix is N x N;

the dimension of the feature vector is N;

or,

the obtaining of the mean vector according to the sample matrix includes:

the dimension of the covariance matrix is N x N;

the dimension of the feature vector is N;

where N equals 128 and K equals 32.

9. The method of claim 8, wherein the image dataset comprises:

planar object images and three-dimensional object images.

10. The method according to any one of claims 2 to 9,

the numerical value in the table I is the numerical value corresponding to the mean value vector;

table one:

elements in the second table form the dimension reduction matrix, or elements in the second table form a transpose matrix of the dimension reduction matrix;

table two: