WO2014036813A1

WO2014036813A1 - Method and device for extracting image features

Info

Publication number: WO2014036813A1
Application number: PCT/CN2013/071183
Authority: WO
Inventors: 彭健; 叶茂; 杨素娟
Original assignee: 华为技术有限公司
Priority date: 2012-09-10
Filing date: 2013-01-31
Publication date: 2014-03-13
Also published as: CN103679169A; CN103679169B

Abstract

The present invention relates to the computer vision field. Disclosed are a method and a device for extracting image features. The method comprises: performing gray scale equalization processing on an image to be processed, and selecting a preset number of point pairs from the image that is processed; determining two extraction areas by using two points in a first point pair as centers; obtaining a color quantization matrix and a gradient matrix corresponding to each extraction area; fusing the color quantization matrix and the gradient matrix corresponding to each extraction area to obtain two fusion matrixes; calculating, according to the two fusion matrixes, a scalar value corresponding to the first point pair; calculating scalar values of other point pairs in the same way; and combining scalar values of the preset number of point pairs to obtain a feature vector of the image to be processed. A gradient value can describe a target outline of an image, and a color relates to an object and a scene contained in the image, featuring low dependency on the image and high robustness. Therefore, the present invention uses a gradient value and a color to extract an image feature, which can improve accuracy of the image feature that is extracted.

Description

Image feature extraction method and device

The present application claims priority to Chinese Patent Application No. 201210332125.5, entitled "Image Feature Extraction Method and Apparatus", filed on September 10, 2012, the entire contents of which is incorporated herein by reference. .

Technical field

The invention relates to the field of computer vision, and in particular to a method and a device for extracting image features. Background technique

With the rapid development of computer vision technology, target tracking and detection technology has become the core technology in the field of computer vision, and the image feature extraction technology will directly affect the accuracy, adaptability and stability of target tracking and detection. Therefore, image features The extraction technology is of significant importance.

In the prior art, image feature extraction is performed by using image brightness information, which is specifically used to perform gray level equalization and Gaussian blur processing on the image to eliminate the influence of illumination on the image, and then randomly select N pairs of points randomly. The two points of each pair of points are diagonal points to form a rectangular area. The horizontal middle line is used to divide it into two equal areas and compare the sum of the pixels in the two areas. The latter is assigned a maximum of 1; Conversely, assign 0 to output the first bit data, then use the vertical center line to divide it into two equal regions and compare the sum of the pixels in the two regions, the latter being assigned a maximum of 1; Assigned to 0, the second bit data is output, and finally the feature vector whose dimension is N is output.

The prior art uses the luminance information of the image to calculate the feature vector, but still faces many difficulties in practical applications, such as illumination changes, target occlusion or partial occlusion, target pose change, and nonlinear deformation, etc., resulting in extracted image features. Less accurate.

Summary of the invention

In view of this, embodiments of the present invention provide a method and apparatus for extracting image features. The technical solution is as follows:

In one aspect, a method for extracting image features is provided, the method comprising:

Performing grayscale equalization processing on the image to be processed, and selecting a pair of preset pairs of points in the processed image;

Taking a first set of point pairs, and determining a first extraction area and a second extraction area of the preset range respectively by using two of the first set of point pairs;

Obtaining a color component of each pixel in the first extraction area to obtain the first extraction area a corresponding color quantization matrix, and acquiring a gradient value of each pixel in the first extraction region to obtain a gradient matrix corresponding to the first extraction region;

And merging the color quantization matrix corresponding to the first extraction area and the gradient matrix to obtain a first fusion matrix corresponding to the first extraction area;

Obtaining, by acquiring the first fusion matrix corresponding to the first extraction area, a second fusion matrix corresponding to the second extraction area;

Calculating, according to the first fusion matrix and the second fusion matrix, a scalar value corresponding to the first set of point pairs;

The calculation of the corresponding scalar value is performed by other points in the first set of point pairs, and the scalar value of the pair of the preset number is combined to obtain the feature vector of the image to be processed.

Optionally, the obtaining a color component of each pixel in the range of the first extraction area to obtain a color quantization matrix corresponding to the first extraction area, specifically includes:

Converting an image of the first extraction area to a corresponding color space to obtain a color component corresponding to each pixel in the first extraction area;

The color components corresponding to each pixel are weighted and quantized to obtain a color quantization matrix corresponding to the second extraction region.

Optionally, the color quantization matrix corresponding to the first extraction region is obtained by performing weight quantization on the color component corresponding to each pixel, and specifically includes:

Weighting the color component corresponding to each pixel by the formula / = σ _3⁄4 /7 + σ^ + σ _ν ν to obtain a color quantization matrix corresponding to the first extraction region;

Wherein h is a hue, s is a saturation, and V is a brightness, and ^σ °^ is a weighting coefficient.

Optionally, the acquiring the gradient value of each pixel in the range of the first extraction area to obtain the gradient matrix corresponding to the first extraction area, specifically includes:

The horizontal matrix and the vertical matrix in the Sobel operator are respectively convoluted with each pixel in the first extraction region to obtain a horizontal and vertical luminance difference approximation;

A gradient matrix corresponding to the first extraction region is obtained according to the luminance difference approximation between the horizontal and vertical directions.

Optionally, the horizontal matrix and the vertical matrix in the Sobel operator are respectively associated with the first mention The convolution operation is performed on each pixel in the range of the region to obtain a horizontal and vertical luminance difference approximation, which specifically includes:

-1 0

Using the transverse matrix in the Sobel operator = with the longitudinal matrix

f+\ +2 +1

Sy = 0 0 0 is respectively convoluted with each pixel in the range of the first extraction region to obtain a horizontal -1 -1 -1 luminance difference approximation = a longitudinal luminance difference approximation ¹ ; wherein the I is the a pixel in the first extraction area; the gradient matrix corresponding to the first extraction area is obtained according to the brightness difference approximation of the horizontal and vertical directions, and specifically includes:

The gradient matrix G _i3⁄4y corresponding to the first extraction region is obtained by the formula = i + G _y ² . Optionally, the merging the color quantization matrix corresponding to the first extraction area and the gradient matrix to obtain the first fusion matrix corresponding to the first extraction area, specifically includes:

And performing a product operation on the color quantization matrix corresponding to the first extraction region and the gradient matrix to obtain a first fusion matrix corresponding to the first extraction region.

Optionally, the calculating, according to the first fusion matrix and the second fusion matrix, the scalar value corresponding to the first set of point pairs, specifically:

Expanding the first fusion matrix in rows to form a row vector;

Sorting vector elements in the row vector by size to form a new row vector; performing element values of each vector in the new row vector and element values of vectors corresponding to central pixels of the extracted region Comparing, obtaining a first comparison result;

Processing the second fusion matrix according to the first fusion matrix manner to obtain a second comparison result; obtaining a scalar value corresponding to the first set of point pairs by comparing the first comparison result with the second comparison result.

In another aspect, an apparatus for extracting image features is provided, the apparatus comprising: a processing module, configured to perform grayscale equalization processing on an image to be processed;

a selection module, configured to select a preset pair of pairs of points in the image processed by the processing module; a determining module, configured to take a first set of point pairs, and determine a first extraction area and a second extraction area of a preset range respectively by using two of the first set of point pairs;

a first obtaining module, configured to acquire color components of each pixel in the first extraction area, and obtain a color quantization matrix corresponding to the first extraction area;

a second acquiring module, configured to acquire a gradient value of each pixel in the first extraction area, to obtain a gradient matrix corresponding to the first extraction area;

a merging module, configured to combine a color quantization matrix and a gradient matrix corresponding to the first extraction region, to obtain a first fusion matrix corresponding to the first extraction region;

a first repetition module, configured to acquire, according to the first fusion matrix corresponding to the first extraction area, a second fusion matrix corresponding to the second extraction area;

a calculation module, configured to calculate, according to the first fusion matrix and the second fusion matrix, a scalar value corresponding to the first set of point pairs;

a second repeating module, configured to perform calculation of a corresponding scalar value of the other point pair according to the first set of point pairs; a combination module, configured to combine the scalar value of the pair of the preset number of points, to obtain the Process the feature vector of the image.

Optionally, the first acquiring module specifically includes:

a converting unit, configured to convert an image of the first extraction area to a corresponding color space, to obtain a color component corresponding to each pixel in the first extraction area;

And an obtaining unit, configured to weight quantize the color component corresponding to each pixel obtained by the converting unit, to obtain a color quantization matrix corresponding to the first extraction region.

Optionally, the acquiring unit is specifically configured to weight-quantize the color components corresponding to each pixel by using the formula /= 0^ + 0^ + 0^ to obtain a color quantization matrix corresponding to the first extraction region; The h is a hue, s is a saturation, V is a brightness, and the σ _3⁄4 and σ σ _ν are weighting coefficients.

Optionally, the second acquiring module specifically includes:

An operation unit, configured to perform a convolution operation with each of the pixels in the first extraction region by using a horizontal matrix and a vertical matrix in the Sobel operator, to obtain a horizontal and vertical luminance difference approximation; and an acquisition unit, configured to The horizontal and vertical luminance difference approximation obtained by the operation unit obtains a gradient matrix corresponding to the first extraction region.

Optionally, the operation unit is specifically configured to use a horizontal matrix in a Sobel operator - 1 0 +1, f+\ +2 +1

-2 0 +2 and the vertical matrix 0 0 0 are respectively convoluted with each pixel of the range -1 0 +1 -1 -2 -1 in the first extraction region to obtain a lateral luminance difference approximation ^ = ¹ and longitudinal luminance a difference approximation G _y = g / ; wherein the I is a pixel in the range of the first extraction region; the acquiring unit is specifically configured to obtain a gradient matrix G corresponding to the first extraction region by using a formula = ^^ _I3⁄4y .

Optionally, the merging module is specifically configured to perform a product operation on the color quantization matrix corresponding to the first extraction region and the gradient matrix to obtain a first fusion matrix corresponding to the first extraction region.

Optionally, the calculating module specifically includes:

An expansion unit, configured to expand the first fusion matrix in a row to form a row vector;

a sorting unit, configured to sort the vector elements in the row vector obtained by expanding the expansion unit by size, to form a new row vector;

a first comparing unit, configured to compare an element value of each vector in the new row vector obtained by the sorting unit with an element value of a vector corresponding to a central pixel of the first extraction region, to obtain a first comparison result ;

a re-operation unit, configured to process the second fusion matrix according to the first fusion matrix manner, to obtain a second comparison result;

And a second comparing unit, configured to obtain a scalar value corresponding to the first set of point pairs by comparing the first comparison result with the second comparison result.

The beneficial effects of the technical solutions provided by the embodiments of the present invention are:

Since the gradient value can describe the target contour of the image, and because the color is often closely related to the object or scene contained in the image, and compared with other features, the color feature has less dependence on the size, direction, and viewing angle of the image itself. It has high robustness. Therefore, by extracting image features with gradient values and color components, features with strong descriptive ability and resistance to illumination can be extracted, which can not only improve the accuracy of extracted image features. And can also be adapted to the extraction of image features in many common target tracking and detection techniques.

DRAWINGS In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Other drawings may also be obtained from those of ordinary skill in the art in view of the drawings.

1 is a flowchart of a method for extracting image features according to a first embodiment of the present invention; FIG. 2 is a flowchart of a method for extracting image features according to a second embodiment of the present invention; A schematic diagram of an experimental operation effect provided;

4 is a schematic structural diagram of an image feature extraction device according to Embodiment 3 of the present invention; FIG. 5 is a schematic structural diagram of a first acquisition module according to Embodiment 3 of the present invention;

6 is a schematic structural diagram of a second acquiring module according to Embodiment 3 of the present invention;

7 is a schematic structural diagram of a computing module according to Embodiment 3 of the present invention;

FIG. 8 is a schematic structural diagram of an apparatus for extracting image features according to Embodiment 4 of the present invention. detailed description

The embodiments of the present invention will be further described in detail below with reference to the accompanying drawings.

Embodiment 1

This embodiment provides a method for extracting image features. Referring to FIG. 1, the method of the method provided in this embodiment is as follows:

101: Perform gray level equalization processing on the image to be processed, and select a pair of preset pairs of points in the processed image;

102: taking a first set of point pairs, and determining a first extraction area and a second extraction area of the preset range respectively by using two points of the first set of point pairs;

103: Obtain a color component of each pixel in the first extraction area, obtain a color quantization matrix corresponding to the first extraction area, and obtain a gradient value of each pixel in the first extraction area to obtain a gradient matrix corresponding to the first extraction area;

The color component of each pixel in the first extraction area is obtained, and the color quantization matrix corresponding to the first extraction area is obtained, including but not limited to:

Converting an image of the first extraction area to a corresponding color space to obtain a color component corresponding to each pixel in the first extraction area; After the color components corresponding to each pixel are weighted and quantized, a color quantization matrix corresponding to the first extraction region is obtained.

Further, after the color components corresponding to each pixel are weighted and quantized, a color quantization matrix corresponding to the first extraction region is obtained, including but not limited to:

The color components corresponding to each pixel are weighted and quantized by the formula f = o _h h + a _s s + σ _ν ν to obtain a color quantization matrix corresponding to the first extraction region;

Where h is the hue, s is the saturation, V is the brightness, σ _3⁄4 , a _s . σ _ν is the weighting factor.

Further, the gradient values of the pixels in the first extraction area are obtained, and the gradient matrix corresponding to the first extraction area is obtained, including but not limited to:

A gradient matrix corresponding to the first extraction region is obtained according to the luminance difference approximation between the horizontal and vertical directions. Further, the horizontal matrix and the vertical matrix in the Sobel operator are respectively convoluted with each pixel in the first extraction region to obtain a lateral and vertical luminance difference approximation, including but not limited to:

-1 0

f+\ +2 +1

Sy = 0 0 0 is convoluted with each pixel in the first extraction region to obtain a lateral bright -1 -2 -1 degree difference approximation = * and a longitudinal luminance difference approximation ⁼ ^ * ; where I is the first extraction The pixels in the region range; the gradient matrix corresponding to the first extraction region is obtained according to the brightness difference approximation between the horizontal and vertical directions, and specifically includes:

The gradient matrix G _i3⁄4y corresponding to the first extraction region is obtained by the formula = i + G _y ² .

104: merging a color quantization matrix corresponding to the first extraction region and a gradient matrix to obtain a first fusion matrix corresponding to the first extraction region;

Wherein, the color quantization matrix corresponding to the first extraction region is merged with the gradient matrix to obtain the first extraction The first fusion matrix corresponding to the region, including but not limited to:

The color quantization matrix corresponding to the first extraction region is multiplied by the gradient matrix to obtain a first fusion matrix corresponding to the first extraction region.

Step 105: Obtain a second fusion matrix corresponding to the second extraction area, in a manner of acquiring a first fusion matrix corresponding to the first extraction area.

106: Calculate, according to the first fusion matrix and the second fusion matrix, a scalar value corresponding to the first set of point pairs;

Specifically, the scalar value corresponding to the first set of point pairs is calculated according to the first fusion matrix and the second fusion matrix, including but not limited to:

Spreading the first fusion matrix in rows to form a row vector;

The vector elements in the row vector are sorted according to the large d, to form a new row vector;

Comparing the element value of each vector in the new row vector with the element value of the vector corresponding to the central pixel of the first extraction region to obtain a first comparison result;

107: Calculate the corresponding scalar value of other points according to the first set of point pairs, and combine the scalar values of the pair of preset groups to obtain the feature vector of the image to be processed.

The method provided in this embodiment, because the gradient value can describe the target contour of the image, and because the color is often closely related to the object or scene contained in the image, and compared with other features, the color feature has the size and direction of the image itself. The dependence of the angle of view is small and has high robustness. Therefore, by extracting the image features with gradient values and color components, features with strong descriptive ability and resistance to illumination can be extracted, which can not only improve extraction. The accuracy of the image features, and can also be adapted to the extraction of image features in many common target tracking and detection techniques; in addition, based on the fusion image color and gradient values, by expanding the fusion matrix in rows, The vector elements in the row vector are sorted by size, and the feature vectors are obtained accordingly, so that the targets have similar feature vectors at different angles, and the anti-rotation ability of the feature vectors is increased.

In order to clarify the method provided by the foregoing embodiment, the method for extracting image features is illustrated by taking the following example 2 as an example. For details, refer to the following embodiment 2:

Embodiment 2 This embodiment provides a method for extracting image features. Referring to FIG. 2, the method of the method provided in this embodiment is as follows:

201: Perform gray level equalization processing on the image to be processed, and select a pair of preset pairs of points in the processed image;

The image to be processed includes, but is not limited to, an image captured in a video stream formed by the video recording device. Histogram equalization is one of the most common processing methods when performing grayscale equalization processing on an image to be processed. Therefore, this step can process the histogram distribution of the image to be processed into a uniform histogram distribution. From the theory of informatics, the image with the largest entropy (ie, the maximum amount of information) is the equalized image. From an intuitive point of view, the histogram equalization will increase the contrast of the image. Therefore, this step performs gray scale equalization processing on the image to be processed, which can effectively eliminate the interference of the error caused by the strong change of the light on the feature extraction.

When a pair of preset pairs of points is selected in the processed image, the specific value of the preset number of groups is not limited in this embodiment. This step only takes 100 sets of point pairs randomly in the processed image range as an example. A total of 200 sets of points are selected, and the points in the pair should be randomly but tend to be evenly distributed on the processed image, and the two points in the pair need to be separated by a certain distance.

202: Take a first set of point pairs, and determine a first extraction area and a second extraction area of the preset range respectively by using two points of the first set of point pairs;

For this step, when the first extraction area and the second extraction area of the preset range are respectively determined by the two points in the first set of point pairs, for convenience of explanation, only two of the first set of point pairs are used here. Taking the points as the center and taking the square with the side length of 7 pixels as the extraction area, the first extraction area and the second extraction area are both 7*7 pixel matrices. Certainly, the preset range of the first extraction area and the second extraction area may be other sizes or other shapes. In this embodiment, the specific preset range of the extraction area is not limited, and the specific shape of the extraction area is not limited. .

203: Obtain color components of each pixel in the first extraction area, and obtain a color quantization matrix corresponding to the first extraction area.

Specifically, the color components of each pixel in the first extraction area are obtained, and the color quantization matrix corresponding to the first extraction area is obtained, including but not limited to:

Wherein, when converting the image of the first extraction area to the corresponding color space, there are multiple color spaces in the actual application, for example, RGB composed of color components such as R (red), G (green) and B (blue). Color space; HSV color space composed of color components such as H (hue), S (saturation), and V (value); HIS color space composed of H (hue), S (saturation) ^ I (intensity) color components, etc. Etc. Therefore, the present embodiment does not limit the specific color space to which the image of the first extraction region is converted, but only converts it to the HSV color space as an example. Since the image-to-color space conversion technology in the prior art is very mature, the specific conversion process of the step can be implemented according to the existing conversion technology, which is not described in this embodiment.

In addition, since the color components such as brightness and saturation in the HSV color space do not have a clear distinguishing ability, and the hue has a strong distinguishing ability, the step passes the coefficient when acquiring the color quantization matrix corresponding to the first extracted region. Enhance tonal information while weakening brightness and saturation information. In a specific implementation, after the color components corresponding to each pixel are weighted and quantized, the color quantization matrix corresponding to the first extraction region is obtained, including but not limited to:

For the above weighting coefficients σ _3⁄4 , and σ _ν , it can be set according to actual needs, for example, setting a _h = 0.8 , a _s =0.l , σ _ν =0 Λ, of course, other weighting factors can be set besides Value, this embodiment does not limit the specific value of the weighting coefficient. Taking the square in which the first extraction region determined in the above step 202 is a 7*7 pixel matrix as an example, after performing color component weight quantization on each pixel in the first extraction region, each pixel obtains a corresponding color. For component f, then this step will result in a 7*7 color quantization matrix ^C 7x7.

Step 204: Obtain a gradient value of each pixel in the first extraction area, and obtain a gradient matrix corresponding to the first extraction area.

For this step, the gradient values of the pixels in the first extraction region are obtained, and the gradient matrix corresponding to the first extraction region is obtained, including but not limited to:

The horizontal matrix and the vertical matrix in the Sobel operator are respectively separated from the first extraction region The pixel performs a convolution operation to obtain a horizontal and vertical luminance difference approximation;

A gradient matrix corresponding to the first extraction region is obtained according to the luminance difference approximation between the horizontal and vertical directions. Wherein, the horizontal matrix and the vertical matrix in the Sobel operator are respectively convoluted with each pixel in the first extraction region to obtain a horizontal and vertical luminance difference approximation, including but not limited to:

-1 0

f+\ +2 +1

Sy = 0 0 0 is convoluted with each pixel in the first extraction region to obtain a lateral bright -1 -2 -1 degree difference approximation = an approximation to the longitudinal luminance difference; where I is within the first extraction region Pixel; according to the horizontal and vertical luminance difference approximation, the gradient matrix corresponding to the extracted region is obtained, including but not limited to:

205: merging a color quantization matrix corresponding to the first extraction region and a gradient matrix to obtain a first fusion matrix corresponding to the first extraction region;

Specifically, the color quantization matrix corresponding to the first extraction region is merged with the gradient matrix to obtain a first fusion matrix corresponding to the first extraction region, including but not limited to:

The color quantization matrix corresponding to the first extraction area obtained in the above step 203 is ^C 7x7 , and the gradient matrix corresponding to the first extraction area obtained in the above step 204 is taken as an example, and the step corresponding to the first extraction area is obtained by the step. The first fusion matrix is F, _X , = ^G , _X , * C, _X , .

206: Acquire a second fusion matrix corresponding to the second extraction area, in a manner of acquiring a first fusion matrix corresponding to the first extraction area.

For obtaining the second fusion matrix corresponding to the second extraction region, the color of each pixel in the second extraction region is obtained in the manner provided in step 203 above. a component, obtaining a color quantization matrix corresponding to the second extraction region; The method of the foregoing step 204 is performed to obtain the gradient value of each pixel in the second extraction region, and obtain the gradient matrix corresponding to the second extraction region; and the color quantization matrix and the gradient matrix corresponding to the second extraction region are merged according to the manner provided in step 205 above. A second fusion matrix corresponding to the second extraction region is obtained.

207: Calculate, according to the first fusion matrix and the second fusion matrix, a scalar value corresponding to the first set of point pairs;

Specifically, the specific implementation of the step includes, but is not limited to, the following steps:

Step a, expanding the first fusion matrix by rows to form a row vector;

Step b, sorting the vector elements in the row vector according to the size to form a new row vector; Step c, the elements of the vector corresponding to the element value of each vector in the new row vector and the central pixel of the first extraction region The values are compared to obtain a first comparison result;

Step d: processing the second fusion matrix according to the first fusion matrix manner to obtain a second comparison result; Step e, obtaining a scalar value corresponding to the first set of point pairs by comparing the first comparison result with the second comparison result.

For ease of understanding, taking the first fusion matrix as the matrix ^ ⁷ as an example, an implementation manner of each step of calculating the scalar value corresponding to the first set of point pairs according to the first fusion matrix and the second fusion matrix is illustrated:

Step a, expanding the first fusion matrix _{7 into} rows to form a 49-dimensional row vector /;

For this step, when the first fusion matrix ^ ^{7 is} expanded in rows, the row of the matrix may be the last vector of the first row of the matrix and the first vector of the second row, and the last vector of the second row The first vector of the three lines is connected, and so on, the first and the next line of each line are joined to form a 49-dimensional row vector.

Step b, sorting the vector elements in the row vector/ by size to form a new row vector/; For this step, when sorting the vector elements in the row vector by size, the order may be in descending order Sorting may also be sorted in order from small to large. This embodiment does not limit the specific sorting manner.

Step c, comparing the element value of each vector in the new row vector / with the element value of the vector corresponding to the central pixel of the first extraction region, to obtain a first comparison result;

For this step, the embodiment does not limit the manner in which the first comparison result is obtained, for example, a vector corresponding to the element value of each vector in the new row vector/the center pixel of the first extraction region. When the element values are compared, if the former is large, the first comparison result obtained is ^{= ()} ; otherwise, the first comparison result obtained is ^{= 1} ; or, if the former is large, the first comparison result obtained ^{= 1} ; Otherwise, the first comparison result obtained is 0. Regardless of the first comparison result obtained, a 48-dimensional vector can be obtained for a new row vector / , and the value in each vector dimension in the 48-dimensional vector is only a value of 0 or 1.

Step d, processing the second fusion matrix according to the first fusion matrix method to obtain a second comparison result.

Step e, , by comparing the first comparison result and the second comparison result to obtain a scalar value corresponding to the first set of point pairs, since both points in the first set of point pairs can follow the above step a, to step c, Obtaining a 48-dimensional vector, and the value in each vector dimension of the 48-dimensional vector is only a value of 0 or 1, and the first comparison result and the second comparison result corresponding to two points in the first set of point pairs may be obtained. For comparison, the scalar values corresponding to the first set of point pairs are obtained according to the comparison result.

For example, for two of the first set of point pairs, the first comparison result corresponding to one of the points is denoted as ^Μ . The second comparison result corresponding to another point is recorded as ^M. Since the first comparison result and the second comparison result both obtain a 48-dimensional vector, and the value in each vector dimension in the 48-dimensional vector is only a value of 0. Or 1, then ". and both are binary sequences, which can be used to represent unsigned numbers. Therefore, by comparing ^M. The binary value corresponding to ^M , the scalar value corresponding to the first set of point pairs is obtained. For example, If "." If it is greater than ^M , the comparison result is 1 and it is taken as the corresponding scalar value of the first set of point pairs; otherwise, the comparison result is 0, which is used as the corresponding scalar value of the first set of point pairs.

208: Perform calculation of the corresponding scalar value of the other points according to the first set of point pairs, and combine the scalar values of the pair of preset groups to obtain the feature vector of the image to be processed.

For this step, after calculating the scalar value corresponding to the first set of point pairs according to the above steps, according to the calculation method of the corresponding scalar value of the first set of point pairs, the scalar value corresponding to each set of point pairs can be calculated, and The result of combining the scalar values of the pair of points of the preset number is used as the feature vector of the image to be processed. For example, in the above step 201, if 100 pairs of points are selected in the processed image, the scalar value corresponding to the pair of 100 pairs can be obtained, and the scalar values corresponding to the pair of 100 pairs can be combined to obtain 100 00. Feature vector.

When it is to be noted, the above image feature extraction method can be applied to a scene such as target tracking and detection, and the image feature extraction process realized by the above steps is based on the fusion of the image color and the gradient information, and the merged matrix is adopted. Sorting the data in the data, allowing tracking or detection of the target at different angles The degrees have similar feature vectors, which in turn increases the anti-rotation ability of the feature vectors. Next, in order to more clearly illustrate the beneficial effects achieved by the above method provided by the embodiment, and the prior art, the image feature extraction method provided by the embodiment is applied to the target tracker as an example. , combined with experimental data for explanation.

In the experiment, the video of the marked target circumscribed rectangular area is used as the experimental input data, and the target tracker based on the extraction feature of the image feature provided by the embodiment is separately operated and the target tracking of the image feature based on the prior art is used. Outputs the circumscribed rectangular area data of the target being tracked in 125 consecutive frames. The target area to be marked is referred to as a positive sample, and the area around the marked sample area and the area overlapping with the positive sample area are referred to as negative samples. 4. If the area of the target tracker output target area and the target mark overlaps less than, the target tracker is not detected to be the target, otherwise the target tracker is considered to have detected the target. Under this premise, the random forest is used as a classifier, and the precision of the target tracker based on the technical solution provided by the prior art and the technical solution provided by the embodiment is calculated separately (reca is ion) and recall rate (reca Ll) data. In the experiment, it can be set according to a specific application scenario, which is not specifically limited in this embodiment, and only 0.30 is taken as an example.

In this experiment, the experimental data is prepared as follows:

1) Collecting 2 videos with light-changing scenes, targets with significant color features, and partial or total occlusion of the target;

2) For each video, a continuous 125 frames of video data are selected, and a rectangular area of the position where the tracked object appears in the video is marked. The specific mark can be implemented by using a manual mark, which is not specifically limited in this embodiment. The information of the marked rectangular area of the position can be represented by the horizontal and vertical coordinates of the starting position of the rectangular area of the position and the length and width of the rectangular area of the position. The upper left corner of the position rectangular area can be regarded as the coordinate origin, and the horizontal and vertical coordinates of the starting position of the position rectangular area are the origin coordinates of the upper left corner of the position rectangular area, in pixels, for each pixel position of the position rectangular area. To quantify, the length and width of the rectangular area of the position can be represented by the distance between two pixels, and the unit is a pixel. For example, the information of the position rectangular area can be expressed as: The coordinates of the starting position are (256, 108), the length is 100 pixels, and the width is 200 pixels. After marking the rectangular area of the position where the tracked object appears in the video, the information of the rectangular area of the position is recorded in the file together with the frame number (not directly drawn in the video), the type of the file about the information of the rectangular area of the recorded position, The storage location and the like are not specifically limited in this embodiment. For a fully occluded target, the location data of the target is not recorded. For partially occluded targets, only record targets The coordinate data of the visible area. The rectangle marking the target area should be as close as possible to the outer contour of the visible area of the tracked target, and the target area data of the above mark is a positive sample. In this experiment, taking the schematic diagram of the experimental operation shown in Figure 3 as an example, the target to be tracked is the woman's head and the boy's shirt.

3) Positive and negative sample data are recorded in the file, and each sample of the file stores one sample data. There are 5 data in each row, the first data represents the frame number, and then the 4 data represents the area of the sample in the video image.

In this experiment, the experimental data statistics process is as follows:

1) The frame number of the video segment is represented by 1, 2, ..., 125. Since the first frame of the video segment does not necessarily contain the tracking target, the tracking target may not be numbered from the first frame of the video. Frame start number;

2) Initialize the target tracker in the frame before the video is marked as frame 1, and run the target tracker to make the target tracker in normal working state, and can process the numbered frame data normally;

3) Processing the ith frame data, suspending the target tracker, and recording the location area data of the target tracker output target. Where i is an arbitrary value from 1 to 125, and the process of extracting the image feature vector of the i-th frame data is included in the processing of the ith frame data, and the manner of extracting the feature here is described in the above steps 201 to 206, where No longer. When recording the ith frame, the target tracker outputs a rectangular area of R; _reifci , where the positive sample is located; R; _{a ei} . The function ⁰ ve^Pdw ") indicates the ratio of the area of the rectangle and the area ^*" to the area, that is, Overlap(R _predwt , ^ _get ) = * _100% , and S represents the function of the area of the rectangle. The interval is [W], where 0 means the rectangle "the area and the area ^ ^ have no overlapping area, and 1 means that the two areas completely overlap. For the target tracker to output the target area to be tracked R _Ct ,

- x , the positive sample of the i-th frame is judged as a positive sample by the classifier, that is, the classifier output target area and the target mark area substantially overlap; the service O _ver lap, K _predict D _<a , the negative sample of the i-th frame is The classifier is judged as a positive sample, that is, the classifier output target area is far from the target mark area. In this experiment, it is necessary to count the positive samples in each frame as positive samples by the target tracker. The total number of samples and the total number of samples in which the negative sample was judged as a positive sample by the target tracker.

4) For each video target, the total number of positive samples is counted, the positive sample is judged as the total number of positive samples by the classifier, and the negative sample is misclassified as the total number of positive samples by the classifier. Calculate the recall rate, the calibration rate, and the F value according to the following formula;

The positive sample is judged as the total number of positive samples by the tracker

Full recall rate =

The total number of positive samples The positive sample is judged as the total number of positive samples by the tracker

Precision rate =

The positive sample is judged as the total number of positive samples by the tracker + The negative sample is judged as the total number of positive samples. Correct rate * Recall rate *2

"value =

Correct rate + recall rate

In this experiment, the experimental results are shown in Figure 3. In FIG. 3, the first row of four pictures and the third row of four pictures are respectively corresponding to the operation effect of the target tracker based on the image feature extraction scheme provided by the prior art; the second row of four pictures and the fourth row of four pictures are based on The running effect corresponding to the target tracker of the image feature extraction scheme provided by this embodiment. The four pictures on the left show the head of the woman, and the four pictures on the right track the men's shirt. The experimental data is shown in Table 3 below. Through the experimental data analysis, the target tracker based on the image feature extraction scheme provided by the present embodiment can remove better results in the case where the light changes significantly and the tracked target has obvious color features.

Based on the target tracker experimental data of the image feature extraction scheme provided by the prior art, as shown in Table 1 below, the target tracker experimental data based on the image feature extraction scheme provided by the present embodiment is as shown in Table 2 below:

Table 1

Tracked target noun TP FP P TP/P TP/ (TP+FP) F value Ms. head 81 15 1 03 0. 79 0. 84 0. 81 Men's shirt 78 9 92 0. 85 0. 90 0. 87 Motorcycle Car 76 12 79 0. 96 0. 86 0. 90 Table 2

Note: TP indicates the total number of samples whose positive samples are positive by the target tracker; FP indicates the total number of samples whose negative samples are positive by the target tracker; P indicates the total number of positive samples marked.

table 3

As can be seen from the experimental data shown in Table 3 above, the technical solution provided by this embodiment has a higher recall rate and precision than the technical solution provided by the prior art, thereby improving the extraction. The accuracy of the image features.

The method provided in this embodiment, because the gradient value can describe the target contour of the image, and because the color is often closely related to the object or scene contained in the image, and compared with other features, the color feature has the size and direction of the image itself. The dependence of the angle of view is small and has high robustness. Therefore, by extracting the image features with gradient values and color components, features with strong descriptive ability and resistance to illumination can be extracted, which can not only improve extraction. The accuracy of the image features can be adapted to the extraction of image features in many common target tracking and detection techniques. In addition, based on the blended image color and gradient values, the merged matrix is expanded by row. The vector elements in the row vector are sorted according to the size, and the feature vector is obtained according to the result, so that the target has similar feature vectors at different angles, and the anti-rotation ability of the feature vector is increased.

Embodiment 3 The embodiment provides an image feature extraction device, which is used to perform the image feature extraction method provided in the first embodiment or the second embodiment. Referring to FIG. 4, the image feature extraction device includes:

a processing module 401, configured to perform grayscale equalization processing on the image to be processed;

The selecting module 402 is configured to select a pair of preset pairs of points in the image processed by the processing module 401; the determining module 403 is configured to take the first set of point pairs, and take two points in the first set of point pairs as The center respectively determines a first extraction area and a second extraction area of the preset range;

The first obtaining module 404 is configured to obtain color components of each pixel in the first extraction area, and obtain a color quantization matrix corresponding to the first extraction area;

The second obtaining module 405 is configured to obtain a gradient value of each pixel in the first extraction area, and obtain a gradient matrix corresponding to the first extraction area.

The merging module 406 is configured to combine the color quantization matrix and the gradient matrix corresponding to the first extraction region to obtain a first fusion matrix corresponding to the first extraction region;

a first repetition module 407, configured to acquire a second fusion matrix corresponding to the second extraction region, by acquiring a first fusion matrix corresponding to the first extraction region;

The calculating module 408 is configured to calculate, according to the first fusion matrix and the second fusion matrix, a scalar value corresponding to the first set of point pairs;

a second repeating module 409, configured to perform calculation of a corresponding scalar value of other point pairs according to the first set of point pairs;

The combination module 41 0 is configured to combine the scalar values of the pair of preset groups to obtain the feature vector of the image to be processed.

Further, referring to FIG. 5, the first obtaining module 404 specifically includes:

a converting unit 4041, configured to convert an image of the first extraction area to a corresponding color space, to obtain a color component corresponding to each pixel in the first extraction area;

The obtaining unit 4042 is configured to weight quantize the color component corresponding to each pixel obtained by the converting unit 4041 to obtain a color quantization matrix corresponding to the first extraction region.

Further, the obtaining unit 4042 is specifically configured to weight-quantize the color components corresponding to each pixel by using the formula /= 0^ + 0^ + 0^ to obtain a color quantization matrix corresponding to the first extraction region; wherein h is a hue, s is the saturation, V is the brightness, σ _3⁄4 , a _s . σ _ν is the weighting coefficient. Further, referring to FIG. 6, the second obtaining module 405 specifically includes:

The operation unit 4051 is configured to perform a convolution operation on each of the pixels in the first extraction region by using the horizontal matrix and the vertical matrix in the Sobel operator to obtain a luminance difference approximation between the horizontal and vertical directions. The obtaining unit 4052 is configured to The horizontal and vertical luminance difference approximation obtained by the operation unit 4051 obtains a gradient matrix corresponding to the first extraction region.

Further, the operation unit 4051 is specifically configured to use the horizontal matrix g _x = in the Sobel operator and each image in the first extraction region

The convolution operation is performed to obtain a lateral luminance difference approximation ^= and a longitudinal luminance difference approximation °y ^{= g} y * ¹ ; where I is a pixel in the first extraction region; and an acquisition unit 4052 is specifically used to pass the formula = G _x ² ₊ (J _y ² obtains a gradient matrix G _i3⁄4y corresponding to the first extraction region.

Further, the merging module 406 is specifically configured to perform a product operation on the color quantization matrix corresponding to the first extraction region and the gradient matrix to obtain a first fusion matrix corresponding to the first extraction region.

Further, referring to FIG. 7, the calculating module 408 specifically includes:

An expansion unit 4081, configured to expand the first fusion matrix in rows to form a row vector;

a sorting unit 4082, wherein the vector elements in the row vector obtained by expanding the expansion unit 4081 are sorted according to the size to form a new row vector;

The first comparing unit 4083 is configured to compare the element value of each vector in the new row vector obtained by the sorting unit 4082 with the element value of the vector corresponding to the central pixel of the first extraction region, to obtain a first comparison result;

a re-operation unit 4084, configured to process the second fusion matrix according to the first fusion matrix manner to obtain a second comparison result;

The second comparing unit 4085 is configured to obtain a scalar value corresponding to the first pair of point pairs by comparing the first comparison result with the second comparison result.

The device provided in this embodiment can describe the target contour of the image because the gradient value is closely related to the object or scene contained in the image, and the color feature is compared with other features. It has less dependence on the size, direction and angle of view of the image itself, and has higher robustness. Therefore, by extracting image features with gradient values and color components, it can extract strong ability to describe and resist light changes. The feature can not only improve the accuracy of the extracted image features, but also adapt to the extraction of image features in many common target tracking and detection techniques; in addition, based on the fusion image color and gradient values, The fused matrix is expanded by rows, and the vector elements in the row vector are sorted according to the size, and the eigenvectors are obtained according to the eigenvectors, so that the targets have similar eigenvectors at different angles, and the anti-rotation ability of the eigenvectors is increased.

Embodiment 4

8 is a block diagram showing the structure of a feature extraction device in an embodiment, the feature extraction device including at least one processor (801), such as a CPU, at least one network interface 804 or other user interface 803, a memory 805, and at least one communication bus 802. . Communication bus 802 is used to implement connection communication between these devices. User interface 803 can be a display, a keyboard or a pointing device. The memory 805 may include a high speed RAM memory and may also include a non-volatile memory, such as at least one disk memory. The memory 805 can optionally include at least one storage device located remotely from the aforementioned CPU 802. In some embodiments, memory 805 stores the following elements, modules or data structures, or a subset thereof, or their extensions:

Operating system 806, containing various programs for implementing various basic services and processing hardware-based tasks;

The application module 807 includes a processing module 401, a selection module 402, a determination module 403, a first acquisition module 404, a second acquisition module 405, a fusion module 406, a first repetition module 407, a calculation module 408, a second repetition module 409, and a combination. Module 410. For the functions of the above modules, refer to the description of the working principle diagram of Figure 4, and details are not described here.

The device provided in this embodiment can describe the target contour of the image because the gradient value is closely related to the object or scene contained in the image, and the size and direction of the color feature on the image itself compared with other features. The dependence of the angle of view is small and has high robustness. Therefore, by extracting the image features with gradient values and color components, features with strong descriptive ability and resistance to illumination can be extracted, which can not only improve extraction. The accuracy of the image features, and can also be adapted to the extraction of image features in many common target tracking and detection techniques; On the basis of the gradient values, the vector elements in the row vector are sorted by size according to the scale, and the feature vectors are obtained according to the result, so that the target has similar feature vectors at different angles, and the feature is added. The anti-rotation ability of the vector.

It should be noted that the image feature extraction device provided in the foregoing embodiment is only illustrated by the division of the above functional modules when extracting image features. In actual applications, the functions may be assigned to different functional modules according to needs. Completion, dividing the internal structure of the device into different functional modules to perform all or part of the functions described above. In addition, the image feature extraction device and the image feature extraction method embodiment provided by the above embodiments are in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.

A person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium. The storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

The above is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., which are within the spirit and scope of the present invention, should be included in the protection of the present invention. Within the scope.

Claims

Rights request

A method for extracting image features, the method comprising:

Acquiring a color component of each pixel in the first extraction area to obtain a color quantization matrix corresponding to the first extraction area, and acquiring a gradient value of each pixel in the first extraction area to obtain the first extraction The gradient matrix corresponding to the region;

The method according to claim 1, wherein the acquiring a color component of each pixel in the first extraction area to obtain a color quantization matrix corresponding to the first extraction area, specifically includes:

After the color components corresponding to each pixel are weighted and quantized, a color quantization matrix corresponding to the first extraction region is obtained.

The method according to claim 2, wherein the color component corresponding to each pixel is weighted and quantized to obtain a color quantization matrix corresponding to the first extraction region, which specifically includes: _h h + a _s s + σ _ν ν weight-quantize the color components corresponding to each pixel to obtain a color quantization matrix corresponding to the first extraction region;

Wherein h is a hue, s is a saturation, V is a brightness, and σ _3⁄4 , a _s , ^ are weighting systems Number.

The method according to claim 1, wherein the acquiring the gradient value of each pixel in the range of the first extraction region to obtain the gradient matrix corresponding to the first extraction region, specifically includes: The horizontal matrix and the vertical matrix in the Bell operator are respectively convoluted with each pixel in the range of the first extraction region to obtain a horizontal and vertical luminance difference approximation;

The method according to claim 4, wherein the horizontal matrix and the vertical matrix in the Sobel operator are convoluted with each pixel in the first extraction region to obtain a horizontal The vertical brightness difference approximation includes:

-1 0

Using Horizontal and Vertical Matrices in Sobel Operators

g performing a convolution operation with each pixel in the range of the first extraction area to obtain a horizontal

The luminance difference approximation value = the longitudinal luminance difference approximation value ¹ ; wherein, the I is a pixel in the range of the first extraction region; and the brightness difference approximation according to the horizontal and vertical directions is obtained by the first extraction region The gradient matrix specifically includes:

The method according to claim 1, wherein the merging the color quantization matrix corresponding to the first extraction region and the gradient matrix to obtain the first fusion matrix corresponding to the first extraction region comprises:

The method according to claim 1, wherein the calculating the scalar value corresponding to the first set of point pairs according to the first fusion matrix and the second fusion matrix comprises: Expanding the first fusion matrix in rows to form a row vector;

Sorting vector elements in the row vector by size to form a new row vector; elements of a vector corresponding to an element value of each vector in the new row vector and a center pixel of the first extraction region The values are compared to obtain a first comparison result;

8. An apparatus for extracting image features, the apparatus comprising:

a processing module, configured to perform gray level equalization processing on the image to be processed;

a selection module, configured to select a pair of preset pairs of points in the image processed by the processing module; and a determining module, configured to take a first set of point pairs, and use two points in the first set of point pairs Determining a first extraction area and a second extraction area of a preset range for the center;

The device according to claim 8, wherein the first acquiring module specifically includes:

a converting unit, configured to convert an image of the first extraction region to a corresponding color space, to obtain a color component corresponding to each pixel in the first extraction region; And an obtaining unit, configured to weight quantize the color component corresponding to each pixel obtained by the converting unit, to obtain a color quantization matrix corresponding to the first extraction region.

The device according to claim 9, wherein the acquiring unit is specifically configured to weight-quantize the color components corresponding to each pixel by using a formula f = o _h h + a _s s + σ _ν ν a color quantization matrix corresponding to the first extraction region; wherein h is a hue, s is a saturation, V is a luminance, and σ _3⁄4 , a _s , and σ _ν are weighting coefficients.

The device according to claim 8, wherein the second acquiring module specifically includes:

The device according to claim 11, wherein the operation unit is specifically used

- 1 0 +1, f+\ +2 +1 The horizontal matrix in the Sobel operator -2 0 +2 and the vertical matrix 0 0 0

-1 0 +1 -1 -2 -1 are respectively convoluted with each pixel in the first extraction region to obtain a lateral luminance difference approximation and a longitudinal luminance difference approximation G _y = g / ; a pixel in the range of the first extraction area; the acquiring unit is specifically configured to obtain a gradient matrix corresponding to the first extraction area by using a formula=^^.

The device according to claim 8, wherein the merging module is configured to perform a product operation on the color quantization matrix corresponding to the first extraction region and the gradient matrix to obtain the first extraction region. The first fusion matrix.

The device according to claim 8, wherein the calculating module comprises: an unfolding unit, configured to expand the first fusion matrix in a row to form a row vector;

a sorting unit, configured to sort the vector elements in the row vector obtained by expanding the expansion unit by size, to form a new row vector; a first comparing unit, configured to compare an element value of each vector in the new row vector obtained by the sorting unit with an element value of a vector corresponding to a central pixel of the first extraction region, to obtain a first comparison result ;