CN104599286A

CN104599286A - Optical flow based feature tracking method and device

Info

Publication number: CN104599286A
Application number: CN201310529938.8A
Authority: CN
Inventors: 刘阳; 张乐; 陈敏杰; 林福辉
Original assignee: Spreadtrum Communications Tianjin Co Ltd
Current assignee: Spreadtrum Communications Tianjin Co Ltd
Priority date: 2013-10-31
Filing date: 2013-10-31
Publication date: 2015-05-06
Anticipated expiration: 2033-10-31
Also published as: CN104599286B

Abstract

The invention provides optical flow based feature tracking method and device. The method comprises the steps of acquiring feature points included in a tracking window; tracking the feature points by the sparse optical flow algorithm; re-positioning the tracked feature points when the feature points are outside the preset area after tracking, wherein the preset area is the area using a mid-value feature point as the center, and the mid-value feature point is the feature point with the minimum sum of the distance of the feature point to those of all other feature points. With the adoption of the method, the tracking points which are possibly inaccurate can be rep-positioned during tracking the feature points, thus the accuracy of the feature points can be improved, and the accuracy of the tracking result can be increased.

Description

Feature tracking method and device based on optical flow

Technical Field

The invention relates to the technical field of image processing, in particular to a feature tracking method and device based on optical flow.

Background

With the rapid development of moving object detection technology, a variety of methods for detecting a moving object are correspondingly generated, for example, in the prior art, a corresponding detection method is established based on color features, motion information, a motion model and the like of the moving object, and feature detection and tracking of the moving object are important basic and key technologies for research, for example, features of an image sequence shot by a hand and a face of a person in a motion state can be detected and tracked, and then recognition of gestures of the person, the face of the person and the like can be realized.

The detection method based on the color characteristics of the moving target comprises a mean shift method, a continuous self-adaptive mean shift method and the like, and the method can realize better tracking of human gestures and the like in some simple scenes. The detection method based on the motion information of the moving object includes optical flow method, Kalman Filter (Kalman Filter), Particle Filter (Particle Filter) and other methods. In the optical flow method, the Motion Field (Motion Field) of the moving object is further calculated by using the change of the intensity of pixels in an image sequence containing the moving object in a time domain and a space domain, and finally the tracking of the moving object is realized. The optical flow method can be divided into dense optical flows and sparse optical flows according to the number of pixel points required by calculation. In addition, there is also a detection method based on a motion model, in which a 2D or 3D model of the moving target is first established, for example, a 2D or 3D model of a human hand is established, and parameters of the established model are iterated and optimized according to actual conditions in the process of tracking the target, so that the model can continuously adapt to the change of the moving target, and the tracking of the moving target is realized.

In the detection method based on motion information, in the optical flow method, tracking and recognition of a moving object can be generally achieved by performing optical flow calculation on a plurality of feature points in an image sequence and tracking the feature points, but in the tracking process of the optical flow method, the matching requirement on the image sequence is high, but the image sequence obtained in some complex scenes may have a low matching degree, so that errors occur in tracking some feature points, and further errors or failures in tracking and recognition of the moving object may occur.

Reference is made to U.S. patent application publication No. US2013259317a 1.

Disclosure of Invention

The invention solves the problem of inaccurate tracking of the characteristic points.

In order to solve the above problems, the present invention provides a feature tracking method based on optical flow, including:

acquiring characteristic points contained in a tracking window;

tracking the feature points based on a sparse optical flow algorithm;

and when the tracked feature points are located outside a preset area, repositioning the tracked feature points, wherein the preset area is an area taking a median feature point as a center, and the median feature point is the feature point with the smallest sum of distances from all the tracked feature points in the tracked feature points.

Optionally, the process of acquiring the image and tracking the feature points included in the window includes:

obtaining an autocorrelation matrix of all pixel points in a tracking window of an image by the following formula:

wherein M (X, Y) represents an autocorrelation matrix of a pixel point with coordinates (X, Y), I, j are index values of the pixel point in the tracking window in the X direction and the Y direction respectively, w (I, j) is a weight value of the pixel point with the index value in the X direction being I and the index value in the Y direction being j, K is a half width value of the tracking window, I_xAnd I_yRespectively determining the partial derivative value in the X direction and the partial derivative value in the Y direction of a pixel point with an index value in the X direction being i and an index value in the Y direction being j;

acquiring a maximum characteristic value and a minimum characteristic value of the autocorrelation matrix of the pixel points based on the autocorrelation matrix of the pixel points;

and when lambda (min) > Axlambda (max), determining the pixel point as a characteristic point contained in the tracking window, wherein lambda (max) is the maximum characteristic value of the autocorrelation matrix of the pixel point, lambda (min) is the minimum characteristic value of the autocorrelation matrix of the pixel point, and A is a characteristic threshold.

Optionally, the value of the characteristic threshold is 0.001-0.01.

Optionally, the method further includes: after the characteristic points contained in the tracking window are obtained, illumination compensation is carried out on the characteristic points before the characteristic points are tracked based on a sparse optical flow algorithm.

Optionally, the performing illumination compensation on the feature points includes:

based on the formula J_nAnd = λ × J +, where λ is a gain coefficient of luminance of the feature point and a bias coefficient of luminance of the feature point, J is a luminance value before the feature point is compensated, and Jn is a luminance value after the feature point is compensated.

Optionally, the process of repositioning the tracked feature points includes:

and repositioning the tracked feature points through a formula N = R × M + (1-R) × N, wherein N is the coordinate value of the tracked feature points, R is an updating coefficient, the value range of R is a numerical value between 0 and 1, and M is the coordinate value of the median feature point.

Optionally, the preset region is a circular region with the median feature point as a center and the half length of the edge length value of the tracking window as a radius.

Optionally, the sparse optical flow algorithm is an image pyramid optical flow algorithm.

Optionally, the method further includes: and after the tracked feature points are repositioned, recognizing the gesture of the user based on the tracking result of the feature points in the tracking window.

The present invention also provides a feature tracking device for optical flow, the device including:

the acquisition unit is suitable for acquiring the characteristic points contained in the tracking window;

a tracking unit adapted to track the feature points based on a sparse optical flow algorithm;

and the repositioning unit is suitable for repositioning the tracked feature points when the tracked feature points are positioned outside a preset area, the preset area is an area taking a median feature point as a center, and the median feature point is the feature point with the smallest sum of distances from all other tracked feature points in the tracked feature points.

Optionally, the apparatus further comprises: the compensation unit is suitable for performing illumination compensation on the feature points after the feature points of the image are acquired and before the feature points are tracked based on a sparse optical flow algorithm.

Optionally, the relocation unit relocates the tracked feature point according to a formula N = R × M + (1-R) × N, where N is a coordinate value of the tracked feature point, R is an update coefficient, a value range of R is a numerical value between 0 and 1, and M is a coordinate value of the median feature point.

Optionally, the apparatus further comprises: and the recognition unit is suitable for recognizing the gesture of the user based on the tracking result of the characteristic points in the tracking window after the tracked characteristic points are repositioned.

Compared with the prior art, the technical scheme of the invention has the following advantages:

after the feature points contained in the tracking window are obtained, the feature points are tracked based on a sparse optical flow algorithm, when the tracked feature points are located outside a preset area, the tracked feature points are repositioned, the preset area is an area with a median feature point as a center, and the median feature point is the feature point with the minimum sum of distances between the median feature point and all other feature points in the feature points. In the technical scheme of the invention, in the characteristic tracking process, the tracking points which possibly do not meet the requirements in the tracked characteristic points are relocated, so that the accuracy of the characteristic points can be improved, and the accuracy of the tracking result is improved.

Furthermore, before tracking the feature points based on the optical flow algorithm, illumination compensation is performed on the pixels where the feature points are located through an illumination compensation method, so that images under different illumination conditions can be effectively adjusted, and the accuracy and stability of feature point tracking under different illumination conditions are improved.

Drawings

FIG. 1 is a schematic flow chart of a feature tracking method based on optical flow according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a method for tracking features based on optical flow according to an embodiment of the present invention;

FIG. 3 is a flowchart illustrating a method for tracking features based on optical flow according to a second embodiment of the present invention.

Detailed Description

In order to solve the above problem, an aspect of the present invention provides a feature tracking method based on an optical flow, in which, based on a tracking result of the feature points, when there may be an abnormality in a position of the tracked feature points, for example, when the tracked feature points are located outside a preset area, the tracked feature points are repositioned.

Fig. 1 is a schematic flow chart of the optical flow-based feature tracking method according to the present invention, and as shown in fig. 1, step S101 is executed first to obtain feature points included in a tracking window.

When a moving object is tracked, a tracking window needs to be determined first, and the size of the corresponding tracking window can be determined according to the size of an acquired image, where the tracking window can be obtained by a variety of methods known to those skilled in the art, such as motion detection, background removal, skin color detection based on a training model, and the like, and the moving object is contained in the tracking window, for example, a hand image, a face image, and the like may be contained in the tracking window. The tracking window is typically a regularly shaped area, such as a square area, a rectangular area, etc.

The optical flow method can be understood as a method for calculating optical flow for a plurality of pixels taking a feature point as a center in an image sequence, that is, calculating optical flow by using the pixels where the feature point is located, and further tracking a moving target based on a calculation result. The picture element is defined in the application document as a region which takes a characteristic point as a center and contains a plurality of pixel points. When the feature points are tracked based on the optical flow method, the feature points in the tracking window need to be acquired, and the method for acquiring the feature points in the tracking window may be performed by various methods in the prior art, for example, by using a Shi-Tomasi corner algorithm, a Harris algorithm, and the like, which is not limited herein.

And step S102 is executed, and the feature points are tracked based on a sparse optical flow algorithm.

After the feature points contained in the tracking window are obtained, the feature points may be tracked by an optical flow algorithm, for example, the sparse optical flow algorithm may be a sparse optical flow algorithm based on an image pyramid.

Feature points can be extracted from the previous frame of image of the image sequence, and the feature points are tracked for the next frame of image by using the image pyramid-based sparse optical flow algorithm to obtain the positions of the feature points in the previous frame of image in the next frame of image.

And step S103 is executed, and when the tracked feature point is located outside the preset area, the tracked feature point is repositioned. The preset region is a region with a median feature point as a center, and the median feature point is a feature point with the smallest sum of distances from all other feature points among the feature points. In this document, the feature points located outside the preset region after tracking are referred to as designated feature points.

When the feature points are tracked based on the optical flow algorithm, the tracked positions of the feature points can be obtained based on the optical flow algorithm, and if the feature points are located outside the preset area, namely when the tracked feature points are the designated feature points, the positions of the feature points are adjusted in an aggregation mode.

In the process of aggregation, firstly, a median feature point is determined, the distances from any one tracked feature point in the tracking window to all other feature points are accumulated, if the sum of the distances from the tracked feature point to all other feature points is the minimum, the median feature point can be used as the median feature point, so that the calculated median feature point is positioned at the center positions of all the feature points in the tracking window, and the feature point with the larger accumulated distance is not considered in the process of calculating the median feature point.

After the median feature point is obtained, a preset area may be determined, where the preset area may be an area with the median feature point as a center, and the size of the preset area may be an area that is about 0.8 to 1.2 times the size of the current tracking window, and usually, the median feature point is used as the center, and the area corresponding to the size of the current tracking window is used as the preset area. For example, the preset region may be a circular region centered on the median feature point, and a radius of the circular region is associated with a length of an edge of the tracking window. Taking the tracking window as a square area as an example, the radius of the circular area may be one-half of the length of the edge length value of the tracking window.

After the preset area is determined, the designated feature points may be aggregated, or may also be referred to as aggregation, the designated feature points may be aggregated by a method of determining a certain aggregation speed or step length, or the designated feature points may be aggregated by a variable step length algorithm, and a specific method is not limited.

In the method, in the process of tracking the characteristic points, the tracking points which may be wrong are repositioned, so that the accuracy of the characteristic points can be improved, and the accuracy of the tracking result is effectively improved.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are further described below.

Example one

In this embodiment, after the feature points included in the tracking window are acquired, the feature points are tracked by a sparse optical flow algorithm based on an image pyramid, and when the tracked feature points are located outside a preset area, the tracked feature points are repositioned.

Fig. 2 is a schematic flowchart of the optical flow-based feature tracking method provided in this embodiment, and as shown in fig. 2, step S201 is first executed to determine a tracking window of an image.

The tracking window may be determined according to the size of the acquired image, and the tracking window may be obtained by various methods known to those skilled in the art, such as motion detection, background removal, skin color detection based on a training model, and the like. In this embodiment, a description is given by taking gesture tracking of a user as an example, and the tracking window contains a hand image of the user.

And S202, acquiring the feature points in the tracking window based on the Shi-Tomasi corner algorithm.

In this embodiment, a method of acquiring feature points by using the Shi-Tomasi corner point algorithm is taken as an example for explanation.

In the Shi-Tomasi corner point algorithm, the autocorrelation matrix of all pixel points in the tracking window of the image is first obtained by formula (1).

Wherein M (X, Y) represents an autocorrelation matrix of a pixel point with coordinates (X, Y), I, j are index values of the pixel point in the tracking window in the X direction and the Y direction respectively, w (I, j) is a weight value of the pixel point with the index value in the X direction being I and the index value in the Y direction being j, K is a half width value of the tracking window, I_xAnd I_yThe index value in the X direction is i, the index value in the Y direction is j, and the partial derivative value in the X direction and the partial derivative value in the Y direction are respectively.

And (3) acquiring the maximum eigenvalue lambda (max) and the minimum eigenvalue lambda (min) of the autocorrelation matrix of the pixel points based on the autocorrelation matrix of all the pixel points calculated by the formula (1). The method for obtaining the maximum eigenvalue and the minimum eigenvalue of the autocorrelation matrix is well known to those skilled in the art and will not be described herein.

And (3) determining whether the pixel points are the characteristic points of the image or not through a formula (2).

λ(min)>A×λ(max) （2）

Wherein A is a characteristic threshold value, and the value of A is a numerical value between 0.001 and 0.01.

Usually, when a pixel point in the tracking window satisfies the formula (2), the pixel point can be determined to be a feature point of the image.

And step S203 is executed, and the feature points in the tracking window are tracked based on a sparse optical flow algorithm.

In this embodiment, the image pyramid-based sparse optical flow algorithm tracks feature points in the tracking window.

The sparse optical flow can be understood as image registration of a plurality of pixels taking the feature point as the center between the images of the adjacent frames, namely, the optical flow can be calculated based on the pixels where the feature points are located, and then the gesture of the user is tracked based on the calculation result.

In the sparse optical flow algorithm based on the image pyramid, the optical flow is usually calculated iteratively based on a gradient method, and coarse-to-fine motion estimation is realized in a pyramid mode.

In the process of tracking the feature points in the tracking window by using the image pyramid-based sparse optical flow algorithm, for two given frames of continuous images, the feature point tracking aims to find out one pixel I on one frame of image, and the other pixel J with similar image intensity is correspondingly found on the other adjacent frame of image.

When calculating the optical flow of a pixel, it is necessary to use a residual function ξ (d) shown in equation (3). In the present embodiment, pixel I is taken as a tracking pixel.

Wherein, I and J are corresponding pixels between adjacent frame images, d represents the optical flow to be calculated, d_xAnd d_yThe components of the optical flow to be calculated in the x and y directions are respectively; u. of_xAndu _ythe positions of the characteristic points in the pixel I in the x direction and the y direction are respectively; w is a_xAnd w_yHalf window width of the pixel I in x and y directions, I (x, y) is the image intensity of the pixel I at x and y, J (x + d)_x，y+d_y) Is pixel J at (x + d)_x，y+d_y) The image intensity of (a).

After the tracking residual ξ (d) of the tracked pixel I is obtained by formula (3), the optical flow can be iteratively calculated by adopting a gradient descent method.

Ideally, the first order differential of the residual function ξ (d) with respect to the optical flow d to be calculated should be zero, as shown in equation (4).

<math> <mrow> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>=</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mn>0</mn> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow> </math>

In the specific calculation, the calculation can be performed by the formula (5)The value of (c).

<math> <mrow> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>=</mo> <mo>-</mo> <mn>2</mn> <msubsup> <mi>Σ</mi> <mrow> <mi>x</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> </msubsup> <msubsup> <mi>Σ</mi> <mrow> <mi>y</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> </msubsup> <mrow> <mo>(</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>J</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>+</mo> <msub> <mi>d</mi> <mi>x</mi> </msub> <mo>,</mo> <mi>y</mi> <mo>+</mo> <msub> <mi>d</mi> <mi>y</mi> </msub> <mo>)</mo> </mrow> <mo>)</mo> </mrow> <mo>·</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>x</mi> </mrow> </mfrac> </mtd> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>y</mi> </mrow> </mfrac> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow> </math>

For J (x + d)_x,y+d_y) The first order Taylor expansion is used and the result is shown in equation (6).

<math> <mrow> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>≈</mo> <mo>-</mo> <mn>2</mn> <msubsup> <mi>Σ</mi> <mrow> <mi>x</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> </msubsup> <msubsup> <mi>Σ</mi> <mrow> <mi>y</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> </msubsup> <mfenced open='(' close=')'> <mtable> <mtr> <mtd> <mi>I</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>-</mo> <mi>J</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>-</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>x</mi> </mrow> </mfrac> </mtd> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>y</mi> </mrow> </mfrac> </mtd> </mtr> </mtable> </mfenced> <mi>d</mi> </mtd> </mtr> </mtable> </mfenced> <mo>·</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>x</mi> </mrow> </mfrac> </mtd> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>y</mi> </mrow> </mfrac> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow> </math>

Matrix arrayRepresenting the image gradient vector, can be represented by a formula as shown in formula (7).

<math> <mrow> <mo>&dtri;</mo> <mi>I</mi> <mo>=</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <msub> <mi>I</mi> <mi>x</mi> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>I</mi> <mi>y</mi> </msub> </mtd> </mtr> </mtable> </mfenced> <mtext>=</mtext> <msup> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>x</mi> </mrow> </mfrac> </mtd> <mtd> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>J</mi> </mrow> <mrow> <mo>&PartialD;</mo> <mi>y</mi> </mrow> </mfrac> </mtd> </mtr> </mtable> </mfenced> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow> </math>

Wherein,

<math> <mrow> <mo>&ForAll;</mo> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&Element;</mo> <mo>[</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> <mo>,</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> <mo>]</mo> <mo>×</mo> <mo>[</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> <mo>,</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> <mo>]</mo> <mo>.</mo> </mrow> </math>

let I (x, y) = I (x, y) -J (x, y) represent image time-domain differentiation, let

I_{x} (x, y) =

<math> <mrow> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>x</mi> </mrow> </mfrac> <mo>,</mo> <msub> <mi>I</mi> <mi>y</mi> </msub> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>I</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>y</mi> </mrow> </mfrac> </mrow> </math>

Representing the spatial differential of the image in the x, y directions, respectively.

In order to reduce the amount of computation in the iterative process of optical flow, after the image is decomposed into a certain layerThe amount of image motion between adjacent layers becomes sufficiently small, and this time can be usedAndsubstitutionAndthis alternative is to satisfy the assumed conditions for optical flow.

Based on the above analysis, equation (6) can be rewritten as equation (8).

<math> <mrow> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>≈</mo> <msubsup> <mi>Σ</mi> <mrow> <mi>x</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> </msubsup> <msubsup> <mi>Σ</mi> <mrow> <mi>y</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> </msubsup> <mrow> <mo>(</mo> <msup> <mrow> <mo>&dtri;</mo> <mi>I</mi> </mrow> <mi>T</mi> </msup> <mi>d</mi> <mo>-</mo> <mi>δI</mi> <mo>)</mo> </mrow> <msup> <mrow> <mo>&dtri;</mo> <mi>I</mi> </mrow> <mi>T</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>8</mn> <mo>)</mo> </mrow> </mrow> </math>

Based on equation (8), equation (9) can be obtained.

<math> <mrow> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <msup> <mrow> <mo>[</mo> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>]</mo> </mrow> <mi>T</mi> </msup> <mo>≈</mo> <msubsup> <mi>Σ</mi> <mrow> <mi>x</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>x</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>x</mi> </msub> </mrow> </msubsup> <msubsup> <mi>Σ</mi> <mrow> <mi>y</mi> <mo>=</mo> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>-</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> <mrow> <msub> <mi>u</mi> <mi>y</mi> </msub> <mo>+</mo> <msub> <mi>w</mi> <mi>y</mi> </msub> </mrow> </msubsup> <mfenced open='(' close=')' separators=' '> <mtable> <mtr> <mtd> </mtd> </mtr> </mtable> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <msubsup> <mi>I</mi> <mi>x</mi> <mn>2</mn> </msubsup> </mtd> <mtd> <msub> <mi>I</mi> <mi>x</mi> </msub> <msub> <mi>I</mi> <mi>y</mi> </msub> </mtd> </mtr> <mtr> <mtd> <msub> <mi>I</mi> <mi>x</mi> </msub> <msub> <mi>I</mi> <mi>y</mi> </msub> </mtd> <mtd> <msubsup> <mi>I</mi> <mi>y</mi> <mn>2</mn> </msubsup> </mtd> </mtr> </mtable> </mfenced> <mi>d</mi> <mo>-</mo> <mfenced open='[' close=']'> <mtable> <mtr> <mtd> <msub> <mi>I</mi> <mi>x</mi> </msub> <mi>δI</mi> </mtd> </mtr> <mtr> <mtd> <msub> <mi>I</mi> <mi>y</mi> </msub> <mi>δI</mi> </mtd> </mtr> </mtable> </mfenced> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>9</mn> <mo>)</mo> </mrow> </mrow> </math>

Then, order

Equation (9) can be rewritten as equation (10).

<math> <mrow> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <msup> <mrow> <mo>[</mo> <mfrac> <mrow> <mo>&PartialD;</mo> <mi>ξ</mi> <mrow> <mo>(</mo> <mi>d</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&PartialD;</mo> <mi>d</mi> </mrow> </mfrac> <mo>]</mo> </mrow> <mi>T</mi> </msup> <mo>≈</mo> <mi>Gd</mi> <mo>-</mo> <mi>b</mi> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>10</mn> <mo>)</mo> </mrow> </mrow> </math>

Then the ideal optical flow vector d can be obtained based on the formula (10)_optAs shown in equation (11).

d_opt=G^-1b （11）

In actual calculation, if an accurate solution of the optical flow is to be obtained, iterative calculation is required, that is, iterative calculation is performed using equation (12).

η^k=G^-1b_k （12）

Wherein G is Hessian Matrix (Hessian Matrix), b_kIs the Gradient-weighted Residual Vector at the kth iteration η^kIs the residual optical flow at the k-th iteration.

Residual optical flow eta in obtaining kth iteration^kThen, the estimated optical flow at the k-th iteration can be obtained by equation (13).

v^k=v^k-1+η^k （13）

Wherein v is^kIs the estimated optical flow at the k-th iteration, v^k-1Is the estimated optical flow, η, after the k-1 iteration^kIs the residual optical flow at the k-th iteration.

After a plurality of iterations, the optical flow d is obtained as shown in equation (14) until a convergence condition or iteration number is met.

<math> <mrow> <mi>d</mi> <mo>=</mo> <msup> <mover> <mi>v</mi> <mo>&OverBar;</mo> </mover> <mi>k</mi> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>14</mn> <mo>)</mo> </mrow> </mrow> </math>

Where k represents a preset number of iterations or a number of iterations reaching a convergence condition,representing the optical flow value calculated when the number of iterations reaches k.

The optical flow obtained by carrying out multiple iterative computations on the image with the single scale can be obtained based on the formula (14), in order to realize the tracking of a large-amplitude moving target in a complex scene, the motion estimation from coarse to fine is realized by adopting an image pyramid mode, the iterative computation is firstly carried out on the image with the coarse scale, the result calculated on the image with the coarse scale is substituted into the image with the finer scale for the iterative computation based on the method for obtaining the optical flow on the image with the single scale, and the like, and finally the final optical flow computation result d is obtained through the formula (15)_last。

Wherein L is the hierarchy of the image pyramid, L is the [0, L ∈_m]，L_mIs the highest layer number of the image pyramid, and L =0 represents thatStarting image, d^LIs the optical flow result from the L-layer calculation.

After the optical flow of the current tracking pixel is obtained by formula (15), the position of the tracked pixel can be obtained, that is, the position of the tracked feature point can be determined.

Step S204 is executed after step S203, and it is determined whether all the feature points in the tracking window have been tracked. If yes, go to step S206; otherwise, step S205 is executed to select the next feature point for tracking.

In step S205, one feature point that has not been tracked is selected from the feature points included in the tracking window. When a feature point is selected for tracking in the tracking window, it can be marked to indicate that it has been tracked. Then in step S205, an unmarked feature point can be selected from the feature points contained in the tracking window for tracking. The method for selecting the feature points in the tracking window may be performed in various ways, for example, the feature points may be selected in a random manner.

After step S205, the process returns to step S203, and the selected feature points are continuously tracked.

Until all the feature points in the tracking window have been tracked, that is, when the determination result of step S204 is yes, step S206 is executed.

In step S206, a median feature point is calculated.

After the feature points contained in the tracking window are tracked, the distances from the feature point to all other tracked feature points are respectively calculated for any tracked feature point in the tracking window, the sum of the distances from the tracked feature point to all other tracked feature points is calculated, the sum of the distances from each tracked feature point to all other tracked feature points can be obtained for each tracked feature point, and the feature point with the smallest sum of the distances from the tracked feature points is taken as the median feature point.

And step S207 is executed to determine a preset area based on the median feature point.

In this embodiment, a square area and a circular area are taken as an example of the tracking window, where the preset area may be a circular area with the median feature point as a center, and a radius of the circular area may be a half length of a side length value of the tracking window.

Based on the tracking result of each tracked feature point obtained in step S203 and the range of the preset area determined in step S207, after all the feature points in the tracking window are tracked, one tracked feature point is selected, step S208 is executed, and it is determined whether the tracked feature point is outside the range of the preset area. And judging whether the currently tracked feature point is out of the range of the preset area or not through the step.

If so, step S209 is performed, otherwise step S211 is performed.

Step S209, determining the current feature point as the designated feature point, and repositioning the designated feature point.

When the tracked feature point is outside the preset area, the feature point can be determined to be a designated feature point and needs to be repositioned.

In the present embodiment, the specified feature point can be relocated by the formula (16).

N=R×M+（1－R）×N （16）

In the formula (16), N on the right of the equal sign represents a coordinate value of the specified feature point before updating (i.e., repositioning), N on the left of the equal sign represents a coordinate value obtained after the specified feature point is updated by the formula (16), R is an update coefficient, the value range of R is a numerical value between 0 and 1, and M is a coordinate value of the median feature point.

The position of the designated feature point, that is, the coordinate value of the designated feature point, may be determined in step S203, and in the process of calculating the median feature point in step S206, the position of the median feature point, that is, the coordinate value of the median feature point, may be determined.

The currently tracked feature points may be repositioned in step S209, and then step S210 is executed to determine whether all the tracked feature points in the tracking window are determined, that is, whether all the tracked feature points in the tracking window are determined to be outside the preset area range. If yes, go to step S212; otherwise, step S211 is executed to select the next tracked feature point.

Step S211, selecting a next tracked feature point.

In step S211, a feature point that is not determined whether it is outside the preset area is selected from the tracked feature points in the tracking window. When a tracked feature point is selected in the tracking window for judgment, the feature point can be marked to indicate that the feature point is judged. Then, when step S211 is executed, an unmarked feature point may be selected from the tracked feature points included in the tracking window for determining whether the unmarked feature point is outside the preset area. The method for selecting the tracked feature points in the tracking window may be performed by various methods, for example, the feature points may be selected by a random selection method.

After step S211, the process returns to step S208, and it is determined whether the selected feature point is outside the preset area range.

Until all the tracked feature points in the tracking window are judged whether to be outside the preset area, that is, when the judgment result of step S210 is yes, step S212 is executed, and the gesture of the user is recognized based on the positions of the feature points in the tracking window.

After all the feature points in the tracking window are tracked, the positions of the feature points in the tracking window can be determined according to the tracking result, based on the tracked position information and the tracked position information of the feature points, the position change information, the motion direction change information and the like before and after the feature points are tracked can be obtained, based on the information, the gesture change information of the user can be realized, and further the gesture of the user can be recognized by adopting the gesture recognition technology in the prior art.

It should be noted that, in this embodiment, when it is determined that a tracked feature point is a designated feature point, the designated feature point is repositioned, then a next tracked feature point is selected, whether the tracked feature point is outside a preset area is determined, that is, whether the tracked feature point is the designated feature point is determined, if the tracked feature point is outside the preset area, the designated feature point is repositioned, and so on until corresponding operations are performed on all tracked feature points. In other embodiments, it may also be possible to sequentially determine whether all the tracked feature points are designated feature points, and after all the designated feature points are determined, sequentially reposition all the designated feature points, where a specific manner is not limited herein.

It should be noted that, in this embodiment, in the process of performing feature points of a tracking window of a current frame image, the method is adopted to sequentially select feature points in the tracking window of the current frame image for tracking, determine a median feature point and a preset region based on tracking results of all the feature points after all the feature points are tracked, further determine whether there are feature points outside the preset region in all the tracked feature points, reposition the feature points outside the preset region in the tracked feature points, and then identify a gesture of a user based on the repositioned feature points. In other embodiments, the median feature point and the preset region may also be determined based on the tracking result of the feature point of the previous frame image, when performing feature tracking on the current frame image, one feature point in the tracking window of the current frame image may also be selected for tracking, and it is determined whether the tracked feature point in the current frame is located outside the preset region determined based on the feature point tracking result of the previous frame image, if so, it is repositioned, otherwise, another feature point in the tracking window of the current frame is selected for tracking, and based on the tracking result of the feature point and the preset region determined based on the feature point tracking result of the previous frame image, it is determined whether it needs to be repositioned, and so on until all the feature points are correspondingly operated.

In this embodiment, in the process of tracking the feature points, the accuracy of the feature points and the accuracy of the tracking result can be improved by repositioning the tracking points which may be wrong.

Example two

In this embodiment, in order to cope with more general illumination changes, after feature points included in a tracking window are acquired, before iterative computation of optical flow, illumination compensation may be performed on pixels where any feature point is located, then the feature points may be tracked based on a sparse optical flow algorithm of an image pyramid, and in the process of tracking the feature points, an aggregation method is used to reposition specified feature points.

Fig. 3 is a schematic flowchart of the optical flow-based feature tracking method provided in this embodiment, and as shown in fig. 3, step S301 is first executed to determine a tracking window of an image. Please refer to step S201.

And executing step S302 to obtain the characteristic points in the tracking window.

All feature points in the tracking window may be obtained based on the Shi-Tomasi corner algorithm, please refer to embodiment step S202.

And executing step S303, and performing illumination compensation on the pixels where all the characteristic points in the tracking window are located.

And before tracking the pixels through an optical flow iterative algorithm, performing illumination compensation on the pixels where the feature points are located.

In this embodiment, illumination compensation may be performed by using linear changes related to bias and gain, and after the gain coefficient and the bias coefficient are determined, illumination compensation may be performed on an image element in which a feature point included in the tracking window is located by using formula (17).

J_n=λ×J+ （17）

Wherein λ is a gain coefficient of the luminance of the feature point, and is a bias coefficient of the luminance of the feature point, J is a luminance value before compensation of the feature point, and Jn is a luminance value after compensation of the feature point.

In this embodiment, the optical flow is calculated by using the pixels where the feature points are located, so the meaning of each corresponding parameter in equation (17) can be understood as: lambda is the gain coefficient of the brightness of the pixel where the characteristic point is located, and is the bias coefficient of the brightness of the pixel where the characteristic point is located, J is the brightness value before compensation of the pixel where the characteristic point is located, and Jn is the brightness value after compensation of the pixel where the characteristic point is located.

The gain is a value for amplifying the brightness, the bias is a value for increasing or decreasing the brightness value, and the gain coefficient of the brightness of the pixel where the feature point is located and the bias coefficient of the brightness are obtained by various methods known by those skilled in the art under the condition of ensuring that J and Jn have the same mean value and variance.

In other embodiments, other illumination compensation methods may also be used to perform illumination compensation on the pixel where the feature point is located, which is not limited herein.

After the pixels of all the feature points contained in the tracking window are subjected to illumination compensation, step S304 is executed, and the feature points in the tracking window are tracked based on a sparse optical flow algorithm.

Step S305 is executed after step S304, and it is determined whether all the feature points in the tracking window have been tracked. If yes, go to step S307; otherwise, step S306 is executed, and the next feature point is selected for tracking.

After step S306, the process returns to step S304, and the selected feature points are continuously tracked.

Until all the feature points in the tracking window have been tracked, that is, when the determination result of step S305 is yes, step S307 is executed.

Step S307, a median feature point is calculated.

After the feature points contained in the tracking window are tracked, the distances from any one tracked feature point in the tracking window to all other tracked feature points are respectively calculated, the sum of the distances from the tracked feature point to all other tracked feature points is calculated, the sum of the distances from each tracked feature point to all other tracked feature points can be obtained, and the feature point with the smallest sum of the distances from each tracked feature point to all other tracked feature points is taken as a median feature point.

Step S308 is executed to determine a preset area based on the median feature point.

In this embodiment, a preset area is still taken as an example to be described as a circular area, where the preset area may be a circular area with the median feature point as a center, and a radius of the circular area may be a length of one half of a side length value of the tracking window.

Based on the tracking result of each tracked feature point obtained in step S304 and the range of the preset area determined in step S308, after all the feature points in the tracking window are tracked, one tracked feature point is selected, step S309 is executed, and it is determined whether the tracked feature point is outside the range of the preset area. And judging whether the currently tracked feature point is out of the range of the preset area or not through the step.

If the judgment result of the step S309 is yes, step S310 is performed, otherwise step S312 is performed.

Step S310, determining the current feature point as the specified feature point, and repositioning the specified feature point.

The currently tracked feature points are repositioned in step S310, and then step S311 is executed to determine whether all the tracked feature points in the tracking window are determined, that is, whether all the tracked feature points in the tracking window are determined to be outside the preset area range. If so, go to step S313; otherwise, step S312 is executed to select the next tracked feature point.

Step S312, selecting a next tracked feature point.

After step S312, the process returns to step S309, and determines whether the selected feature point is outside the preset area range.

Until all the tracked feature points in the tracking window are judged whether to be outside the preset area, that is, when the judgment result of step S311 is yes, step S313 is executed, and the gesture of the user is recognized based on the positions of the feature points in the tracking window.

Please refer to steps S203 to S212 in the embodiment from step S304 to step S313.

After all the feature points in the tracking window are tracked, position change information, movement direction change information and the like before and after the feature points are tracked can be obtained based on the tracking result, gesture change information of a user can be realized based on the information, and then the gesture of the user can be recognized by adopting a gesture recognition technology in the prior art.

In this embodiment, before tracking the feature points based on the optical flow algorithm, illumination compensation is performed on the pixels where the feature points are located by using an illumination compensation method, so that images under different illumination conditions can be effectively adjusted, and the accuracy and stability of feature point tracking under different illumination conditions are improved.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An optical flow-based feature tracking method, comprising:

acquiring characteristic points contained in a tracking window;

tracking the feature points based on a sparse optical flow algorithm;

2. The optical flow-based feature tracking method according to claim 1, wherein the process of obtaining the feature points included in the tracking window comprises:

wherein M (X, Y) represents an autocorrelation matrix of a pixel point with coordinates (X, Y), I, j are index values of the pixel point in the tracking window in the X direction and the Y direction respectively, w (I, j) is a weight value of the pixel point with the index value in the X direction being I and the index value in the Y direction being j, K is a half width value of the tracking window, I_xAnd I_yRespectively making the X-direction index value be i and making the pixel point whose index value is j in Y direction be partial derivative in X directionA numerical value and a derivative numerical value in the Y direction;

3. The optical flow-based feature tracking method according to claim 2, wherein the feature threshold value is 0.001-0.01.

4. The optical flow-based feature tracking method of claim 1, further comprising: after the characteristic points contained in the tracking window are obtained, illumination compensation is carried out on the characteristic points before the characteristic points are tracked based on a sparse optical flow algorithm.

5. The optical flow-based feature tracking method of claim 4, wherein the illumination compensating the feature points comprises:

based on the formula J_n= λ × J +, where λ is a gain coefficient of luminance of the feature point and a bias coefficient of luminance of the feature point, J is a luminance value before compensation of the feature point, and J is illumination compensation of the feature point included in the tracking window_nAnd compensating the brightness value of the characteristic point.

6. The optical flow-based feature tracking method of claim 1, wherein the process of repositioning the tracked feature points comprises:

7. The optical flow-based feature tracking method according to claim 1, wherein the preset area is a circular area centered on the median feature point and having a radius of one-half of the length of the edge value of the tracking window.

8. The optical flow-based feature tracking method of claim 1, wherein the sparse optical flow algorithm is an image pyramid optical flow algorithm.

9. The optical flow-based feature tracking method of claim 1, further comprising: and after the tracked feature points are repositioned, recognizing the gesture of the user based on the tracking result of the feature points in the tracking window.

10. An optical flow-based feature tracking apparatus, comprising:

11. The optical flow-based feature tracking apparatus of claim 10 further comprising: the compensation unit is suitable for performing illumination compensation on the feature points after the feature points of the image are acquired and before the feature points are tracked based on a sparse optical flow algorithm.

12. The optical flow-based feature tracking device according to claim 10, wherein the repositioning unit repositions the tracked feature point according to a formula N = R × M + (1-R) × N, where N is a coordinate value of the tracked feature point, R is an update coefficient, R has a value ranging from 0 to 1, and M is a coordinate value of the median feature point.

13. The optical flow-based feature tracking apparatus of claim 10 further comprising: and the recognition unit is suitable for recognizing the gesture of the user based on the tracking result of the characteristic points in the tracking window after the tracked characteristic points are repositioned.