CN115104125A - Optical flow acquisition method and device - Google Patents
Optical flow acquisition method and device Download PDFInfo
- Publication number
- CN115104125A CN115104125A CN202080096767.2A CN202080096767A CN115104125A CN 115104125 A CN115104125 A CN 115104125A CN 202080096767 A CN202080096767 A CN 202080096767A CN 115104125 A CN115104125 A CN 115104125A
- Authority
- CN
- China
- Prior art keywords
- pixel block
- optical flow
- similarity
- frame image
- gradient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/223—Analysis of motion using block-matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/269—Analysis of motion using gradient-based methods
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
The application discloses an optical flow acquisition method and device, relates to the field of video image processing, and aims to reduce time consumption of iterative computation in an optical flow computing process. The optical flow acquisition method comprises the following steps: determining a first similarity of a first pixel block in the i-1 th frame image and a second pixel block in the i-1 th frame image; performing at least one of the following two processes: determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame image; determining a third similarity between the first pixel block and a fourth pixel block in the ith frame image, wherein the coordinate of the fourth pixel block is obtained by the optical flow of an adjacent pixel block of the first pixel block; and obtaining a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity and the gradient information of the first pixel block and at least one of the second similarity and the third similarity.
Description
The present disclosure relates to the field of video image processing, and in particular, to a method and an apparatus for acquiring an optical flow.
Optical flow (optical flow) is a concept in motion detection of objects in the field of view, and is used to describe the motion of an observed object, surface, or edge caused by motion relative to an observer. The optical flow algorithm is a method for deducing the moving speed and direction of an object by detecting the change of the intensity of image pixel points along with time, is widely applied to image processing, and can be used for motion detection, motion focus tracking and the like.
The essence of the optical flow algorithm is the iterative solution of the optimization problem. And constructing a corresponding energy function equation by taking the brightness consistency assumption and the motion consistency assumption as core ideas. And gradually approaching the optimal solution of the energy function equation through a large amount of iterative calculation. In the iterative calculation process, the initial values adopted are not controlled, so that the iterative calculation is time-consuming.
Disclosure of Invention
The embodiment of the application provides an optical flow acquisition method and device, which are used for reducing the time consumption of iterative computation in the process of computing optical flow.
In order to achieve the above purpose, the embodiment of the present application adopts the following technical solutions:
in a first aspect, an optical flow obtaining method is provided, including: determining a first similarity between a first pixel block in the i-1 th frame image and a second pixel block in the i-1 th frame image; the coordinates of the second pixel block are the same as the coordinates of the first pixel block, and i is a positive integer; performing at least one of the following two processes: determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame image; determining a third similarity between the first pixel block and a fourth pixel block in the ith frame image, wherein the coordinates of the fourth pixel block are obtained from optical flows of adjacent pixel blocks of the first pixel block; and obtaining a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity and the gradient information of the first pixel block and at least one of the second similarity and the third similarity.
According to the optical flow acquisition method, the first similarity between the first pixel block in the i-1 th frame image and the second pixel block in the i-1 th frame image is determined, the coordinates of the second pixel block are the same as those of the first pixel block, and the consistency that the first pixel block keeps still is reflected. By determining a second similarity of a first pixel block in the i-1 th frame image and a third pixel block in the i-1 th frame image, coordinates of the third pixel block are obtained from historical optical flows of the first pixel block from the i-2 th frame image to the i-1 th frame, and time domain consistency between the optical flows of the first pixel block from the i-2 th frame image to the i-1 th frame image and the optical flows of the first pixel block from the i-1 th frame image to the i-1 th frame image is embodied. And determining a third similarity between the first pixel block and a fourth pixel block in the ith frame image, wherein the coordinates of the fourth pixel block are obtained from the optical flows of the adjacent pixel blocks of the first pixel block, and the spatial consistency between the optical flows of the first pixel block and the adjacent pixel blocks is embodied. The initial value of the optical flow of iterative computation is optimized through the similarity, so that the iteration number is reduced. The light-weight optical flow algorithm can be realized, and the fast convergence can be realized under the condition of lower iteration times.
In one possible embodiment, the coordinates of the third block of pixels are equal to the coordinates of the first block of pixels plus the historical optical flow of the first block of pixels from the i-2 th frame image to the i-1 th frame. The size of the third pixel block is the same as the size of the first pixel block.
In a possible embodiment, the coordinates of the fourth block of pixels are equal to the coordinates of the first block of pixels plus the optical flow of the blocks of pixels adjacent to the first block of pixels. The size of the fourth pixel block is the same as the size of the first pixel block.
In a possible embodiment, obtaining the target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity, the first similarity and the gradient information of the first pixel block, wherein the target optical flow comprises: determining the highest similarity according to the first similarity and at least one of the second similarity and the third similarity, and selecting an optical flow corresponding to the highest similarity as an optical flow initial value; and combining the gradient information of the first pixel block, performing approximate Gaussian Newton gradient descent iterative solution on the initial value of the optical flow, and taking an iteration result obtained when the optical flow exits because an exit condition is met as a target optical flow. That is, the number of iterations can be reduced according to the initial value of the optical flow optimized with the highest similarity.
In a possible embodiment, obtaining the target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity, the gradient information of the first pixel block, and at least one of the second similarity and the third similarity, includes: determining at least two highest similarities according to the first similarity and at least one of the second similarity and the third similarity, and selecting optical flows respectively corresponding to the at least two highest similarities as at least two initial optical flow values; and combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on at least two initial values of the optical flow, setting exit conditions to be the same, and selecting an iteration result with the minimum energy function as a target optical flow when the exit conditions are met and the exit is carried out. That is, a plurality of initial values of optical flows corresponding to the highest similarity are preferred, and the optical flows are further preferred according to the iteration result.
In one possible embodiment, the gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of sums of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values, and the method further includes: calculating and convolving each pixel point of the first pixel block with an X-direction Sobel operator to obtain an X-direction gradient matrix, and calculating and convolving each pixel point of the first pixel block with a Y-direction Sobel operator to obtain a Y-direction gradient matrix; performing cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and performing cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values; performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain the sum of the Y-direction gradient values, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain the sum of squares of the Y-direction gradient values; and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product of the gradient values in the X direction and the Y direction.
In one possible embodiment, the method further comprises: determining a fourth similarity between the first pixel block and a fifth pixel block in the ith frame of image, wherein the coordinates of the fifth pixel block are obtained by the target optical flow; and determining the confidence coefficient of the target optical flow according to the first similarity and the fourth similarity. Limited by the principle of gradient-based optical flow algorithm, large errors are easily caused in the gradient flat area. The method and the device also increase a confidence mechanism, the confidence of the optical flow with larger error is lower, and the confidence of the optical flow with smaller error is higher. The user may select an optical flow with a high degree of confidence.
In one possible implementation, determining the confidence level of the target optical flow according to the first similarity and the fourth similarity includes: determining the target optical flow to be of low confidence if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is greater than a first threshold; otherwise, if the numerical value of the fourth similarity is larger than the second threshold, determining that the target optical flow is low in confidence; otherwise, the target optical flow is determined to be of high confidence.
In one possible embodiment, the method further comprises: an intermediate image is inserted between the i-1 th frame image and the i-th frame image according to the target optical flow. Due to the limitation of the computing power of the processor, the rendering rate of the image is different from the display frame rate, and the time domain super-resolution frame interpolation function can be realized by means of the result of the optical flow acquisition method.
In a second aspect, there is provided an optical flow acquisition apparatus including: the determining module is used for determining the first similarity of a first pixel block in the (i-1) th frame image and a second pixel block in the (i) th frame image; the coordinates of the second pixel block are the same as those of the first pixel block, and i is a positive integer; a determination module further configured to perform at least one of the following two processes: determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame image; determining a third similarity between the first pixel block and a fourth pixel block in the ith frame of image, wherein the coordinates of the fourth pixel block are obtained from the optical flows of adjacent pixel blocks of the first pixel block; and the acquisition module is used for obtaining a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity and the gradient information of the first pixel block.
In one possible embodiment, the coordinates of the third block of pixels are equal to the coordinates of the first block of pixels plus the historical optical flow of the first block of pixels from the i-2 th frame image to the i-1 th frame.
In a possible implementation, the coordinates of the fourth block of pixels are equal to the coordinates of the first block of pixels plus the optical flow of the adjacent block of pixels of the first block of pixels.
In a possible implementation manner, the obtaining module is specifically configured to: determining the highest similarity according to the first similarity and at least one of the second similarity and the third similarity, and selecting an optical flow corresponding to the highest similarity as an optical flow initial value; and combining the gradient information of the first pixel block, performing approximate Gaussian Newton gradient descent iterative solution on the initial value of the optical flow, and taking an iteration result obtained when the optical flow exits because an exit condition is met as a target optical flow.
In a possible implementation manner, the obtaining module is specifically configured to: determining at least two highest similarities according to the first similarity and at least one of the second similarity and the third similarity, and selecting optical flows respectively corresponding to the at least two highest similarities as at least two initial optical flow values; and combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on at least two initial values of the optical flow, setting the exit conditions to be the same, and selecting the iteration result with the minimum energy function as the target optical flow when exiting due to the satisfaction of the exit conditions.
In a possible implementation manner, the gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of sums of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values, and the obtaining module is further configured to: calculating and convolving each pixel point of the first pixel block with an X-direction Sobel operator to obtain an X-direction gradient matrix, and calculating and convolving each pixel point of the first pixel block with a Y-direction Sobel operator to obtain a Y-direction gradient matrix; performing cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and performing cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values; performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain the sum of the Y-direction gradient values, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain the sum of squares of the Y-direction gradient values; and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product of the gradient values in the X direction and the Y direction.
In one possible embodiment, the determining module is further configured to: determining a fourth similarity between the first pixel block and a fifth pixel block in the ith frame of image, wherein the coordinates of the fifth pixel block are obtained by the target optical flow; and determining the confidence of the target optical flow according to the first similarity and the fourth similarity.
In a possible implementation, the determining module is specifically configured to: determining the target optical flow to be of low confidence if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is greater than a first threshold; otherwise, if the numerical value of the fourth similarity is larger than the second threshold, determining that the target optical flow is low in confidence; otherwise, the target optical flow is determined to be of high confidence.
In a possible implementation, the optical flow obtaining apparatus further includes a frame interpolation module, configured to: an intermediate image is inserted between the i-1 th frame image and the i-th frame image according to the target optical flow.
In a third aspect, an optical flow obtaining apparatus is provided, including a processor and a memory, wherein: the memory has stored therein computer instructions that are executed by the processor to implement the method of the first aspect and possible implementations thereof.
In a fourth aspect, a computer-readable storage medium is provided, in which computer instructions are stored, which, when run on a computer or a processor, cause the computer or the processor to perform the method of the first aspect and its possible implementations.
In a fifth aspect, there is provided a computer program product comprising instructions which, when executed on a computer or processor, cause the computer or processor to carry out the method of the first aspect and possible implementations thereof.
The technical effects of the second to fifth aspects may be as described with reference to various possible implementations of the first aspect.
Fig. 1 is a schematic flowchart of an optical flow obtaining method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of a pixel block according to an embodiment of the present application;
FIG. 3 is a schematic illustration of optical flow provided by an embodiment of the present application;
fig. 4 is a schematic diagram of a second pixel block according to an embodiment of the present application;
fig. 5 is a schematic diagram of a third pixel block according to an embodiment of the present application;
fig. 6 is a schematic diagram of a neighboring pixel block of a first pixel block according to an embodiment of the present application;
fig. 7 is a schematic diagram of a fourth pixel block according to an embodiment of the present application;
fig. 8 is a schematic diagram of another fourth pixel block provided in the embodiment of the present application;
fig. 9 is a schematic diagram of a sobel operator provided in an embodiment of the present application;
FIG. 10 is a diagram illustrating an iterative solution of approximately Gaussian Newton gradient descent provided by an embodiment of the present application;
FIG. 11 is a schematic flow chart of another optical flow obtaining method according to an embodiment of the present application;
FIG. 12 is a schematic flowchart of another optical flow obtaining method according to an embodiment of the present application;
FIG. 13 is a schematic flow chart of another optical flow obtaining method according to an embodiment of the present application;
FIG. 14 is a schematic flow chart illustrating a further optical flow obtaining method according to an embodiment of the present application;
FIG. 15 is a schematic structural diagram of an optical flow obtaining apparatus according to an embodiment of the present application;
fig. 16 is a schematic structural diagram of another optical flow obtaining apparatus according to an embodiment of the present application.
In the field of video image processing, especially for terminal products, high requirements are placed on the performance and effect of an optical flow algorithm, and it is generally required to acquire full-image optical flow information with high accuracy and good stability at a high speed. Therefore, it is a better solution to implement a lightweight optical flow algorithm by an integrated circuit. However, the conventional optical flow algorithm has a characteristic that a large number of iterative computations are required, so that the improvement of the frame rate is limited, and in general, the lightweight optical flow algorithm also has the problems of large gradient flat area error and poor stability.
For example, for a large number of iterative calculations, each image block is calculated independently, the initial values used are not controlled, and if a poor initial value is used, a large number of iterative calculations are required to gradually approach a better solution. For poor stability, the optical flows of two adjacent frames of images are calculated independently from the two frames of images, and the correlation between the two frames of images is not fully utilized. For the larger error of the gradient flat area, the gradient flat area is difficult to converge and the error is usually larger under the restriction of the current algorithm principle.
The application provides an optical flow acquisition method and device, which are used for determining the first similarity between a first pixel block in an i-1 th frame image and a second pixel block in the i-1 th frame image, wherein the coordinates of the second pixel block are the same as those of the first pixel block, and the consistency that the first pixel block keeps still is embodied. By determining a second similarity of a first pixel block in the i-1 th frame image and a third pixel block in the i-1 th frame image, coordinates of the third pixel block are obtained from historical optical flows of the first pixel block from the i-2 th frame image to the i-1 th frame, and time domain consistency between the optical flows of the first pixel block from the i-2 th frame image to the i-1 th frame image and the optical flows of the first pixel block from the i-1 th frame image to the i-1 th frame image is embodied. By determining a third similarity between the first pixel block and a fourth pixel block in the ith frame image, coordinates of the fourth pixel block are obtained from optical flows of adjacent pixel blocks of the first pixel block, and spatial consistency between the optical flows of the first pixel block and the adjacent pixel blocks is embodied. The initial value of the optical flow of iterative computation is optimized through the similarity, so that the iteration number is reduced. The light-weight optical flow algorithm can be realized, and the fast convergence can be realized under the condition of lower iteration times.
As shown in fig. 1, an optical flow acquiring method provided in the embodiments of the present application includes steps S101 to S104, where at least one of steps S102 and S103 is performed.
S101, determining a first similarity between a first pixel block in the i-1 th frame image and a second pixel block in the i-1 th frame image.
In the embodiment of the application, the ith frame image is a current frame image in a video stream, the (i-1) th frame image refers to a previous frame image (which may also be referred to as a reference frame image), the (i-2) th frame refers to two previous frame images, and so on. i is a positive integer.
The image is subjected to grid division to obtain pixel blocks (which may also be referred to as unit blocks), each pixel block includes at least four pixel points (i.e., 2 pixels by 2 pixels), the pixel blocks may be rectangular or square, and the size of the pixel blocks is configurable, for example, 6 pixels by 6 pixels, 8 pixels by 8 pixels, 10 pixels by 10 pixels, 12 pixels by 12 pixels, 14 pixels by 14 pixels, and the like.
The adjacent pixel blocks comprise at least one common pixel point. For example, as shown in fig. 2, taking the size of each pixel block as 3 pixels by 3 pixels as an example, the pixel blocks a1 and a4, a1 and a2, a1 and A3, a2 and A3, a2 and a4, and A3 and a4 all include common pixel points therebetween. The significance of adjacent pixel blocks comprising common pixel points is that: if the common pixel point belongs to a plurality of pixel blocks, the calculation result of the pixel region formed by the common pixel point can be weighted and averaged according to the calculation result of the pixel block to which the common pixel point belongs, and the robustness can be better ensured. If there is no overlap between the pixel blocks, the calculation result of any pixel region is solely determined by the calculation result of a single pixel block, and thus there is a risk that the individual deviation is large.
As shown in fig. 2, if a plane coordinate system is established for the image, the pixel point at the upper left corner of the image is generally used as the origin, the right direction is the X direction, and the downward direction is the Y direction. When an object is in motion, the luminance pattern of the corresponding point on the neighboring image is also in motion, and the apparent motion (apparent motion) representing the luminance pattern of the image is the optical flow. Specifically, the optical flow represents the motion speed and the motion direction of each pixel in the two adjacent frames of images. For example, as shown in fig. 3, it is assumed that the optical flow of a certain object (e.g., a first pixel block) in the image of the i-1 th frame is (2,3), that is, it means that the object (e.g., the first pixel block) in the image of the i-1 th frame moves to the right by 2 pixel points and moves to the down by 3 pixel points in the image of the i-1 th frame, and therefore, the optical flow (2,3) reflects the offset (displacement) of the same object in two adjacent frames of images.
In the embodiment of the application, the coordinates of the pixel point at the upper left corner of the pixel block are used as the coordinates of the pixel block. Of course, the coordinates of the pixel point in the center of the pixel block may also be used as the coordinates of the pixel block, and the application is not limited.
When the optical flow is calculated, for example, after a row of pixel blocks is sequentially calculated from the pixel block at the upper left corner to the right, the sequential calculation is continued from the pixel block at the leftmost side of the next row to the right. For example, for the example of FIG. 2, the pixel blocks are calculated in the order A1, A2, A3, A4.
The coordinates (for example, coordinates of a pixel point at the upper left corner) of the second pixel block in the ith frame image are the same as the coordinates (for example, coordinates of a pixel point at the upper left corner) of the first pixel block in the ith-1 frame image, and the size of the second pixel block is the same as that of the first pixel block. That is, the coordinates of the first pixel block (for example, the coordinates of the pixel point at the upper left corner) are taken as the coordinates of the second pixel block in the i-th frame image (for example, the coordinates of the pixel point at the upper left corner), and a region having the same size as the first pixel block is selected as the second pixel block in the i-th frame image based on the coordinates.
The significance of this step is that the target may be stationary in the two frame images, i.e. the pixel blocks at the same position in the two frame images may be the same, and the corresponding optical flow is (0, 0).
Exemplarily, as shown in fig. 4, taking the size of the first pixel block as 3 pixels by 3 pixels as an example, the coordinates of the first pixel block in the i-1 th frame image are (1,2), the coordinates of the second pixel block in the i-1 th frame image are (1,2), and the size of the second pixel block is the same as the size of the first pixel block, and both the sizes are 3 pixels by 3 pixels.
In image processing, indices for evaluating similarity between images include, but are not limited to Sum of Absolute Differences (SAD), sum of square of absolute differences (SSD), normalized cross-correlation (NCC), and the like, and the present application is not limited thereto. For SAD, SSD, a larger value of similarity indicates less similarity, and a smaller value of similarity indicates more similarity. For NCC, the numerical value of the similarity is a decimal between 0 and 1, and a larger value (close to 1) indicates more similarity, and a smaller value (close to 0) indicates less similarity.
Illustratively, taking SSD as an example, the similarity between two pixel blocks is shown in formula 1:
wherein x is a coordinate in the i-1 frame image; p is the optical flow (the initial value of the optical flow is obtained in the first iteration); w (x; p) represents the coordinate x plus the optical flow p; i (W (x; p)) is the pixel value of the pixel block with the coordinate of W (x; p) in the ith frame image, T (x) is the pixel value of the pixel block with the coordinate of x in the ith-1 frame image, and I (W (x; p)) -T (x) is the difference result of the two pixel values.
Specifically, the similarity of the luminance between the pixel blocks may be calculated, or the similarity of the pixel values between the pixel blocks may be calculated, which is not limited in this application.
S102, determining a second similarity between the first pixel block and a third pixel block in the ith frame image.
And the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the i-2 frame image to the i-1 frame. Each historical optical flow corresponds to a third pixel block, the coordinates of the third pixel block (for example, the coordinates of the pixel point at the upper left corner) are equal to the coordinates of the first pixel block (for example, the coordinates of the pixel point at the upper left corner) plus the historical optical flow of the first pixel block, and the size of the third pixel block is the same as that of the first pixel block. That is, the historical optical flow of the first pixel block is added with the coordinate of the first pixel block (for example, the coordinate of the pixel point at the upper left corner) to be the coordinate of the third pixel block in the ith frame image (for example, the coordinate of the pixel point at the upper left corner), and the area with the same size as the first pixel block is selected as the third pixel block in the ith frame image based on the coordinate.
Due to the consistency of the movement of objects in the real world. In most cases, the current optical flow information may be inferred from historical optical flow information, or the historical optical flow information may be utilized to provide a reference for the calculation of the current optical flow information. This may also be referred to as temporal coherence.
The historical optical flow of the first pixel block can be obtained when the optical flow is calculated iteratively on the (i-1) th frame image, and can also be obtained by carrying out filtering estimation (such as linear filtering) on the optical flow in the (i-2) th frame image and the optical flow in the (i-1) th frame image, and when the two types of historical optical flows are used simultaneously, the obtained third pixel block and the second similarity are both two values.
Illustratively, as shown in fig. 5, the coordinates of the first pixel block in the i-1 th frame image are (1,2), the historical optical flow of the first pixel block is (2,3), the coordinates of the third pixel block in the i-th frame image are (3,5), and the size of the third pixel block is the same as the size of the first pixel block, and both are 3 pixels by 3 pixels.
Regarding the similarity between two pixel blocks, see step S101, which is not repeated here.
S103, determining a third similarity between the first pixel block and a fourth pixel block in the ith frame image.
Wherein the coordinates of the fourth pixel block are derived from the optical flows of the neighboring pixel blocks of the first pixel block. Each current optical flow corresponds to a fourth pixel block, the coordinates of which are equal to the coordinates of the first pixel block plus the current optical flow, the size of the fourth pixel block being the same as the size of the first pixel block. That is, the current optical flow of the adjacent pixel block of the first pixel block plus the coordinate of the first pixel block (for example, the coordinate of the pixel point at the upper left corner) is taken as the coordinate of the third pixel block in the ith frame image (for example, the coordinate of the pixel point at the upper left corner), and the area with the same size as the first pixel block is selected as the third pixel block in the ith frame image based on the coordinate.
Since the rigid body motion has consistency, the iteration initial value can be optimized using the current optical flow information of the adjacent pixel blocks as a reference. This may also be referred to as spatial coherence. The current optical flow refers to an optical flow obtained when an optical flow is iteratively calculated for the i-th frame image.
As described above, the embodiment of the present application calculates the optical flows of the pixel blocks in the order from left to right and from top to bottom, and therefore, the optical flows of the pixel blocks above and left of the first pixel block are known. If the optical flows of the pixel blocks are calculated in the reverse order, the optical flows of the pixel blocks below and to the right of the first pixel block are known.
Illustratively, as shown in FIG. 6, each box represents a block of pixels, and the overlapping portions between adjacent blocks of pixels are not shown for clarity of illustration. The adjacent pixel blocks of the first pixel block include a first-order adjacent pixel block which means a pixel block which has obtained an optical flow next to the first pixel block and a second-order adjacent pixel block which is a pixel block which is separated from the first pixel block by one pixel block and has obtained an optical flow. The first-order adjacent pixel blocks are adopted to participate in the calculation, so that the calculation amount can be reduced, and the second-order adjacent pixel blocks are adopted to participate in the calculation, so that the accuracy can be improved.
Exemplarily, as shown in fig. 7, the first pixel block in the i-1 th frame image has coordinates of (1,2), two first-order neighboring pixel blocks of the first pixel block (coordinates of (1,1), (2,1), respectively), and the current optical flows of the first-order neighboring pixel blocks are all (2,3), then there is a fourth pixel block in the i-th frame image, and the fourth pixel block has coordinates of (3,5), and the size of the fourth pixel block is the same as that of the first pixel block, and is all 3 pixels by 3 pixels. Illustratively, as shown in fig. 8, if the current optical flows of two first-order neighboring pixel blocks are (1,4) and (2,3), respectively, there are two fourth pixel blocks in the i-th frame image, and the coordinates of the fourth pixel blocks are (2,6) and (3,5), respectively.
Regarding the similarity between two pixel blocks, see step S101, it is not repeated here.
S104, obtaining a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity and the gradient information of the first pixel block and at least one of the second similarity and the third similarity.
If the image is regarded as a two-dimensional discrete function, the gradient information is the derivative of the two-dimensional discrete function, and the degree of difference between the edge of the object in the image and the background is characterized.
The gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values.
How the gradient information of the pixel block is acquired is described below.
Firstly, each pixel point of a first pixel block is convoluted with an X-direction Sobel operator to obtain an X-direction gradient matrix, and each pixel point of the first pixel block is convoluted with a Y-direction Sobel operator to obtain a Y-direction gradient matrix. The size of the X-direction gradient matrix and the size of the Y-direction gradient matrix are the same as the size of the pixel blocks, namely, each pixel point corresponds to one X-direction gradient value in the X-direction gradient matrix and also corresponds to one Y-direction gradient value in the Y-direction gradient matrix.
Illustratively, as shown in fig. 9, the size of each pixel block a is 4 pixels by 4 pixels, and the size of the sobel operator is 3 pixels by 3 pixels. Where C0 and C1 are preset convolution kernel coefficients. And traversing on 4 × 4 ═ 16 pixel points of the pixel block in sequence, and calculating the convolution of each pixel point and the Sobel operator. If the Sobel operator is an X-direction Sobel operator, an X-direction gradient matrix is obtained, and if the Sobel operator is an X-direction Sobel operator, an X-direction gradient matrix is obtained.
Then, carrying out cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and carrying out cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values; performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain the sum of the Y-direction gradient values, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain the sum of squares of the Y-direction gradient values; and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product of the gradient values in the X direction and the Y direction. These accumulated sums are collectively referred to as the gradient information for that block of pixels.
In one possible implementation, the highest similarity (i.e., the lowest numerical value of the similarity) may be determined according to the first similarity and at least one of the second similarity and the third similarity, and the optical flow corresponding to the highest similarity may be selected as the initial value of the optical flow. And combining the gradient information of the first pixel block, performing approximate Gaussian Newton gradient descent iterative solution on the initial value of the optical flow, and taking an iteration result obtained when the optical flow exits because an exit condition is met as a target optical flow.
In another possible embodiment, at least two highest similarities may be determined from the first similarity and at least one of the second similarity and the third similarity, and optical flows corresponding to the at least two highest similarities may be selected as the at least two initial optical flow values. And combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on at least two initial values of the optical flow, setting exit conditions to be the same, and selecting an iteration result with the minimum energy function as a target optical flow when the exit conditions are met and the exit is carried out.
The above-described embodiments in which the exit conditions are set to the same conditions may of course be applied to other methods of determining the initial value of the optical flow, and the present application is not limited to:
for example, a highest similarity may be determined from the second similarities, a highest similarity may be determined from the third similarities, and the optical flows corresponding to the two highest similarities and the optical flows corresponding to the first similarities may be used as the initial values of the three optical flows.
A highest similarity can be determined from the second similarity and the first similarity, a highest similarity is determined from the third similarity, and an optical flow corresponding to the highest similarity is used as two optical flow initial values.
A highest similarity may be determined from the second similarities, a highest similarity may be determined from the third similarities and the first similarities, and the optical flows corresponding to the highest similarities may be used as the two initial optical flow values. And so on.
In the process of determining the highest similarity, a threshold value may be combined, and the similarity smaller than the threshold value may be retained.
The present application also does not limit the number of the highest similarities, and for example, optical flows corresponding to a plurality of superior similarities may be selected as the initial values of the optical flows.
The following describes a process of performing an approximate gauss-newton gradient descent iterative solution on an initial value of an optical flow in combination with gradient information of the first pixel block.
As shown in fig. 10, the initial value of the optical flow and the gradient information of the first pixel block may be substituted into formula 2 (or formula 3) to perform the iterative solution of the approximate gauss-newton gradient descent, and when the exit condition is not satisfied, the result Δ p calculated by formula 2 (or formula 3) is used as the negative feedback of the optical flow F (that is, F ═ F- Δ p), and substituted into formula 2 (or formula 3) again to perform the iterative solution of the approximate gauss-newton gradient descent until the exit condition is satisfied, and then the optical flow that enters formula 2 (or formula 3) when exiting is the iterative result.
Wherein, H is a Hessian matrix, the Hessian matrix is obtained from gradient information of the first pixel block, the size of the Hessian matrix is 2 × 2, H (0,0) is the sum of squares of gradient values in the X direction, H (0,1) is the sum of products of gradient values in the X direction and the Y direction, and H (1,1) is the sum of squares of gradient values in the Y direction;a first gradient vector of a first pixel block in the i-1 frame image, including an X-direction gradient value and a Y-direction gradient value, i.e. (X-direction gradient value, Y-direction gradient value);is a Jacobian matrix; x is the coordinate in the i-1 frame image; p is the optical flow (the initial value of the optical flow is obtained in the first iteration); w (x; p) represents the coordinate x plus the optical flow p; i (W (x; p)) is the pixel value of the pixel block with the coordinate W (x; p) in the ith frame image, T (x) is the pixel value of the pixel block with the coordinate x in the ith-1 frame image, and I (W (x; p)) -T (x) is the difference result of the two pixel values.
Consider the method of equation 2And part of the light stream is normalized to improve the anti-interference capability of the light stream acquisition method on the brightness change of the area to obtain a formula 3.
Wherein n is the number of pixels in the first pixel block, and B is a second gradient vector of the first pixel block in the i-1 th frame image, and includes the sum of the X-direction gradient values and the sum of the Y-direction gradient values, that is, (the sum of the X-direction gradient values and the sum of the Y-direction gradient values).
Accordingly, the normalization processing can be performed on the formula 1 to improve the anti-interference capability of the optical flow acquisition method for the brightness change of the area, so as to obtain a formula 4.
Wherein n is the number of pixel points in the first pixel block.
The condition of meeting the exit condition includes, but is not limited to, reaching a preset iteration number, the iteration result does not meet the convergence condition, the iteration result exceeds a preset threshold, and the like.
Limited by the principle of gradient-based optical flow algorithm, large errors are easily caused in the gradient flat area. The method and the device also increase a confidence mechanism, the optical flow with larger errors has lower confidence, and the optical flow with smaller errors has higher confidence. Specifically, as shown in fig. 11, the optical flow acquiring method further includes:
s1101, determining a fourth similarity between the first pixel block and a fifth pixel block in the ith frame image.
Wherein the coordinates of the fifth block of pixels are derived from the target optical flow. In particular, the coordinates of the fifth block of pixels, equal to the coordinates of the first block of pixels plus the target optical flow, are of the same size as the first block of pixels.
In this step, the pixel blocks in the i-th frame image are obtained from the optical flow and the first pixel block and the similarity between the two pixel blocks is calculated, similarly to steps S102 and S103, and is not repeated here.
And S1102, determining the confidence of the target optical flow according to the first similarity and the fourth similarity.
The first similarity corresponds to a similarity between the same positions (first pixel blocks) in the i-th frame image and the i + 1-th frame image before the iterative computation, and the fourth similarity corresponds to a similarity between the same objects in the i-th frame image and the i + 1-th frame image after the iterative computation.
Specifically, as shown in fig. 12, step S1102 includes:
s11021, if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is larger than a first threshold value, determining that the target optical flow is low in confidence. Otherwise, step S11022 is executed.
The ratio of the fourth similarity value to the first similarity value is an evaluation of the accuracy of the similarity, and as mentioned above, a smaller similarity value indicates a higher similarity, so that a smaller ratio of the fourth similarity value to the first similarity value indicates a more accurate tracking of the same target after iteration, otherwise indicates a less accurate tracking.
S11022, if the numerical value of the fourth similarity is larger than a second threshold value, determining that the target optical flow is low confidence. Otherwise, step S11023 is executed.
Comparing the value of the fourth similarity with the threshold value is an evaluation of absolute accuracy, and a smaller value of the fourth similarity indicates that the tracking of the same target is more accurate after iteration, otherwise the tracking of the same target is less accurate.
S11023, determining the target optical flow to be high confidence.
The optical flow and the confidence coefficient thereof acquired by the optical flow acquisition method are sparse, and the application requirements of motion detection and motion focus tracking can be met. For example, according to the above information, the motion detection of the subject in the field of view can be realized, and the direction and speed of the motion of the subject can be acquired, so as to realize the motion focus tracking function.
The light stream acquired by the light stream acquisition method can be subjected to densification processing to obtain a pixel-level light stream, so that the registration alignment among pixels is realized, and the method is used for assisting in improving the performances of multi-frame denoising, multi-frame super-resolution, multi-frame exposure fusion, multi-frame de-mosaic and the like.
The optical flow algorithm can calculate the motion relation between pixel blocks of two frames of images, including the global motion of the whole visual field and the local motion of objects in the scene. Thus, the optical flow information can be considered as a one-to-one mapping established between pixel blocks of two frames of images, i.e. registration alignment between pixel blocks of images.
For applications such as multi-frame denoising, multi-frame super resolution, multi-frame exposure fusion, multi-frame de-mosaic and the like in image processing, the optical flow algorithm can provide a registration and alignment function between pixel blocks of an image for the applications, so that the improvement effect of the applications is assisted, and the matching calculation amount is reduced. Performance is further enhanced if registration alignment between pixels can be achieved.
As shown in fig. 13, the optical flow acquisition method further includes:
s1301, determining a weighting coefficient of a first pixel block according to a first pixel in the (i-1) th frame and a target optical flow of the first pixel block containing the first pixel.
Assuming that the coordinates of the first pixel in the i-1 th frame are (x0, y0), the pixel value is P0, and the optical flow of the first pixel block is (dx, dy), the coordinates of the first pixel are added to the optical flow of the first pixel block to obtain the second pixel in the i-th frame image, the coordinates of the second pixel are (x0+ dx, y0+ dy), and the pixel value of the second pixel is P1.
Determining a weighting factor for a first pixel block based on a pixel value of a first pixel and a pixel value of a second pixelJ represents a jth first pixel block, and when a plurality of first pixel blocks all comprise first pixels, the value range of j is larger than 1; abs denotes the absolute value.
S1302, determining an optical flow of the first pixel according to the weighting coefficient and the target optical flow of the first pixel block.
It should be noted that, when the optical flow acquisition method is applied to multi-frame denoising, a certain pre-denoising process may be performed to improve the interference of noise on optical flow calculation. When the optical flow acquisition method is applied to multi-frame exposure fusion, normalization processing can be performed on images with different exposures so as to meet the brightness consistency assumption and the like required by an optical flow algorithm.
In addition, due to the limitation of the computing power of the processor, the rendering rate of the image is different from the display frame rate, and the time domain super-resolution frame interpolation function can be realized by means of the result of the optical flow acquisition method.
As shown in fig. 14, the optical flow acquiring method further includes:
s1401, insert the intermediate image between the i-1 frame image and the i frame image according to the target optical flow.
The optical flow algorithm acquires the motion change relationship between the two real frames of images, and on the basis, the motion relationship at a certain time (or a plurality of times) between the times corresponding to the two frames of images can be estimated. That is, the super-resolution frame interpolation result can be obtained at a low cost by using the reference frame image (e.g., an image with an earlier time) and the mapping relationship estimated by the optical flow.
Assuming that the first image I1 and the second image I2 are provided, an intermediate image I1.5 of the first image I1 and the second image I2 is to be obtained. As shown in fig. 13, the optical flow of a pixel point can be obtained from the target optical flow of a pixel block, for example, the optical flow of a pixel point a in the first image I1 is F1 — 2.
Since the time between the two images is short enough, the speed of the object in the image can be considered to be linear approximately, half of the optical flow F1_2 can be reduced to obtain the optical flow F1_1.5 ═ F1_2)/2 between the pixel point a from the first image I1 to the intermediate image I1.5, and the optical flow F1_1.5 represents the transformation relationship between the pixel point a from the first image I1 to the intermediate image I1.5. All the pixel points in the first image I1 are processed as described above, so that an intermediate image I1.5 can be obtained.
It should be noted that: for an occlusion scene and a physical interaction scene, some special consideration needs to be added when optical flow information between two frames of images is estimated. For example, occlusion detection may be performed by a bidirectional optical flow acquisition method. Specifically, the optical flow of the first pixel block from the i-1 th frame image to the i-th frame image may be solved first, and is denoted as F; then, the optical flow of the first pixel block from the ith frame image to the (i-1) th frame image is solved, and the optical flow is marked as F'. Theoretically, if there is no occlusion and the optical flow calculation is accurate, the target at a certain position P on the i-1 th frame image can be mapped to a position P' on the i-th frame image by the optical flow F; and then mapped back to a position P ' on the i-1 th frame image by the optical flow F ', and the P ' is basically coincident with the P. Otherwise, occlusion can be inferred. In occlusion scenes, a higher confidence optical flow is typically employed.
The embodiment of the application also provides an optical flow acquisition device, and the optical flow acquisition device is used for realizing the various methods. The optical flow acquisition device can be a mobile phone, a tablet, an unmanned aerial vehicle, an automobile, an electric vehicle and the like.
It is understood that, in order to implement the above functions, the optical flow acquiring apparatus includes a hardware structure and/or a software module corresponding to each function. Those of skill in the art will readily appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software drives hardware depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiment of the present application, the optical flow acquiring apparatus may perform division of the functional modules according to the method embodiment, for example, each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. It should be noted that, in the embodiment of the present application, the division of the module is schematic, and is only one logic function division, and there may be another division manner in actual implementation.
Fig. 15 shows a schematic structural view of an optical flow acquisition device 150. The optical flow acquisition device 150 includes a determination module 1501, an acquisition module 1502, and a frame interpolation module 1503. The determining module 1501 is configured to implement steps S101 to S103 in fig. 1, steps S101 to S103 and S1101 to S1102 in fig. 11, steps S101 to S103, S1101 and S11021 to S11023 in fig. 12, steps S101 to S103 and S1301 to S1302 in fig. 13, and steps S101 to S103 in fig. 14 in the foregoing method embodiment. The obtaining module 1502 is configured to implement step S104 in fig. 1, step S104 in fig. 11, and step S104 in fig. 13 in the foregoing method embodiment. The frame inserting module 1503 is used to implement step S1401 in fig. 14 in the foregoing method embodiment.
Illustratively, the determining module 1501 is configured to determine a first similarity between a first pixel block in the i-1 th frame image and a second pixel block in the i-th frame image; and the coordinates of the second pixel block are the same as the coordinates of the first pixel block, and i is a positive integer.
The determining module 1501 is further configured to perform at least one of the following two processes: determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame image; determining a third similarity of the first pixel block and a fourth pixel block in the ith frame image, wherein coordinates of the fourth pixel block are obtained from optical flows of adjacent pixel blocks of the first pixel block.
An obtaining module 1502 configured to obtain a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity, the at least one of the second similarity and the third similarity, and the gradient information of the first pixel block.
In one possible embodiment, the coordinates of the third block of pixels are equal to the coordinates of the first block of pixels plus the historical optical flow of the first block of pixels from the i-2 th frame image to the i-1 th frame.
In a possible implementation, the coordinates of the fourth block of pixels are equal to the coordinates of the first block of pixels plus the optical flow of the adjacent block of pixels of the first block of pixels.
In a possible implementation, the obtaining module 1502 is specifically configured to: determining the highest similarity according to the first similarity and at least one of the second similarity and the third similarity, and selecting an optical flow corresponding to the highest similarity as an optical flow initial value; and combining the gradient information of the first pixel block, performing approximate Gaussian Newton gradient descent iterative solution on the initial value of the optical flow, and taking an iteration result obtained when the optical flow exits because an exit condition is met as a target optical flow.
In a possible implementation, the obtaining module 1502 is specifically configured to: determining at least two highest similarities according to the first similarity and at least one of the second similarity and the third similarity, and selecting optical flows respectively corresponding to the at least two highest similarities as at least two optical flow initial values; and combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on at least two initial values of the optical flow, setting exit conditions to be the same, and selecting an iteration result with the minimum energy function as a target optical flow when the exit conditions are met and the exit is carried out.
In a possible implementation, the gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of sums of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values, and the obtaining module 1502 is further configured to: calculating and convolving each pixel point of the first pixel block with an X-direction Sobel operator to obtain an X-direction gradient matrix, and calculating and convolving each pixel point of the first pixel block with a Y-direction Sobel operator to obtain a Y-direction gradient matrix; performing cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and performing cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values; performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain the sum of the gradient values in the Y direction, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain the sum of squares of the gradient values in the Y direction; and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product of the gradient values in the X direction and the Y direction.
In a possible implementation, the determining module 1501 is further configured to: determining a fourth similarity between the first pixel block and a fifth pixel block in the ith frame of image, wherein the coordinates of the fifth pixel block are obtained by the target optical flow; and determining the confidence of the target optical flow according to the first similarity and the fourth similarity.
In a possible implementation, the determining module 1501 is specifically configured to: determining the target optical flow to be of low confidence if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is greater than a first threshold; otherwise, if the numerical value of the fourth similarity is larger than the second threshold, determining that the target optical flow is low in confidence; otherwise, the target optical flow is determined to be of high confidence.
In one possible implementation, the frame insertion module 1503 is configured to: an intermediate image is inserted between the i-1 th frame image and the i-th frame image according to the target optical flow.
In the present embodiment, the optical flow acquisition means 150 is presented in the form of dividing each functional module in an integrated manner. As used herein, a module may refer to a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor and memory that execute one or more software or firmware programs, an integrated logic circuit, and/or other devices that provide the described functionality.
Since the optical flow obtaining device 150 provided in this embodiment can perform the above method, the technical effects obtained by the optical flow obtaining device can refer to the above method embodiment, and are not described herein again.
As shown in fig. 16, the embodiment of the present application further provides an optical-flow obtaining apparatus, where optical-flow obtaining apparatus 160 includes processor 1602 and memory 1601, where processor 1602 and memory 1601 are coupled via bus 1603, and computer instructions are stored in memory 1601, and when processor 1602 executes the computer instructions in memory 1601, the optical-flow obtaining method in fig. 1, 11-14 is executed.
An embodiment of the present application further provides a chip, including: a processor and an interface, which are used for calling and running the computer program stored in the memory from the memory, and executing the optical flow obtaining method in fig. 1, 11-14.
Embodiments of the present application further provide a computer-readable storage medium, in which instructions are stored, and when the instructions in the computer-readable storage medium are executed on a computer or a processor, the computer or the processor is caused to execute the optical flow obtaining method in fig. 1, 11 to 14.
Embodiments of the present application also provide a computer program product containing instructions, which when executed on a computer or a processor, cause the computer or the processor to execute the optical flow obtaining method in fig. 1, 11 to 14.
The embodiment of the application provides a chip system, which comprises a processor, and is used for the optical flow acquisition device to execute the optical flow acquisition method in fig. 1, 11-14.
In one possible design, the system-on-chip further includes a memory for storing necessary program instructions and data. The chip system may include a chip, an integrated circuit, or a chip and other discrete devices, which is not specifically limited in this embodiment of the present application.
The optical flow obtaining device, the chip, the computer storage medium, the computer program product, or the chip system provided in the present application are all configured to execute the method described above, and therefore, the beneficial effects achieved by the optical flow obtaining device, the chip, the computer storage medium, the computer program product, or the chip system may refer to the beneficial effects in the embodiments provided above, and are not described herein again.
The processor related to the embodiment of the application may be a chip. For example, the Field Programmable Gate Array (FPGA) may be an Application Specific Integrated Circuit (ASIC), a system on chip (SoC), a Central Processing Unit (CPU), a Network Processor (NP), a digital signal processing circuit (DSP), a Micro Controller Unit (MCU), a Programmable Logic Device (PLD) or other integrated chips.
The memory referred to in embodiments of the present application may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of RAM are available, such as Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), Synchronous Dynamic Random Access Memory (SDRAM), double data rate SDRAM, enhanced SDRAM, SLDRAM, Synchronous Link DRAM (SLDRAM), and direct rambus RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. For example, the above-described device embodiments are merely illustrative, and for example, the division of the units is only one type of logical functional division, and other divisions may be realized in practice, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments of the application are all or partially generated when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or can comprise one or more data storage devices, such as a server, a data center, etc., that can be integrated with the medium. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (18)
- An optical flow acquisition method, comprising:determining a first similarity of a first pixel block in the i-1 th frame image and a second pixel block in the i-1 th frame image; the coordinates of the second pixel block are the same as the coordinates of the first pixel block, and i is a positive integer;performing at least one of the following two processes:determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame;determining a third similarity between the first pixel block and a fourth pixel block in the ith frame of image, wherein coordinates of the fourth pixel block are obtained from optical flows of adjacent pixel blocks of the first pixel block;and obtaining a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity and the gradient information of the first pixel block and at least one of the second similarity and the third similarity.
- The method of claim 1 wherein the coordinates of the third block of pixels are equal to the coordinates of the first block of pixels plus the historical optical flow of the first block of pixels from the i-2 th frame image to the i-1 th frame image.
- The method according to any of claims 1-2, wherein the coordinates of said fourth block of pixels are equal to the coordinates of said first block of pixels plus the optical flow of the blocks of pixels adjacent to said first block of pixels.
- The method according to any one of claims 1-3, wherein said deriving a target optical flow of said first pixel block from said i-1 th frame image to said i-th frame image according to at least one of said second similarity and said third similarity, said first similarity, and gradient information of said first pixel block comprises:determining the highest similarity according to the first similarity and at least one of the second similarity and the third similarity, and selecting an optical flow corresponding to the highest similarity as an optical flow initial value;and combining the gradient information of the first pixel block, carrying out approximate gauss-newton gradient descent iterative solution on the initial value of the optical flow, and taking an iterative result obtained when the optical flow exits because an exit condition is met as the target optical flow.
- The method according to any one of claims 1-3, wherein said deriving a target optical flow of said first pixel block from said i-1 th frame image to said i-th frame image according to at least one of said second similarity and said third similarity, said first similarity, and gradient information of said first pixel block comprises:determining at least two highest similarities according to the first similarity and at least one of the second similarity and the third similarity, and selecting optical flows respectively corresponding to the at least two highest similarities as at least two optical flow initial values;and combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on the at least two initial values of the optical flow, setting exit conditions to be the same, and selecting the iteration result with the minimum energy function as the target optical flow when the exit conditions are met and the exit is carried out.
- The method according to any one of claims 1 to 5, wherein the gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values, the method further comprising:calculating convolution of each pixel point of the first pixel block and an X-direction Sobel operator to obtain an X-direction gradient matrix, and calculating convolution of each pixel point of the first pixel block and a Y-direction Sobel operator to obtain a Y-direction gradient matrix;performing cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and performing cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values;performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain a sum of the Y-direction gradient values, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain a sum of squares of the Y-direction gradient values;and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product sum of the gradient values in the X direction and the Y direction.
- The method according to any one of claims 1-6, further comprising:determining a fourth similarity of the first pixel block and a fifth pixel block in the ith frame of image, wherein coordinates of the fifth pixel block are derived from the target optical flow;determining a confidence level of the target optical flow according to the first similarity and the fourth similarity.
- The method of claim 7, wherein determining the confidence level of the target optical flow according to the first similarity and the fourth similarity comprises:determining the target optical flow to be of low confidence if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is greater than a first threshold;otherwise, if the numerical value of the fourth similarity is greater than a second threshold, determining that the target optical flow is low in confidence;otherwise, the target optical flow is determined to be of high confidence.
- The method according to any one of claims 1-8, further comprising:inserting an intermediate image between the i-1 frame image and the i frame image according to the target optical flow.
- An optical flow acquisition apparatus characterized by comprising:the determining module is used for determining the first similarity of a first pixel block in the (i-1) th frame image and a second pixel block in the (i) th frame image; the coordinates of the second pixel block are the same as the coordinates of the first pixel block, and i is a positive integer;the determining module is further configured to perform at least one of the following two processes:determining a second similarity between the first pixel block and a third pixel block in the ith frame image, wherein the coordinates of the third pixel block are obtained from the historical optical flow of the first pixel block from the ith-2 frame image to the ith-1 frame;determining a third similarity of the first pixel block and a fourth pixel block in the ith frame of image, wherein coordinates of the fourth pixel block are derived from optical flows of adjacent pixel blocks of the first pixel block;an obtaining module, configured to obtain a target optical flow of the first pixel block from the i-1 th frame image to the i-th frame image according to the first similarity, the gradient information of the first pixel block, and at least one of the second similarity and the third similarity.
- The optical flow obtaining device according to claim 10, wherein the coordinates of said third pixel block are equal to the coordinates of said first pixel block plus the historical optical flow of said first pixel block from the i-2 th frame image to the i-1 th frame image.
- Optical flow obtaining device according to any one of claims 10 to 11, characterised in that the coordinates of said fourth block of pixels are equal to the coordinates of said first block of pixels plus the optical flow of the blocks of pixels adjacent to said first block of pixels.
- The optical flow obtaining device according to any one of claims 10 to 12, characterised in that the obtaining module is specifically configured to:determining the highest similarity according to the first similarity and at least one of the second similarity and the third similarity, and selecting an optical flow corresponding to the highest similarity as an optical flow initial value;and combining the gradient information of the first pixel block, performing approximate Gaussian Newton gradient descent iterative solution on the initial value of the optical flow, and taking an iteration result obtained when quitting because an quit condition is met as the target optical flow.
- The optical flow obtaining device according to any one of claims 10 to 12, characterised in that the obtaining module is specifically configured to:determining at least two highest similarities according to the first similarity and at least one of the second similarity and the third similarity, and selecting optical flows respectively corresponding to the at least two highest similarities as at least two initial optical flow values;and combining the gradient information of the first pixel block, respectively carrying out approximate Gaussian Newton gradient descent iterative solution on the at least two initial values of the optical flow, setting exit conditions to be the same, and selecting an iteration result with the minimum energy function as the target optical flow when exiting due to the fact that the exit conditions are met.
- The optical flow obtaining apparatus according to any one of claims 10 to 14, wherein the gradient information includes a sum of X-direction gradient values, a sum of squares of X-direction gradient values, a sum of sums of Y-direction gradient values, a sum of squares of Y-direction gradient values, and a sum of products of X-direction and Y-direction gradient values, and the obtaining module is further configured to:calculating convolution of each pixel point of the first pixel block and an X-direction Sobel operator to obtain an X-direction gradient matrix, and calculating convolution of each pixel point of the first pixel block and a Y-direction Sobel operator to obtain a Y-direction gradient matrix;performing cumulative summation on all gradient values of the X-direction gradient matrix to obtain the sum of the X-direction gradient values, and performing cumulative summation after squaring each gradient value of the X-direction gradient matrix to obtain the sum of squares of the X-direction gradient values;performing cumulative summation on all gradient values of the Y-direction gradient matrix to obtain a sum of the Y-direction gradient values, and performing cumulative summation after squaring each gradient value of the Y-direction gradient matrix to obtain a sum of squares of the Y-direction gradient values;and multiplying gradient values at the same positions of the X-direction gradient matrix and the Y-direction gradient matrix, and then accumulating and summing to obtain a product sum of the gradient values in the X direction and the Y direction.
- The optical flow obtaining apparatus according to any one of claims 10 to 15, wherein the determining module is further configured to:determining a fourth similarity of the first pixel block and a fifth pixel block in the ith frame of image, wherein coordinates of the fifth pixel block are obtained from the target optical flow;and determining the confidence level of the target optical flow according to the first similarity and the fourth similarity.
- The optical flow obtaining apparatus according to claim 16, wherein the determining module is specifically configured to:determining the target optical flow to be of low confidence if the ratio of the numerical value of the fourth similarity to the numerical value of the first similarity is greater than a first threshold;otherwise, if the value of the fourth similarity is greater than a second threshold, determining that the target optical flow is low confidence;otherwise, the target optical flow is determined to be of high confidence.
- The optical flow obtaining apparatus according to any one of claims 10 to 17, further comprising a frame interpolation module configured to:inserting an intermediate image between the i-1 frame image and the i frame image according to the target optical flow.
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/075890 WO2021163928A1 (en) | 2020-02-19 | 2020-02-19 | Optical flow obtaining method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115104125A true CN115104125A (en) | 2022-09-23 |
Family
ID=77390350
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202080096767.2A Pending CN115104125A (en) | 2020-02-19 | 2020-02-19 | Optical flow acquisition method and device |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN115104125A (en) |
WO (1) | WO2021163928A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116152301A (en) * | 2023-04-24 | 2023-05-23 | 知行汽车科技(苏州)股份有限公司 | Target speed estimation method, device, equipment and medium |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113947116B (en) * | 2021-09-30 | 2023-10-31 | 西安交通大学 | Train track looseness non-contact real-time detection method based on camera shooting |
CN113947608B (en) * | 2021-09-30 | 2023-10-20 | 西安交通大学 | High-precision measurement method for irregular movement of structure based on geometric matching control |
CN116684662A (en) * | 2022-02-22 | 2023-09-01 | 北京字跳网络技术有限公司 | Video processing method, device, equipment and medium |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10515455B2 (en) * | 2016-09-29 | 2019-12-24 | The Regents Of The University Of Michigan | Optical flow measurement |
CN107292912B (en) * | 2017-05-26 | 2020-08-18 | 浙江大学 | Optical flow estimation method based on multi-scale corresponding structured learning |
CN109087338B (en) * | 2017-06-13 | 2021-08-24 | 北京图森未来科技有限公司 | Method and device for extracting image sparse optical flow |
CN107657644B (en) * | 2017-09-28 | 2019-11-15 | 浙江大华技术股份有限公司 | Sparse scene flows detection method and device under a kind of mobile environment |
-
2020
- 2020-02-19 WO PCT/CN2020/075890 patent/WO2021163928A1/en active Application Filing
- 2020-02-19 CN CN202080096767.2A patent/CN115104125A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116152301A (en) * | 2023-04-24 | 2023-05-23 | 知行汽车科技(苏州)股份有限公司 | Target speed estimation method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
WO2021163928A1 (en) | 2021-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115104125A (en) | Optical flow acquisition method and device | |
CN107230225B (en) | Method and apparatus for three-dimensional reconstruction | |
Hamzah et al. | Literature survey on stereo vision disparity map algorithms | |
US20210110599A1 (en) | Depth camera-based three-dimensional reconstruction method and apparatus, device, and storage medium | |
US9830715B2 (en) | Method for determining a parameter set designed for determining the pose of a camera and/or for determining a three-dimensional structure of the at least one real object | |
EP3471057A1 (en) | Image processing method and apparatus using depth value estimation | |
CN109523581B (en) | Three-dimensional point cloud alignment method and device | |
CN111819601A (en) | Method and system for point cloud registration for image processing | |
US9792709B1 (en) | Apparatus and methods for image alignment | |
US20180262685A1 (en) | Apparatus and methods for image alignment | |
EP3384466A1 (en) | Quasi-parametric optical flow estimation | |
TWI738196B (en) | Method and electronic device for image depth estimation and storage medium thereof | |
CN113407027B (en) | Pose acquisition method and device, electronic equipment and storage medium | |
Zhang et al. | A new high resolution depth map estimation system using stereo vision and kinect depth sensing | |
Gao et al. | A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images | |
CN107330930B (en) | Three-dimensional image depth information extraction method | |
CN114202632A (en) | Grid linear structure recovery method and device, electronic equipment and storage medium | |
US11145072B2 (en) | Methods, devices and computer program products for 3D mapping and pose estimation of 3D images | |
CN114445473B (en) | Stereo matching method and system based on deep learning operator | |
CN115937002B (en) | Method, apparatus, electronic device and storage medium for estimating video rotation | |
CN111899326A (en) | Three-dimensional reconstruction method based on GPU parallel acceleration | |
CN114463409B (en) | Image depth information determining method and device, electronic equipment and medium | |
Huang et al. | A multiview stereo algorithm based on image segmentation guided generation of planar prior for textureless regions of artificial scenes | |
CN115908731A (en) | Double-unmanned aerial vehicle three-dimensional reconstruction method based on cloud edge cooperation | |
Sakai et al. | Accurate and dense wide-baseline stereo matching using SW-POC |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |