CN114187333A

CN114187333A - Image alignment method, image alignment device and terminal equipment

Info

Publication number: CN114187333A
Application number: CN202010963340.XA
Authority: CN
Inventors: 潘澄; 刘恒芳
Original assignee: TCL Technology Group Co Ltd
Current assignee: TCL Technology Group Co Ltd
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2022-03-15

Abstract

The application is applicable to the technical field of image processing, and provides an image alignment method, an image alignment device, a terminal device and a computer readable storage medium, wherein the method comprises the following steps: acquiring an image frame to be aligned and a reference image frame; extracting a first characteristic point in the image frame to be aligned and a second characteristic point in the reference image frame; calculating a motion vector of the image frame to be aligned according to the first characteristic point and the second characteristic point; aligning the image frame to be aligned to the reference image frame according to the motion vector. By the method, the calculated amount of image alignment can be greatly reduced, and the efficiency of image alignment is improved.

Description

Image alignment method, image alignment device and terminal equipment

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image alignment method, an image alignment apparatus, a terminal device, and a computer-readable storage medium.

Background

In daily life, the image alignment technology has many application scenes. For example, in the process of shooting through a mobile phone, a small difference exists between multiple frames of images continuously shot due to hand shake, and in order to obtain better image quality, it is necessary to apply an image alignment technique to register the multiple frames of images in the same coordinate system.

The existing image alignment technology mainly comprises an image alignment technology based on an optical flow method, an image alignment technology based on feature extraction, an image alignment technology based on a pyramid and the like. However, these image alignment techniques are computationally intensive and inefficient.

Disclosure of Invention

In view of this, the present application provides an image alignment method, an image alignment apparatus, a terminal device and a computer readable storage medium, which can greatly reduce the calculation amount of image alignment and improve the efficiency of image alignment.

In a first aspect, the present application provides an image alignment method, including:

acquiring an image frame to be aligned and a reference image frame;

extracting a first characteristic point in the image frame to be aligned and a second characteristic point in the reference image frame;

calculating a motion vector of the image frame to be aligned according to the first characteristic point and the second characteristic point;

and aligning the image frame to be aligned to the reference image frame according to the motion vector.

In a second aspect, the present application provides an image registration apparatus comprising:

the image acquisition unit is used for acquiring an image frame to be aligned and a reference image frame;

a feature point extracting unit, configured to extract a first feature point in the image frame to be aligned and a second feature point in the reference image frame;

a motion vector calculation unit for calculating a motion vector of the image frame to be aligned based on the first feature point and the second feature point;

and the image alignment unit is used for aligning the image frame to be aligned to the reference image frame according to the motion vector.

In a third aspect, the present application provides a terminal device, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method provided in the first aspect when executing the computer program.

In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the method as provided in the first aspect.

In a fifth aspect, the present application provides a computer program product, which, when run on a terminal device, causes the terminal device to perform the method provided by the first aspect.

As can be seen from the above, in the present application, an image frame to be aligned and a reference image frame are first obtained, a first feature point in the image frame to be aligned and a second feature point in the reference image frame are extracted, a motion vector of the image frame to be aligned is calculated according to the first feature point and the second feature point, and finally the image frame to be aligned is aligned to the reference image frame according to the motion vector. According to the scheme, the characteristic points representing high-frequency information in the image frame to be aligned and the reference image frame are extracted, and the motion vector of the image frame to be aligned is calculated according to the extracted characteristic points, so that the calculated amount of image alignment can be greatly reduced, and the efficiency of image alignment is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a schematic flowchart of an image alignment method provided in an embodiment of the present application;

FIG. 2 is a diagram illustrating exemplary features provided in an embodiment of the present application;

FIG. 3 is a schematic diagram of feature descriptor generation provided by an embodiment of the present application;

FIG. 4 is an exemplary diagram of an image block provided in an embodiment of the present application;

FIG. 5 is a diagram of an example pixel block provided by an embodiment of the present application;

FIG. 6 is a block diagram of an image alignment apparatus according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

Fig. 1 shows a flowchart of an image alignment method provided in an embodiment of the present application, which is detailed as follows:

step 101, acquiring an image frame to be aligned and a reference image frame;

in this embodiment of the application, the terminal device needs to acquire at least two frames of images, for example, the at least two frames of images may be at least two frames of images continuously captured by the terminal device during the process of capturing images. After the terminal equipment acquires at least two frames of images, one frame of image is selected from the at least two frames of images to be used as a reference image frame, the rest images are all used as image frames to be aligned, and then each image frame to be aligned is aligned to the reference image frame. Optionally, as for the selection manner of the reference image frame, the definition of each frame of the at least two frames of images may be respectively calculated, and a frame of image with the highest definition is selected as the reference image frame to ensure that the image frame to be aligned after being aligned to the reference image frame is clear, or the resolution of each frame of the at least two frames of images may be respectively calculated, and the frame of image with the highest resolution is used as the reference image frame, where the selection manner of the reference image frame is not limited.

For example, the terminal device may continuously capture multiple frames of images of the same scene in a very short time using a continuous shooting mode of its own camera, for example, continuously capture 30 frames of face images of the same face in 1 second, which are face image 1, face image 2, face image 3, …, and face image 30, respectively. Then, the sharpness of the 30 frames of face images can be respectively calculated, one frame of face image with the highest sharpness in the 30 frames of face images is used as a reference image frame, and the remaining 29 frames of face images are used as image frames to be aligned. For example, it is assumed that the definition of the face image 1 is the highest in the 30 frames of face images, and therefore, the face image 1 may be used as a reference image frame, and the

face images

2, 3, 4, …, and 30 may all be used as image frames to be aligned.

102, extracting a first characteristic point in an image frame to be aligned and a second characteristic point in a reference image frame;

in the embodiment of the present application, since each image frame to be aligned needs to be aligned to a reference image frame, and operations performed on each image frame to be aligned are the same or similar, for convenience of description, a frame of the image frames to be aligned will be described in the embodiment of the present application as an example. The first characteristic point represents high-frequency information in the image frame to be aligned, and the second characteristic point represents high-frequency information in the reference image frame. By detecting the feature points of the image frame to be aligned and the reference image frame, a first feature point in the image frame to be aligned and a second feature point in the reference image frame can be obtained. Referring specifically to fig. 2, fig. 2 is an example of first feature points in the image frames to be aligned, and each small circle in fig. 2 represents a first feature point. Alternatively, in order to balance the effect and efficiency of feature point detection, feature point detection may be performed on the image frame to be aligned and the reference image frame by a featurefrom estimated Segment Test (FAST) algorithm.

Taking the image frame to be aligned as an example, please refer to fig. 3, where each square in fig. 3 represents a pixel point. For a certain pixel point p in the image frame to be aligned, a circle is determined by taking the pixel point p as the center of the circle and the radius of the circle is 3, and 16 pixel points are arranged on the circle. The absolute value d of the difference between the pixel value of each pixel point of the 16 pixel points on the circle and the pixel point p is calculated. If the number of the pixel points of which the corresponding absolute values d are larger than the threshold t in the 16 pixel points on the circle is larger than or equal to 12, the pixel point p is determined as an angular point, meanwhile, the confidence coefficient of the pixel point p can be calculated according to the sum of the absolute values d corresponding to the 16 pixel points on the circle, and the greater the sum of the absolute values d corresponding to the 16 pixel points on the circle is, the greater the confidence coefficient is. Based on this, all corner points in the image frame to be aligned are detected, if the number of the detected corner points is smaller than the preset number of corner points (e.g. 500), the threshold value t is reduced, and all corner points in the image frame to be aligned are detected again according to the above mode. If the number of the detected corner points is still smaller than the preset number of the corner points, the threshold value t is reduced again, all the corner points in the image frame to be aligned are detected again according to the method, and the like is repeated until the number of the detected corner points in the image frame to be aligned is larger than or equal to the preset number of the corner points. After the number of the corner points in the image frame to be aligned is detected to be larger than or equal to the threshold value of the number of the corner points, the corner points are sequenced according to the sequence of the confidence degrees of the corner points from high to low, and the corner points arranged in front in the preset number are taken as first feature points, so that the accuracy of the FAST algorithm is improved. Since the way of detecting the feature points of the reference image frame is the same as the way of detecting the feature points of the image frame to be aligned, the details are not repeated here.

For example, assuming that the number of preset corner points is 500, the number of corner points included in the image frame to be aligned, which is detected with the threshold t of 10, is 300, and the threshold t is reduced to 6 because 300 is less than 500. The number of corner points included in the image frame to be aligned, which is detected by taking the threshold t as 6, is 800, and since 800 is greater than 500, the corner points are sequenced according to the sequence of the confidence degrees of the corner points from high to low, and the first 500 corner points are taken as first feature points.

103, calculating a motion vector of the image frame to be aligned according to the first characteristic point and the second characteristic point;

in this embodiment of the application, the first feature point may represent high-frequency information in the image frame to be aligned, that is, a feature contour of an object in the image frame to be aligned, and the second feature point may represent high-frequency information in the reference image frame, that is, a feature contour of an object in the reference image frame, so that, by analyzing the first feature point and the second feature point, an offset between the object in the image frame to be aligned and the object in the reference image frame may be obtained, and further, a motion vector of the image frame to be aligned is obtained.

Optionally, the step 103 may specifically include:

a1, calculating a global motion vector of the image frame to be aligned according to the first characteristic point and the second characteristic point;

a2, dividing an image frame to be aligned into a preset number of image blocks;

a3, calculating a local motion vector of each image block in the image frame to be aligned according to the global motion vector and the first characteristic point;

and A4, calculating the motion vector according to the global motion vector and the local motion vector.

In the embodiment of the application, by analyzing the first characteristic point and the second characteristic point, the offset between the image frame to be aligned and the reference image frame can be obtained, and further, the global motion vector of the image frame to be aligned is obtained. Next, the image frame to be aligned needs to be divided into a preset number of image blocks, each having an equal area, please refer to fig. 4, where fig. 4 is an example of an image block in the image frame to be aligned, where each square represents an image block. Illustratively, the image frame to be aligned may be divided into 32 × 32 image blocks, each of which is 125 × 94 pixels in size for an image frame to be aligned of 4000 × 3000 pixels in size. The global motion vector represents the offset between the image frame to be aligned and the reference image frame, and the first feature point represents high-frequency information, namely an outline (such as characters, icons and the like), in the image frame to be aligned, so that according to the global motion vector and the first feature point in the image block, an image block similar to the image block in the image frame to be aligned can be matched in the reference image frame, and the two similar image blocks have similar high-frequency information. According to the offset between the two similar image blocks, the local motion vector of the image block in the image frame to be aligned can be obtained. And finally, calculating the motion vector according to the global motion vector and the local motion vector. It should be noted that in the embodiment of the present application, a motion vector is calculated for each image block, that is, the number of motion vectors is the same as the number of image blocks in the image frame to be aligned.

For example, the motion vector of each image block may be the result of summing the local motion vector of that image block with the global motion vector of the image frame to be aligned. Taking a certain image block a as an example, assuming that the local motion vector of the image block a is (0, -1), and the global motion vector of the image frame to be aligned is (2, 2), the motion vector of the image block a is calculated to be (2, 1).

Optionally, the step a1 may specifically include:

a11, generating a feature descriptor of each first feature point and a feature descriptor of each second feature point;

a12, determining a target first feature point matched with the target second feature point;

a13, calculating the offset between the target first characteristic point and the target second characteristic point, and adding the offset into an offset list;

and A14, determining a global motion vector of the image frame to be aligned according to the offset list.

In the embodiment of the present application, the generation manners of the feature descriptors of the first feature points and the feature descriptors of the second feature points are the same, and the generation manner of the feature descriptors of the first feature points is described below by taking a certain first feature point q in the image frame to be aligned as an example. Firstly, a neighborhood window is determined by taking a first feature point q as a center, and 256 pairs of pixel points are randomly selected in the neighborhood window. Then, for each pair of selected pixels, the magnitude relation of the pixel values of two pixels (such as pixel 1 and pixel 2) in the pair of pixels is determined, if the pixel value of pixel 1 is smaller than that of pixel 2, the corresponding value of the pair of pixels is marked as 1, and if the pixel value of pixel 1 is not smaller than that of pixel 2, the corresponding value of the pair of pixels is marked as 0. Finally, the corresponding values of the 256 pairs of pixel points are spliced into a 256-bit binary code, and the binary code is the feature descriptor of the first feature point q.

After the feature descriptors of the first feature points and the feature descriptors of the second feature points are generated, the first feature points and the second feature points need to be matched according to the feature descriptors. For convenience of explanation, a second feature point in the reference image frame, that is, the target second feature point, will be described as an example. It will be appreciated that, in practice, the operation as the target second feature point will be performed for each second feature point in the reference image frame. Specifically, the manner of determining the target first feature point is as follows: and calculating the Hamming distance between the feature descriptor of the target second feature point and the feature descriptor of each first feature point in the image frame to be aligned, and determining the corresponding first feature point with the minimum Hamming distance as the target first feature point, namely, matching the feature descriptor of the target second feature point with the feature descriptor of the target first feature point.

After determining the target first feature point matched with the target second feature point, the offset between the target second feature point and the target first feature point needs to be calculated. For example, assuming that the coordinates of the target second feature point in the reference image frame are (50, 50) and the coordinates of the target first feature point in the image frame to be aligned are (48, 49), the offset between the target second feature point and the target first feature point is (2, 1). The calculated offset between the target second feature point and the target first feature point is added to a preset offset list. After each second feature point in the reference image frame is operated as a target second feature point, the offset list comprises a plurality of offsets, and the offsets contained in the offset list are analyzed, so that the global motion vector of the image frame to be aligned can be determined.

For example, the offset that occurs the most number of times in the offset list may be determined as the global motion vector. For example, assuming that the offset list includes 500 offsets, 400 offsets are (1, 2), 80 offsets are (0, 1), and 20 offsets are (3, 2), the offset (1, 2) with the largest occurrence number is determined as the global motion vector.

Optionally, the step a3 may specifically include:

b1, determining a first pixel block in the target image block by taking the first characteristic point in the target image block as a center;

b2, according to the global motion vector, determining a second pixel block matched with the first pixel block in the reference image frame;

and B3, determining the offset between the first pixel block and the second pixel block as a local motion vector of the target image block.

In the embodiment of the present application, for convenience of description, an image block in an image frame to be aligned will be taken as an example, where the image block is a target image block. It will be appreciated that in practice, an operation like the target image block will be performed for each image block in the image frame to be aligned. Specifically, first, a first pixel block is determined in the target image block according to a preset pixel block size with a first feature point in the target image block as a center, wherein the pixel block size can be set according to practical situations, for example, the pixel block size is 32 × 32 pixels. Then, a second pixel block matching the first pixel block may be determined in the reference image frame based on the global motion vector, the second pixel block having a size identical to that of the first pixel block. Finally, the offset between the first pixel block and the second pixel block can be calculated according to the coordinate of the first pixel block in the image frame to be aligned and the coordinate of the second pixel block in the reference image frame, and the offset is determined as the local motion vector of the target image block. The first pixel block is determined in the target image block by taking the first characteristic point in the target image block as the center, and the first pixel block is used for block matching instead of directly using the image block for block matching, so that the image alignment method in the embodiment of the application has the advantages of very small calculation amount and very high efficiency. In addition, since the first feature point in the target image block represents high-frequency information in the target image block, and low-frequency information included in the first pixel block determined with the first feature point as the center is very small, performing block matching with the first pixel block can also reduce interference of the low-frequency information in the target image block, thereby improving the accuracy of the calculated local motion vector of the target image block.

For example, assuming that the coordinates of the center point of the second pixel block in the reference image frame are (50, 50) and the coordinates of the center point of the first pixel block in the image frame to be aligned are (48, 49), the offset between the first pixel block and the second pixel block is (2, 1), i.e. the local motion vector of the target image block is (2, 1). As an example, the local motion vectors of the respective image blocks are indicated in the respective image blocks (i.e., squares) in fig. 4.

It should be noted that, because the first feature points are not necessarily uniformly distributed in the image frame to be aligned, some image blocks do not include the first feature point, some image blocks include one first feature point, and some image blocks include more than two first feature points. Based on this, before the step B1 is executed, the following steps are also included:

and if the target image block does not contain the first characteristic point, directly taking the local motion vector of the image block around the target image block as the local motion vector of the target image block. Specifically, the average value of the local motion vectors of the four image blocks of the upper, lower, left, and right adjacent to the target image block may be used as the local motion vector of the target image block. And if the upper image block, the lower image block, the left image block and the right image block which are adjacent to the target image block do not contain the first characteristic point, taking the average value of the local motion vectors of the four image blocks which are adjacent to the target image block and are arranged at the upper left corner, the lower left corner, the upper right corner and the lower right corner as the local motion vector of the target image block. And if the four adjacent image blocks of the upper left corner, the lower left corner, the upper right corner and the lower right corner of the target image block still do not contain the first feature point, taking the global motion vector of the image frame to be aligned as the local motion vector of the target image block. Thereby avoiding blocking effects.

And if the target image block contains the first characteristic point, judging whether the number of the first characteristic points contained in the target image block is larger than a preset characteristic point number threshold value or not. Taking the threshold of the number of feature points as 3 as an example, if the number of first feature points included in the target image block is less than or equal to 3, all the first feature points included in the target image block are retained.

If the number of the first feature points included in the target image block is greater than 3, removing the first feature points at the edge of the target image block, where the edge of the target image block may be the outermost circle of the target image block or the outermost two circles of the target image block, and the boundary of the edge may be set according to the requirement, which is not specifically limited herein. And then, sequencing the first feature points according to the sequence of the confidence degrees of the first feature points from high to low, only reserving the first 3 first feature points in the target image block, and removing the first feature points except the first 3 first feature points in the target image block. It should be noted that, if the number of the first feature points only stored in the target image block is less than 3 after the first feature points at the edge of the target image block are removed, all the first feature points only stored in the target image block are retained. In this way, the number of the first feature points remaining in each image block does not exceed the feature point number threshold, so that the calculation amount of the subsequent steps B1, B2, and B3 is reduced.

It should be understood that, since the number of first feature points included in the target image block may be more than one, there may be more than one first pixel block determined in the target image block. If more than two first pixel blocks are determined in the target image block, for each first pixel block, a second pixel block matching the first pixel block is determined in the reference image frame, and the offset between the first pixel block and the second pixel block is calculated. After obtaining the offset corresponding to each first pixel block in the target image block, an average value of each offset may be determined as a local motion vector of the target image block.

Optionally, the step B2 may specifically include:

b21, determining a search range in the reference image frame according to the global motion vector and the first characteristic point in the target image block;

and B22, searching the second pixel block matched with the first pixel block in the searching range.

In the embodiment of the present application, the search range may be determined in the reference image frame according to the global motion vector and the first feature point in the target image block. Then, a candidate second pixel block can be determined by respectively taking each pixel point in the search range as the center, wherein the size of each candidate second pixel block is the same as that of the first pixel block, finally, the similarity between each candidate second pixel block and the first pixel block is respectively calculated, and the candidate second pixel block with the highest similarity with the first pixel block is determined as the second pixel block matched with the first pixel block. Alternatively, the euclidian space distance between the candidate second pixel block and the first pixel block may be taken as the similarity between the candidate second pixel block and the first pixel block, and specifically, assuming that the size of the first pixel block is 32 × 32 pixels, the similarity between the candidate second pixel block and the first pixel block may be calculated according to the following formula:

wherein L is the similarity, P_rIs the pixel value, P, of a pixel point in a candidate second pixel block_tThe pixel values of the pixel points in the first pixel block.

For example, the manner of determining the search range in the reference image frame may be: a central point is selected in the reference image frame, and the offset between the central point and the first characteristic point in the target image block is equal to the global motion vector. Then, a search range is determined in the reference image frame according to a preset search range size with the center point as the center, wherein the search range size can be 16 × 16 pixels.

For example, referring to fig. 5, the target image block in fig. 5 includes a first feature point a, coordinates of the first feature point a are (7, 7), and assuming that the global motion vector is (2, 3) and the preset search range size is 3 × 3, a search range centered on a center point a is determined in the reference image frame according to the first feature point a, where the coordinates of the center point a are (7-2, 7-3) ═ 5, 4. The searching range comprises 9 pixel points, a candidate second pixel block is determined by respectively taking the 9 pixel points as the center, and then the similarity between the 9 candidate second pixel blocks and the first pixel block is respectively calculated. The candidate second pixel block with the pixel point m as the center has the highest similarity with the first pixel block, and therefore the candidate second pixel block with the pixel point m as the center is used as the second pixel block matched with the first pixel block.

Step 104, aligning the image frame to be aligned to a reference image frame according to the motion vector;

in the embodiment of the application, the image frame to be aligned comprises a plurality of image blocks, and after the motion vector of each image block is calculated, the image frame to be aligned can be aligned to the reference image frame according to the motion vector of each image block. Specifically, the image block may be aligned to the reference image frame according to the motion vector of each image block, for example, if the motion vector of a certain image block a in the image frame to be aligned is (m, n), the image block a is aligned to the reference image frame to obtain an image block a ', and then the pixel value of the pixel point with the coordinate (x, y) in the image block a' is equal to the pixel value of the pixel point with the coordinate (x + m, y + n) in the image block a.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 6 shows a block diagram of an image alignment apparatus provided in an embodiment of the present application, and only a part related to the embodiment of the present application is shown for convenience of description.

The image alignment apparatus 600 includes:

an image obtaining unit 601, configured to obtain an image frame to be aligned and a reference image frame;

a feature point extracting unit 602, configured to extract a first feature point in the image frame to be aligned and a second feature point in the reference image frame;

a motion vector calculation unit 603 configured to calculate a motion vector of the image frame to be aligned according to the first feature point and the second feature point;

an image alignment unit 604, configured to align the image frame to be aligned to the reference image frame according to the motion vector.

Optionally, the motion vector calculation unit 603 further includes:

a global motion vector calculating subunit, configured to calculate a global motion vector of the image frame to be aligned according to the first feature point and the second feature point;

the image segmentation subunit is used for segmenting the image frame to be aligned into a preset number of image blocks;

a local motion vector calculating subunit, configured to calculate a local motion vector of each image block in the image frame to be aligned according to the global motion vector and the first feature point;

and the motion vector calculation subunit is used for calculating the motion vector according to the global motion vector and the local motion vector.

Optionally, the global motion vector calculating subunit includes:

a descriptor generation subunit, configured to generate a feature descriptor for each first feature point and a feature descriptor for each second feature point;

a feature point matching subunit, configured to determine a target first feature point that matches a target second feature point, where the target second feature point is any second feature point in the reference image frame, and a feature descriptor of the target second feature point matches a feature descriptor of the target first feature point;

an offset amount operator unit for calculating an offset amount between the target first feature point and the target second feature point and adding the offset amount to an offset amount list;

and the global motion vector determining subunit is used for determining the global motion vector of the image frame to be aligned according to the offset list.

Optionally, the global motion vector determining subunit is specifically configured to determine, as the global motion vector, an offset that occurs the most frequently in the offset list.

Optionally, the local motion vector calculating subunit includes:

a first pixel block determining subunit, configured to determine a first pixel block in a target image block by taking a first feature point in the target image block as a center, where the target image block is any image block in the image frame to be aligned;

a second pixel block determination subunit configured to determine, in the reference image frame, a second pixel block that matches the first pixel block, based on the global motion vector, the second pixel block having a size that is the same as that of the first pixel block;

a local motion vector determining subunit, configured to determine an offset between the first pixel block and the second pixel block as a local motion vector of the target image block.

Optionally, the second pixel block determination subunit includes:

a search range determining subunit, configured to determine a search range in the reference image frame according to the global motion vector and the first feature point in the target image block;

and the second pixel block searching subunit is used for searching the second pixel block matched with the first pixel block in the searching range.

Optionally, the search range determining subunit is specifically configured to select a central point in the reference image frame, where an offset between the central point and a first feature point in the target image block is equal to the global motion vector; and determining the search range in the reference image frame according to a preset size, wherein the search range takes the central point as a center.

Optionally, the image acquiring unit 601 includes:

the image acquisition subunit is used for acquiring at least two frames of images;

and the image frame determining subunit is used for determining one image of the at least two images as the reference image frame and determining the rest images as the image frames to be aligned.

Fig. 7 is a schematic structural diagram of a terminal device according to an embodiment of the present application. As shown in fig. 7, the terminal device 7 of this embodiment includes: at least one processor 70 (only one shown in fig. 7), a memory 71, and a computer program 72 stored in the memory 71 and executable on the at least one processor 70, wherein the processor 70 implements the following steps when executing the computer program 72:

acquiring an image frame to be aligned and a reference image frame;

Assuming that the above is the first possible embodiment, in a second possible embodiment provided based on the first possible embodiment, the calculating a motion vector of the image frame to be aligned based on the first feature point and the second feature point includes:

calculating a global motion vector of the image frame to be aligned according to the first characteristic point and the second characteristic point;

dividing the image frame to be aligned into a preset number of image blocks;

calculating local motion vectors of all image blocks in the image frame to be aligned according to the global motion vectors and the first characteristic points;

and calculating the motion vector according to the global motion vector and the local motion vector.

In a third possible embodiment based on the second possible embodiment, the calculating a global motion vector of the image frame to be aligned according to the first feature point and the second feature point includes:

generating a feature descriptor of each first feature point and a feature descriptor of each second feature point;

determining a target first feature point matched with a target second feature point, wherein the target second feature point is any one second feature point in the reference image frame, and a feature descriptor of the target second feature point is matched with a feature descriptor of the target first feature point;

calculating the offset between the target first characteristic point and the target second characteristic point, and adding the offset into an offset list;

and determining the global motion vector of the image frame to be aligned according to the offset list.

In a fourth possible implementation manner provided as a basis for the third possible implementation manner, the determining a global motion vector of the image frame to be aligned according to the offset list includes:

and determining the offset with the largest occurrence frequency in the offset list as the global motion vector.

In a fifth possible embodiment based on the second possible embodiment, the calculating a local motion vector of each image block in the image frame to be aligned based on the global motion vector and the first feature point includes:

determining a first pixel block in a target image block by taking a first characteristic point in the target image block as a center, wherein the target image block is any image block in the image frame to be aligned;

determining a second pixel block matched with the first pixel block in the reference image frame according to the global motion vector, wherein the size of the second pixel block is the same as that of the first pixel block;

and determining the offset between the first pixel block and the second pixel block as a local motion vector of the target image block.

In a sixth possible implementation form based on the fifth possible implementation form, the determining a second pixel block matching the first pixel block in the reference image frame according to the global motion vector includes:

determining a search range in the reference image frame according to the global motion vector and a first feature point in the target image block;

searching the second pixel block matched with the first pixel block in the searching range.

In a seventh possible embodiment based on the sixth possible embodiment, the determining a search range in the reference image frame according to the global motion vector and the first feature point in the target image block includes:

selecting a central point in the reference image frame, wherein the offset between the central point and a first characteristic point in the target image block is equal to the global motion vector;

and determining the search range in the reference image frame according to a preset size, wherein the search range takes the central point as a center.

In an eighth possible implementation manner provided on the basis of the first possible implementation manner, the second possible implementation manner, the third possible implementation manner, the fourth possible implementation manner, the fifth possible implementation manner, the sixth possible implementation manner, or the seventh possible implementation manner, the acquiring the image frame to be aligned and the reference image frame includes:

acquiring at least two frames of images;

determining one frame image of the at least two frame images as the reference image frame, and determining the rest images as the image frame to be aligned.

The terminal device 7 may be a desktop computer, a notebook, a palm computer, a cloud server, or other computing devices. The terminal device may include, but is not limited to, a processor 70, a memory 71. Those skilled in the art will appreciate that fig. 7 is only an example of the terminal device 7, and does not constitute a limitation to the terminal device 7, and may include more or less components than those shown, or combine some components, or different components, for example, and may further include input/output devices, network access devices, and the like.

The Processor 70 may be a Central Processing Unit (CPU), and the Processor 70 may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The storage 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. In other embodiments, the memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 7. Further, the memory 71 may include both an internal storage unit and an external storage device of the terminal device 7. The memory 71 is used for storing an operating system, an application program, a BootLoader (BootLoader), data, and other programs, such as program codes of the computer programs. The above-mentioned memory 71 may also be used to temporarily store data that has been output or is to be output.

It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

The embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps in the above method embodiments.

Embodiments of the present application provide a computer program product, which, when running on a terminal device, causes the terminal device to execute the steps in the above-mentioned method embodiments.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable medium may include at least: any entity or apparatus capable of carrying computer program code to a terminal device, recording medium, computer Memory, Read-Only Memory (ROM), Random-Access Memory (RAM), electrical carrier wave signals, telecommunications signals, and software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the above modules or units is only one logical function division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

The above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. An image alignment method, comprising:

acquiring an image frame to be aligned and a reference image frame;

aligning the image frame to be aligned to the reference image frame according to the motion vector.

2. The image alignment method according to claim 1, wherein said calculating a motion vector of the image frame to be aligned according to the first feature point and the second feature point comprises:

dividing the image frame to be aligned into a preset number of image blocks;

calculating local motion vectors of all image blocks in the image frame to be aligned according to the global motion vectors and the first feature points;

calculating the motion vector from the global motion vector and the local motion vector.

3. The image alignment method according to claim 2, wherein said calculating a global motion vector of the image frame to be aligned according to the first feature point and the second feature point comprises:

and determining a global motion vector of the image frame to be aligned according to the offset list.

4. The image alignment method according to claim 3, wherein said determining a global motion vector of the image frame to be aligned according to the offset list comprises:

and determining the offset with the largest occurrence number in the offset list as the global motion vector.

5. The image alignment method according to claim 2, wherein said calculating a local motion vector of each image block in the image frame to be aligned according to the global motion vector and the first feature point comprises:

determining a first pixel block in a target image block by taking a first feature point in the target image block as a center, wherein the target image block is any image block in the image frame to be aligned;

determining a second pixel block matched with the first pixel block in the reference image frame according to the global motion vector, wherein the second pixel block is the same as the first pixel block in size;

determining an offset between the first pixel block and the second pixel block as a local motion vector of the target image block.

6. The image alignment method of claim 5, wherein said determining a second pixel block in the reference image frame that matches the first pixel block based on the global motion vector comprises:

searching a second pixel block matched with the first pixel block in the searching range.

7. The image alignment method according to claim 6, wherein the determining a search range in the reference image frame according to the global motion vector and the first feature point in the target image block comprises:

8. The image alignment method according to any one of claims 1 to 7, wherein the acquiring the image frame to be aligned and the reference image frame comprises:

acquiring at least two frames of images;

9. An image alignment apparatus, comprising:

the characteristic point extraction unit is used for extracting a first characteristic point in the image frame to be aligned and a second characteristic point in the reference image frame;

a motion vector calculation unit, configured to calculate a motion vector of the image frame to be aligned according to the first feature point and the second feature point;

10. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1 to 8 when executing the computer program.

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 8.