CN115272428A - Image alignment method and device, computer equipment and storage medium - Google Patents

Image alignment method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN115272428A
CN115272428A CN202211021665.1A CN202211021665A CN115272428A CN 115272428 A CN115272428 A CN 115272428A CN 202211021665 A CN202211021665 A CN 202211021665A CN 115272428 A CN115272428 A CN 115272428A
Authority
CN
China
Prior art keywords
block image
image
block
images
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211021665.1A
Other languages
Chinese (zh)
Inventor
林子尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Oppo Mobile Telecommunications Corp Ltd
Original Assignee
Sonar Sky Information Consulting Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sonar Sky Information Consulting Co ltd filed Critical Sonar Sky Information Consulting Co ltd
Priority to CN202211021665.1A priority Critical patent/CN115272428A/en
Publication of CN115272428A publication Critical patent/CN115272428A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Abstract

The application relates to an image alignment method, an image alignment apparatus, a computer device, a storage medium and a computer program product. According to the texture feature information of the reference frame image, a plurality of first block images with balanced texture complexity are divided from the reference frame image, a target second block image with the maximum similarity to each first block image in the reference frame image in second block images of images to be aligned in other frames is determined, a movement vector of the target second block image is determined according to the position of the target second block in the images to be aligned in other frames and the position of the corresponding first block image with the maximum similarity in the reference frame image, and image alignment is carried out based on the movement vector. Compared with the traditional mode of aligning the whole image, the scheme divides the image into the block images with balanced texture complexity, determines the motion vector of each block image based on the similarity between the blocks, and aligns the image based on the motion vector, thereby realizing the effect of improving the accuracy of image alignment.

Description

Image alignment method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image alignment method, an image alignment apparatus, a computer device, a storage medium, and a computer program product.
Background
With the development of computer technology, people can take images through mobile devices such as mobile phones and cameras, and there is a way of synthesizing multiple frames of images in image taking to enhance the image quality of the images. Since the multiple images may have an offset due to time difference or hand vibration, the multiple images need to be aligned before being combined. At present, the alignment of multi-frame images is usually realized by aligning the whole image, however, local motion and object deformation exist in the actual image, which results in that the images cannot be accurately matched.
Therefore, the current image alignment method has the defect of low alignment accuracy.
Disclosure of Invention
In view of the above, it is necessary to provide an image alignment method, an apparatus, a computer device, a computer readable storage medium, and a computer program product capable of improving alignment accuracy in view of the above technical problems.
In a first aspect, the present application provides an image alignment method, including:
acquiring at least two frames of images to be aligned, and determining a frame of reference frame image from the at least two frames of images to be aligned;
dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range;
determining at least one second block image from other frames of images to be aligned according to each first block image, obtaining the similarity between the first block image and each second block image in the at least one second block image, and determining a target second block image with the maximum similarity; the second block image corresponds to the first block image in size;
determining a motion vector of the target second block image according to the position of the target second block image in the to-be-aligned image of the other frame and the position of the first block image in the reference frame image;
aligning the target second block image and the image of the first block image corresponding to the target second block image based on the motion vector.
In one embodiment, the dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image comprises:
taking the reference frame image as an image to be divided, and dividing the image to be divided into a plurality of candidate block images with the same size;
aiming at each candidate block image, acquiring the texture complexity and the side length of the candidate block image;
if the texture complexity is greater than a first preset complexity threshold, the side length is greater than or equal to a preset side length threshold and the dividing times are less than a preset time threshold, taking the candidate block image as a new image to be divided, and returning to the step of dividing the image to be divided into a plurality of candidate block images with the same size; otherwise, determining the area corresponding to the candidate block image as a target candidate block image in the reference frame image;
and determining a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images.
In one embodiment, the determining a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images includes:
acquiring the texture complexity of the target candidate block images;
merging adjacent target candidate block images with the texture complexity less than or equal to a second preset complexity threshold to obtain merged target candidate block images; the second preset complexity threshold is smaller than the first preset complexity threshold;
and obtaining a plurality of first block images in the reference frame image according to the non-merged target candidate block image and the merged target candidate block image.
In one embodiment, the obtaining the texture complexity of the candidate block image includes:
according to the gray scale difference between each pixel and its neighboring pixels in the candidate block image, obtaining a gray gradient value corresponding to the candidate block image;
determining a standard deviation value corresponding to the candidate block image according to the distance between each pixel in the candidate block image and a preset position in the candidate block image;
determining the entropy value corresponding to the candidate block image according to each gray level in the candidate block image;
and determining the texture complexity of the candidate block image according to the weighted sum of the gray gradient value, the standard difference value and the entropy value.
In one embodiment, the obtaining the similarity between the first block image and each of the at least one second block image comprises:
if the size of the first block image is larger than or equal to the size of a preset area, determining the similarity between the first block image and each second block image in the at least one second block image based on a first similarity comparison strategy;
if the size of the first block image is smaller than a preset area size and the brightness difference value between the second block image and the first block image is smaller than a preset brightness difference threshold value, determining the similarity between the first block image and each second block image in the at least one second block image based on a second similarity comparison strategy;
if the size of the first block image is smaller than a preset area size and the brightness difference value between the second block image and the first block image is larger than or equal to a preset brightness difference threshold value, determining the similarity between the first block image and each second block image in the at least one second block image based on a third similarity comparison strategy;
wherein the first similarity comparison strategy, the second similarity comparison strategy and the third similarity comparison strategy are different from each other.
In one embodiment, the determining the similarity between the first block image and each of the at least one second block image based on the first similarity comparison policy includes:
according to the texture feature information in the first block image and the same texture feature information in each second block image, constructing a homography matrix, adjusting each second block image to be matched with the first block image according to the homography matrix, and according to the error value of the adjusted second block image and the first block image, determining the similarity between the first block image and each second block image in the at least one second block image; alternatively, the first and second electrodes may be,
the determining the similarity of the first block image and each of the at least one second block image based on a second similarity comparison policy comprises:
acquiring a search area which is larger than the size of the first block image and contains the first block image from the images to be aligned of other frames, matching the first block image with a second block image in the search area, and determining the similarity between the first block image and the second block image in the search area according to the error value between the second block image in the search area and the first block image; alternatively, the first and second electrodes may be,
the determining the similarity of the first block image and each of the at least one second block image based on a third similarity comparison policy comprises:
and acquiring the Hamming distance between the first block image and the second block image, and obtaining the similarity between the first block image and each second block image in the at least one second block image according to the Hamming distance.
In one embodiment, the obtaining the hamming distance between the first block image and the second block image includes:
acquiring the gray values of a central pixel point of the first block image and other pixel points adjacent to the central pixel point, recording other pixel points of which the gray values are greater than the gray value of the central pixel point as first numerical values, and recording other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as second numerical values; the first value and the second value are different;
obtaining a first numerical value string corresponding to the first block image according to the numerical values corresponding to the other pixel points;
acquiring the gray values of a central pixel point of the second block image and other pixel points adjacent to the central pixel point, recording the other pixel points of which the gray values are greater than the gray value of the central pixel point as first numerical values, and recording the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as second numerical values;
obtaining a second numerical value string corresponding to the second block image according to the numerical values corresponding to the other pixel points;
and carrying out exclusive OR operation on the first numerical string and the second numerical string to obtain the Hamming distance between the first block image and the second block image.
In one embodiment, the determining a motion vector of the target second block image according to the position of the target second block image in the other frame to-be-aligned image and the position of the first block image in the reference frame image includes:
determining an initial movement vector of an object corresponding to the target second block image according to the distance and the direction of the position of the target second block image in the other frames of images to be aligned and the position of the corresponding first block image with the maximum similarity in the reference frame image;
obtaining the confidence of the initial movement vector;
if the confidence degree is smaller than a preset confidence degree threshold value, adjusting the first initial motion vector of the object corresponding to the target second block image according to the initial motion vector of the object corresponding to the associated target second block image and the confidence degree of the initial motion vector, and obtaining the motion vector of the target second block image; wherein the associated target second block image is adjacent to the target second block image and the confidence is greater than or equal to a preset confidence threshold.
In one embodiment, the obtaining the confidence level of the initial motion vector includes:
acquiring a preset normalization parameter corresponding to a similarity comparison strategy of the target second block image and the first block image;
and obtaining the confidence coefficient of the target second block image according to the ratio of the similarity of the target second block image and the corresponding first block image with the maximum similarity to the preset normalization parameter.
In one embodiment, the determining a frame of reference frame image from the at least two images to be aligned includes:
acquiring the definition of each frame of image to be aligned in the at least two frames of images to be aligned;
and taking the image to be aligned with the frame with the maximum definition as a reference frame image.
In a second aspect, the present application provides an image registration apparatus, the apparatus comprising:
the device comprises an acquisition module, a registration module and a registration module, wherein the acquisition module is used for acquiring at least two frames of images to be aligned and determining a frame of reference frame image from the at least two frames of images to be aligned;
the dividing module is used for dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range;
the first determining module is used for determining at least one second block image from other frames of images to be aligned according to each first block image, acquiring the similarity between the first block image and each second block image in the at least one second block image, and determining a target second block image with the maximum similarity; the second block image corresponds to the first block image in size;
a second determining module, configured to determine a motion vector of the target second block image according to a position of the target second block image in the image to be aligned in the other frame and a position of the first block image in the reference frame image;
and the alignment module is used for aligning the target second block image and the image of the corresponding first block image based on the movement vector.
In a third aspect, the present application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described above.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the method described above.
According to the image alignment method, the device, the computer equipment, the storage medium and the computer program product, one frame of reference frame image is determined from at least two frames of images to be aligned, a plurality of first block images with balanced texture complexity are divided from the reference frame image according to the texture feature information of the reference frame image, a target second block image with the maximum similarity to each first block image in the reference frame image in the second block images of other frames of images to be aligned is determined, a movement vector of the target second block image is determined according to the position of the target second block in the images to be aligned in other frames and the position of the corresponding first block image with the maximum similarity in the reference frame image, and image alignment is carried out based on the movement vector. Compared with the traditional mode of aligning the whole image, the scheme has the advantages that the image is divided into the block images with balanced texture complexity, the moving vector of each block image is determined based on the similarity between the blocks, and the image alignment is carried out based on the moving vector, so that the effect of improving the image alignment accuracy is realized.
Drawings
FIG. 1 is a flow diagram illustrating an image alignment method in one embodiment;
FIG. 2 is a flowchart illustrating the block image dividing step according to an embodiment;
FIG. 3 is a schematic flow chart of the error determination step in one embodiment;
FIG. 4 is a flow chart illustrating the vector correction step in one embodiment;
FIG. 5 is a block diagram showing the structure of an image alignment apparatus according to an embodiment;
FIG. 6 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In an embodiment, as shown in fig. 1, an image alignment method is provided, which is exemplified by applying the method to a terminal, and it is understood that the method may also be applied to a server, and may also be applied to a system including the terminal and the server, and is implemented by interaction between the terminal and the server, and includes the following steps:
step S202, at least two frames of images to be aligned are obtained, and a frame of reference frame image is determined from the at least two frames of images to be aligned.
The image to be aligned may be an image obtained by shooting through a camera of a mobile phone or other devices, the image to be aligned may have multiple frames, and the noise reduction and HDR (High dynamic range) effects may be achieved on the image by synthesizing the multiple frames of images. However, due to time difference or hand-held vibration, etc., the images may be offset when the images of the plurality of frames are combined, and therefore, the images of the plurality of frames need to be aligned. When the terminal aligns the images, at least two frames of images to be aligned can be obtained, and the at least two frames of images to be aligned can be images continuously shot in a preset time period or can be multiple images with the same image content. The terminal can determine a frame of reference frame image from at least two frames of images to be aligned.
The reference frame image may be a representative one of at least two frames of images to be aligned. The terminal may determine the reference frame image according to the sharpness of each frame image. For example, in one embodiment, determining a frame of reference frame image from at least two images to be aligned comprises: acquiring the definition of each frame of image to be aligned in at least two frames of images to be aligned; and taking the image to be aligned with the frame with the maximum definition as a reference frame image. In this embodiment, when determining the reference frame image, the terminal may obtain the definition of each to-be-aligned image of the at least two to-be-aligned images. And the terminal can take the image to be aligned with the frame with the maximum definition as a reference frame image. I.e. the reference frame image may be the sharpest and stable one of the at least two images to be aligned, the other frames of images to be aligned may be compared and aligned based on the reference frame images.
Step S204, dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range.
The texture feature information may be a texture in the reference frame image. The texture feature information may be feature information in which a gray-scale distribution is formed at a spatial position. By calculating the correlation between pixels in the image, the information of the image in the aspects of direction, interval, change amplitude and the like is reflected. The terminal can acquire texture feature information in the reference frame image, and divide a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image, wherein the texture complexity of the divided first block images is within a preset range. Namely, the terminal can adaptively divide the reference frame image into a plurality of blocks with balanced texture complexity. The size of each block in the reference frame image may be the same or different, and the size of the block needs to be determined according to the texture complexity of each block. For example, the terminal may split a block with high block texture complexity for a plurality of times, and split or merge the block with lower block complexity for a plurality of times with other blocks. Namely, the terminal can segment the reference frame image for multiple times, so as to obtain multiple block images with the complexity within a preset range.
The terminal may further determine the texture complexity of the block image in various manners, for example, the terminal may determine the texture complexity of each block image in a manner of gradient value, standard deviation, entropy value, and the like, so as to determine whether to continue segmenting or merging the blocks.
Step S206, determining at least one second block image from the images to be aligned of other frames aiming at each first block image, obtaining the similarity between the first block image and each second block image in the at least one second block image, and determining a target second block image with the maximum similarity; the second block image corresponds to the first block image in size.
The terminal may divide the block image of the reference frame image to obtain a plurality of first block images in the reference frame image, wherein in the at least two frames of images to be aligned, besides the reference frame image, there may be other frames of images to be aligned, and the terminal may determine, according to the size of each first block image in the reference frame image, a second block image in the other frames of images to be aligned. For example, for each first block image, the terminal may determine, according to the first block image, one second block image in the other frames of images to be aligned, where the size of the second block image is consistent with that of the first block image, and the terminal may determine, in the other frames of images to be aligned, multiple second block images in the other frames of images to be aligned, where the size of the second block image in each of the other frames of images to be aligned may be one or multiple.
After obtaining each second block image in the other frames of images to be aligned, the terminal may obtain the similarity between each first block image in the reference frame image and the second block image in the other frames of images to be aligned. For example, for each first block image, a plurality of second block images are determined in other frames of images to be aligned, and the similarity of each second block image and the first block image is compared, so that the terminal can obtain the similarity of each second block image and the first block image. And the terminal can determine a target second block image with the maximum similarity from a plurality of second block images of the other frames of images to be aligned, wherein the target second block image is the most similar part of the other frames of images to be aligned to the first block image. Since there may be a plurality of first block images in the reference frame image, after the terminal performs the similarity comparison on each first block image, a target second block image most similar to each first block image in the other frames of images to be aligned in the frame can be obtained. In the other frames of the image to be aligned in the frame, there may be a plurality of second block images most similar to the first block image, and the terminal may determine which is the target second block image by calculating a matching error between the first block image and the second block images with the greatest similarity. The matching error can be determined by distance information between the first block image and the second block image;
when there are multiple frames of images to be aligned in other frames, the terminal may determine at least one second block image in each frame of images to be aligned in other frames, and determine a target second block image in each frame of images to be aligned according to the method.
Step S208, determining a motion vector of the target second block image according to the position of the target second block image in the to-be-aligned image of the other frame and the position of the first block image in the reference frame image.
The target second block image may be a second block image with the largest similarity with the corresponding first block image in the other frames of the to-be-aligned images. That is, there is a first block image in the reference frame image, which has the greatest similarity with the target second block image. However, due to the time difference or the handheld vibration, the position of the target second block image in the to-be-aligned image may be different from the position of the corresponding first block image in the reference frame image, and therefore, the terminal needs to move the target second block image. The terminal can determine the motion vector of the target second block image according to the position of the target second block image in the image to be aligned in the other frame and the position of the corresponding first block image with the maximum similarity in the reference frame. The moving vector may be a vector for determining a moving direction and a moving distance of the target second block image, so that the target second block image may be aligned with the corresponding first block image.
Step S210, aligning the target second block image and the image of the corresponding first block image based on the motion vector.
Each target second block image has a corresponding motion vector. The terminal may move the target second block image based on the determined movement vector so that the target second block image is aligned with the corresponding first block image with the largest similarity. The terminal can move the target second block images in each frame of image to be aligned based on the movement vector, so that each target second block image is aligned with the corresponding first block image in the reference frame, and the terminal can obtain the aligned image after aligning each target second block image in each frame of image to be aligned with the corresponding first block image. So that the terminal can realize the alignment of the images of multiple frames.
In the image alignment method, a frame of reference frame image is determined from at least two frames of images to be aligned, a plurality of first block images with balanced texture complexity are divided from the reference frame image according to texture feature information of the reference frame image, a target second block image with the maximum similarity to each first block image in the reference frame image in second block images of other frames of images to be aligned is determined, a movement vector of the target second block image is determined according to the position of the target second block in the images to be aligned in other frames and the position of the corresponding first block image with the maximum similarity in the reference frame image, and image alignment is performed based on the movement vector. Compared with the traditional mode of aligning the whole image, the scheme has the advantages that the image is divided into the block images with balanced texture complexity, the moving vector of each block image is determined based on the similarity between the blocks, and the image alignment is carried out based on the moving vector, so that the effect of improving the image alignment accuracy is realized.
In one embodiment, dividing a plurality of first block images in a reference frame image according to texture feature information of the reference frame image comprises: taking the reference frame image as an image to be divided, and dividing the image to be divided into a plurality of candidate block images with the same size; aiming at each candidate block image, acquiring the texture complexity and the side length of the candidate block image; if the texture complexity is greater than a first preset complexity threshold, the side length is greater than or equal to a preset side length threshold and the dividing times are less than a preset time threshold, taking the candidate block image as a new image to be divided, and returning to the step of dividing the image to be divided into a plurality of candidate block images with the same size; otherwise, determining the area corresponding to the candidate block image as a target candidate block image in the reference frame image; and determining a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images.
In this embodiment, the terminal may divide a plurality of first block images with balanced texture complexity in the reference frame image based on the texture feature information in the reference frame image. The terminal may first take the reference frame image as an image to be divided, and divide the image to be divided into a plurality of candidate block images of the same size. The terminal may analyze the image of each candidate block, for example, analyze the texture complexity of the image in the block, and determine whether to continue the segmentation or the merging. The number of the candidate block images can be multiple, and for each candidate block image, the terminal can obtain the texture complexity and the side length of the candidate block image, so that the terminal can determine whether the block needs to be further segmented or merged based on the texture complexity and the side length of the block.
As shown in fig. 2, the blocks in the image 301 may be the image obtained by dividing the reference frame image into a plurality of candidate block images with the same size, such as four square blocks in the image 301. The terminal may perform detection of texture complexity and side length for each candidate block image. The terminal can determine the texture complexity in the block image in various ways. For example, in one embodiment, obtaining the texture complexity of the candidate block image includes: obtaining a gray gradient value corresponding to the candidate block image according to the gray difference value of each pixel and the adjacent pixel in the candidate block image; determining a standard deviation value corresponding to the candidate block image according to the distance between each pixel in the candidate block image and a preset position in the candidate block image; determining the entropy value corresponding to the candidate block image according to each gray level in the candidate block image; and determining the texture complexity of the candidate block image according to the weighted sum of the gray gradient value, the standard difference value and the entropy value.
In this embodiment, the terminal may determine the texture complexity of the candidate block image in a manner of a gradient value, a variance, a standard deviation, an entropy value, and the like of a gray level. The image gradient is commonly used for edge detection and image complexity analysis, the direction of the gradient points to the direction with the fastest change, the pixel value change at the edge is the largest, and the gradient value is larger; the variance and standard deviation can describe the degree of dispersion between pixels; the entropy value may be a metric used to describe the amount of consultation present in the image, indicating the complexity of the image, the higher the complexity of the image, the larger the entropy value, and vice versa. The number of the candidate block images may be multiple, and for each candidate block image, the terminal may obtain a gray scale difference between each pixel in the candidate block image and its neighboring pixel, to obtain a gray scale gradient value corresponding to the candidate block image. The terminal may determine a standard deviation value corresponding to the candidate block image according to a distance between each pixel in the candidate block image and a preset position in the candidate block image. And the terminal may further determine an entropy value corresponding to the candidate block image according to each gray level existing in the candidate block image.
Specifically, for the gray scale gradient value, the calculation formula can be as follows: grad (x, y) = dx (i, j) + dy (i, j); dx (i, j) = p (i, j) -p (i-1,j); dy (i, j) = p (i, j) -p (i, j-1); where, grad (x, y) may be a gray gradient value of the candidate tile image, dx (i, j) is a gray variation value in the x-axis direction, dy (i, j) is a gray variation value in the y-axis direction, p (i, j) is a gray value of one pixel, p (i, j) -p (i-1,j) represents gray variation values of two pixels adjacent in the x-axis direction, and p (i, j) -p (i, j-1) represents gray variation values of two pixels adjacent in the y-axis direction. The variance can also be referred to as variance, and the formula for variance and standard deviation can be as follows:
Figure BDA0003814432120000121
wherein Xi represents the position of the ith pixel in the candidate block image, μ is a preset position in the candidate block image, and N is the total number of pixels in the candidate block image. The above formula of entropy values can be as follows:
Figure BDA0003814432120000122
h (X) is an entropy value of a candidate patch image, pi may be a probability of occurrence of each gray level of the candidate patch image, the probability may be determined by determining all gray levels occurring in the candidate patch image, and then determining pi based on the number of gray levels occupying the total gray level, and n may be a total number of gray levels in the candidate patch image. After the terminal obtains the grayscale value, the standard difference value and the entropy value, the texture complexity of the candidate block image can be determined according to the weighted sum of the grayscale value, the standard difference value and the entropy value, wherein in the weighted sum, the weights corresponding to the grayscale value, the standard difference value and the entropy value can be set according to actual conditions.
After obtaining the texture complexity and the side length of each candidate block image, the terminal can judge whether the texture complexity is greater than a first preset complexity threshold, whether the side length is greater than or equal to a preset side length threshold, and whether the division times are less than a preset time threshold. If the terminal detects that the texture complexity is larger than a first preset complexity threshold, the side length is larger than or equal to a preset side length threshold and the division times are smaller than a preset times threshold, the terminal can determine that the texture of the candidate block image is complex enough and needs to be divided continuously, the terminal takes the candidate block image as a new image to be divided, and returns to the step of dividing the image to be divided into a plurality of candidate block images with the same size for next division. Specifically, as shown in 302 in fig. 2, after calculating the texture complexity of each candidate block image, the terminal determines that the texture of the candidate block image is sufficiently complex through the above determination, and then divides the candidate block image again.
When detecting that the texture complexity of the candidate block image is less than or equal to a first preset complexity threshold, or the side length is less than a preset side length threshold, or the division frequency is greater than or equal to a preset frequency threshold, the terminal may determine that the division of the block image needs to be stopped, and determine the region corresponding to the candidate block image as a target candidate block image in the reference frame image. There may be a plurality of target candidate block images, and the final terminal may obtain the division result as shown at 304 and 305 in fig. 2. That is, each block in 305 may be a target candidate block image, and the terminal may determine a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images.
Through the embodiment, the terminal can determine the texture complexity of the block image based on various parameters, and can segment the block with the too high texture complexity to obtain a plurality of block images with balanced texture complexity in the reference frame, so that the terminal can perform image alignment based on a plurality of first block images in the reference frame image, and the accuracy of the image alignment is improved.
In one embodiment, determining a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images comprises: acquiring the texture complexity of a plurality of target candidate block images; merging adjacent target candidate block images with the texture complexity less than or equal to a second preset complexity threshold to obtain merged target candidate block images; the second preset complexity threshold is smaller than the first preset complexity threshold; and obtaining a plurality of first block images in the reference frame image according to the non-merged target candidate block image and the merged target candidate block image.
In this embodiment, after the terminal divides the candidate block images, the terminal may divide the candidate block images to obtain a plurality of target candidate block images with lower texture complexity, and there may be blocks with too low texture complexity and too small block size in the target candidate block images, so the terminal may merge the blocks. After the terminal divides the target candidate block images to obtain a plurality of target candidate block images, the texture complexity of the target candidate block images can be obtained, the target candidate block images with the texture complexity smaller than or equal to a second preset complexity threshold value can be obtained, and the terminal can merge adjacent target candidate block images with the texture complexity smaller than or equal to the second preset complexity threshold value, so that merged target candidate block images are obtained. And the second preset complexity threshold is smaller than the first preset complexity threshold. Therefore, the terminal can keep the texture complexity of the combined target candidate block image within a certain complexity range. Specifically, as shown at 303 in fig. 2, the terminal may merge the adjacent target candidate block images with too small texture complexity, so that the terminal may obtain a plurality of first block images in the reference frame image according to the non-merged target candidate block image and the merged target candidate block image.
Through the embodiment, the terminal can merge the blocks with undersize texture complexity and adjacent blocks, so that the block image with the texture complexity within the preset complexity range in the reference frame image is obtained, image alignment is carried out based on the block image, and the accuracy of image alignment is improved.
In one embodiment, the obtaining the similarity between the first block image and each of the at least one second block image comprises: if the size of the first block image is larger than or equal to the size of the preset area, determining the similarity between the first block image and each second block image in at least one second block image based on a first similarity comparison strategy; if the size of the first block image is smaller than the size of the preset area and the brightness difference value between the second block image and the first block image is smaller than the preset brightness difference threshold value, determining the similarity between the first block image and each second block image in at least one second block image based on a second similarity comparison strategy; if the size of the first block image is smaller than the size of the preset area and the brightness difference value between the second block image and the first block image is larger than or equal to the preset brightness difference threshold value, determining the similarity between the first block image and each second block image in at least one second block image based on a third similarity comparison strategy; the first similarity comparison strategy, the second similarity comparison strategy and the third similarity comparison strategy are different from each other.
In this embodiment, there may be a plurality of first block images in the reference frame image, for each first block image, the terminal may determine, based on the size of the first block image, a second block image with the same size in the other frames of images to be aligned, where there may be a plurality of second block images in each other frame of images to be aligned, and the terminal may determine the similarity between the second block image in the other frames of images to be aligned and the first block image in a plurality of ways. For example, the terminal may detect a size of the first block image, and if the size of the first block image is greater than or equal to a predetermined area size, the terminal may determine a similarity between the first block image and each of the at least one second block image based on a first similarity comparison policy; if the terminal detects that the size of the first block image is smaller than the preset area size, the terminal can detect the brightness difference value between the first block image and each second block image, and if the terminal detects that the size of the first block image is smaller than the preset area size and the brightness difference value between the second block image and the first block image is smaller than a preset brightness difference threshold value, which indicates that brightness change between the two blocks is balanced, the terminal can determine the similarity between the first block image and each second block image in at least one second block image based on a second similarity comparison strategy; if the terminal detects that the size of the first block image is smaller than the preset area size and the brightness difference value between the second block image and the first block image is larger than or equal to the preset brightness difference threshold value, the terminal determines that the brightness change between the two blocks is large, and the terminal can determine the similarity between the first block image and the second block image based on a third similarity comparison strategy.
The first similarity comparison strategy, the second similarity comparison strategy and the third similarity comparison strategy may be different from each other, the first similarity comparison strategy may be a matrix-based comparison strategy, the second similarity comparison strategy may be a comparison strategy based on a local block matching manner, and the third similarity comparison strategy may be a comparison strategy based on a stereo matching manner. For example, in one embodiment, determining the similarity between the first block image and each of the at least one second block image based on the first similarity comparison policy includes: according to the texture feature information in the first block image and the same texture feature information in each second block image, a homography matrix is constructed, each second block image is adjusted to be matched with the first block image according to the homography matrix, and according to the error value of the adjusted second block image and the first block image, the similarity between the first block image and each second block image in at least one second block image is determined; determining the similarity between the first block image and each of the at least one second block image based on a second similarity comparison policy, including: acquiring a search area which is larger than the size of the first block image and contains the first block image from the other frames of images to be aligned, matching the first block image with a second block image in the search area, and determining the similarity between the first block image and the second block image in the search area according to the error value between the second block image and the first block image in the search area; determining the similarity between the first block image and each second block image in the at least one second block image based on a third similarity comparison strategy, including: and acquiring the Hamming distance between the first block image and the second block image, and obtaining the similarity between the first block image and each second block image in at least one second block image according to the Hamming distance.
In this embodiment, the terminal may determine the similarity between the blocks based on the matching cost, where the matching cost is also referred to as an error value, and the terminal may determine the matching cost based on the size of the block image and the luminance difference between the block images. For the case that the size of the first block image is greater than or equal to the size of the predetermined area, the terminal may construct the texture feature information according to the texture feature information in the first block image and the same texture feature information in each of the second block imagesAnd the homography matrix is used for adjusting each second block image to be matched with the first block image according to the homography matrix, and the similarity between the first block image and each second block image in at least one second block image is determined based on the error value of the adjusted second block image and the first block image. Specifically, when the first block image is large, for example, larger than the size of the predetermined area, it indicates that the first block image belongs to a flat area or an area with weak texture, the terminal may detect the feature point, match the block image with the second block image in the other frame to-be-aligned image through the homography matrix, and obtain a matching error value E between the first block image and the second block image 1_ini And an initial motion vector V between the second block image and the first block image 1_ini The homography matrix is a 3*3 matrix and contains information such as rotation, scaling and translation, and the second block image in the image to be aligned in other frames can be transformed through the homography matrix so as to be matched with the first block image. The terminal obtains the matching error value E 1_ini Then, the similarity between the first block image and each of the at least one second block image may be determined based on the error value, for example, the smaller the error value, the greater the similarity.
For the case that the size of the first block image is smaller than the size of the preset area and the luminance difference value between the second block image and the first block image is smaller than the preset luminance difference threshold, the terminal may obtain a search area which is larger than the size of the first block image and contains the first block image from the images to be aligned of other frames according to the size of the first block image, match the first block image with the second block image in the search area, and determine the similarity between the first block image and the second block image in the search area according to the error value between the second block image and the first block image in the search area. Specifically, the search area may be a search window determined by the terminal in the image to be aligned in another frame, and the terminal determines the second block image in the search area and searches for the second block image that is most similar to the first block image. The size of the image in the first block is larger than that of the image in the second blockWhen the first block image is aligned with the second block image, the terminal may perform the estimation of the matching cost by using a local block matching method if the luminance variation of the first block image and the luminance variation of the second block image are uniform. Thereby obtaining a matching error E between the first block image and the second block image 2_ini And an initial motion vector V between the second block image and the first block image 2_ini Wherein, the calculation formula of the second similarity comparison policy may be as follows:
Figure BDA0003814432120000161
wherein D is p (u, v) represents the error value of the match; p represents a norm, typically 1 or 2; (u, v) represents an initial motion vector; i (x + u) 0 ,y+v+v 0 ) Representing a search area; t (x, y) represents an area of the tile image.
For the case that the size of the first block image is smaller than the preset area size and the luminance difference value between the second block image and the first block image is greater than or equal to the preset luminance difference threshold, the terminal may obtain the hamming distance between the first block image and the second block image, and obtain the similarity between the first block image and each second block image in the at least one second block image according to the hamming distance. The hamming distance can be determined by stereo matching.
For example, in one embodiment, obtaining the hamming distance between the first block image and the second block image comprises: acquiring gray values of a central pixel point of the first block image and other pixel points adjacent to the central pixel point, recording other pixel points of which the gray values are greater than the gray value of the central pixel point as a first numerical value, and recording other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as a second numerical value; the first value and the second value are different; obtaining a first numerical value string corresponding to the first block image according to the numerical values corresponding to other pixel points; acquiring the gray values of a central pixel point of the second block image and other pixel points adjacent to the central pixel point, recording other pixel points of which the gray values are greater than the gray value of the central pixel point as first numerical values, and recording other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as second numerical values; obtaining a second numerical value string corresponding to the second block image according to the numerical values corresponding to other pixel points; and carrying out exclusive OR operation on the first numerical string and the second numerical string to obtain the Hamming distance between the first block image and the second block image.
In this embodiment, before obtaining the hamming distance, the terminal may perform numerical value conversion on pixels in the first block image and the second block image. The terminal can obtain the gray values of the central pixel point of the first block image and other pixel points adjacent to the central pixel point, and records the other pixel points of which the gray values are greater than the gray value of the central pixel point as a first numerical value, records the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as a first numerical value, and records the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as a second numerical value. Wherein the first value and the second value are different. Therefore, the terminal can obtain the first numerical string corresponding to the first block image according to the numerical values corresponding to other pixel points. In addition, the terminal can also obtain the gray values of the central pixel point of the second block image and other pixel points adjacent to the central pixel point, record other pixel points with the gray values larger than the gray value of the central pixel point as first numerical values, and record other pixel points with the gray values smaller than or equal to the gray value of the central pixel point as second numerical values, so that the terminal can obtain a second numerical value string corresponding to the second block image according to the numerical values corresponding to the other pixel points. After the terminal obtains the numerical string, the terminal may perform exclusive or (XOR) operation on the first numerical string and the second numerical string to obtain a hamming distance between the first block and the second block, and may further obtain an initial motion vector between the first block image and the second block image. So that the terminal can determine the similarity between the first block image and the second block image based on the hamming distance.
Specifically, as shown in fig. 3, fig. 3 is a schematic flow chart of the error determination step in one embodiment. When the texture complexity of the first block image is high and the luminance difference between the first block image and the second block image is large, the terminal may perform stereo matching encoding on the difference between the center points of the first block image and the second block image and the pixels adjacent to the center points of the first block image and the second block image to obtain two character strings, for example, {11010101} corresponding to the first block image and {10100111} corresponding to the second block image in the figure, specifically, the terminal may select any point in the block image, and draw a rectangle of, for example, 3 × 3 with the point as the center, each point in the rectangle except the center point is compared with the center point, the gray value smaller than the center point is marked as 1, the gray value larger than the center point is marked as 0, so as to obtain the numerical value of each pixel point, and the terminal extracts the numerical value in a certain order to obtain the first numerical value string and the second numerical value string. The terminal may perform exclusive or operation on the two numerical strings to obtain a hamming distance, for example, for the {11010101} and {10100111}, the hamming distance is 4, which indicates a matching error E between the first block image and the second block image 3_ini Is 4, and the terminal may further obtain an initial motion vector V according to a position difference between the first block image and the second block image 3_ini . So that the terminal can determine the similarity based on the matching error.
Through the embodiment, the terminal can determine different similarity comparison strategies among the blocks based on the sizes of the block images and the brightness change values among the block images, so that the similarity and the initial movement vector among the blocks are obtained, and the terminal aligns the blocks based on the similarity and the initial movement vector, so that the accuracy of image alignment is improved.
In one embodiment, determining a motion vector of the target second block image according to the position of the target second block image in the other frame images to be aligned and the position of the first block image in the reference frame image includes: determining an initial movement vector of an object corresponding to the target second block image according to the distance and the direction of the position of the target second block image in the other frames of images to be aligned and the position of the corresponding first block image with the maximum similarity in the reference frame image; obtaining the confidence coefficient of the initial motion vector; if the confidence coefficient is smaller than a preset confidence coefficient threshold value, adjusting the first initial motion vector of the object corresponding to the target second block image according to the initial motion vector of the object corresponding to the associated target second block image and the confidence coefficient thereof to obtain the motion vector of the target second block image; and the associated target second block image is adjacent to the target second block image, and the confidence coefficient is greater than or equal to the preset confidence coefficient threshold value.
In this embodiment, the terminal may determine the motion vector of the target second block image based on the position of each target second block image in the to-be-aligned image where the target second block image is located and the position of the corresponding first block image with the largest similarity in the reference frame image. The terminal may first determine the initial movement vector, and the terminal may determine the initial movement vector of the object corresponding to the target second block image according to the position of the target second block image in the to-be-aligned image in the other frame, and the distance and the direction of the position of the first block image with the largest similarity corresponding to the target second block image in the reference frame image. After the terminal determines the confidence of each initial motion vector, the confidence of the initial motion vector can be obtained, and the confidence is detected.
The confidence degree can be obtained based on the matching error, but the matching error can be obtained according to different similarity comparison strategies, so that the terminal can determine the confidence degree after normalizing each matching error. For example, in one embodiment, obtaining a confidence level for an initial motion vector comprises: acquiring a preset normalization parameter corresponding to a similarity comparison strategy of the target second block image and the first block image; and obtaining the confidence of the target second block image according to the ratio of the similarity of the target second block image and the corresponding first block image with the maximum similarity to a preset normalization parameter. In this embodiment, when the terminal performs normalization, a preset normalization corresponding to the similarity comparison policy between the target second block image and the first block image may be obtainedAnd obtaining the confidence coefficient of the target second block image according to the ratio of the similarity of the target second block image and the corresponding first block image with the maximum first similarity to the preset normalization parameter. Specifically, since the blocks have different sizes and the similarity comparison policies used are different, the terminal may first determine the preset normalization parameter N of each similarity comparison policy, for example, if the similarity comparison policy has three kinds, the preset normalization parameter has three kinds N 1 、N 2 And N 3 . The terminal can pass formula C n =E n /N n And n =1,2,3, obtaining corresponding confidence degrees under each similarity comparison strategy. Wherein E is n Denotes the matching error obtained by the nth similarity comparison strategy, C n Representing the confidence of the match error calculation based on the nth similarity comparison strategy. The terminal may then drop the error value on the same reference for subsequent comparison by the calculation described above.
The terminal may obtain a first initial movement vector of the object corresponding to the target second block image with the confidence coefficient smaller than the preset confidence coefficient threshold, and obtain a second initial movement vector of the object corresponding to the target second block image with the confidence coefficient greater than or equal to the preset confidence coefficient threshold. The terminal may obtain a second initial motion vector and a confidence thereof of an object corresponding to the target second block image which is adjacent to the target second block image and whose confidence is greater than or equal to the preset confidence threshold, and adjust the first initial motion vector according to the second initial motion vector and the confidence to obtain a motion vector of the target second block image.
Wherein the adjustment may be an adjustment in the direction of the vector and the representative moving distance. Specifically, as shown in fig. 4, fig. 4 is a schematic flow chart of a vector correction step in one embodiment. After the similarity comparison, each second block image includes the corresponding confidence and the initial motion vector. The terminal can make the confidence lower than a preset confidence threshold Th c The block of (1) is modified by the vector modification based on the lower than the predetermined confidence threshold Th c Blocks of (1) are adjacent and the confidence is higher than the threshold Th c The block (2) is corrected by means of weighted sum to obtain the final motion vector V d . The calculation formula is as follows: v d =(∑ i V i ·C i )/(∑ i C i ),C i >Th c . Where i represents the number of neighboring blocks. As shown in fig. 4, the dotted arrow indicates a block image below a preset confidence threshold, which indicates that the block image needs to be corrected, the solid arrow indicates a block image with high confidence, which can be used to correct a block image with low confidence, and fig. 4 indicates a block image with high confidence by correcting a vector of an intermediate block, and the terminal changes the block image into a block image with high confidence by comparing before and after correction.
By the embodiment, the terminal can firstly normalize the confidence level, carry out confidence level judgment based on the confidence level under the same reference, and correct the block with lower confidence level based on the adjacent block with high confidence level, so that the terminal can move the object of the block image based on the motion vector of the corrected block, and the accuracy of image alignment is improved.
It should be understood that, although the steps in the flowcharts related to the embodiments are shown in sequence as indicated by the arrows, the steps are not necessarily executed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the above embodiments may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least a part of the steps or stages in other steps.
Based on the same inventive concept, the embodiment of the present application further provides an image alignment apparatus for implementing the image alignment method. The implementation scheme for solving the problem provided by the apparatus is similar to the implementation scheme described in the above method, so specific limitations in one or more embodiments of the image alignment apparatus provided below may refer to the limitations on the image alignment method in the foregoing, and details are not described here.
In one embodiment, as shown in fig. 5, there is provided an image alignment apparatus including: an obtaining module 500, a dividing module 502, a first determining module 504, a second determining module 506, and an aligning module 508, wherein:
the acquiring module 500 is configured to acquire at least two frames of images to be aligned, and determine a frame of reference frame image from the at least two frames of images to be aligned.
A dividing module 502, configured to divide a plurality of first block images in a reference frame image according to texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range.
A first determining module 504, configured to determine at least one second block image from the to-be-aligned images of other frames for each first block image, obtain a similarity between each first block image in the reference frame image and each second block image in the at least one second block image, and determine a target second block image with a maximum similarity; the second block image corresponds to the size of the first block image.
The second determining module 506 is configured to determine a motion vector of the target second block image according to the position of the target second block image in the image to be aligned in the other frame and the position of the first block image in the reference frame image.
An alignment module 508, configured to align the target second block image and the image of the corresponding first block image based on the motion vector.
In an embodiment, the obtaining module 500 is specifically configured to use the reference frame image as an image to be divided, and divide the image to be divided into a plurality of candidate block images with the same size; aiming at each candidate block image, acquiring the texture complexity and the side length of the candidate block image; if the texture complexity is greater than a first preset complexity threshold, the side length is greater than or equal to a preset side length threshold and the dividing times are less than a preset time threshold, taking the candidate block image as a new image to be divided, and returning to the step of dividing the image to be divided into a plurality of candidate block images with the same size; otherwise, determining the area corresponding to the candidate block image as a target candidate block image in the reference frame image; and determining a plurality of first block images in the reference frame image according to the determined plurality of target candidate block images.
In one embodiment, the obtaining module 500 is specifically configured to obtain texture complexity of a plurality of target candidate block images; merging adjacent target candidate block images with the texture complexity less than or equal to a second preset complexity threshold to obtain merged target candidate block images; the second preset complexity threshold is smaller than the first preset complexity threshold; and obtaining a plurality of first block images in the reference frame image according to the non-merged target candidate block image and the merged target candidate block image.
In an embodiment, the obtaining module 500 is specifically configured to obtain a gray gradient value corresponding to the candidate block image according to a gray difference between each pixel in the candidate block image and its neighboring pixels; determining a standard deviation value corresponding to the candidate block image according to the distance between each pixel in the candidate block image and a preset position in the candidate block image; determining the entropy value corresponding to the candidate block image according to each gray level in the candidate block image; and determining the texture complexity of the candidate block image according to the weighted sum of the gray gradient value, the standard difference value and the entropy value.
In one embodiment, the first determining module 504 is specifically configured to determine similarity between the first block image and each of the at least one second block image based on a first similarity comparison policy if the size of the first block image is greater than or equal to a predetermined area size; if the size of the first block image is smaller than the size of the preset area and the brightness difference value between the second block image and the first block image is smaller than the preset brightness difference threshold value, determining the similarity between the first block image and each second block image in at least one second block image based on a second similarity comparison strategy; if the size of the first block image is smaller than the size of the preset area and the brightness difference value between the second block image and the first block image is larger than or equal to the preset brightness difference threshold value, determining the similarity between the first block image and each second block image in at least one second block image based on a third similarity comparison strategy; the first similarity comparison strategy, the second similarity comparison strategy and the third similarity comparison strategy are different from each other.
In an embodiment, the first determining module 504 is specifically configured to construct a homography matrix according to the texture feature information in the first block image and the same texture feature information in each of the second block images, adjust each of the second block images to match the first block image according to the homography matrix, and determine the similarity between the first block image and each of the second block images in the at least one second block image according to an error value between the adjusted second block image and the first block image.
In an embodiment, the first determining module 504 is specifically configured to obtain a search area that is larger than the size of the first block image and includes the first block image from the other to-be-aligned images, match the first block image with a second block image in the search area, and determine a similarity between the first block image and the second block image in the search area according to an error value between the second block image and the first block image in the search area.
In an embodiment, the first determining module 504 is specifically configured to obtain a hamming distance between the first block image and the second block image, and obtain a similarity between the first block image and each of the at least one second block image according to the hamming distance.
In an embodiment, the first determining module 504 is specifically configured to obtain the gray values of the center pixel point and other pixel points adjacent to the center pixel point of the first tile image, record, as the first numerical value, other pixel points whose gray values are greater than the gray value of the center pixel point, and record, as the second numerical value, other pixel points whose gray values are less than or equal to the gray value of the center pixel point; the first value and the second value are different; obtaining a first numerical value string corresponding to the first block image according to the numerical values corresponding to other pixel points; acquiring the gray values of a central pixel point of the second block image and other pixel points adjacent to the central pixel point, recording the other pixel points of which the gray values are greater than the gray value of the central pixel point as a first numerical value, and recording the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as a second numerical value; obtaining a second numerical value string corresponding to the second block image according to the numerical values corresponding to other pixel points; and carrying out exclusive OR operation on the first numerical string and the second numerical string to obtain the Hamming distance between the first block image and the second block image.
In an embodiment, the second determining module 506 is specifically configured to determine an initial movement vector of an object corresponding to the target second block image according to a distance and a direction between a position of the target second block image in the to-be-aligned image in the other frame and a position of the corresponding first block image with the largest similarity in the reference frame image; obtaining the confidence coefficient of the initial motion vector; if the confidence coefficient is smaller than a preset confidence coefficient threshold value, adjusting the first initial motion vector of the object corresponding to the target second block image according to the initial motion vector of the object corresponding to the associated target second block image and the confidence coefficient thereof to obtain the motion vector of the target second block image; and the associated target second block image is adjacent to the target second block image, and the confidence coefficient is greater than or equal to the preset confidence coefficient threshold.
In an embodiment, the second determining module 506 is specifically configured to obtain a preset normalization parameter corresponding to a similarity comparison policy between the target second block image and the first block image; and obtaining the confidence of the target second block image according to the ratio of the similarity of the target second block image and the corresponding first block image with the maximum similarity to a preset normalization parameter.
In an embodiment, the obtaining module 500 is specifically configured to obtain a definition of each to-be-aligned image in at least two to-be-aligned images; and taking the image to be aligned with the frame with the maximum definition as a reference frame image.
The modules in the image alignment apparatus can be wholly or partially implemented by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device comprises a processor, a memory, a communication interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image alignment method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the architecture shown in fig. 6 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory in which a computer program is stored and a processor which, when executing the computer program, implements the image alignment method described above.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the image alignment method described above.
In an embodiment, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the image alignment method described above.
It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above may be implemented by hardware that is instructed by a computer program, and the computer program may be stored in a non-volatile computer-readable storage medium, and when executed, may include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), magnetic Random Access Memory (MRAM), ferroelectric Random Access Memory (FRAM), phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (14)

1. An image alignment method, comprising:
acquiring at least two frames of images to be aligned, and determining a frame of reference frame image from the at least two frames of images to be aligned;
dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range;
determining at least one second block image from other frames of images to be aligned according to each first block image, obtaining the similarity between the first block image and each second block image in the at least one second block image, and determining a target second block image with the maximum similarity; the second block image corresponds to the first block image in size;
determining a movement vector of the target second block image according to the position of the target second block image in the other frame to-be-aligned image and the position of the first block image in the reference frame image;
and aligning the target second block image and the image of the corresponding first block image based on the movement vector.
2. The method according to claim 1, wherein said dividing a plurality of first block pictures in the reference frame picture according to the texture feature information of the reference frame picture comprises:
taking the reference frame image as an image to be divided, and dividing the image to be divided into a plurality of candidate block images with the same size;
aiming at each candidate block image, acquiring the texture complexity and the side length of the candidate block image;
if the texture complexity is greater than a first preset complexity threshold, the side length is greater than or equal to a preset side length threshold and the dividing times are less than a preset time threshold, taking the candidate block image as a new image to be divided, and returning to the step of dividing the image to be divided into a plurality of candidate block images with the same size; otherwise, determining the area corresponding to the candidate block image as a target candidate block image in the reference frame image;
and determining a plurality of first block images in the reference frame image according to the determined target candidate block images.
3. The method of claim 2, wherein determining a first plurality of block images in the reference frame image according to the determined target candidate block images comprises:
acquiring the texture complexity of the target candidate block images;
merging adjacent target candidate block images with the texture complexity less than or equal to a second preset complexity threshold to obtain merged target candidate block images; the second preset complexity threshold is smaller than the first preset complexity threshold;
and obtaining a plurality of first block images in the reference frame image according to the target candidate block images which are not combined and the combined target candidate block images.
4. The method of claim 2, wherein said obtaining the texture complexity of the candidate block image comprises:
obtaining a gray gradient value corresponding to the candidate block image according to the gray difference value of each pixel and the adjacent pixel in the candidate block image;
determining a standard deviation value corresponding to the candidate block image according to the distance between each pixel in the candidate block image and a preset position in the candidate block image;
determining the entropy value corresponding to the candidate block image according to each gray level in the candidate block image;
and determining the texture complexity of the candidate block image according to the weighted sum of the gray gradient value, the standard difference value and the entropy value.
5. The method of claim 1, wherein the obtaining the similarity between the first block image and each of the at least one second block image comprises:
if the size of the first block image is larger than or equal to the size of a preset area, determining the similarity between the first block image and each second block image in the at least one second block image based on a first similarity comparison strategy;
if the size of the first block image is smaller than a preset area size and the brightness difference value between the second block image and the first block image is smaller than a preset brightness difference threshold value, determining the similarity between the first block image and each second block image in the at least one second block image based on a second similarity comparison strategy;
if the size of the first block image is smaller than a preset area size and the brightness difference value between the second block image and the first block image is larger than or equal to a preset brightness difference threshold value, determining the similarity between the first block image and each second block image in the at least one second block image based on a third similarity comparison strategy;
wherein the first similarity comparison strategy, the second similarity comparison strategy and the third similarity comparison strategy are different from each other.
6. The method according to claim 5, wherein the determining the similarity between the first block image and each of the at least one second block image based on the first similarity comparison policy comprises:
constructing a homography matrix according to the texture feature information in the first block image and the same texture feature information in each second block image, adjusting each second block image to be matched with the first block image according to the homography matrix, and determining the similarity between the first block image and each second block image in the at least one second block image according to the error value of the adjusted second block image and the first block image; alternatively, the first and second electrodes may be,
the determining the similarity between the first block image and each of the at least one second block image based on a second similarity comparison policy includes:
acquiring a search area which is larger than the size of the first block image and contains the first block image from the images to be aligned of other frames, matching the first block image with a second block image in the search area, and determining the similarity between the first block image and the second block image in the search area according to the error value between the second block image in the search area and the first block image; alternatively, the first and second liquid crystal display panels may be,
the determining the similarity of the first block image and each of the at least one second block image based on a third similarity comparison policy comprises:
and acquiring the Hamming distance between the first block image and the second block image, and obtaining the similarity between the first block image and each second block image in the at least one second block image according to the Hamming distance.
7. The method of claim 6, wherein obtaining the Hamming distance between the first block image and the second block image comprises:
acquiring gray values of a central pixel point of the first block image and other pixel points adjacent to the central pixel point, recording the other pixel points of which the gray values are greater than the gray value of the central pixel point as first numerical values, and recording the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as second numerical values; the first value and the second value are different;
obtaining a first numerical value string corresponding to the first block image according to the numerical values corresponding to the other pixel points;
acquiring the gray values of a central pixel point of the second block image and other pixel points adjacent to the central pixel point, recording the other pixel points of which the gray values are greater than the gray value of the central pixel point as first numerical values, and recording the other pixel points of which the gray values are less than or equal to the gray value of the central pixel point as second numerical values;
obtaining a second numerical value string corresponding to the second block image according to the numerical values corresponding to the other pixel points;
and carrying out exclusive OR operation on the first numerical string and the second numerical string to obtain the Hamming distance between the first block image and the second block image.
8. The method according to claim 5, wherein said determining the motion vector of the target second block image according to the position of the target second block image in the other frame image to be aligned and the position of the first block image in the reference frame image comprises:
determining an initial movement vector of an object corresponding to the target second block image according to the distance and the direction of the position of the target second block image in the other frame to-be-aligned image and the position of the corresponding first block image with the maximum similarity in the reference frame image;
obtaining the confidence of the initial motion vector;
if the confidence degree is smaller than a preset confidence degree threshold value, adjusting the first initial motion vector of the object corresponding to the target second block image according to the initial motion vector of the object corresponding to the associated target second block image and the confidence degree of the initial motion vector, and obtaining the motion vector of the target second block image; wherein the associated target second block image is adjacent to the target second block image and the confidence is greater than or equal to a preset confidence threshold.
9. The method of claim 8, wherein obtaining the confidence level of the initial motion vector comprises:
acquiring a preset normalization parameter corresponding to a similarity comparison strategy of the target second block image and the first block image;
and obtaining the confidence of the target second block image according to the ratio of the similarity of the target second block image and the corresponding first block image with the maximum similarity to the preset normalization parameter.
10. The method according to any one of claims 1 to 9, wherein said determining a frame of reference frame image from said at least two images to be aligned comprises:
acquiring the definition of each frame of image to be aligned in the at least two frames of images to be aligned;
and taking the image to be aligned with the frame with the maximum definition as a reference frame image.
11. An image registration apparatus, characterized in that the apparatus comprises:
the device comprises an acquisition module, a comparison module and a display module, wherein the acquisition module is used for acquiring at least two frames of images to be aligned and determining a frame of reference frame image from the at least two frames of images to be aligned;
the dividing module is used for dividing a plurality of first block images in the reference frame image according to the texture feature information of the reference frame image; the texture complexity of the first block images is within a preset range;
the first determining module is used for determining at least one second block image from other frames of images to be aligned according to each first block image, acquiring the similarity between the first block image and each second block image in the at least one second block image, and determining a target second block image with the maximum similarity; the second block image corresponds to the first block image in size;
a second determining module, configured to determine a motion vector of the target second block image according to a position of the target second block image in the image to be aligned in the other frame and a position of the first block image in the reference frame image;
and the alignment module is used for aligning the target second block image and the image of the corresponding first block image based on the movement vector.
12. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor realizes the steps of the method of any one of claims 1 to 10 when executing the computer program.
13. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 10.
14. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 10 when executed by a processor.
CN202211021665.1A 2022-08-24 2022-08-24 Image alignment method and device, computer equipment and storage medium Pending CN115272428A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211021665.1A CN115272428A (en) 2022-08-24 2022-08-24 Image alignment method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211021665.1A CN115272428A (en) 2022-08-24 2022-08-24 Image alignment method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115272428A true CN115272428A (en) 2022-11-01

Family

ID=83752433

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211021665.1A Pending CN115272428A (en) 2022-08-24 2022-08-24 Image alignment method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115272428A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740182A (en) * 2023-08-11 2023-09-12 摩尔线程智能科技(北京)有限责任公司 Ghost area determining method and device, storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116740182A (en) * 2023-08-11 2023-09-12 摩尔线程智能科技(北京)有限责任公司 Ghost area determining method and device, storage medium and electronic equipment
CN116740182B (en) * 2023-08-11 2023-11-21 摩尔线程智能科技(北京)有限责任公司 Ghost area determining method and device, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN109034078B (en) Training method of age identification model, age identification method and related equipment
US9665789B2 (en) Device and method for analyzing the correlation between an image and another image or between an image and a video
US8718324B2 (en) Method, apparatus and computer program product for providing object tracking using template switching and feature adaptation
EP2839389B1 (en) Image retargeting quality assessment
KR100660725B1 (en) Portable terminal having apparatus for tracking human face
US20150110386A1 (en) Tree-based Linear Regression for Denoising
Pham et al. Efficient image splicing detection algorithm based on markov features
WO2014070489A1 (en) Recursive conditional means image denoising
WO2023098045A1 (en) Image alignment method and apparatus, and computer device and storage medium
CN106407978B (en) Method for detecting salient object in unconstrained video by combining similarity degree
Lecca et al. Comprehensive evaluation of image enhancement for unsupervised image description and matching
Zhang et al. Fine-grained image quality assessment: A revisit and further thinking
Bellavia et al. Experiencing with electronic image stabilization and PRNU through scene content image registration
CN114332183A (en) Image registration method and device, computer equipment and storage medium
CN115272428A (en) Image alignment method and device, computer equipment and storage medium
Chen et al. Face super resolution based on parent patch prior for VLQ scenarios
CN113963072A (en) Binocular camera calibration method and device, computer equipment and storage medium
US20070280555A1 (en) Image registration based on concentric image partitions
CN109063537B (en) Hyperspectral image preprocessing method for unmixing of abnormal small target
WO2022206679A1 (en) Image processing method and apparatus, computer device and storage medium
CN115147296A (en) Hyperspectral image correction method, device, computer equipment and storage medium
CN115550558A (en) Automatic exposure method and device for shooting equipment, electronic equipment and storage medium
Xia et al. A coarse-to-fine ghost removal scheme for HDR imaging
CN115063473A (en) Object height detection method and device, computer equipment and storage medium
Van Vo et al. High dynamic range video synthesis using superpixel-based illuminance-invariant motion estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20230815

Address after: Changan town in Guangdong province Dongguan 523860 usha Beach Road No. 18

Applicant after: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS Corp.,Ltd.

Address before: Room F, 11/F, Beihai Center, 338 Hennessy Road, Wan Chai District, 810100 Hong Kong, China

Applicant before: Sonar sky Information Consulting Co.,Ltd.

TA01 Transfer of patent application right