WO2023245383A1

WO2023245383A1 - Method for aligning multiple image frames, apparatus for aligning multiple image frames, and storage medium

Info

Publication number: WO2023245383A1
Application number: PCT/CN2022/099957
Authority: WO
Inventors: 万韶华
Original assignee: 北京小米移动软件有限公司; 北京小米松果电子有限公司
Priority date: 2022-06-20
Filing date: 2022-06-20
Publication date: 2023-12-28
Also published as: CN117616455A

Abstract

The present disclosure relates to a method and apparatus for aligning multiple image frames, and a storage medium. The method for aligning multiple image frames comprises: acquiring reference frames and support frames of multiple image frames; performing fast Fourier transform on the reference frames, so as to obtain frequency-domain signals of the reference frames, and performing fast Fourier transform on the support frames, so as to obtain frequency-domain signals of the support frames; and aligning the support frames and the reference frames on the basis of the frequency-domain signals of the reference frames and the frequency-domain signals of the support frames, so as to obtain multiple aligned image frames. By means of the embodiments of the present disclosure, fast Fourier transform is performed on reference frames and support frames, so as to convert same from a space domain to a frequency domain and align same in the frequency domain, such that pixel points of images can be prevented from being traversed multiple times, thereby reducing the calculation complexity, and accelerating the alignment.

Description

Multi-frame image alignment method, multi-frame image alignment device and storage medium

Technical field

The present disclosure relates to the field of image processing technology, and in particular to a multi-frame image alignment method, a multi-frame image alignment device and a storage medium.

Background technique

With the continuous upgrading of smart phones, taking pictures on mobile phones has become a very important function. Mobile phone photography is simple to operate, intuitive to image, and easy to process, so users pay special attention to the performance of mobile phone photography. When taking pictures, when the hand shakes or the captured image is dynamic, there will be noise in the photos, resulting in unclear photos. In order to restore the original high-resolution image from a low-quality image, smartphones mostly continuously collect multiple frames of images and perform alignment processing on the multiple frames of images.

In related technologies, a method based on feature point alignment is used to perform feature detection, feature matching, and image transformation on multi-frame images. However, the computational complexity is high, and the processed image is greatly affected by the noise and brightness of the original image, resulting in Poor user experience.

Contents of the invention

In order to overcome the problems existing in related technologies, the present disclosure provides a multi-frame image alignment method, a multi-frame image alignment device and a storage medium for quickly aligning multi-frame images.

According to a first aspect of an embodiment of the present disclosure, a multi-frame image alignment method is provided, applied to a terminal, including: obtaining a reference frame and a supporting frame of a multi-frame image; performing fast Fourier transform on the reference frame to obtain the The frequency domain signal of the reference frame is performed, and the fast Fourier transform is performed on the support frame to obtain the frequency domain signal of the support frame; based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, Align the support frame and the reference frame to obtain a multi-frame aligned image.

In one embodiment, aligning the support frame and the reference frame based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame includes: based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame to determine a first amplitude and a second amplitude, where the first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame, and the second amplitude is the Amplitude corresponding to the frequency domain signal of the support frame; based on the first amplitude and the second amplitude, determine an alignment parameter between the reference frame and the support frame, the alignment parameter including at least one of the following Types: offset, rotation and scaling; align the support frame and the reference frame based on the alignment parameter.

In one implementation, the alignment parameter includes an offset; and determining the alignment parameter between the reference frame and the support frame based on the first amplitude and the second amplitude includes: : Based on the first amplitude and the second amplitude, determine the correlation between the frequency domain signal of the reference frame and the frequency domain signal of the support frame; based on the maximum value of the correlation, determine the The offset between the reference frame and the supporting frame.

In one embodiment, the alignment parameter includes a rotation amount and/or a scaling amount; and based on the first amplitude value and the second amplitude value, the alignment parameter between the reference frame and the support frame is determined. Alignment parameters include: converting the first amplitude and the second amplitude into polar coordinates respectively to obtain a first polar coordinate amplitude and a second polar coordinate amplitude; based on the first polar coordinate amplitude and The second polar coordinate amplitude determines the amount of rotation and/or scaling between the reference frame and the support frame.

In one implementation, determining an alignment parameter between the reference frame and the support frame based on the first amplitude and the second amplitude includes: determining the first amplitude and the Cross power spectrum of the second amplitude; perform inverse fast Fourier transform on the cross power spectrum to obtain a peak value, and determine alignment parameters between the reference frame and the support frame based on the peak value.

In one implementation, fast Fourier transform is performed on the reference frame to obtain the frequency domain signal of the reference frame, and fast Fourier transform is performed on the support frame to obtain the frequency domain signal of the support frame. The signal includes: performing Y-channel downsampling on the reference frame and then performing fast Fourier transform to obtain the frequency domain signal of the reference frame, and performing Y-channel downsampling on the support frame and then performing fast Fourier transform. , obtain the frequency domain signal of the support frame.

According to a second aspect of an embodiment of the present disclosure, a multi-frame image alignment device is provided, which is applied to a terminal and includes: an acquisition unit for acquiring a reference frame and a supporting frame of a multi-frame image; a processing unit for aligning the reference frame The frame is subjected to fast Fourier transform to obtain the frequency domain signal of the reference frame, and the support frame is subjected to fast Fourier transform to obtain the frequency domain signal of the support frame; and based on the frequency domain signal of the reference frame signal and the frequency domain signal of the support frame, and align the support frame and the reference frame to obtain a multi-frame aligned image.

In one implementation, the processing unit aligns the supporting frame and the reference frame based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame in the following manner: based on the frequency domain signal of the reference frame The frequency domain signal and the frequency domain signal of the support frame determine a first amplitude and a second amplitude. The first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame. The second amplitude is the amplitude corresponding to the frequency domain signal of the support frame; based on the first amplitude and the second amplitude, determine the alignment parameters between the reference frame and the support frame, the alignment parameters include At least one of the following: an offset, a rotation, and a scaling; aligning the support frame and the reference frame based on the alignment parameter.

In one implementation, the alignment parameter includes an offset; the processing unit determines the distance between the reference frame and the support frame based on the first amplitude and the second amplitude in the following manner: Alignment parameter: based on the first amplitude and the second amplitude, determine the correlation between the frequency domain signal of the reference frame and the frequency domain signal of the support frame; based on the maximum value of the correlation, An offset between the reference frame and the supporting frame is determined.

In one implementation, the alignment parameter includes a rotation amount and/or a scaling amount; the processing unit determines the reference frame and the reference frame based on the first amplitude value and the second amplitude value in the following manner: Support alignment parameters between frames: convert the first amplitude and the second amplitude to polar coordinates respectively to obtain the first polar coordinate amplitude and the second polar coordinate amplitude; based on the first polar coordinate The amplitude and the second polar coordinate amplitude determine the amount of rotation and/or scaling between the reference frame and the reference frame.

In one implementation, the processing unit determines the alignment parameter between the reference frame and the support frame based on the first amplitude and the second amplitude in the following manner: determining the first amplitude and the cross power spectrum of the second amplitude; perform an inverse fast Fourier transform on the cross power spectrum to obtain a peak value, and determine the alignment parameter between the reference frame and the support frame based on the peak value .

In one implementation, the processing unit performs fast Fourier transform on the reference frame in the following manner to obtain the frequency domain signal of the reference frame, and performs fast Fourier transform on the support frame to obtain The frequency domain signal of the support frame: perform Y channel downsampling on the reference frame and then perform fast Fourier transform to obtain the frequency domain signal of the reference frame, and perform Y channel downsampling on the support frame. Fast Fourier transform is performed to obtain the frequency domain signal of the support frame.

According to a third aspect of the embodiment of the present disclosure, a multi-frame image alignment device is provided, including: a processor; a memory for storing instructions executable by the processor; wherein the processor is configured to: execute any one of the first aspects The multi-frame image alignment method described in the embodiment.

According to a fourth aspect of an embodiment of the present disclosure, a non-transitory computer-readable storage medium is provided, the storage medium stores instructions, and when the instructions in the storage medium are executed by a processor of the terminal, the terminal can Execute the multi-frame image alignment method described in any embodiment of the first aspect.

The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects: perform fast Fourier transform on the reference frame and the support frame to obtain frequency domain signals of the reference frame and the support frame, based on the frequency domain signals of the reference frame and the support frame, Align reference and supporting frames. By converting the reference frame and supporting frame from the spatial domain to the frequency domain, the computational complexity is reduced and the alignment speed can be accelerated.

It should be understood that the foregoing general description and the following detailed description are exemplary and explanatory only, and do not limit the present disclosure.

Description of the drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure.

Figure 1 is a flow chart of a multi-frame image alignment method according to an exemplary embodiment.

Figure 2 is a flowchart of a multi-frame image alignment method according to an exemplary embodiment.

Figure 3 is a flowchart of a multi-frame image alignment method according to an exemplary embodiment.

Figure 4 is a flow chart of a multi-frame image alignment method according to an exemplary embodiment.

Figure 5 is a block diagram of a multi-frame image alignment device according to an exemplary embodiment.

FIG. 6 is a block diagram of a multi-frame image alignment device according to an exemplary embodiment.

Detailed ways

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. When the following description refers to the drawings, the same numbers in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with the present disclosure.

As terminals such as smartphones continue to upgrade, the application functions they perform are becoming more and more demanding. Among them, the camera function of the terminal is also constantly being optimized. Mobile phone photography is simple to operate, intuitive to image, and easy to process, so users pay special attention to the performance of mobile phone photography. When taking pictures, when the hand shakes or the captured image is dynamic, there will be noise in the photos, resulting in unclear photos. In order to restore the original high-resolution image from the low-quality image without increasing the cost, smartphones often choose to use super-resolution algorithm as the high-quality image restoration algorithm, and the multi-frame super-resolution algorithm is compared with the single-frame super-resolution algorithm. The algorithm has obvious advantages in recovery effect. The multi-frame super-resolution algorithm means that when taking pictures, multiple frames of low-quality images are collected quickly and continuously, and after the fusion processing of the super-resolution algorithm, a high-quality image is obtained. Due to the jitter of the handheld terminal, the multiple frames of images collected continuously are not aligned. In related technologies, alignment algorithms can be divided into two major categories. One is a method based on feature point alignment, which performs feature detection, feature matching, and image transformation on multi-frame images. The second category is an alignment method based on block matching. A typical example is HDR+ proposed by Google. However, the method based on feature point alignment will lead to high computational complexity, and the processed image will be greatly affected by the noise and brightness of the original image; the alignment method based on block matching will cause uneven noise in the photo under high sensitivity, and some tone mapping There are flaws that result in a poor user experience.

Therefore, the present disclosure provides a multi-frame image alignment method, which is applied to a terminal. The multi-frame image alignment method includes: obtaining a reference frame and a supporting frame from the multi-frame images collected by the terminal; performing fast Fourier transform on the reference frame. , transform the reference frame from the spatial domain to the frequency domain to obtain the frequency domain signal of the reference frame, and perform fast Fourier transform on the supporting frame to transform the supporting frame from the spatial domain to the frequency domain to obtain the frequency domain signal of the supporting frame; The frequency domain signal of the frame and the frequency domain signal of the supporting frame are calculated for correlation, and the alignment parameters of the supporting frame and the reference frame are determined to realize the alignment of multi-frame images. By performing fast Fourier transform on the reference frame and the support frame, converting from the spatial domain to the frequency domain, and performing alignment in the frequency domain, the computational complexity is reduced, the alignment speed is accelerated, and the image alignment effect is improved.

Figure 1 is a flow chart of a multi-frame image alignment method according to an exemplary embodiment. As shown in Figure 1, the multi-frame image alignment method can be used in a terminal. The embodiments of the present disclosure describe the multi-frame image alignment method. The type of terminal used is not limited. For example, examples of terminals may include: mobile phones, tablets, laptops, wearable devices, etc. The multi-frame image alignment method includes the following steps.

In step S11, the reference frames and supporting frames of the multi-frame image are obtained.

In the embodiment of the present disclosure, when the user presses the camera button, the terminal will continuously collect multiple frames of images. Due to the jitter of the handheld mobile phone, the continuously collected multiple frames of images are not aligned. The clearest image is selected from the multiple frames of images. One frame is the reference frame and the remaining frames are support frames. The selection of the reference frame in the embodiment of the present disclosure can be determined by methods in related technologies. In one implementation, for example, the average gradient value method can be used to calculate the average gradient of each frame of image. The larger the average gradient value, the clearer the image.

In one example, a multi-frame image is divided into several image blocks of the same size and partially overlapping edges, the clearest frame in each group of image blocks is selected as the reference frame, and the remaining frames are support frames.

In step S12, fast Fourier transform is performed on the reference frame to obtain the frequency domain signal of the reference frame, and fast Fourier transform is performed on the support frame to obtain the frequency domain signal of the support frame.

In the embodiment of the present disclosure, it is assumed that the reference frame is f(x,y) and the support frame is g(x,y). Perform fast Fourier transform on the reference frame to transform the reference frame from the spatial domain to the frequency domain to obtain the frequency domain signal F(u,v) of the reference frame. Perform fast Fourier transform on the support frame to transform the support frame from the spatial domain to the frequency domain to obtain the frequency domain signal G(u,v) of the support frame.

In step S13, based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame, the supporting frame and the reference frame are aligned to obtain aligned multi-frame images.

Fast Fourier transform can convert reference frames and support frames from the spatial domain to the frequency domain. Aligning frequency domain images can speed up the calculation of alignment, avoid multiple traversals of pixels in the image, and reduce computational complexity.

According to the embodiments of the present disclosure, reference frames and supporting frames are obtained from multi-frame images collected by the terminal, and fast Fourier transform is used to convert the reference frames and supporting frames from the spatial domain to the frequency domain, which can realize multi-frame images in the frequency domain. Alignment is performed on the image, which reduces the computational complexity and speeds up the alignment.

Figure 2 is a flow chart of a multi-frame image alignment method according to an exemplary embodiment. As shown in Figure 2, the multi-frame image alignment method is used in a terminal. The embodiment of the present disclosure applies the multi-frame image alignment method The type of terminal is not limited. Based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, the method of aligning the support frame and the reference frame includes the following steps.

In step S21, the first amplitude and the second amplitude are determined based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame; the first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame, and the second amplitude The value is the amplitude corresponding to the frequency domain signal of the supporting frame.

In the embodiment of the present disclosure, fast Fourier transform is used to convert the reference frame and the support frame from the spatial domain to the frequency domain to obtain the spectrogram of the reference frame and the support frame. The spectrogram describes the relationship between the frequency and amplitude of the signal. It can also be It is understood as a frequency distribution curve. Among them, the independent variable of the frequency distribution curve is frequency, that is, the horizontal axis is frequency, and the vertical axis is the amplitude of the frequency signal. Obtain the amplitude corresponding to the frequency domain signal of the reference frame, which is later called the first amplitude. Get the amplitude corresponding to the frequency domain signal of the support frame, which is later called the second amplitude.

In step S22, based on the first amplitude value and the second amplitude value, an alignment parameter between the reference frame and the support frame is determined, where the alignment parameter includes at least one of an offset amount, a rotation amount, and a scaling amount.

In the embodiment of the present disclosure, due to the jitter of the handheld terminal, the multiple frames of images continuously collected by the terminal are not aligned. The reference frame and the supporting frame will have at least one alignment parameter including positional offset, angle rotation, and image size scaling. Based on the first amplitude and the second amplitude, the alignment parameters of the reference frame and the support frame are calculated to achieve multi-frame image alignment.

In step S23, the support frame and the reference frame are aligned based on the alignment parameters.

In the embodiment of the present disclosure, after determining at least one of offset, rotation, and scaling that the support frame needs to perform relative to the reference frame according to the alignment parameters, the support frame and the reference frame are aligned.

In the embodiment of the present disclosure, the alignment operation of the reference frame and the support frame is performed according to the correlation law. When the matching degree between the reference frame and the supporting frame is the highest, the correlation reaches the maximum. By multiplying the Fourier transform of the support frame by the complex conjugate of the Fourier transform of the reference frame, the cross-correlated Fourier transform of the two images can be obtained, that is, the correlation between the reference frame and the support frame is calculated. Based on the correlation The maximum value determines the offset.

In an example, when the alignment parameter between the reference frame and the supporting frame is an offset, the reference frame is f(x,y) and the supporting frame is g(x,y). f(x,y) and g(x,y) are not aligned, there are displacements Δx and Δy in the x and y directions, which are later called offsets, that is to say g(x,y)=f(x-Δx, y-Δy). Perform fast Fourier transform on the reference frame to obtain the frequency domain signal of the reference frame and obtain the first amplitude F(u,v). Perform fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame, and obtain the second amplitude G(u,v). According to the correlation law F(CC)=F ^* (u,v)G(u,v), the correlation between the reference frame and the support frame can be calculated. Where F ^* (u, v) represents the complex conjugate of F (u, v), and F (CC) represents the Fourier transform of the cross-correlation between the reference frame and the support frame. The process of multi-frame image alignment can be expressed as the process of calculating the correlation between two images. When the matching degree between the two images is the highest, the correlation reaches the maximum. Therefore, according to the maximum value of F(CC), the displacements Δx and Δy of the reference frame and the support frame in the x and y directions are determined, and the offset is obtained.

According to an embodiment of the present disclosure, the mutual correlation between f(x,y) and g(x,y) is defined as follows:

CC(u,v) represents the cross-correlation between the reference frame and the support frame. Calculating the value of CC(u,v) at (u,v) requires traversing all (x,y), and calculating CC(u,v ) all values require traversing all (x, y) multiple times, and the computational complexity is very high. Therefore, fast Fourier transform is performed on the reference frame to obtain the frequency domain signal of the reference frame, and fast Fourier transform is performed on the support frame to obtain the frequency domain signal of the support frame. The reference frame and supporting frame can be converted from the spatial domain to the frequency domain. Alignment in the frequency domain can reduce the complexity of calculating correlation, speed up the alignment, and improve the image alignment effect.

In the embodiment of the present disclosure, when the alignment parameters between the reference frame and the supporting frame are offset, rotation and scaling, the reference frame f(x,y) and the supporting frame g(x,y) are not aligned, There are displacements Δx and Δy in the x and y directions, as well as a rotation amount θ ₀ and a scaling amount s, then the relationship between the reference frame f (x, y) and the support frame g (x, y) is

g(x,y)＝f(s(xcosθ ₀ +ysinθ ₀ )-Δx,s(-xsinθ ₀ +ycosθ ₀ )-Δy)

Perform fast Fourier transform on the reference frame f (x, y) to obtain the frequency domain signal of the reference frame, and obtain the amplitude M _F (u, v) corresponding to the frequency domain signal of the reference frame, which is subsequently called the first amplitude. Perform fast Fourier transform on the support frame g(x,y) to obtain the frequency domain signal of the support frame, and obtain the amplitude M _G (u,v) corresponding to the frequency domain signal of the support frame, which is subsequently called the second amplitude. then you can get

M _G (u,v)＝s ² M _F (s ^-1 (ucosθ ₀ +vsinθ ₀ ),s ^-1 (-usinθ ₀ +vcosθ ₀ ))

Convert M _G (u, v) and M _F (u, v) into polar coordinates (λ, θ), we can get M _GLP (λ, θ) = s ² M _FLP (λ-logs, θ-θ ₀ ), where logs is the difference in radians between the amplitudes of the reference frame and the support frame in polar coordinates, and θ ₀ is the difference in angle.

Among them, M _FLP (λ, θ) is the amplitude of the reference frame in polar coordinates, which is later called the first polar coordinate amplitude. M _GLP (λ, θ) is the amplitude of the support frame in polar coordinates, which is later called is the second polar coordinate amplitude. According to the correlation law, the correlation between the reference frame and the support frame can be calculated. The correlation calculation formula is:

M(CC)＝M _FLP ^* (λ,θ)M _GLP (λ,θ),

Among them, M _FLP ^* (λ, θ) represents the complex conjugate of M _FLP (λ, θ), and M (CC) represents the Fourier transform of the cross-correlation between the reference frame and the support frame in polar coordinates. The process of multi-frame image alignment can be expressed as the process of calculating the correlation between two images. When the matching degree between the two images is the highest, the correlation reaches the maximum. Therefore, in the embodiment of the present disclosure, the arc of the polar coordinate corresponding to the maximum value of M(CC) is used as the scaling amount between the reference frame and the support frame. The angle of the polar coordinate corresponding to the maximum value of M(CC) is used as the rotation amount between the reference frame and the support frame.

Among them, in the embodiment of the present disclosure, the alignment of the reference frame and the support frame can be achieved through the calculated scaling amount and rotation amount.

According to the embodiments of the present disclosure, not only the offset of the support frame relative to the reference frame can be calculated, but also the rotation amount and/or scaling amount of the support frame relative to the reference frame can be calculated, so as to further improve the alignment accuracy and obtain better Align effects and improve user experience.

Figure 3 is a flowchart of yet another multi-frame image alignment method according to an exemplary embodiment. As shown in Figure 3, the multi-frame image alignment method is used in a terminal. The embodiments of the present disclosure describe the multi-frame image alignment method. The type of terminal used is not limited. The multi-frame image alignment method includes step S31, step S32 and step S33. Among them, the execution steps of step S31 and step S21 in FIG. 2 are similar, and will not be described again in this disclosure.

In step S32, the cross power spectrum of the first amplitude and the second amplitude is determined.

In the embodiment of the present disclosure, when calculating the correlation between the reference frame and the supporting frame, no normalization operation is performed on the brightness of the image pixels. If the exposure time or ISO of the two images is different, resulting in a difference in the brightness of the images, then it is impossible to find a position where the two images can be aligned. Therefore, the present disclosure introduces a normalization factor to obtain the cross power spectrum of the frequency domain signal of the reference frame and the frequency domain signal of the support frame, thereby reducing the impact of image brightness on alignment accuracy.

Perform fast Fourier transform on the reference frame to obtain the frequency domain signal of the reference frame and obtain the first amplitude F(u,v). Perform fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame and obtain the second amplitude G(u,v). pass

Calculate the cross power spectrum of the frequency domain signal of the reference frame and the frequency domain signal of the support frame, where F ^* (u, v) represents the complex conjugate of F (u, v), and Q (u, v) represents the cross power spectrum.

In step S33, the peak value is obtained after performing inverse fast Fourier transform on the cross power spectrum, and the alignment parameter between the reference frame and the support frame is determined based on the peak value.

The alignment parameter determined based on the peak value in the embodiment of the present disclosure may include at least one of a translation amount, a rotation amount, and a scaling amount.

The following is an exemplary explanation of the process of performing inverse fast Fourier transform based on the cross power spectrum to obtain the peak value, and determining the alignment parameters based on the peak value.

In one implementation of this disclosure, the peak value is obtained by inversely transforming the cross power spectrum through the following formula:

in,

Represents the inverse fast Fourier transform, obtaining

Peak,

The peak of represents the position with the strongest correlation between the reference frame and the supporting frame.

Among them, the abscissa corresponding to the position of the peak is the horizontal offset Δx of the support frame relative to the reference frame, and the ordinate corresponding to the position of the peak is the vertical offset Δy of the support frame relative to the reference frame.

Further, in the embodiment of the present disclosure, when determining the rotation amount θ ₀ and/or the scaling amount s between the reference frame and the support frame, the reference frame f (x, y) and the support frame g (x, y) are The relationship translates to:

g(x,y)＝f(s(xcosθ ₀ +ysinθ ₀ )-Δx,s(-xsinθ ₀ +ycosθ ₀ )-Δy)

Perform fast Fourier transform on the reference frame and the support frame to obtain the first amplitude M _F (u, v) and the second amplitude M _G (u, v), and convert the first amplitude and the second amplitude to In polar coordinates (λ, θ), the first amplitude M _FLP (λ, θ) in polar coordinates and the second amplitude M _GLP (λ, θ) in polar coordinates can be obtained. Furthermore, in the case of polar coordinates, the first amplitude and the second amplitude can have the following corresponding relationship:

M _GLP (λ,θ)=s ² M _FLP (λ-logs,θ-θ ₀ ),

Among them, log s is the difference in radians between the amplitudes of the reference frame and the support frame in polar coordinates, and θ ₀ is the difference in angle.

When calculating the cross power spectrum of the frequency domain signal of the reference frame and the frequency domain signal of the support frame in polar coordinates, the following formula can be used:

Among them, M _FLP ^* (λ, θ) represents the complex conjugate of M _FLP (λ, θ), and Q (λ, θ) represents the cross power spectrum.

Further, the inverse fast Fourier transform is performed on the cross power spectrum through the following formula:

in,

stands for Inverse Fast Fourier Transform.

In this disclosed embodiment, the

The radian of the polar coordinate corresponding to the peak value is used as the scaling amount between the reference frame and the supporting frame. Will

The angle of the polar coordinate corresponding to the peak value is used as the rotation amount between the reference frame and the support frame.

According to embodiments of the present disclosure, by obtaining the cross power spectrum of the frequency domain signal of the reference frame and the frequency domain signal of the support frame, and performing an inverse fast Fourier transform on the cross power spectrum, the normalization operation of the image brightness is achieved, Then, the alignment parameters between the reference frame and the support frame are determined based on the peak value determined by the cross power spectrum, and the reference frame and the support frame are aligned, which can improve the alignment accuracy and reduce the impact of image brightness on the alignment accuracy.

Figure 4 is a flow chart of yet another multi-frame image alignment method according to an exemplary embodiment. As shown in Figure 4, the multi-frame image alignment method is used in a terminal. The type of terminal used is not limited.

Referring to Figure 4, Y-channel downsampling is performed on the reference frame and the support frame respectively and then fast Fourier transform is performed to obtain the frequency domain signals of the reference frame and the support frame. Determine the cross power spectrum of the frequency domain signals of the reference frame and the support frame, perform inverse fast Fourier transform on the cross power spectrum and obtain the peak value, determine the alignment parameters of the reference frame and the support frame based on the peak value, and then realize the alignment of the reference frame and the support frame. Alignment operations. Determining the alignment parameters of the reference frame and the support frame according to the peak value is similar to step S33 in FIG. 3 , and will not be described again in this disclosure.

In the embodiment of the present disclosure, Y channel downsampling can be selectively turned on or off. When the alignment accuracy requirement is low, Y channel downsampling can be turned on to reduce the resolution of the reference frame and the support frame accordingly. Aligning multiple frames on a reduced-resolution image can speed up the alignment.

It should be noted that those skilled in the art can understand that the various implementations/embodiments mentioned above in the embodiments of the present disclosure can be used in conjunction with the foregoing embodiments or can be used independently. Whether used alone or in conjunction with the foregoing embodiments, the implementation principles are similar. In the implementation of the present disclosure, some embodiments are described in terms of implementations used together. Of course, those skilled in the art can understand that such illustrations do not limit the embodiments of the present disclosure.

Based on the same concept, embodiments of the present disclosure also provide a multi-frame image alignment device.

It can be understood that, in order to implement the above functions, the multi-frame image alignment device provided by the embodiment of the present disclosure includes hardware structures and/or software modules corresponding to each function. Combined with the units and algorithm steps of each example disclosed in the embodiments of the present disclosure, the embodiments of the present disclosure can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is performed by hardware or by computer software driving the hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art can use different methods to implement the described functions for each specific application, but such implementation should not be considered to go beyond the scope of the technical solutions of the embodiments of the present disclosure.

FIG. 5 is a block diagram 100 of a multi-frame image alignment device according to an exemplary embodiment. Referring to FIG. 5 , the device includes an acquisition unit 101 and a processing unit 102 .

The acquisition unit 101 is configured to acquire reference frames and support frames of multi-frame images.

The processing unit 102 is configured to perform fast Fourier transform on the reference frame to obtain the frequency domain signal of the reference frame, perform fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame; and perform frequency domain signal based on the reference frame. domain signal and the frequency domain signal of the supporting frame, align the supporting frame and the reference frame, and obtain a multi-frame aligned image.

In the embodiment of the present disclosure, the processing unit 102 uses the following method to align the supporting frame and the reference frame based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame: based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame signal, determine the first amplitude and the second amplitude. The first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame, and the second amplitude is the amplitude corresponding to the frequency domain signal of the support frame; based on the first amplitude and a second amplitude to determine an alignment parameter between the reference frame and the reference frame, where the alignment parameter includes at least one of the following: offset, rotation, and scaling; based on the alignment parameter, align the support frame and the reference frame.

In the embodiment of the present disclosure, the alignment parameter includes an offset; the processing unit 102 determines the alignment parameter between the reference frame and the reference frame based on the first amplitude and the second amplitude in the following manner: based on the first amplitude and the second amplitude. The two amplitude values determine the correlation between the frequency domain signal of the reference frame and the frequency domain signal of the support frame; based on the maximum value of the correlation, determine the offset between the reference frame and the support frame.

In the embodiment of the present disclosure, the alignment parameter includes a rotation amount and/or a scaling amount; the processing unit 102 determines the alignment parameter between the reference frame and the supporting frame based on the first amplitude value and the second amplitude value in the following manner: convert the first The amplitude and the second amplitude are converted to polar coordinates respectively to obtain the first polar coordinate amplitude and the second polar coordinate amplitude; based on the first polar coordinate amplitude and the second polar coordinate amplitude, the reference frame and the support frame are determined. The amount of rotation and/or scaling between

In the embodiment of the present disclosure, the processing unit 102 determines the alignment parameter between the reference frame and the supporting frame based on the first amplitude and the second amplitude in the following manner: determining the intersection of the first amplitude and the second amplitude. Power spectrum; perform inverse fast Fourier transform on the cross power spectrum to obtain the peak value, and determine the alignment parameters between the reference frame and the support frame based on the peak value.

In the embodiment of the present disclosure, the processing unit 102 performs fast Fourier transform on the reference frame in the following manner to obtain the frequency domain signal of the reference frame, and performs fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame: The reference frame is subjected to Y channel downsampling and then fast Fourier transform is performed to obtain the frequency domain signal of the reference frame. The support frame is Y channel downsampled and then fast Fourier transformed is performed to obtain the frequency domain signal of the support frame.

Regarding the devices in the above embodiments, the specific manner in which each module performs operations has been described in detail in the embodiments related to the method, and will not be described in detail here.

FIG. 6 is a block diagram of a device 200 for multi-frame image alignment according to an exemplary embodiment. For example, the device 200 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or the like.

Referring to Figure 6, device 200 may include one or more of the following components: processing component 202, memory 204, power component 206, multimedia component 208, audio component 210, input/output (I/O) interface 212, sensor component 214, and Communication component 216.

Processing component 202 generally controls the overall operations of device 200, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 202 may include one or more processors 220 to execute instructions to complete all or part of the steps of the above method. Additionally, processing component 202 may include one or more modules that facilitate interaction between processing component 202 and other components. For example, processing component 202 may include a multimedia module to facilitate interaction between multimedia component 208 and processing component 202.

Memory 204 is configured to store various types of data to support operations at device 200 . Examples of such data include instructions for any application or method operating on device 200, contact data, phonebook data, messages, pictures, videos, etc. Memory 204 may be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EEPROM), Programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

Power component 206 provides power to various components of device 200 . Power components 206 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 200 .

Multimedia component 208 includes a screen that provides an output interface between the device 200 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide action. In some embodiments, multimedia component 208 includes a front-facing camera and/or a rear-facing camera. When the device 200 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front-facing camera and rear-facing camera can be a fixed optical lens system or have a focal length and optical zoom capabilities.

Audio component 210 is configured to output and/or input audio signals. For example, audio component 210 includes a microphone (MIC) configured to receive external audio signals when device 200 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signals may be further stored in memory 204 or sent via communications component 216. In some embodiments, audio component 210 also includes a speaker for outputting audio signals.

The I/O interface 212 provides an interface between the processing component 202 and a peripheral interface module, which may be a keyboard, a click wheel, a button, etc. These buttons may include, but are not limited to: Home button, Volume buttons, Start button, and Lock button.

Sensor component 214 includes one or more sensors for providing various aspects of status assessment for device 200 . For example, the sensor component 214 can detect the open/closed state of the device 200, the relative positioning of components, such as the display and keypad of the device 200, and the sensor component 214 can also detect a change in position of the device 200 or a component of the device 200. , the presence or absence of user contact with the device 200 , device 200 orientation or acceleration/deceleration and temperature changes of the device 200 . Sensor assembly 214 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 214 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor component 214 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communication component 216 is configured to facilitate wired or wireless communication between apparatus 200 and other devices. Device 200 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In one exemplary embodiment, the communication component 216 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 216 also includes a near field communications (NFC) module to facilitate short-range communications. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

In an exemplary embodiment, apparatus 200 may be configured by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable Gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are implemented for executing the above method.

In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions, such as a memory 204 including instructions, which can be executed by the processor 220 of the device 200 to complete the above method is also provided. For example, the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

It can be further understood that “plurality” in this disclosure refers to two or more, and other quantifiers are similar. "And/or" describes the relationship between related objects, indicating that there can be three relationships. For example, A and/or B can mean: A exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the related objects are in an "or" relationship. The singular forms "a", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It is further understood that the terms "first", "second", etc. are used to describe various information, but the information should not be limited to these terms. These terms are only used to distinguish information of the same type from each other and do not imply a specific order or importance. In fact, expressions such as "first" and "second" can be used interchangeably. For example, without departing from the scope of the present disclosure, the first information may also be called second information, and similarly, the second information may also be called first information.

It will be further understood that although the operations are described in a specific order in the drawings in the embodiments of the present disclosure, this should not be understood as requiring that these operations be performed in the specific order shown or in a serial order, or that it is required that Perform all operations shown to obtain the desired results. In certain circumstances, multitasking and parallel processing may be advantageous.

Other embodiments of the disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure that follow the general principles of the disclosure and include common knowledge or customary technical means in the technical field that are not disclosed in the disclosure. .

It is to be understood that the present disclosure is not limited to the precise structures described above and illustrated in the accompanying drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the disclosure is limited only by the appended rights.

Claims

A multi-frame image alignment method, characterized in that, applied to a terminal, the method includes:

Obtain the reference frames and supporting frames of multi-frame images;

Perform fast Fourier transform on the reference frame to obtain the frequency domain signal of the reference frame, and perform fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame;

Based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, the support frame and the reference frame are aligned to obtain a multi-frame aligned image.
The method of claim 1, wherein aligning the support frame and the reference frame based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame includes:

Based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, determine a first amplitude and a second amplitude, where the first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame, The second amplitude is the amplitude corresponding to the frequency domain signal of the support frame;

Based on the first amplitude and the second amplitude, an alignment parameter between the reference frame and the support frame is determined, the alignment parameter including at least one of the following: an offset, a rotation, and a scaling. ;

The support frame and the reference frame are aligned based on the alignment parameter.
The method of claim 2, wherein the alignment parameter includes an offset;

Determining the alignment parameter between the reference frame and the reference frame based on the first amplitude value and the second amplitude value includes:

Based on the first amplitude and the second amplitude, determine the correlation between the frequency domain signal of the reference frame and the frequency domain signal of the support frame;

Based on the maximum value of the correlation, an offset between the reference frame and the support frame is determined.
The method according to claim 2, wherein the alignment parameter includes a rotation amount and/or a scaling amount;

Determining the alignment parameter between the reference frame and the reference frame based on the first amplitude value and the second amplitude value includes:

Convert the first amplitude value and the second amplitude value to polar coordinates respectively to obtain the first polar coordinate amplitude value and the second polar coordinate amplitude value;

An amount of rotation and/or scaling between the reference frame and the support frame is determined based on the first polar magnitude and the second polar magnitude.
The method according to any one of claims 2 to 4, wherein the alignment between the reference frame and the support frame is determined based on the first amplitude value and the second amplitude value. Parameters, including:

determining a cross power spectrum of the first amplitude and the second amplitude;

A peak value is obtained after performing an inverse fast Fourier transform on the cross power spectrum, and an alignment parameter between the reference frame and the support frame is determined based on the peak value.
The method according to claim 1, characterized in that, fast Fourier transform is performed on the reference frame to obtain the frequency domain signal of the reference frame, and fast Fourier transform is performed on the support frame to obtain the The frequency domain signals that support frames include:

Perform Y-channel downsampling on the reference frame and then perform fast Fourier transform to obtain the frequency domain signal of the reference frame. Perform Y-channel downsampling on the support frame and then perform fast Fourier transform to obtain the Supports framed frequency domain signals.
A multi-frame image alignment device, characterized in that it is applied to a terminal, and the device includes:

An acquisition unit is used to acquire the reference frame and support frame of the multi-frame image;

A processing unit configured to perform fast Fourier transform on the reference frame to obtain the frequency domain signal of the reference frame, and perform fast Fourier transform on the support frame to obtain the frequency domain signal of the support frame; And based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, the support frame and the reference frame are aligned to obtain a multi-frame aligned image.
The device according to claim 7, wherein the processing unit aligns the supporting frame and the reference frame based on the frequency domain signal of the reference frame and the frequency domain signal of the supporting frame in the following manner:

Based on the frequency domain signal of the reference frame and the frequency domain signal of the support frame, determine a first amplitude and a second amplitude, where the first amplitude is the amplitude corresponding to the frequency domain signal of the reference frame, The second amplitude is the amplitude corresponding to the frequency domain signal of the support frame;

Based on the first amplitude and the second amplitude, an alignment parameter between the reference frame and the support frame is determined, the alignment parameter including at least one of the following: an offset, a rotation, and a scaling. ;

The support frame and the reference frame are aligned based on the alignment parameter.
The device of claim 8, wherein the alignment parameter includes an offset;

The processing unit determines the alignment parameter between the reference frame and the reference frame based on the first amplitude value and the second amplitude value in the following manner:

Based on the first amplitude and the second amplitude, determine the correlation between the frequency domain signal of the reference frame and the frequency domain signal of the support frame;

Based on the maximum value of the correlation, an offset between the reference frame and the support frame is determined.
The device according to claim 8, wherein the alignment parameter includes an amount of rotation and/or an amount of scaling;

The processing unit determines the alignment parameter between the reference frame and the support frame based on the first amplitude and the second amplitude in the following manner:

Convert the first amplitude and the second amplitude to polar coordinates respectively to obtain the first polar coordinate amplitude and the second polar coordinate amplitude;

An amount of rotation and/or scaling between the reference frame and the reference frame is determined based on the first polar magnitude and the second polar magnitude.
The device according to any one of claims 8 to 10, characterized in that the processing unit determines the reference frame and the support based on the first amplitude and the second amplitude in the following manner: Alignment parameters between frames:

determining a cross power spectrum of the first amplitude and the second amplitude;

A peak value is obtained after performing an inverse fast Fourier transform on the cross power spectrum, and an alignment parameter between the reference frame and the support frame is determined based on the peak value.
The device according to claim 7, characterized in that the processing unit performs fast Fourier transform on the reference frame in the following manner to obtain the frequency domain signal of the reference frame, and performs fast Fourier transformation on the support frame. Fourier transform is used to obtain the frequency domain signal of the support frame:

Perform Y-channel downsampling on the reference frame and then perform fast Fourier transform to obtain the frequency domain signal of the reference frame. Perform Y-channel downsampling on the support frame and then perform fast Fourier transform to obtain the Supports framed frequency domain signals.
A multi-frame image alignment device, characterized by including:

processor;

Memory used to store instructions executable by the processor;

Wherein, the processor is configured to perform the multi-frame image alignment method according to any one of claims 1 to 6.
A non-transitory computer-readable storage medium that, when instructions in the storage medium are executed by a processor of a terminal, enables the terminal to perform the multi-frame image alignment method described in any one of claims 1 to 6.