CN112261387A

CN112261387A - Image fusion method and device for multi-camera module, storage medium and mobile terminal

Info

Publication number: CN112261387A
Application number: CN202011513656.5A
Authority: CN
Inventors: 陈岩; 李怀东; 姬长胜
Original assignee: Spreadtrum Communications Shanghai Co Ltd
Current assignee: Spreadtrum Communications Shanghai Co Ltd
Priority date: 2020-12-21
Filing date: 2020-12-21
Publication date: 2021-01-22
Anticipated expiration: 2040-12-21
Also published as: CN112261387B

Abstract

An image fusion method and device for a plurality of camera modules, a storage medium and a mobile terminal are provided, wherein the plurality of camera modules comprise a first camera and a second camera, and the method comprises the following steps: acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera; coarsely aligning the first image to a second image based on a perspective transformation; calculating the alignment error of the second image relative to the first image after rough alignment; determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting to obtain a first image which is precisely aligned to the second image and recording the first image as a processed first image; and fusing the processed first image into the second image to obtain a fused image. The scheme of the invention can greatly improve the imaging quality of the multi-camera module through the visual field fusion, and has low calculation complexity.

Description

Image fusion method and device for multi-camera module, storage medium and mobile terminal

Technical Field

The invention relates to the technical field of image processing, in particular to an image fusion method and device for a multi-camera module, a storage medium and a mobile terminal.

Background

With the development of smart phones, the function of photographing in smart phones is more and more important. In order to obtain better photographing experience, the number of cameras mounted on the smart phone is also increasing, for example, the initial single-shot camera gradually changes into double-shot camera, triple-shot camera or even more cameras. The increased cameras enable the image quality of images shot by the smart phone to be improved continuously.

Taking the current popular wide-angle and telephoto dual-camera mobile phones as an example, the imaging of such dual-camera mobile phones is realized by a wide-angle lens and a telephoto lens respectively. That is, in the wide-angle lens imaging focal section, the image obtained by the double-camera mobile phone is obtained by zooming the image obtained by the wide-angle lens through corresponding digital codes; and imaging a focal section on the telephoto lens, and digitally zooming the image obtained by the double-camera mobile phone from the image obtained by the telephoto lens.

Compared with the traditional single-camera mobile phone, the multi-camera mobile phone can obtain higher-quality images. However, due to the characteristics of the lenses, the colors, brightness, and sharpness of images captured by the lenses in different focal lengths may be significantly different. In the zooming process of a multi-camera mobile phone, if no processing is performed, the switching of different focal length lenses brings discomfort to a user, and the user is not as smooth and natural as the optical zooming.

On the other hand, although the image quality of the image taken by the multi-camera mobile phone is improved compared with that of the image taken by the earlier single-camera mobile phone, in a single lens focal section (excluding the focal section corresponding to the lens with the longest focal length), when the zoom magnification is larger, the digital zoom still plays an important role, and the image quality of the image taken by the multi-camera mobile phone still degrades.

The field-of-view fusion can improve the two problems, but the existing field-of-view fusion scheme has defects generally, so that the quality of the fused image is poor. On the other hand, the smart phone has high operation complexity when realizing the existing view field fusion scheme, and the power consumption of the device is seriously influenced.

Disclosure of Invention

The technical problem solved by the invention is how to improve the image quality of the image obtained by field fusion.

In order to solve the above technical problem, an embodiment of the present invention provides an image fusion method for a multi-camera module, where the multi-camera module includes a first camera and a second camera, and the method includes: acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera; coarsely aligning the first image to a second image based on a perspective transformation; calculating the alignment error of the second image relative to the first image after rough alignment; determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting to obtain a first image which is precisely aligned to the second image and recording the first image as a processed first image; and fusing the processed first image into the second image to obtain a fused image.

Optionally, the acquiring the first image and the second image includes: and respectively preprocessing the image acquired by the first camera and the image acquired by the second camera to obtain the first image and the second image with the same resolution and magnification.

Optionally, the coarsely aligning the first image to the second image based on the perspective transformation includes: calculating and matching to obtain a matching feature point pair of the first image and the second image, wherein the matching feature point pair refers to a matching result of a first image feature point in the first image and a second image feature point in the second image; calculating to obtain a global homography matrix based on the matched feature point pairs; performing a perspective transformation based on the global homography to coarsely align the first image to a second image.

Optionally, the number of the matching feature point pairs is multiple, and calculating an alignment error of the second image with respect to the first image after the rough alignment includes: and for each matching characteristic point pair, calculating the alignment error of the matching characteristic point pair aligned by the global homography matrix.

Optionally, the calculating an alignment error of the second image with respect to the first image after the coarse alignment includes: and for each second image feature point in the second image, calculating the deviation between the mapping coordinate of the second image feature point on the first image after rough alignment and a reference coordinate to obtain the alignment error of the second image feature point, wherein the reference coordinate is the coordinate of the first image feature point matched with the second image feature point in the first image on the first image.

Optionally, the alignment error includes an alignment error of at least one matching feature point pair of the first image and the second image, where the matching feature point pair is a matching result of a first image feature point in the first image and a second image feature point in the second image; determining and correcting the alignment correction amount of each pixel in the first image according to the alignment error to obtain a first image after fine alignment to the second image comprises: calculating to obtain the alignment error of the residual pixels in the second image except the second image feature point in the rough alignment relative to the corresponding pixels in the first image by taking the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images as references; and for each pixel of the first image, determining and correcting the alignment correction of the pixel according to the alignment error of the pixel and the corresponding pixel in the second image to obtain the processed first image.

Optionally, the calculating, with reference to the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images, the alignment error of the remaining pixels, excluding the second image feature point, in the second image after the rough alignment, with respect to the corresponding pixel in the first image includes: and carrying out interpolation calculation according to the alignment error of the at least one matching characteristic point pair and the coordinates of the at least one matching characteristic point pair on the respective images to obtain the alignment error of the residual pixels, except the second image characteristic point, in the second image relative to the corresponding pixels in the first image after rough alignment.

Optionally, the calculating, with reference to the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images, the alignment error of the remaining pixels, excluding the second image feature point, in the second image after the rough alignment, with respect to the corresponding pixel in the first image includes: establishing a mathematical model based on a preset basis function by taking the alignment error of the at least one matching characteristic point pair and the coordinates of the at least one matching characteristic point pair on the respective images as anchor points, wherein the mathematical model is used for describing the alignment error of each pixel in the first image and the second image after rough alignment; and determining the alignment error of the residual pixels except the second image characteristic point in the second image relative to the corresponding pixels in the first image after the rough alignment based on the mathematical model.

Optionally, after calculating an alignment error of the second image relative to the first image after the rough alignment, before determining and correcting an alignment correction amount of each pixel in the first image according to the alignment error, the method further includes: judging whether the alignment error exceeds a preset correction range; and when the judgment result shows that the alignment error does not exceed the preset correction range, determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting.

Optionally, the fusing the processed first image into the second image to obtain a fused image includes: determining a minimum circumscribed rectangular area containing all matched feature point pairs on the second image, wherein the matched feature point pairs refer to matching results of first image feature points in the first image and second image feature points in the second image; and replacing the second image in the minimum circumscribed rectangular area with the processed first image to obtain the fused image.

Optionally, the field of view of the first camera is included in the field of view of the second camera, or the field of view of the second camera is included in the field of view of the first camera.

In order to solve the above technical problem, an embodiment of the present invention further provides an image fusion apparatus for a multi-camera module, where the multi-camera module includes a first camera and a second camera, and the apparatus includes: the acquisition module is used for acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera; a coarse alignment module to coarsely align the first image to a second image based on a perspective transformation; the processing module is used for calculating the alignment error of the second image relative to the first image after rough alignment; a correction module, configured to determine an alignment correction amount of each pixel in the first image according to the alignment error and correct the alignment correction amount to obtain a first image that is precisely aligned to the second image, and record the first image as a processed first image; and the fusion module is used for fusing the processed first image into the second image to obtain a fused image.

To solve the above technical problem, an embodiment of the present invention further provides a storage medium, on which a computer program is stored, and the computer program executes the steps of the above method when being executed by a processor.

In order to solve the above technical problem, an embodiment of the present invention further provides a mobile terminal, including a memory and a processor, where the memory stores a computer program capable of running on the processor, and the processor executes the steps of the method when running the computer program, and the mobile terminal further includes the multi-camera module.

Compared with the prior art, the technical scheme of the embodiment of the invention has the following beneficial effects:

the embodiment of the invention provides an image fusion method for a plurality of camera modules, wherein the plurality of camera modules comprise a first camera and a second camera, and the method comprises the following steps: acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera; coarsely aligning the first image to a second image based on a perspective transformation; calculating the alignment error of the second image relative to the first image after rough alignment; determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting to obtain a first image which is precisely aligned to the second image and recording the first image as a processed first image; and fusing the processed first image into the second image to obtain a fused image.

Compared with the existing view field fusion scheme, the embodiment can greatly improve the imaging quality of the multi-camera module through view field fusion, and has low calculation complexity. Specifically, on the basis of the global perspective transformation, the alignment error between two images after the perspective transformation is measured and corrected. Because the embodiment is directly aimed at the alignment between points, the final alignment effect is better, the image quality of the fused image is greatly improved, and the method has stronger robustness on the parallax between different images. Therefore, by providing a view field fusion alignment quality optimization scheme, the alignment level of different lens images in the view field fusion image is greatly improved. Further, compared with an alignment optimization algorithm adopted in the prior art, the method and the device only perform perspective transformation once, so that the computational complexity of the device executing the view field fusion scheme is low.

Further, after calculating an alignment error of the second image relative to the first image after rough alignment, judging whether the alignment error exceeds a preset correction range; and when the judgment result shows that the alignment error does not exceed the preset correction range, determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting. Thereby, the device power consumption and the imaging quality can be balanced. If the alignment error is too large to exceed the preset correction range, only the second image obtained based on digital zooming is taken as the final imaging to avoid meaningless fusion images, namely consuming the power consumption of equipment, and the quality of the fused images is not satisfactory; if the alignment error is within an acceptable range (i.e., falls within a preset correction range), the correction and compensation step of the present embodiment may be performed to obtain a fused image of higher quality.

Drawings

FIG. 1 is a flowchart of an image fusion method for a multi-camera module according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating the calculation of a fusion region in a typical application scenario according to an embodiment of the present invention;

FIG. 3 is a diagram illustrating the calculation of alignment errors in a typical application scenario according to an embodiment of the present invention;

FIG. 4 is a flowchart of one embodiment of step S104 of FIG. 1;

fig. 5 is a schematic structural diagram of an image fusion apparatus for a multi-camera module according to an embodiment of the present invention.

Detailed Description

As background art, the existing field fusion scheme generally has defects, which result in poor quality of the fused image.

Specifically, early single-shot mobile phones only achieved image zooming using a digital zoom mode, that is, an image interpolation algorithm was used to interpolate an initial image to change the size of the image. When the zoom factor is large, the image quality of an image obtained by digital zooming may be significantly degraded. Some image capturing apparatuses, such as single lens reflex cameras, image through the optical zoom lens, and the images obtained by the optical zoom lens at different focal lengths have high image quality. However, the optical zoom lens is not suitable for application to a smartphone due to volume and cost. In order to improve the imaging quality of the smart phone on the premise of acceptable volume and cost, the mobile phone has a trend of taking more pictures in the current smart phone market.

In order to improve the imaging quality of the mobile phone and control the cost and volume, the current popular method is to configure a plurality of cameras with different characteristics on the mobile phone. The current representative camera assembly (also called lens assembly) includes: (i) a double-camera mobile phone consisting of a wide-angle lens and a telephoto lens; (ii) a triple-camera mobile phone consisting of an ultra-wide-angle lens, a wide-angle lens and a telephoto lens; (iii) the three-camera mobile phone consists of a wide-angle lens, a telephoto lens and an ultra-telephoto lens.

The multi-camera mobile phone divides the whole zooming range into a plurality of focal segments according to the lens combination, and in each focal segment, the image shot by the mobile phone is obtained by digitally zooming the image obtained by the lens corresponding to the focal segment. Taking the mid wide-angle and telephoto dual-camera mobile phone in (i) as an example, this type of mobile phone divides the entire zoom range into a wide-angle focal length and a telephoto focal length. In the wide-angle focal section, the image shot by the mobile phone is obtained by digitally zooming the image obtained by the wide-angle lens; in the long focus section, the images shot by the mobile phone are obtained by digitally zooming the images obtained by the long focus lens. The principle of the other lens combined mobile phone for shooting images is similar to the principle. Compared with the early single-camera and digital zoom imaging mobile phones, the quality of the images obtained by the multi-camera mobile phone (the multi-camera mobile phone for short) is improved.

However, due to the characteristics of the lens, the colors, brightness and sharpness of images shot by the lens in different focal sections are different. In the zooming process, if no processing is performed, the lens switching brings discomfort to people, and the optical zooming is not as smooth and natural. On the other hand, although the image quality of the image taken by the multi-camera mobile phone is improved compared with that of the image taken by the earlier single-camera mobile phone, in a single lens focal section (excluding the focal section corresponding to the lens with the longest focal length), when the zoom magnification is larger, the digital zoom still plays an important role, and the image quality of the image taken by the multi-camera mobile phone still degrades.

Field fusion can improve both of the above problems. Taking the wide-angle and telephoto dual-camera phone in (i) as an example, since the field angle (field of view for short) of the image obtained by the telephoto lens is smaller than that of the image obtained by the telephoto lens, in the focal length of the wide-angle lens, the image obtained by the telephoto lens can be fused into the image generated by the wide-angle lens through corresponding scaling. After the processing, in the focal section of the wide-angle lens, the shot image is fused with the image obtained by the telephoto lens with higher definition, so that the image quality is improved, and meanwhile, the wide-angle lens is switched to the telephoto lens to be smoother and more natural. Other dual-camera combined mobile phones, such as super wide-angle and wide-angle, tele and super-tele, can also be similarly fused.

However, the existing field fusion scheme generally has defects, which result in poor quality of the fused image. Specifically, when a lens switching is required in the zooming process of the current multi-camera mobile phone, a lens direct switching method is mostly used, and no processing is performed on an image. Although the method is simple and easy to implement, the direct switching of the lens brings discomfort to people due to the characteristics of the lens and the influence of digital zooming, the flow of the lens is not as natural as that of an optical zoom lens, and the possibility of image quality reduction of images shot by a mobile phone still exists due to the digital zooming in each focal segment.

Still taking the wide-angle and telephoto dual-camera mobile phone in (i) as an example, in the field-of-view fusion process, it is necessary to fuse the image which is shot by the telephoto lens and has a small field of view and is zoomed into the corresponding position of the image shot by the wide-angle lens and has a large field of view. Thus, image alignment becomes a key point in the field-of-view fusion problem. If the images are not well aligned, in the field-of-view fused image, the joint area of the wide-angle lens shot image and the telephoto lens shot image is obviously staggered, so that the image quality of the field-of-view fused image is poor.

The simplest image alignment method at present is image global alignment. Global alignment of images is achieved by computing a transformation matrix from pairs of matching feature points detected in two images, and then using the transformation matrix to act on each pixel coordinate of one image (e.g., a zoomed telephoto image) and aligning the result to another image (e.g., a zoomed wide-angle image). When the image is two-dimensional, the alignment operation is performed in a homogeneous coordinate system, and the calculated global transformation matrices are all in a 3 × 3 form.

The current common image global transformation mainly comprises affine transformation, wherein a corresponding matrix of the affine transformation is an affine transformation matrix; and perspective transformation, the corresponding matrix of which is a homography matrix. The affine transformation will maintain parallelism between parallel lines, and the affine transformation matrix has 6 degrees of freedom. Whereas the perspective transformation assumes that the positional change between points is caused by planar motion, the homography matrix has 8 degrees of freedom. In the image global transformation adopted by the existing field-of-view fusion technology, the perspective transformation is the most widely applied image global transformation because the limitation is the least, namely the containment is the strongest.

However, the inventor of the present application has found through analysis that, in an actual situation, the image content is variable, and a theoretical premise of perspective transformation, that is, a position change between matching feature points caused by planar motion, is often not satisfied. That is, in an actual application scene, the position relationship between the images obtained by the two cameras of the multi-camera mobile phone shooting the same scene is not necessarily the position change of a pure plane. Therefore, the image alignment effect based on the global perspective transformation adopted by the existing field-of-view fusion scheme is often unsatisfactory.

This embodiment can greatly improve the imaging quality of many camera modules through the visual field integration, reduces the uncomfortable sense that causes by the camera lens switching, and the computational complexity is low. Specifically, on the basis of the global perspective transformation, the alignment error between two images after the perspective transformation is measured and corrected. Because the embodiment is directly aimed at the alignment between points, the final alignment effect is better, the image quality of the fused image is greatly improved, and the method has stronger robustness on the parallax between different images. Therefore, by providing a view field fusion alignment quality optimization scheme, the alignment level of different lens images in the view field fusion image is greatly improved. Further, compared with an alignment optimization algorithm adopted in the prior art, the method and the device only perform perspective transformation once, so that the computational complexity of the device executing the view field fusion scheme is low.

Taking a wide-angle and long-focus dual-camera mobile phone as an example, in the wide-angle lens imaging focal section, the embodiment can fuse the image shot by the long-focus lens into the image obtained by the wide-angle lens. After the processing of the embodiment, the image quality of the image obtained by the wide-angle focal length can be obviously improved, and the process of switching from the wide-angle lens to the telephoto lens is smoother and more natural.

In the embodiment of the invention, the wide-angle lens is a lens with a large visual field and a small focal length, and can shoot a scene with a wide visual field generally.

In the embodiment of the invention, the telephoto lens refers to a lens with a small visual field and a large focal length, and can shoot a scene with a small visual field and rich details.

In the embodiment of the present invention, zooming refers to changing the focal length and the field angle (abbreviated as field angle) to zoom in or zoom out a shot object, so as to achieve the purpose of shooting objects at different distances.

In the embodiment of the invention, the digital zooming refers to zooming of an image by an interpolation algorithm, and the image quality gradually deteriorates along with the increase of the zooming multiple.

In the embodiment of the invention, optical zooming refers to changing the focal length and the field angle by changing the distance between optical lenses so as to achieve the purpose of zooming.

In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below.

Fig. 1 is a flowchart of an image fusion method for a multi-camera module according to an embodiment of the present invention.

Many cameras module can integrate in mobile terminal such as smart mobile phone, IPAD, panel computer. Wherein, many camera modules can include a plurality of camera modules (be referred to as the camera for short), and different sections of focusing are corresponded to different cameras. The present embodiment is exemplified by a multi-camera phone.

The multi-camera module can comprise a first camera and a second camera, and the view field of the first camera is contained in the view field of the second camera. For example, the first camera may be a telephoto lens (also called a telephoto camera), and the second camera may be a wide-angle lens (also called a wide-angle camera), and the image obtained by the wide-angle camera has a large field of view but low resolution; while the tele camera obtains images with higher sharpness but smaller fields of view and is included in the field of view of the wide image.

In addition to the above-mentioned wide-angle and telephoto camera conditions, other dual-camera combinations, such as wide-angle and ultra-wide-angle cameras, telephoto and ultra-telephoto cameras, etc., can also be used to calculate high-quality field-of-view fused images by aligning images obtained by the respective lenses. The embodiment can also be applied to a three-camera mobile phone or even a mobile phone with more cameras, and for the mobile phones, the two cameras which need to be fused can be determined according to the current focal length so as to execute the embodiment.

Next, the present embodiment will be specifically described by taking a wide-angle, telephoto dual-camera mobile phone as an example.

In order to improve the image quality of images obtained by a wide-angle and telephoto dual-camera mobile phone at a wide-angle camera focal length and enable the effect of switching from the wide-angle camera to the telephoto camera to be natural and smooth, the embodiment provides a method for fusing images formed by the telephoto camera into an imaging focal length of the wide-angle camera and aligning the wide-angle and telephoto images. According to the embodiment, the wide-angle camera can form a focal section, and a high-quality output image can be obtained by combining the wide-angle image and the tele image through a view field fusion method. The output image refers to an image finally presented on a display screen of the dual-camera mobile phone.

Specifically, referring to fig. 1, the image fusion method for a multi-camera module according to this embodiment may include the following steps:

step S101, acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera;

step S102, roughly aligning the first image to a second image based on perspective transformation;

step S103, calculating the alignment error of the second image relative to the first image after rough alignment;

step S104, determining and correcting the alignment correction amount of each pixel in the first image according to the alignment error to obtain a first image which is precisely aligned to the second image, and recording the first image as a processed first image;

and step S105, fusing the processed first image into the second image to obtain a fused image.

More specifically, when the first camera is a telephoto camera and the second camera is a wide-angle camera, the corresponding first image is a telephoto image and the second image is a wide-angle image.

In one implementation, the first image may be a result of a pre-processing of an image captured by a first camera, and similarly, the second image may be a result of a pre-processing of an image captured by a second camera.

Accordingly, the step S101 may include the steps of: and respectively preprocessing the image acquired by the first camera and the image acquired by the second camera to obtain the first image and the second image with the same resolution and magnification.

Specifically, in order to perform field fusion, the images obtained by the initial tele and wide cameras need to be preprocessed to obtain a preprocessed wide image and tele image with the same resolution and the same magnification. In this embodiment, the preprocessed wide-angle image and telephoto image having the same resolution and the same magnification are respectively referred to as a second image and a first image.

For example, the tele image initially captured by the tele camera may be reduced (i.e., down-sampled) by a factor of K2/K1 to obtain an image of the size of the object, such as the initial wide image, which is referred to as the first image. Wherein, K2 is the default optical magnification of wide-angle camera, and K1 is the default optical magnification of tele camera.

For another example, a wide-angle image initially captured by a wide-angle camera may be center-cropped, and an image with a resolution such as a cropped and zoomed tele image (i.e., the first image) may be referred to as the second image.

In one implementation, the step S102 may include the steps of: calculating and matching to obtain a matching feature point pair of the first image and the second image, wherein the matching feature point pair refers to a matching result of a first image feature point in the first image and a second image feature point in the second image; calculating to obtain a global homography matrix based on the matched feature point pairs; performing a perspective transformation based on the global homography to coarsely align the first image to a second image.

Specifically, a certain image feature point calculation method may be selected to calculate and match feature point pairs to obtain distribution coordinates of the matched feature point pairs. For example, the first image and the second image each include a plurality of pixels, and in step S102, the pixels in the first image and the pixels in the second image may be matched based on the image feature point calculation method to obtain a plurality of matching feature point pairs. For the sake of convenience of distinction, the feature points extracted from the pixels of the first image are referred to as first image feature points, and the feature points extracted from the pixels of the second image are referred to as second image feature points.

For example, the image feature point calculation method may include Scale-invariant feature transform (SIFT), Speeded Up Robust Features algorithm (SURF), and FAST feature point extraction and description algorithm (ORB).

The matching characteristic point pairs can be obtained by real-time detection and matching.

Further, after the matching feature point pairs of the first image and the second image are calculated, a global homography matrix can be calculated by using the matching feature point pairs, and the global homography matrix is used for preliminarily aligning the first image and the second image (namely the preprocessed wide image and the tele image).

For example, referring to fig. 2, in order to confirm the region to be fused into the tele image (referred to as a fusion region for short), feature point detection and matching may be performed on the first image and the second image, and the wide-angle image region (after digital zooming) surrounded by the minimum bounding rectangle containing all the matching feature points is the region to be fused into the tele image. Wherein the matching feature point refers to a second feature point in the matching feature point pair.

Further, the global homography matrix can be a 3 × 3 global homography matrix, and the second image is subjected to the matrix action to perform preliminary alignment on the first image.

If the image alignment is performed using only the global homography, an alignment error must exist. Specifically, during the field-of-view fusion process, i.e., the process of fusing the (zoomed) tele image (i.e., the first image) into the digitally zoomed wide image (i.e., the second image), the images need to be registered and aligned, otherwise, a misalignment may occur in the field-of-view fusion image. Currently, a global homographic transformation matrix is calculated based on all detected matching feature point pairs, and then the (zoomed) tele image and the wide image are aligned by perspective change using the global homographic transformation matrix.

However, since the perspective transformation is only suitable for describing the planar positional change between images. Therefore, in the case of a flexible and variable actual scene, it is necessary that the positional relationship between the wide-angle image and the telephoto image is described only by a global perspective transformation, and the effect of the image obtained by image alignment is not satisfactory.

Then, the present embodiment further performs subsequent steps to correct alignment errors between the first image and the second image after the rough alignment after performing step S102 to perform the rough alignment of the first image and the second image based on the global perspective transformation.

In one implementation, for each second image feature point in the second image, a deviation between mapping coordinates of the second image feature point on the first image after rough alignment and reference coordinates may be calculated to obtain an alignment error of the second image feature point, where the reference coordinates are coordinates of a first image feature point in the first image, which matches the second image feature point, on the first image.

In particular, the alignment error may comprise an alignment error of at least one matching feature point pair of the first and second images. That is, for each matching feature point pair, the alignment error refers to a deviation between a mapping coordinate of a first image feature point in the matching feature point pair, which is mapped onto a second image after perspective transformation, and a coordinate of a second image feature point in the matching feature point pair, which is mapped onto the second image.

In other words, the step S103 may include the steps of: and for each matching characteristic point pair, calculating the alignment error of the matching characteristic point pair aligned by the global homography matrix.

For example, the global homography matrix calculated when the global perspective transformation is performed in step S102 may be multiplied by the coordinates of all the matching feature points in the second image, and then the coordinates of the matching feature points in the corresponding first image are subtracted from each calculated coordinate, so as to obtain the alignment error of the matching feature points of the two images after being aligned by the global homography matrix, as shown in fig. 3.

In one implementation, the fine alignment may refer to aligning and correcting the first image to the second image, where the aligning refers to the coarse alignment performed in step S102, and the correcting refers to correcting an alignment error generated after the coarse alignment in step S102.

Specifically, referring to fig. 4, the step S104 may include the steps of:

step S1041, taking the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on each image as a reference, and calculating an alignment error of the remaining pixels in the second image after coarse alignment, except for the second image feature point, relative to the corresponding pixel in the first image;

step S1042, for each pixel of the first image, determining an alignment correction amount of the pixel according to an alignment error between the pixel and a corresponding pixel in the second image, and correcting to obtain the processed first image.

Further, the step S1041 may include the steps of: and carrying out interpolation calculation according to the alignment error of the at least one matching characteristic point pair and the coordinates of the at least one matching characteristic point pair on the respective images to obtain the alignment error of the residual pixels, except the second image characteristic point, in the second image relative to the corresponding pixels in the first image after rough alignment. That is, from the alignment errors of known points (i.e., matching feature points), interpolation determines the alignment errors of the remaining points in the image.

For example, a mathematical model may be established based on a preset basis function with the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images as anchor points, wherein the mathematical model is used for describing the alignment error of each pixel in the first image and the second image after rough alignment. That is, the alignment error that occurs after each pixel point in the two images is aligned by the global homography matrix, that is, the alignment correction that needs to be performed on each pixel, can be described based on the mathematical model.

Further, an alignment error of remaining pixels in the second image except for the second image feature points after the rough alignment with respect to corresponding pixels in the first image is determined based on the mathematical model. That is, using the alignment correction, each pixel of the image preliminarily aligned by the global homography matrix is subjected to corresponding position correction, so as to obtain the first image and the second image after alignment optimization.

For example, the preset basis functions may be radial basis functions.

In one implementation, the step S105 may include the steps of: determining a minimum bounding rectangular region (i.e., the fusion region shown in fig. 2 and 3) on the second image that contains all the pairs of matched feature points; and replacing the second image in the minimum circumscribed rectangular area with the processed first image to obtain the fused image.

For example, in the field-of-view fused image, the image inside the circumscribed rectangle uses information of the pre-processed and alignment-optimized tele image, and the image outside the circumscribed rectangle uses information of the wide image.

In a variation, after the step S103, the method of this embodiment may further include the step of: judging whether the alignment error exceeds a preset correction range; and when the judgment result shows that the alignment error does not exceed the preset correction range, executing step S104 to determine and correct the alignment correction amount of each pixel in the first image according to the alignment error.

Thereby, the device power consumption and the imaging quality can be balanced. If the alignment error is too large to exceed the preset correction range, only the second image obtained based on digital zooming is taken as the final imaging to avoid meaningless fusion images, namely consuming the power consumption of equipment, and the quality of the fused images is not satisfactory; if the alignment error is within an acceptable range (i.e., falls within a preset correction range), the correction and compensation step of the present embodiment may be performed to obtain a fused image of higher quality.

For example, the preset correction range may be an alignment error of about 3 to 4.

In one variation, the field of view of the second camera may be contained within the field of view of the first camera. That is, the second camera may be a telephoto lens, and the first camera may be a wide-angle lens.

In a variation, in the step S103, the global homography matrix may be multiplied by coordinates of all matching feature points of the tele image, and then the coordinates of the matching feature points of the corresponding wide image are subtracted from each calculated coordinate, so as to obtain an alignment error of the matching feature points of the two images after being aligned by the global homography matrix.

Based on this, in order to further improve the alignment effect in the field-of-view fusion image, on the basis of the global perspective transformation, the present embodiment proposes an alignment optimization algorithm. The algorithm firstly calculates the alignment error of the matched characteristic point pairs of the (zoomed) tele image and the wide image after being aligned by the global homography, then establishes a mathematical model according to the calculated alignment error of the matched characteristic points and the coordinate position of the matched characteristic points, and calculates the alignment error correction quantity required by the alignment of the global perspective transformation at each position in the field fusion area.

In the embodiment, only one global homography matrix needs to be calculated, so that the alignment optimization algorithm provided by the embodiment has lower calculation complexity compared with other existing alignment optimization algorithms based on local multi-homography transformation matrices. In addition, the embodiment is directly aimed at the alignment between points, so the alignment effect is good, the robustness to the parallax between different images is strong, and the distortion effect in the obtained visual field fusion image is obviously less.

Further, in the field of view fusion process, the embodiment uses two cameras (such as a wide-angle camera and a telephoto camera) with different characteristics, and combines a corresponding algorithm to simulate the imaging effect of the optical zoom lens with high cost and large volume.

Fig. 5 is a schematic structural diagram of an image fusion apparatus for a multi-camera module according to an embodiment of the present invention. Those skilled in the art understand that the image fusion apparatus 5 for a multi-camera module according to the present embodiment can be used to implement the method solutions described in the embodiments of fig. 1 to 4.

Specifically, many camera modules include first camera and second camera.

Further, referring to fig. 5, the image fusion apparatus 5 for multiple camera modules according to this embodiment may include: an obtaining module 51, configured to obtain a first image and a second image, where the first image is obtained through the first camera and the second image is obtained through the second camera; a rough alignment module 52 for rough aligning the first image to a second image based on a perspective transformation; a processing module 53, configured to calculate an alignment error of the second image with respect to the first image after the coarse alignment; a correction module 54, configured to determine an alignment correction amount of each pixel in the first image according to the alignment error and correct the alignment correction amount to obtain a first image that is precisely aligned to the second image, and record the first image as a processed first image; and a fusion module 55, configured to fuse the processed first image into the second image to obtain a fused image.

For more details of the working principle and working mode of the image fusion apparatus 5 for multiple camera modules, reference may be made to the related descriptions in fig. 1 to fig. 4, which are not repeated herein.

Further, the embodiment of the present invention also discloses a storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the method technical solution described in the embodiments shown in fig. 1 to fig. 4 is executed. Preferably, the storage medium may include a computer-readable storage medium such as a non-volatile (non-volatile) memory or a non-transitory (non-transient) memory. The storage medium may include ROM, RAM, magnetic or optical disks, etc.

Further, an embodiment of the present invention further discloses a mobile terminal, which includes a memory and a processor, where the memory stores a computer program capable of running on the processor, and the processor executes the technical solution of the method in the embodiment shown in fig. 1 to 4 when running the computer program, and the mobile terminal further includes the multi-camera module. Specifically, the mobile terminal may be a multi-camera mobile phone.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. An image fusion method for a plurality of camera modules, the plurality of camera modules comprising a first camera and a second camera, the method comprising:

acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera;

coarsely aligning the first image to a second image based on a perspective transformation;

calculating the alignment error of the second image relative to the first image after rough alignment;

determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting to obtain a first image which is precisely aligned to the second image and recording the first image as a processed first image;

and fusing the processed first image into the second image to obtain a fused image.

2. The method of claim 1, wherein the acquiring the first image and the second image comprises:

and respectively preprocessing the image acquired by the first camera and the image acquired by the second camera to obtain the first image and the second image with the same resolution and magnification.

3. The method of claim 1, wherein coarsely aligning the first image to the second image based on the perspective transformation comprises:

calculating and matching to obtain a matching feature point pair of the first image and the second image, wherein the matching feature point pair refers to a matching result of a first image feature point in the first image and a second image feature point in the second image;

calculating to obtain a global homography matrix based on the matched feature point pairs;

performing a perspective transformation based on the global homography to coarsely align the first image to a second image.

4. The method of claim 3, wherein the number of the matching feature point pairs is plural, and the calculating an alignment error of the second image relative to the first image after the rough alignment comprises:

and for each matching characteristic point pair, calculating the alignment error of the matching characteristic point pair aligned by the global homography matrix.

5. The method of claim 1, wherein calculating the alignment error of the second image relative to the first image after the rough alignment comprises:

and for each second image feature point in the second image, calculating the deviation between the mapping coordinate of the second image feature point on the first image after rough alignment and a reference coordinate to obtain the alignment error of the second image feature point, wherein the reference coordinate is the coordinate of the first image feature point matched with the second image feature point in the first image on the first image.

6. The method of claim 1, wherein the alignment error comprises an alignment error of at least one matching feature point pair of the first image and the second image, wherein the matching feature point pair is a matching result of a first image feature point in the first image and a second image feature point in the second image; determining and correcting the alignment correction amount of each pixel in the first image according to the alignment error to obtain a first image after fine alignment to the second image comprises:

calculating to obtain the alignment error of the residual pixels in the second image except the second image feature point in the rough alignment relative to the corresponding pixels in the first image by taking the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images as references;

and for each pixel of the first image, determining and correcting the alignment correction of the pixel according to the alignment error of the pixel and the corresponding pixel in the second image to obtain the processed first image.

7. The method of claim 6, wherein calculating the alignment error of the remaining pixels of the second image after the rough alignment, excluding the feature point of the second image, relative to the corresponding pixel of the first image, based on the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images comprises:

and carrying out interpolation calculation according to the alignment error of the at least one matching characteristic point pair and the coordinates of the at least one matching characteristic point pair on the respective images to obtain the alignment error of the residual pixels, except the second image characteristic point, in the second image relative to the corresponding pixels in the first image after rough alignment.

8. The method of claim 6, wherein calculating the alignment error of the remaining pixels of the second image after the rough alignment, excluding the feature point of the second image, relative to the corresponding pixel of the first image, based on the alignment error of the at least one matching feature point pair and the coordinates of the at least one matching feature point pair on the respective images comprises:

establishing a mathematical model based on a preset basis function by taking the alignment error of the at least one matching characteristic point pair and the coordinates of the at least one matching characteristic point pair on the respective images as anchor points, wherein the mathematical model is used for describing the alignment error of each pixel in the first image and the second image after rough alignment;

and determining the alignment error of the residual pixels except the second image characteristic point in the second image relative to the corresponding pixels in the first image after the rough alignment based on the mathematical model.

9. The method of claim 1, wherein after calculating an alignment error of the second image relative to the first image after the coarse alignment, and before determining an alignment correction for each pixel in the first image based on the alignment error and correcting, further comprising:

judging whether the alignment error exceeds a preset correction range;

and when the judgment result shows that the alignment error does not exceed the preset correction range, determining the alignment correction amount of each pixel in the first image according to the alignment error and correcting.

10. The method of claim 1, wherein fusing the processed first image into the second image to obtain a fused image comprises:

determining a minimum circumscribed rectangular area containing all matched feature point pairs on the second image, wherein the matched feature point pairs refer to matching results of first image feature points in the first image and second image feature points in the second image;

and replacing the second image in the minimum circumscribed rectangular area with the processed first image to obtain the fused image.

11. The method of any one of claims 1 to 10, wherein the field of view of the first camera is contained within the field of view of the second camera, or wherein the field of view of the second camera is contained within the field of view of the first camera.

12. The utility model provides an image fusion device for many camera modules, many camera modules include first camera and second camera, its characterized in that, the device includes:

the acquisition module is used for acquiring a first image and a second image, wherein the first image is acquired by the first camera, and the second image is acquired by the second camera;

a coarse alignment module to coarsely align the first image to a second image based on a perspective transformation;

the processing module is used for calculating the alignment error of the second image relative to the first image after rough alignment;

a correction module, configured to determine an alignment correction amount of each pixel in the first image according to the alignment error and correct the alignment correction amount to obtain a first image that is precisely aligned to the second image, and record the first image as a processed first image;

and the fusion module is used for fusing the processed first image into the second image to obtain a fused image.

13. A storage medium having a computer program stored thereon, the computer program, when being executed by a processor, performing the steps of the method according to any of the claims 1 to 11.

14. A mobile terminal comprising a memory and a processor, said memory having stored thereon a computer program operable on said processor, wherein said processor, when executing said computer program, performs the steps of the method of any of claims 1 to 11, said mobile terminal further comprising said multi-camera module.