CN113452969B

CN113452969B - Image processing method and device

Info

Publication number: CN113452969B
Application number: CN202110189448.2A
Authority: CN
Inventors: 邵纬航; 王银廷; 张一帆; 张丽萍; 王提政
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-03-26
Filing date: 2021-02-19
Publication date: 2023-03-24
Anticipated expiration: 2041-02-19
Also published as: CN113452969A

Abstract

The application discloses an image processing method and device, relates to the technical field of image processing, and is beneficial to optimizing the color, contrast or dynamic range of an image, enabling the optimized image to be more objective and improving the robustness. The method is applied to a terminal comprising a first camera and a second camera, wherein the multiplying power of the first camera is not less than that of the second camera, and the method comprises the following steps: acquiring a first image for a first scene in the current shooting environment through the first camera; when the chroma or the brightness of the first image does not meet an expected condition, acquiring a second image for the first scene through the second camera; wherein the second image has a speed or brightness that is better than the first image; and optimizing the first image according to the color or the brightness of the second image to obtain a third image. The second image has a better image style than the first image and the third image has a better image style than the first image. The image style includes at least one of color, contrast, and dynamic range.

Description

Image processing method and device

The present application claims priority from the chinese patent application having the application number 202010226081.2, entitled "image processing method and apparatus", filed by the national intellectual property office on 26/03/2020, which is incorporated herein by reference in its entirety.

Technical Field

The present application relates to the field of image processing technologies, and in particular, to an image processing method and apparatus.

Background

Optimization of image color at the time of taking a picture has been a focus of research and commercial use. At present, color optimization is generally performed based on a single frame image, that is, the color of the image is optimized based on the characteristics of the captured image itself. The method lacks of collecting objective environment information, so that objective and controllable robustness is difficult to guarantee, namely the final effect is easy to distort in a biased way, and therefore poor user embodiment is caused.

Disclosure of Invention

The embodiment of the application provides an image processing method and device, which are beneficial to optimizing the color, contrast or dynamic range of an image, enabling the optimized image to be more objective and improving the robustness.

In order to achieve the purpose, the technical scheme is as follows:

in a first aspect, an image processing method is provided, where the method is applied to a terminal, the terminal includes a first camera and a second camera, and a magnification of the first camera is not less than a magnification of the second camera, and the method includes: acquiring a first image aiming at a first scene in a current shooting environment through a first camera; when the chroma or the brightness of the first image does not meet the expected conditions, acquiring a second image for a first scene through a second camera; wherein the second image has a higher chroma or luminance than the first image; and optimizing the first image according to the color or brightness of the second image to obtain a third image.

In one possible design, when the color of the second image is better than the color of the first image, the color of the third image is better than the color of the first image (condition 1); or, when the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image (condition 2); alternatively, when the dynamic range of the second image is larger than that of the first image, the dynamic range of the third image is larger than that of the first image (condition 3). At least two of the above conditions 1,2, and 3 may be satisfied at the same time.

In the technical scheme, the terminal optimizes the first image by using the second image which is based on the color, the contrast or the dynamic range in the same scene and is superior to the first image. Due to the fact that the collection of objective environment information is considered, compared with the technical scheme that a single-frame image is used for optimizing colors in the prior art, the method is beneficial to enabling the optimized image to represent a real scene more objectively, and robustness is improved.

In one possible design, the chromaticity or luminance of the first image not satisfying the expected condition includes: the color saturation of the first image does not reach a first preset threshold value; the value range of the first preset threshold is [0.6,0.8]; or the brightness of the first image does not reach a second preset threshold value; the value range of the second preset threshold is [80,100]; or the contrast of the first image does not reach a third preset threshold; the value range of the third preset threshold is [40,60]; or the dynamic range of the first image does not reach a fourth preset threshold; the value range of the fourth preset threshold is [6,8]; or the hue of the first image is different from that of the second image, and the confidence of the hue of the second image is higher than that of the hue of the first image; or, when the sensitivity ISO of the first camera in the current shooting environment is greater than the first threshold.

Specifically, let R, G, and B be three channels of red, green, and blue of the first image in RGB format, respectively, and the preset condition may be one of the following five cases. In actual use, the preset conditions may include, but are not limited to, these five situations.

Case 1-saturation does not reach the first threshold.

The saturation S is given by

max＝max(R,G,B)；

min＝min(R,G,B)；

S＝(max-min)/max；

Each pixel point of the image can calculate a saturation value, and then the average saturation value of the whole image can be calculated, and the average saturation value of the whole image of the first image is compared with a preset first threshold value. The value range of the first threshold includes [0.6,0.8].

Case 2-brightness does not reach the second threshold:

the calculation formula of the brightness is

V = max (R, G, B); alternatively, the first and second electrodes may be,

y = 0.299R + 0.587G + 0.114B, and the pixel values of the R, G and B channels are known to be in the range of 0-255.

Comparing the global brightness of the first image to a second threshold; the second threshold may take 80-100.

Case 3-contrast does not reach the third threshold:

for the ith pixel in the image, the number of the adjacent pixels is assumed to be delta _i And (4) respectively.

Contrast of the whole picture

Comparing the global contrast of the first image to a third threshold; the value range of the third threshold includes [40,60], for example, the third threshold is 50.

Case 4-dynamic range does not reach the fourth threshold:

the formula for the dynamic range D is:

D＝log2(max(R,G,B)-min(R,G,B))；

or

D＝log2(max(Y)-min(Y)),Y＝0.299*R+0.587*G+0.114*B。

Note that the dynamic range is either full graph computation or the graph is divided into several subgraphs to be computed separately, and a single pixel point is meaningless.

Comparing the global dynamic range of the first image to a fourth threshold; the value range of the fourth threshold includes [6,8], for example, the fourth threshold is 7.

Case 5-confidence of hue of second image higher than confidence of hue of the first image

The hue H formula is:

if (R component is maximum), H = (G-B)/(max-min) × 60;

if (G component max), H =120+ (B-R)/(max-min) × 60; in the three channels, the G component is maximum;

if (B component is maximal), H =240+ (R-G)/(max-min) × 60; in three channels, the B component is maximum;

if (H < 0), H = H +360; (understood as assignment statement)

If abs (H1-H2)/H2 is greater than 0.1, it indicates a higher confidence in the second image color than the first image color.

In one possible design, the color of the second image is better than the color of the first image, including at least one of the following conditions: the second image has a higher chroma than the first image; the second image has a luminance greater than the luminance of the first image.

In one possible design, the image content of the third image is the same as (or approximately the same as) the image content of the first image, and may be embodied in that the texture information of the third image is the same as (or approximately the same as) the texture information of the first image. Therefore, optimization of at least one of color, contrast and dynamic range of the first image is facilitated, and meanwhile texture information is guaranteed to be unchanged or changed slightly, so that the texture of the image is enabled to be closer to the texture of a real scene, the effect of the image is improved, and user experience is improved.

In one possible design, the light sensing performance of the second camera is greater than the light sensing performance of the first camera. This helps to make at least one of the color, contrast, or dynamic range of the second image better than the first image.

In one possible design, the aperture of the second camera is larger than the aperture of the first camera. This helps to make at least one of the color, contrast, or dynamic range of the second image better than the first image.

In one possible design, the exposure duration when the second camera acquires the second image is greater than the exposure duration when the first camera acquires the first image. This helps to make at least one of the color, contrast, or dynamic range of the second image better than the first image.

In one possible design, the second camera captures the second image using a larger ISO than the first camera captures the first image. This helps to make at least one of the color, contrast, or dynamic range of the second image better than the first image.

In one possible design, the second camera's magnification range includes [0.5,1 ], and the first camera's magnification range includes [1,20]. For example, the second camera is a wide-angle camera, and the first camera is a 1X camera, a 3X camera, a 10X camera, or the like.

In one possible design, the second camera's magnification is 1 and the first camera's magnification range includes (1, 20.) for example, the second camera is a 1X camera, the first camera is a 3X camera or a 10X camera, etc.

In one possible design, the method further includes: acquiring N frames of images aiming at a first scene through a first camera, wherein N is an integer greater than or equal to 1; and carrying out multi-frame noise reduction (or multi-frame fusion) according to the N frames of images and the first image to obtain a fourth image. Wherein the image content of the fourth image is the same as the image content of the first image. In this case, the optimizing the first image according to the second image to obtain a third image includes: and optimizing the fourth image according to the second image to obtain a third image. That is to say, the terminal may optimize an image obtained by fusing a plurality of images captured by the same camera, to obtain the third image.

In one possible design, the method further includes: respectively acquiring N1 frame images and N2 frame images for a first scene through a first camera and a second camera; wherein N1 and N2 are integers greater than or equal to 1; and carrying out multi-frame noise reduction (or multi-frame fusion) according to the N1 frame image, the N2 frame image and the first image to obtain a fifth image. Wherein the image content of the fifth image is the same as the image content of the first image. In this case, the optimizing the first image according to the second image to obtain a third image includes: and optimizing the fifth image according to the second image to obtain a third image. That is to say, the terminal can optimize the image obtained by fusing a plurality of images shot by different cameras to obtain a third image.

In one possible design, the method further includes: when the shooting magnification of the terminal for the first scene is [1, A), selecting a camera with the magnification of 1in the terminal as a first camera; or when the shooting magnification of the terminal for the first scene is [ A, B ], selecting a camera with the magnification of A in the terminal as a first camera; or when the shooting magnification of the terminal for the first scene is larger than C, selecting a camera with the magnification of C in the terminal as the first camera.

For example, when the shooting magnification of the terminal for the first scene is [1, 3), a camera with the magnification of 1in the terminal is selected as the first camera; or when the shooting magnification of the terminal for the first scene is [3, 7), selecting a camera with the magnification of 3 in the terminal as the first camera; or when the shooting magnification of the terminal for the first scene is larger than 10, selecting a camera with the magnification of 10 in the terminal as the first camera.

In one possible design, the terminal further comprises a third camera, the third camera is a camera different from the first camera and the second camera in the terminal, and the multiplying power of the third camera is not greater than that of the first camera; the method further comprises the following steps: respectively acquiring N3 frames of images and N4 frames of images for a first scene through a first camera and a third camera; wherein N3 and N4 are integers greater than or equal to 1; and carrying out multi-frame noise reduction (or multi-frame fusion) according to the N3 frame image, the N4 frame image and the first image to obtain a sixth image. Wherein the image content of the sixth image is the same as the image content of the first image. In this case, the optimizing the first image according to the second image to obtain a third image includes: and optimizing the sixth image according to the second image to obtain a third image. That is to say, the terminal may perform optimization based on the image obtained by fusing the plurality of images captured by the at least three cameras, so as to obtain the third image.

In one possible design, the method further includes: when the shooting magnification of the terminal for the first scene is [7,10 ], a camera with the magnification of 10 in the terminal is selected as the first camera, and a camera with the magnification of 3 in the terminal is selected as the third camera.

In one possible design, the method further includes: when the shooting magnification of the terminal for the first scene is [1, A), selecting a camera with the magnification of 1 or less than 1in the terminal as a second camera; or when the shooting magnification of the terminal for the first scene is [ A, B ], selecting a camera with the magnification being less than or equal to A in the terminal as a second camera; or when the shooting magnification of the terminal for the first scene is larger than C, selecting the camera with the magnification smaller than or equal to C in the terminal as the second camera.

For example, when the shooting magnification of the terminal for the first scene is [1, 3), a camera with the magnification of 1 or less in the terminal is selected as the second camera;

or when the shooting magnification of the terminal for the first scene is [3, 7), selecting a camera with the magnification of 3, 1 or less than 1in the terminal as a second camera;

or when the shooting magnification of the terminal for the first scene is [7,10 ], selecting a camera with the magnification of 3, 1 or less than 1in the terminal as a second camera;

or when the shooting magnification of the terminal for the first scene is larger than 10, selecting a camera with the magnification of 10, 3, 1 or smaller than 1in the terminal as the second camera.

In one possible design, optimizing the first image based on the second image to obtain a third image includes: acquiring a color correction matrix CCM matrix of at least two sub-images in a first image; the CCM matrix of the first sub-image is used for representing the mapping relation between the characteristics of the first sub-image and the characteristics of a second sub-image in the second image; the first sub-image and the second sub-image are images of the same object; the features include at least one of color, contrast, or dynamic range; obtaining a CCM matrix of pixels in the first image based on the CCM matrices of the at least two sub-images; the CCM matrix of the first pixel is used for representing the mapping relation between the characteristic of the first pixel and the characteristic of a second pixel in the second image; the first pixel and the second pixel correspond to the same image content; a third image is derived based on the CCM matrix of pixels in the first image and the first image. The CCM matrix of each sub-image is determined, and then the CCM matrix of each pixel is obtained through interpolation by a traditional method. In this way, the calculation complexity is reduced, the calculation time is shortened, and the performance overhead is effectively controlled.

Wherein the first pixel and the second pixel correspond to the same image content, it can be understood that: after the first image and the second image are registered, the position of the first pixel in the first image is the same (or approximately the same) as the position of the second pixel in the second image.

In one possible design, obtaining a CCM matrix for at least two sub-images in a first image comprises: acquiring a CCM matrix of at least two sub-images using a first neural network; the first neural network is used for analyzing the characteristic and the texture information of the first image and the characteristic and the texture information of the second image to obtain a CCM matrix of at least two sub-images.

In one possible design, optimizing the first image based on the second image to obtain a third image includes:

optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for optimizing the image style of the image with poor image style by using the image with good image style. This helps to make the optimization result better.

In a second aspect, an image processing method is provided, where the method is applied to a terminal, the terminal includes a first camera and a second camera, and a magnification of the first camera is not less than a magnification of the second camera, and the method includes: when the sensitivity ISO of the first camera in the current shooting environment is greater than a first threshold: acquiring a first image aiming at a first scene in a current shooting environment through a first camera; acquiring a second image for the first scene through a second camera; wherein the color of the second image is closer to the true color of the first scene than the color of the first image; optimizing the first image according to the second image to obtain a third image; the third image has a color that is closer to the true color of the first scene than the color of the first image. The embodiment is particularly suitable for scenes in which the color of the shot image is too different from the color of the real scene, namely, a color cast phenomenon occurs, and in the scenes, the color of the image after optimization is more close to the color of the real scene than the color of the image before optimization.

In one possible design, the second image and the first image satisfy at least one of the following conditions: the chromaticity of the second image is closer to the true chromaticity of the first scene than the chromaticity of the first image; the second image has a luminance closer to the true luminance of the first scene than the luminance of the first image.

In any of the possible designs of the first aspect, some or all of the features provided may be provided as possible designs of the second aspect without conflict. For example, the image content of the third image is the same (or approximately the same) as the image content of the first image. For example, the light sensing performance of the second camera is greater than that of the first camera. For example, the aperture of the second camera is larger than the aperture of the first camera. For example, the exposure duration when the second camera acquires the second image is longer than the exposure duration when the first camera acquires the first image. For example, the ISO used when the second camera acquires the second image is larger than the ISO used when the first camera acquires the first image. For example, the magnification range of the second camera includes [0.5,1 ], and the magnification range of the first camera includes [1,20]. For example, when the first image is optimized, the CCM matrix of the sub-image in the first image is calculated, then the CCM matrix of the pixels in the first image is obtained through interpolation, and then the first image is optimized based on the CCM matrix of the pixels.

In a third aspect, an image processing method is provided, where the method is applied to a terminal, where the terminal includes a target camera, and the method includes: acquiring a first image aiming at a first scene in the current shooting environment through the target camera under the current shooting parameters; when the color or the brightness of the first image does not meet preset conditions, the ISO or the exposure time of the current shooting parameters is adjusted upwards, and a second image is collected for the first scene through the target camera under the adjusted shooting parameters; wherein the second image has a color or brightness that is better than the first image; and optimizing the first image according to the color or the brightness of the second image to obtain a third image.

In the method, the terminal can utilize the same camera to carry out image optimization on the images shot under different parameters, and compared with the method that two cameras shoot, the method has better image adaptability and harmony.

In one possible design, the color of the third image is better than the color of the first image when the color of the second image is better than the color of the first image; or when the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image; or when the dynamic range of the second image is larger than that of the first image, the dynamic range of the third image is larger than that of the first image; wherein the image content of the third image is the same as the image content of the first image.

In one possible design, the chromaticity or luminance of the first image not satisfying a desired condition includes: the color saturation of the first image does not reach a first preset threshold value; the value range of the first preset threshold is [0.6,0.8]; or the brightness of the first image does not reach a second preset threshold value; the value range of the second preset threshold is [80,100]; or the contrast of the first image does not reach a third preset threshold; the value range of the third preset threshold is [40,60]; or the dynamic range of the first image does not reach a fourth preset threshold; the value range of the fourth preset threshold is [6,8]; or the hue of the first image is different from that of the second image, and the confidence of the hue of the second image is higher than that of the hue of the first image.

In one possible design, the optimizing the first image according to the second image to obtain a third image includes: optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for carrying out image style optimization on the image with the poor image style by utilizing the image with the good image style.

It should be understood that the relevant steps in the above three embodiments can be freely combined, substituted, explained and referred to without violating the natural law. For example, a sub-step in the second and third aspects may refer to the description of the same or similar sub-step in the first aspect; to the same extent, both explanatory references and equivalents may be made. And will not be described in detail herein.

In a fourth aspect, an image processing apparatus is provided, which may be a terminal, a chip or a system of chips.

In one possible design, the apparatus may be configured to perform any one of the methods provided in the first, second or third aspects above. The present application may divide the functional modules of the apparatus according to any one of the methods provided by the first aspect. For example, each functional module may be divided in accordance with each function, or two or more functions may be integrated into one processing module. Illustratively, the device can be divided into a processing unit, a sending unit and the like according to functions. The above descriptions of possible technical solutions and beneficial effects executed by each divided functional module may refer to the technical solutions provided by the first aspect or the second aspect or their corresponding possible designs, and are not described herein again.

In another possible design, the apparatus includes: a memory for storing computer instructions and one or more processors for invoking the computer instructions to perform any of the methods as provided by the first aspect and any of its possible designs, or the second aspect and any of its possible designs.

In a fifth aspect, a terminal is provided, including: the system comprises a processor, a memory and at least two cameras, wherein the at least two cameras are used for shooting images, the memory is used for storing computer programs and instructions, and the processor is used for calling the computer programs and the instructions and executing any one of the methods provided by the first aspect or the second aspect in cooperation with the at least two cameras.

In a sixth aspect, a computer-readable storage medium, such as a computer-non-transitory readable storage medium, is provided. Having stored thereon a computer program (or instructions) which, when run on a computer, causes the computer to perform any of the methods provided by any of the possible implementations of the first, second or third aspects described above.

In a seventh aspect, a computer program product is provided, which when run on a computer causes the performance of any one of the methods provided in any one of the possible implementations of the first, second or third aspect.

It is understood that any one of the image processing apparatus, the computer storage medium, the computer program product or the chip system provided above can be applied to the corresponding method provided above, and therefore, the beneficial effects achieved by the method can refer to the beneficial effects in the corresponding method, and are not described herein again.

In the present application, the names of the image processing apparatus or the functional modules described above do not limit the devices or the functional modules themselves, and in actual implementation, the devices or the functional modules may appear by other names. Insofar as the functions of the respective devices or functional modules are similar to those of the present application, they fall within the scope of the claims of the present application and their equivalents.

These and other aspects of the present application will be more readily apparent from the following description.

Drawings

Fig. 1 is a schematic diagram of a hardware structure of a terminal applicable to an embodiment of the present application;

fig. 2 is a block diagram of a software structure of a terminal applicable to the embodiment of the present application;

FIG. 3 is a flowchart of a method for image processing according to an embodiment of the present disclosure;

FIG. 4 is a flowchart of a method for acquiring a first image and a second image according to an embodiment of the present disclosure;

FIG. 5 is a flowchart of another method for acquiring a first image and a second image according to an embodiment of the present disclosure;

FIG. 6 is a flowchart of a method for optimizing a first image according to an embodiment of the present disclosure;

fig. 7 is a schematic diagram of dividing a first image by using mesh averaging according to an embodiment of the present application;

fig. 8A is a schematic diagram of a process for optimizing a first image based on a CCM matrix according to an embodiment of the present application;

FIG. 8B is a schematic diagram illustrating a comparison of the first image, the second image, and the third image when the first image is optimized based on the CCM matrix according to an embodiment of the present application;

FIG. 9 is a flowchart of another method for optimizing a first image according to an embodiment of the present disclosure;

fig. 10 is a schematic diagram illustrating a network structure and a logic function of a neural network according to an embodiment of the present application;

fig. 11 is a schematic diagram of another image processing method provided in the embodiment of the present application;

fig. 12 is a schematic structural diagram of a terminal according to an embodiment of the present application.

Detailed Description

In the embodiments of the present application, the words "exemplary" or "such as" are used herein to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g.," is not necessarily to be construed as preferred or advantageous over other embodiments or designs. Rather, use of the word "exemplary" or "such as" is intended to present concepts related in a concrete fashion.

In the embodiments of the present application, the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present application, "a plurality" means two or more unless otherwise specified.

The image processing method provided by the embodiment of the application can be applied to a terminal, the terminal can be a terminal with a camera, such as a smart phone, a tablet computer, a wearable device, an AR/VR device, or a Personal Computer (PC), a Personal Digital Assistant (PDA), a netbook, or any other terminal capable of implementing the embodiment of the application. The present application does not limit the specific form of the terminal. Wearable equipment can also be called wearable intelligent equipment, is the general term of applying wearable technique to carry out intelligent design, develop the equipment that can dress to daily wearing, like glasses, gloves, wrist-watch, dress and shoes etc.. A wearable device is a portable device that is worn directly on the body or integrated into the clothing or accessories of the user. The wearable device is not only a hardware device, but also realizes powerful functions through software support, data interaction and cloud interaction. The generalized wearable smart device includes full functionality, large size, and can implement full or partial functionality without relying on a smart phone, such as: smart watches or smart glasses and the like, and only focus on a certain type of application functions, and need to be used in cooperation with other devices such as smart phones, such as various smart bracelets for physical sign monitoring, smart jewelry and the like.

In the present application, the structure of the terminal may be as shown in fig. 1. As shown in fig. 1, the terminal 100 may include a processor 110, an external memory interface 120, an internal memory 121, a Universal Serial Bus (USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, a button 190, a motor 191, an indicator 192, a camera 193, a display screen 194, a Subscriber Identity Module (SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.

The number of the cameras 193 included in the terminal 100 may be one or more. If the terminal 100 includes a camera, the embodiment of the present application supports the camera to operate in different operating states. Reference may be made to the following for a detailed description of the operating state. If the terminal 100 includes a plurality of cameras, the magnifications of the plurality of cameras may be different. The plurality of cameras include the first camera and the second camera described in the embodiments of the present application.

It is to be understood that the illustrated structure of the present embodiment does not constitute a specific limitation to the terminal 100. In other embodiments, terminal 100 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 110 may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors. For example, in the present application, the processor 110 may control the first camera to acquire the first image for the first camera and control the second camera to acquire the second image for the first scene when a sensitivity (ISO) of the first camera in the first scene is greater than a first threshold, and optimize the first image according to the second image to obtain the third image. Specific implementations can be found in the following.

The controller may be, among other things, a neural center and a command center of the terminal 100. The controller can generate an operation control signal according to the instruction operation code and the timing signal to complete the control of instruction fetching and instruction execution.

A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.

In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, an integrated circuit built-in audio (I2S) interface, a Pulse Code Modulation (PCM) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, etc.

MIPI interfaces may be used to connect processor 110 with peripheral devices such as display screen 194, camera 193, and the like. The MIPI interface includes a Camera Serial Interface (CSI), a Display Serial Interface (DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the capture functionality of terminal 100. The processor 110 and the display screen 194 communicate through the DSI interface to implement the display function of the terminal 100.

The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 193, the display 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like. The GPIO interface may also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, and the like.

The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the terminal 100, and may also be used to transmit data between the terminal 100 and peripheral devices. And the earphone can also be used for connecting an earphone and playing audio through the earphone. The interface may also be used to connect other terminals, such as AR devices, etc.

It should be understood that the interface connection relationship between the modules illustrated in the present embodiment is only an exemplary illustration, and does not limit the structure of the terminal 100. In other embodiments of the present application, the terminal 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

The power management module 141 is used to connect the battery 142, the charging management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charging management module 140, and supplies power to the processor 110, the internal memory 121, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be used to monitor parameters such as battery capacity, battery cycle count, battery state of health (leakage, impedance), etc. In some other embodiments, the power management module 141 may also be disposed in the processor 110. In other embodiments, the power management module 141 and the charging management module 140 may be disposed in the same device.

The wireless communication function of the terminal 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.

The terminal 100 implements a display function through the GPU, the display screen 194, and the application processor, etc. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 194 is used to display images, video, and the like. The display screen 194 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-OLED, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the terminal 100 may include 1 or N display screens 194, N being a positive integer greater than 1.

A series of Graphical User Interfaces (GUIs) may be displayed on the display screen 194 of the terminal 100, which are the main screens of the terminal 100. Generally, the size of the display 194 of the terminal 100 is fixed, and only a limited number of controls can be displayed in the display 194 of the terminal 100. A control is a GUI element, which is a software component contained in an application program and controls all data processed by the application program and interactive operations related to the data, and a user can interact with the control through direct manipulation (direct manipulation) to read or edit information related to the application program. Generally, a control may include a visual interface element such as an icon, button, menu, tab, text box, dialog box, status bar, navigation bar, widget, and the like.

The terminal 100 can implement a photographing function through the ISP, the camera 193, the video codec, the GPU, the display screen 194, and the application processor, etc.

The ISP is used to process the data fed back by the camera 193. For example, when a user takes a picture, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, an optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and converting into an image visible to the naked eye. The ISP can also carry out algorithm optimization on the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 193.

The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, terminal 100 may include 1 or N cameras 193, N being a positive integer greater than 1.

The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the terminal 100 selects a frequency bin, the digital signal processor is configured to perform fourier transform or the like on the frequency bin energy.

Video codecs are used to compress or decompress digital video. The terminal 100 may support one or more video codecs. In this way, the terminal 100 can play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.

The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the terminal 100, for example: image recognition, face recognition, speech recognition, text understanding, and the like.

The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the terminal 100. The external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.

The internal memory 121 may be used to store computer-executable program code, which includes instructions. The processor 110 executes various functional applications of the terminal 100 and data processing by executing instructions stored in the internal memory 121. For example, in the present embodiment, the processor 110 may acquire the pose of the terminal 100 by executing instructions stored in the internal memory 121. The internal memory 121 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (e.g., audio data, a phonebook, etc.) created during use of the terminal 100, and the like. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like. The processor 110 executes various functional applications of the terminal 100 and data processing by executing instructions stored in the internal memory 121 and/or instructions stored in a memory provided in the processor.

The terminal 100 may implement an audio function through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playing, recording, etc.

The audio module 170 is used to convert digital audio information into analog audio signals for output, and also used to convert analog audio inputs into digital audio signals. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or some functional modules of the audio module 170 may be disposed in the processor 110.

The speaker 170A, also called a "horn", is used to convert the audio electrical signal into an acoustic signal. The terminal 100 can listen to music through the speaker 170A or listen to a handsfree call.

The receiver 170B, also called "earpiece", is used to convert the electrical audio signal into an acoustic signal. When the terminal 100 receives a call or voice information, it can receive voice by bringing the receiver 170B close to the human ear.

The microphone 170C, also referred to as a "microphone," is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can input a voice signal to the microphone 170C by speaking near the microphone 170C through the mouth. The terminal 100 may be provided with at least one microphone 170C. In other embodiments, the terminal 100 may be provided with two microphones 170C to achieve a noise reduction function in addition to collecting sound signals. In other embodiments, the terminal 100 may further include three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, implement directional recording functions, and so on.

The earphone interface 170D is used to connect a wired earphone. The headset interface 170D may be the USB interface 130, or may be a 3.5mm open mobile electronic device platform (OMTP) standard interface, a cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.

The pressure sensor 180A is used for sensing a pressure signal, and can convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a sensor comprising at least two parallel plates having an electrically conductive material. When a force acts on the pressure sensor 180A, the capacitance between the electrodes changes. The terminal 100 determines the intensity of the pressure according to the change in the capacitance. When a touch operation is applied to the display screen 194, the terminal 100 detects the intensity of the touch operation according to the pressure sensor 180A. The terminal 100 may also calculate the touched position based on the detection signal of the pressure sensor 180A. In some embodiments, the touch operations that are applied to the same touch position but different touch operation intensities may correspond to different operation instructions. For example: and when the touch operation with the touch operation intensity smaller than the first pressure threshold value acts on the short message application icon, executing an instruction for viewing the short message. And when the touch operation with the touch operation intensity larger than or equal to the first pressure threshold value acts on the short message application icon, executing an instruction of newly building the short message.

The gyro sensor 180B may be used to determine a motion attitude of the terminal 100. In some embodiments, the angular velocity of terminal 100 about three axes (i.e., x, y, and z axes) may be determined by gyroscope sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. Illustratively, when the shutter is pressed, the gyro sensor 180B detects a shake angle of the terminal 100, calculates a distance to be compensated for by the lens module according to the shake angle, and allows the lens to counteract the shake of the terminal 100 by a reverse movement, thereby achieving anti-shake. The gyroscope sensor 180B may also be used for navigation, somatosensory gaming scenes.

The air pressure sensor 180C is used to measure air pressure. In some embodiments, the terminal 100 calculates an altitude from the barometric pressure measured by the barometric pressure sensor 180C to assist in positioning and navigation.

The magnetic sensor 180D includes a hall sensor. The terminal 100 may detect the opening and closing of the flip holster using the magnetic sensor 180D. In some embodiments, when the terminal 100 is a folder, the terminal 100 may detect the opening and closing of the folder according to the magnetic sensor 180D. And then according to the opening and closing state of the leather sheath or the opening and closing state of the flip cover, the automatic unlocking of the flip cover is set.

The acceleration sensor 180E may detect the magnitude of acceleration of the terminal 100 in various directions (generally, three axes). The magnitude and direction of gravity can be detected when the terminal 100 is stationary. The method can also be used for recognizing the terminal gesture, and is applied to horizontal and vertical screen switching, pedometers and other applications.

A distance sensor 180F for measuring a distance. The terminal 100 may measure the distance by infrared or laser. In some embodiments, the scene is photographed and the terminal 100 may range using the distance sensor 180F to achieve fast focus.

The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The terminal 100 emits infrared light outward through the light emitting diode. The terminal 100 detects infrared reflected light from a nearby object using a photodiode. When sufficient reflected light is detected, it can be determined that there is an object near the terminal 100. When insufficient reflected light is detected, terminal 100 may determine that there are no objects near terminal 100. The terminal 100 can detect that the user holds the terminal 100 to talk near the ear by using the proximity light sensor 180G, so as to automatically turn off the screen to save power. The proximity light sensor 180G may also be used in a holster mode, a pocket mode automatically unlocks and locks the screen.

The ambient light sensor 180L is used to sense the ambient light level. The terminal 100 may adaptively adjust the brightness of the display 194 according to the perceived ambient light level. The ambient light sensor 180L may also be used to automatically adjust the white balance when taking a picture. The ambient light sensor 180L may also cooperate with the proximity light sensor 180G to detect whether the terminal 100 is in a pocket to prevent accidental touches.

The fingerprint sensor 180H is used to collect a fingerprint. The terminal 100 can utilize the collected fingerprint characteristics to realize fingerprint unlocking, access to an application lock, fingerprint photographing, fingerprint incoming call answering, and the like.

The temperature sensor 180J is used to detect temperature. In some embodiments, the terminal 100 executes a temperature processing strategy using the temperature detected by the temperature sensor 180J. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold, the terminal 100 performs a reduction in the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection. In other embodiments, terminal 100 heats battery 142 when the temperature is below another threshold to avoid a low temperature causing abnormal shutdown of terminal 100. In other embodiments, when the temperature is lower than a further threshold, the terminal 100 performs boosting on the output voltage of the battery 142 to avoid abnormal shutdown due to low temperature.

The touch sensor 180K is also called a "touch device". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is used to detect a touch operation applied thereto or nearby. The touch sensor can communicate the detected touch operation to the application processor to determine the touch event type. Visual output associated with the touch operation may be provided through the display screen 194. In other embodiments, the touch sensor 180K may be disposed on the surface of the terminal 100 at a different position than the display screen 194.

The bone conduction sensor 180M can acquire a vibration signal. In some embodiments, the bone conduction sensor 180M may acquire a vibration signal of the human vocal part vibrating the bone mass. The bone conduction sensor 180M may also contact the human pulse to receive the blood pressure pulsation signal. In some embodiments, the bone conduction sensor 180M may also be disposed in a headset, integrated into a bone conduction headset. The audio module 170 may analyze a voice signal based on the vibration signal of the bone mass vibrated by the sound part acquired by the bone conduction sensor 180M, so as to implement a voice function. The application processor can analyze heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.

The keys 190 include a power-on key, a volume key, and the like. The keys 190 may be mechanical keys. Or may be touch keys. The terminal 100 may receive a key input, and generate a key signal input related to user setting and function control of the terminal 100.

The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration cues, as well as for touch vibration feedback. For example, touch operations applied to different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also respond to different vibration feedback effects for touch operations applied to different areas of the display screen 194. Different application scenes (such as time reminding, receiving information, alarm clock, game and the like) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.

Indicator 192 may be an indicator light that may be used to indicate a state of charge, a change in charge, or a message, missed call, notification, etc.

In addition, an operating system runs on the above components. For example, the iOS os developed by apple, the Android open source os developed by google, the Windows os developed by microsoft, and the like. A running application may be installed on the operating system.

The operating system of the terminal 100 may adopt a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the present application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of the terminal 100.

Fig. 2 is a block diagram of a software configuration of the terminal 100 according to the embodiment of the present application.

The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android runtime) and system library, and a kernel layer from top to bottom.

The application layer may include a series of application packages. As shown in fig. 2, the application package may include applications such as camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc. For example, when taking a picture, a camera application may access a camera interface management service provided by the application framework layer.

The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. As shown in FIG. 2, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like. For example, in the embodiment of the present application, when taking a picture, the application framework layer may provide an API related to a picture taking function for the application layer, and provide a camera interface management service for the application layer, so as to implement the picture taking function.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide a communication function of the terminal 100. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to inform download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, text information is prompted in the status bar, a prompt tone is given, the terminal vibrates, an indicator light flashes, and the like.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application layer and the application framework layer as binary files. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface managers (surface managers), media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., openGL ES), 2D graphics engines (e.g., SGL), and the like.

The surface manager is used to manage the display subsystem and provide fusion of 2D and 3D layers for multiple applications.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

Although the Android system is taken as an example for description in the embodiments of the present application, the basic principle is also applicable to a terminal based on an os such as iOS or Windows.

The following describes the workflow of the software and hardware of the terminal 100 in conjunction with fig. 1 and the photographing scene.

The touch sensor 180K receives the touch operation and reports the touch operation to the processor 110, so that the processor 110 starts the camera application in response to the touch operation, and displays a user interface of the camera application on the display screen 194. For example, after receiving the touch operation on the camera application icon, the touch sensor 180K reports the touch operation on the camera application to the processor 110, so that the processor 110 starts the camera application in response to the touch operation, and displays the user interface of the camera on the display screen 194. In addition, in the embodiment of the present application, the terminal 100 may be enabled to start the camera application in other manners, and a user interface of the camera application is displayed on the display screen 194. For example, when the terminal 100 displays a user interface after a screen is blacked, a screen lock interface is displayed, or the terminal is unlocked, the camera application may be started in response to a voice instruction or a shortcut operation of the user, and the user interface of the camera application may be displayed on the display screen 194.

When the terminal is used for shooting, the color light effect of the shot image is not good probably due to the environmental factors of the current scene. For example, in a dark light scene, a high dynamic range scene, or an ultra-high range scene, the color shading effect of the captured image is caused to be poor compared to that of the captured image when the scene is bright. In contrast, optimizing the color and shade of the image to improve the user experience is a hot spot for research and commercial use.

In order to solve the problem of optimizing color and light of an image, an embodiment of the present application provides an image processing method, where the method is applied to a terminal, the terminal includes a first camera and a second camera, and a magnification of the first camera is greater than a magnification of the second camera. The method comprises the following steps: when the ISO of the first camera in the first scene is larger than a first threshold value, acquiring a first image for the first scene through the first camera, and acquiring a second image for the first scene through the second camera; the color shadow degree of the second image is higher than that of the first image; optimizing the first image according to the second image to obtain a third image; the third image has a higher degree of color shading than the first image. In the embodiment of the application, the terminal optimizes the image with low color light and shadow degree by using the image with high color light and shadow degree based on the same scene, and the acquisition of objective environment information is considered.

In the present embodiment, the "image style" may be understood as "color shadow of an image". For example, the image style (or the color shadow of the image) may include at least one of color, contrast, and dynamic range of the image. For another example, the image style of the second image is better than the image style of the first image, and the color shading degree equivalent to the second image is higher than the color shading degree of the first image.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 3 is a flowchart of a method for processing an image according to an embodiment of the present disclosure. The method shown in fig. 3 comprises the following steps:

s101: before a terminal collects an image for a first scene in a current shooting environment, a first camera and a second camera are determined. Wherein, the first camera is used for shooting an image to be optimized (such as a first image in the present application), and the second camera is used for shooting a reference image (i.e. an image used in the process of optimizing the image to be optimized, such as a second image in the present application).

In one implementation, the ISO of the first camera in the current shooting environment of the first camera is greater than a first threshold.

For the same camera, the value of the ISO can be automatically adjusted in different environments. The first threshold value is larger than or equal to the ISO value of the first camera when representing the critical state of the bright area and the dark area. The value of the first threshold is related to the specification of the first camera, the critical state for defining a bright area and a dark area, and the like. The parameters for characterizing the specification of the camera may include: focal length, magnification, photosensitivity, and the like. In one example, the first threshold may be around 200.

In this implementation, the first scene is colloquially a dim light scene, such as a night scene, or a dark area scene during the day.

In another implementation, the first scene is a high dynamic range scene or an ultra-high dynamic range scene. Specifically, if the ratio of the maximum value to the minimum value of the brightness of the preview image when the first camera shoots the first scene is greater than a threshold, it is determined that the first scene is a high dynamic range scene or an ultrahigh dynamic range scene.

In another implementation, the first scene is a high contrast scene. Specifically, if the difference between the maximum value and the minimum value of the contrast of the preview image when the first camera shoots the first scene is greater than a threshold, the first scene is a high-contrast scene.

In another implementation, the first scenario is an exception scenario. Here, the abnormal scene means that the image obtained by capturing the first scene cannot truly reflect the scene due to environmental factors and the like. For example, the texture information of an object in an image obtained by capturing the scene is distorted or color cast (i.e., color changes) occurs.

It can be understood that, the embodiment of the present application may optimize color shadow information of an image, and therefore, the embodiment of the present application is particularly suitable for scenes with poor shooting effect, such as the above-mentioned night scenes or abnormal scenes.

Optionally, the magnification of the first camera is greater than the magnification of the second camera.

In one implementation, the second camera's magnification range includes [0.5,1 ], and the first camera's magnification range includes [1,20]. For example, the second camera is a wide-angle camera, and the first camera is a 1X camera (i.e., a camera with a magnification of 1), a 3X camera (i.e., a camera with a magnification of 3), or a 10X camera (i.e., a camera with a magnification of 10).

In another implementation, the magnification of the second camera is 1 and the magnification of the first camera belongs to the interval (1, 20.) for example, the second camera is a 1X camera and the first camera is a 3X camera or a 10X camera.

In another implementation, the magnification of the second camera belongs to the interval (1, 10), and the magnification of the second camera belongs to the interval [10,20]. For example, the second camera is a 3X camera and the first camera is a 10X camera.

Optionally, the magnification of the first camera is equal to the magnification of the second camera. For example, the magnifications of the first camera and the second camera are both less than 1, or both equal to 1, or both greater than 1, for example, the first camera and the second camera are both wide-angle cameras, or both 1X cameras, or both 3X cameras, or both 10X cameras.

The foregoing is merely an example, and specific values of the magnification of the camera applied in the embodiments of the present application are not limited.

Optionally, the second camera is used for shooting black and white images, and the first camera is used for shooting color images. Illustratively, the second camera is a 1X camera, and the second camera is a wide-angle camera or a 1X camera. Of course, not limited thereto.

Of course, both the second camera and the first camera may be used to capture color images. Hereinafter, the first image and the second image are both color images for illustration, and are described herein in a unified manner, which is not described in detail below.

Optionally, the terminal includes a plurality of cameras, and the terminal may determine the first camera and the second camera based on the following steps:

when the shooting magnification of the terminal for the first scene is [1, 3), a camera with the magnification of 1in the terminal is selected as the first camera. In this case, a camera with a magnification of 1 or less than 1in the terminal may be selected as the second camera.

When the shooting magnification of the terminal for the first scene is [3, 7), a camera with the magnification of 3 in the terminal is selected as the first camera. In this case, a camera with a magnification of 3, 1, or less than 1in the terminal may be selected as the second camera.

When the shooting magnification of the terminal for the first scene is [7,10 ], a camera with the magnification of 10 in the terminal is selected as the first camera. In this case, a camera having a magnification of 3, 1, or less than 1in the terminal may be selected as the second camera.

When the shooting magnification of the terminal for the first scene is larger than 10, a camera with the magnification of 10 in the terminal is selected as the first camera. In this case, a camera having a magnification of 10, 3, 1, or less than 1in the terminal may be selected as the second camera.

More generally, when the shooting magnification of the terminal for the first scene is [1, A); selecting a camera with the magnification of 1 or a main camera in the terminal as a first camera;

or when the shooting magnification of the terminal for the first scene is [ A, B ], selecting a camera with the magnification of A or a main camera in the terminal as the first camera;

or when the shooting magnification of the terminal for the first scene is greater than B, selecting a camera with the magnification of C or a main camera in the terminal as the first camera;

or when the shooting magnification of the terminal for the first scene is larger than C, selecting a camera with the magnification of C or a main camera in the terminal as the first camera;

wherein, A is more than 1 and B is more than C.

After the first camera is determined, a camera with a magnification less than or equal to that of the first camera in the terminal can be selected as the second camera.

S102: the terminal collects a first image for a first scene through the first camera, and collects a second image for the first scene through the second camera. The color light and shadow degree of the second image is higher than that of the first image.

The embodiment of the present application does not limit the shooting time interval of the first image and the second image. For example, the shooting time interval of the first image and the second image is less than a threshold value. Since the longer the shooting time interval, the higher the probability of occurrence of shake, environmental surge, and the like, which results in a high probability of a difference in the contents of the first image and the second image, theoretically, the smaller the shooting time interval between the first image and the second image is, the better. Alternatively, the shooting time interval of both may be on the order of milliseconds or on the order of tens of milliseconds.

The degree of color shading of the image may be characterized based on at least one of color, contrast, and dynamic range of the image.

Optionally, if a preset condition is satisfied between the second image and the first image, it is indicated that the color shading degree of the second image is higher than that of the first image. The preset condition includes at least one of:

condition 1: the color of the second image is better than the color of the first image.

Specifically, the color of the pixel representing an object in the second image is better than the color of the pixel representing the object in the first image, and the number of pixels having the feature in the second image is greater than the threshold. Colloquially, the overall color of the second image is better than the overall color of the first image.

Alternatively, if the signal-to-noise ratio of the colors of the second image is higher than that of the colors of the first image, it is indicated that the colors of the second image are better than the colors of the first image.

Optionally, if the chromaticity of the second image is greater than the chromaticity of the first image, and/or the luminance of the second image is greater than the luminance of the first image, it is indicated that the color of the second image is better than the color of the first image.

Condition 2: the contrast of the second image is higher than the contrast of the first image.

Condition 3: the dynamic range of the second image is greater than the dynamic range of the first image.

The embodiment of the present application provides the following implementation manners, which help to realize that the color shadow degree of the second image is higher than that of the first image:

mode 1: the light sensing performance of the second camera is higher than that of the first camera.

The light sensing performance of the camera is an inherent characteristic of the camera and is used for representing the imaging capability of the camera. Generally, the larger the area of a sensor (sensor) in a camera, the stronger the light sensitivity of the camera, and in the case where the other conditions are the same, the stronger the light sensitivity of the camera, the higher the degree of color shading of an image obtained by capturing.

Mode 2: the aperture of the second camera is larger than the aperture of the first camera.

Mode 3: the exposure duration when the second camera collects the second image is longer than the exposure duration when the first camera collects the first image. For example, the exposure time for the second camera to capture the second image is in the order of one hundred milliseconds, e.g., T _a Which may be about 500 milliseconds, and the exposure time when the first camera captures the second image is on the order of several to tens of milliseconds.

Mode 4: the ISO used when the second camera collects the second image is larger than the ISO used when the first camera collects the first image.

The optimal ISO value of the camera is different under different illumination. The optimal ISO value under a certain illuminance is an ISO value when the color shading degree of the image is the highest (or the color shading degree reaches a certain threshold) under the illuminance. The optimal ISO value under any illuminance can be obtained based on experiments, which is not limited in the embodiment of the present application. Under the same illumination, when the ISO value is smaller than or equal to the optimal ISO value, the larger the ISO value is, the higher the color light and shadow degree of the shot image is; when the ISO value is larger than the optimal ISO value, the larger the ISO value is, the lower the color shadow degree of the shot image is.

The above mode 1 and mode 2 can be understood as that based on the hardware attribute of the camera, the color shadow degree of the second image is higher than that of the first image. The above-mentioned modes 3 and 4 can be understood as adjusting parameters of the camera based on software to realize that the color shadow degree of the second image is higher than that of the first image. The above-described modes 3 and 4 can be applied to the case where the light sensing performance of the second camera is the same as or not much different from that of the first camera.

The aperture and the exposure time of the camera affect the light inlet quantity of the camera, and the sensor photosensitive area and iso of the camera determine the capacity of the camera for receiving the light quantity. Generally, the larger the aperture, the longer the exposure time, the larger the photosensitive area, and the higher the iso, the greater the amount of light that the camera can actually receive at the end. However, the larger the received light quantity is, the better, and there is an exposure critical value for a specific shooting environment, and if the received light quantity is smaller than the exposure critical value, the larger the received light quantity is, the better the color shading effect of the shot image is. If the received light amount is equal to or greater than the exposure threshold value, the greater the received light amount, the color shading effect of the captured image may be deteriorated by overexposure. As an example, the above-described modes 3 and 4 can be applied to a scene in which the received light amount is smaller than the exposure threshold value.

In addition, in the case where there is no conflict, a plurality of modes of the above-described modes 1 to 4 can be used in combination, and a new implementation mode is configured. For example, the above-described modes 3 and 4 may be used in combination.

Optionally, S102 may be implemented by the following mode a or mode B:

mode A: as shown in fig. 4, S102 may include the following steps S21 to S25:

s21: the terminal determines an illumination of a first scene.

S22: the terminal acquires a first exposure time and a first ISO of the first camera based on the determined illumination of the first scene. The specific implementation of this step may refer to the prior art, and is not described herein again.

S23: the terminal shoots a first scene by using the first exposure duration and the first ISO to obtain a first image.

S24: the terminal determines a second exposure time and a second ISO for shooting a second camera based on at least one of the first exposure time and the first ISO.

The relationship among the exposure amount EV, the exposure time T and the ISO of the camera satisfies the following conditions: EV = log ₂ (T ISO). The terminal may pre-store the following association relations: the corresponding relation between the exposure (or the exposure time and/or the ISO) of the first camera and the exposure (or the exposure time and/or the ISO) of the second camera; and then, determining a second exposure time length and a second ISO based on the first exposure time length and the first ISO of the first camera determined in the step 2 and the pre-stored corresponding relation. The embodiment of the present application does not limit the concrete embodiment of the correspondence relationship, and for example, the correspondence relationship may be a functional relationship, a table, or the like.

Optionally, first, based on a predefined T _a And T _m A correspondence between them, or a predetermined T _a And "T _m And ISO _m "or predefining T _a To ISO _m The association relation between the T and the like to obtain T _a The value of (a). Wherein, T _a Denotes a second exposure time period, T _m Denotes the first exposure duration, ISO _m Representing the first ISO. Then, based on T _m 、ISO _m And T _a Obtaining the value of at least one parameter of _a . Wherein ISO _a Representing a second ISO.

In one example, the terminal may adjust the ISO _a So that the exposure EV of the second camera is used for acquiring the second image _a ＝log ₂ (T _a *ISO _a ) The exposure EV of the first camera is larger than that of the first image _m ＝log ₂ (T _m *ISO _m ) By a certain amount, i.e. meet EV _a ＝EV _m + f, where f is a predefined value.

In another example, based on predefined ISO _a To ISO _m An association relationship between them, or a predefined ISO _a And "T _m And ISO _m "or predefined relation betweenISO _a And T _m Obtaining the ISO _a The value of (a).

S25: and the terminal shoots the first scene by using the second exposure duration and the second ISO to obtain a second image.

Mode B: as shown in fig. 5, S102 may include the following steps S31 to S36:

s31 to S33: reference may be made to S21 to S23 described above.

S34: the terminal determines the ISO of the camera corresponding to the illumination section where the illumination of the first scene is located based on the corresponding relation between the illumination sections of the plurality of scenes and the ISO of the camera, and takes the determined ISO as the ISO of the second camera.

The correspondence may be predefined. The ISO of the camera in the corresponding relationship may specifically be the optimal ISO value of the camera, and reference may be made to the above for the description of the optimal ISO value, which is not described herein again.

It should be noted that, when an image is shot with an optimal ISO value under a certain illumination, while the degree of color and light of the shot image is guaranteed to be high, texture information may be sacrificed, noise may be relaxed, and the degree of registration (i.e., the degree of registration between different images collected in the same scene) may be reduced; that is, when an image is captured using the optimal ISO value, the color shading information of the image is recorded with high quality at the expense of texture information, relaxation of noise, and misalignment tolerance upper limit.

S35: the terminal determines a second exposure time of the second camera based on at least one of the first exposure time and the first ISO. Reference may be made to the above for specific implementations.

S36: reference may be made to S25 described above.

Optionally, based on the above manner a, when designing the predefined corresponding relationship used in executing S24, an "optimal ISO value" may be considered as a factor that affects the value of f, so that the value of the second ISO obtained by executing S24 is closer to the optimal ISO value.

It should be noted that, in the above, the first image and the second image are obtained by shooting with different cameras respectively. In addition to this, the first image and the second image may be obtained by any one of the following means:

in one implementation, the first image and the second image can be captured by the same camera under different working conditions. For example, the camera can take a second image when the flash is turned on and a first image when the flash is turned off; or the camera shoots to obtain the first image and the second image by adjusting the exposure time and ISO of the camera.

In another implementation, the second image may be an image with the highest degree of color shading or with a degree of color shading higher than a certain threshold, among images captured by a plurality of cameras.

In another implementation manner, any one of the first image and the second image may be formed by fusing multiple frames of images, rather than being directly captured by a camera. The multi-frame camera can be shot by the same camera, can also be shot by different cameras, can also be shot by part of the multi-frame camera, can also be shot by the same camera, can be shot by different cameras, and the like, and the embodiment of the application does not limit the shooting process.

Example 1: the method further comprises the following steps: acquiring N frames of images aiming at a first scene through a first camera, wherein N is an integer greater than or equal to 1; and carrying out multi-frame noise reduction according to the N frames of images and the first image to obtain a fourth image. In this case, optimizing the first image according to the second image to obtain a third image may include: and optimizing the fourth image according to the second image to obtain a third image. That is, the terminal optimizes the fused image based on the plurality of images shot by the same camera to obtain the third image.

Example 2: the method further comprises the following steps: respectively acquiring N1 frame images and N2 frame images for a first scene through a first camera and a second camera; wherein N1 and N2 are integers greater than or equal to 1; and performing multi-frame noise reduction according to the N1 frame image, the N2 frame image and the first image to obtain a fifth image. In this case, optimizing the first image according to the second image to obtain the third image may include: and optimizing the fifth image according to the second image to obtain a third image. That is to say, the terminal optimizes the fused image based on a plurality of images shot by different cameras to obtain a third image.

As an example, the above examples 1 and 2 may be applicable to the cases where the shooting magnification of the terminal for the first scene is [1,3 ], [3, 7), or greater than 10, and for these cases, reference may be made to the above for specific implementation manners of how to select the first camera and the second camera, which is not described herein again.

Example 3: the terminal further comprises a third camera, and the multiplying power of the third camera is not larger than that of the first camera. The method further comprises the following steps: respectively acquiring N3 frames of images and N4 frames of images for a first scene through a first camera and a third camera; wherein N3 and N4 are integers greater than or equal to 1; and carrying out multi-frame noise reduction according to the N3 frame images, the N4 frame images and the first image to obtain a sixth image. In this case, optimizing the first image according to the second image to obtain the third image may include: and optimizing the sixth image according to the second image to obtain a third image. The third camera is a camera different from the first camera and the second camera in the terminal. That is to say, the terminal may perform optimization based on the image obtained by fusing the plurality of images captured by the at least three cameras, so as to obtain the third image.

As an example, the above examples 1 and 2 may be applicable to a case where the shooting magnification of the terminal for the first scene is [7,10 ], and in this case, reference may be made to the above for specific implementation of how to select the first camera and the second camera, which is not described herein again.

The technical solutions provided in examples 1 to 3 are helpful for making the color shadow degree of the optimized third image higher, so as to improve the user experience.

In particular, there are more methods for generating frames and fusing, and the embodiments of the present application are not limited thereto.

S103: the terminal determines whether the second image is available.

In the process of capturing the second image from the first scene, abnormal situations such as lens occlusion, environmental drastic change, very serious hand shake, and serious blur may be encountered, which may result in the second image obtained by capturing having no reference value (or a low reference value), that is, the second image being unavailable, and therefore, in the embodiment of the present application, before optimizing the chromaticity of the first image, it may be determined whether the second image is available.

Optionally, S103 may include the following steps S103A to S103B:

S103A: the terminal preprocesses the second image so that the viewing ranges of the first image and the processed second image are the same or similar (i.e. the difference between the viewing ranges is within a preset range), and the sizes (including width and height) of the first image and the processed second image are the same or similar (i.e. the difference between the sizes is within a preset range). Specifically, the method comprises the following steps:

step 1: when the shooting condition of the first image and the shooting condition of the second image meet the preset condition, the terminal performs processing such as zooming and cropping on the second image, so that the view range of the processed second image is close to the view range of the first image (namely the difference of the view ranges is within the preset range).

Wherein the preset condition may include the following condition 1 or condition 2:

condition 1: the first image is taken under zoom conditions; and, the second image is photographed under a non-zoom condition, or the second image is photographed under a zoom condition, but a zoom magnification used when the second image is captured is different from a zoom magnification used when the first image is captured.

Condition 2: the first image is taken under non-zoom conditions; and, the second image is photographed under zoom conditions.

It should be noted that step 1 is optional. If the first image and the second image are both taken under non-zoom conditions, or if the first image and the second image are both taken under zoom conditions but the zoom magnification used when the first image and the second image are acquired is the same; then, the viewing ranges of the first image and the second image are usually close (including the same or close), and in this case, the following step 2 may be directly performed without performing the processing such as the zoom cropping on the second image.

Step 2: the terminal may adjust the sizes of the first image and the second image with similar viewing ranges) to be the same (or similar), and the specific implementation manner may refer to the prior art.

It should be noted that step S103A is optional, and if the framing ranges of the captured first image and the captured second image are similar and the sizes of the captured first image and the captured second image are the same or similar, the terminal may not execute step S103A.

S103B: the terminal judges whether the first image and the processed second image can be effectively registered under the condition that the brightness histogram is corrected to be equivalent (namely the difference of the brightness histogram is within a preset range). If the registration is valid, then the second image is available; otherwise, the second image is not available.

The embodiment of the application does not limit the measurement method of effective registration. For example, the measurement method may include: and finding characteristic paired point pairs between the first image and the processed second image by using a classical sift method or a surf method and the like. If the number of the feature paired point pairs of the two is less than or equal to a preset value, the two cannot be effectively registered; otherwise, the two are considered to be in valid registration. The specific implementation method may refer to the prior art, and is not described herein again.

It should be noted that the technical solution provided in the embodiment of the present application has strong robustness for the second image obtained by shooting under abnormal conditions such as shake and blur, and in this case, the result of determining effective registration is usually "unusable" by using a classical sift or surf method. In a possible implementation manner, a two-class AI network can be trained specifically to realize the judgment of effective registration, and the method is more flexible and accurate than the classical method. Specific methods for training the second-class AI network, etc., the embodiments of the present application are not limited.

It should be noted that if the second image does not need to be preprocessed (i.e., S103A is not executed), and S103B is directly executed, S103B is specifically the terminal that determines whether the first image and the second image can be effectively registered when the luminance histograms are corrected to be equivalent.

S104: if the second image is available (i.e., the first and second images are successfully registered), the first image is optimized based on the second image, resulting in a third image. And the color shadow degree of the third image is better than that of the first image.

For the determination of the degree of color and light of the two images, reference may be made to the above description, which is not repeated herein.

Based on the above explanation of the color light and shade degree of the second image and the color light and shade degree of the first image, and the color light and shade degree between the third image and the first image that satisfies that the third image is better than the first image, it can be derived that:

the first image, the second image and the third image satisfy at least one of the following conditions: if the color of the second image is better than the color of the first image, the color of the third image is better than the color of the first image; if the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image; if the dynamic range of the second image is greater than the dynamic range of the first image, the dynamic range of the third image is greater than the dynamic range of the first image.

Optionally, the image content of the third image is the same as the image content of the first image. Is equivalent to: the texture information of the third image is the same as the texture information of the first image. The term "identical" as used herein may mean identical within certain tolerances.

Alternatively, if some regions (marked as regions of the first type) exist in the first image but not in the second image due to abnormal situations such as shaking and blurring, i.e.: the area corresponding to the first type area cannot be found in the second image, wherein one area and the area corresponding to the area are used for representing the same object; also, some regions (labeled as regions of the second type) are present in the first image as well as in the second image, i.e.: the area corresponding to the second type of area can be found in the second image. Then the second image may be considered available. In this case, the second image is available, specifically, a region corresponding to the second type region in the second image is available.

Based on this, the terminal can circle out the region (i.e. the first type region) in the first image, in which the corresponding region cannot be found in the second image, through the ghost detection method. The embodiment of the present application does not limit the ghost detection method. Further, after executing S104 to obtain the third image, the terminal may perform no processing on the region corresponding to the first type region in the third image (i.e., the region representing the same object as the first type region), and perform gradual blending on the region and the other regions at the boundary to remove the obvious boundary between the region and the other regions, thereby improving the image quality.

Optionally, if the second image is available, before performing S104, the method may further include: the terminal registers the first image and the second image (specifically, the second image obtained after the preprocessing) by using, but not limited to, classic sift, surf, etc., and of course, an AI network may also be used for registration. After the registration step, S104 is performed, which helps to make the optimization result more accurate.

After executing the above S104:

in one implementation, the terminal may display the third image obtained in S104.

In another implementation, there may be a desire to only adjust a certain range of pixel values (or a certain region) when a particular implementation is considered, with other locations remaining unchanged. Based on this, the embodiment of the present application provides the following technical solutions:

firstly, a terminal acquires a first target sub-image in a first image; wherein the first target sub-image is an area of the first image where no color shading information is desired to be changed. Specifically, the highlight region or even the over-burst region in the first image may be a region containing a specific color, or a region defined based on semantic content or the like, or obtained based on a network such as an attention network. Secondly, the terminal acquires a second target sub-image in the third image; wherein the second target sub-image and the first target sub-image describe the same object. Then, the terminal updates (or replaces) the second target sub-image in the third image by using the first target sub-image to obtain a fourth image. Or the terminal splices the first target sub-image and the image except the second target sub-image in the third image to obtain a fourth image. For example, final result = straight out result (1-mask) + first image mask. And the final result represents a fourth image, the straight-out result represents a third image, the mask represents the position of the first target sub-image in the first image or the position of the second target sub-image in the third image, the first image represents the first target sub-image, and the straight-out result represents the second target sub-image. Subsequently, the terminal may display a fourth image. The color light and shadow degree of the fourth image is higher than that of the first image; the image content of the fourth image is the same as (or similar to) the image content of the first image.

Optionally, the mask may also be an area surrounding the second image, and the efficiency is often higher, that is, an area with poor effect on the second image is not desired to be used on the first image, such as a high overexposure area. And then, the circled mask is subjected to gaussian blur or up-and-down sampling to fuzzify to construct a gradually-changed transition zone, so that the transition zone is applied to the first image without considering the misalignment between the first image and the second image, and the expected effect is ensured.

Optionally, the mask may also be solved by a network, such as an attention network.

Optionally, the input to the network may be the first image, the second image, or a combination of the first image and the second image, based on any of the above approaches.

Hereinafter, a specific implementation of S104 is explained:

optionally, the optimizing, by the terminal, the first image by using the second image to obtain a third image may include:

mode 1: and representing the mapping relation between the color shadow information of the first image and the second image by using the CCM matrix, thereby optimizing the first image based on the CCM matrix to obtain a third image.

Specifically, as shown in fig. 6, the method 1 may include the following steps S41 to S45:

s41: the terminal respectively extracts features of the first image and the second image to obtain a first tensor and a second tensor; the first tensor is used for representing texture information and color shadow information of the first image, and the second tensor is used for representing texture information and color shadow information of the second image. The first tensor and the second tensor are the same size.

For example, the terminal uses an AI technique, such as a neural network, to perform feature extraction on the first image and the second image, respectively, to obtain a first tensor and a second tensor. The method for extracting the features of the image by using the neural network by the terminal can refer to the prior art, and is not described herein again. Of course, in a specific implementation, the terminal may also use a conventional technique instead of the AI technique to perform feature extraction on the first image and the second image, which is not limited in this embodiment of the present application.

Upon executing S104A, the second image in S104A may be specifically a preprocessed second image in the step S103A.

Taking the first image and the second image as an example of color images, before feature extraction is performed on the first image and the second image by using the third neural network, the terminal may perform the following steps: first, the first image and the second image are resized (resize) to be the same, such as by scaling the second image so that the first image and the scaled second image are the same size. If the sizes of the first image and the second image are the same after S102 is performed, this step may not be performed. Then, the adjusted first image and second image are input to a third neural network. And marking the sizes of the adjusted first image and the adjusted second image as H W3, wherein H and W respectively represent the width and the height of the adjusted first image and the adjusted second image, and 3 represents RGB three channels. For example, the values of H and W may be predefined in the third neural network.

In one example, the third neural network may eventually divide the first and second images into h x w image blocks, respectively. Where h and w denote the numbers of image blocks divided in the width and height directions of the first image (or the second image), respectively. For example, the values of h and w may be predefined in the third neural network.

Optionally, if the convolution kernel (kernel) of the convolution operation used in the third neural network is a square convolution kernel, H/H = W/W = D is required to be satisfied, where D is a positive integer. In this case, the size of each image block is D × D.

Alternatively, it is assumed that the sizes of the first tensor and the second tensor are labeled as h _f *w _f *c _f Then H/H is satisfied _f ＝W/w _f = D1, wherein D1 is a positive integer. c. C _f Is the number of features, e.g., if 10 texture features, 10 color features are extracted, then c _f Is 20. Exemplary, h _f 、w _f And c _f The value of (d) may be predefined in the third neural network.

Optionally, the terminal may use the same third neural network, or may use different third neural networks to extract the features of the first image and the features of the second image.

S42: and the terminal performs superposition operation (concat) on the first tensor and the second tensor based on the characteristic dimensionality of the first tensor and the second tensor to obtain a target tensor. Or, performing subtraction operation on the first tensor and the second tensor to obtain the target tensor.

Marked as h by the size of the first tensor and the second tensor _f *w _f *c _f For example, based on the characteristic dimensions of the first tensor and the second tensor, the size of the target tensor obtained by performing the superposition operation on the first tensor and the second tensor is h _f *w _f *2c _f 。

The tensor obtained by subtracting the two tensors is a tensor composed of elements obtained by subtracting the same-position elements in the two tensors. Marked as h by the dimensions of the first tensor and the second tensor _f *w _f *c _f For example, the size of the target tensor obtained by subtracting the first tensor and the second tensor is h _f *w _f *c _f . "A-B" is equivalent to "A + (-B)" and thus, subtracting the first tensor from the second tensor is equivalent to adding the first tensor to the opposite of the second tensor.

S43: the terminal acquires a CCM matrix of at least two sub-images using a first neural network. The first neural network is configured to analyze color shading information and texture information of the first image and color shading information and texture information of the second image to obtain a CCM matrix of at least two sub-images, for example, a CCM matrix of each sub-image into which the first image is divided.

Optionally, the input information of the first neural network includes a target tensor. The output information of the first neural network includes the CCM matrix for each sub-image in the first image.

The at least two sub-images include a first sub-image, and the CCM matrix of the first sub-image is used for representing the mapping relation between the color of the first sub-image and the color of a second sub-image in the second image. The first sub-image and the second sub-image are images of the same object. The first sub-image may be any one of at least two sub-images included in the first image. "object" refers to the same part/portion of the same object, for example, the object may be the eyes, nose, mouth, etc. of a person, and as another example, the object may be the left index finger portion or the right thumb portion of a person.

Optionally, the sub-image satisfies one of the following conditions:

condition 1: the sub-images are image blocks in the first image, the different image blocks being of the same size.

A frame image may include at least two image blocks. The image blocks are rectangular, which may be square, for example. Based on the condition 1, one sub-picture can be regarded as one picture block in one frame picture.

Condition 2: the sub-images are determined based on the similarity of the pixels in the first image, and the similarity of the pixels in the same sub-image is greater than or equal to a third threshold. Optionally, the different similarities in different sub-images are smaller than a third threshold.

This is a technical solution proposed in consideration of "in a normal case, in the same image, the similarity of pixels describing the same object is not greatly different, but the similarity of pixels describing different objects is greatly different". For example, assuming that a person and a building are included in the first image, there may be a case where the similarity between pixels describing the person and pixels describing the building are different greatly, and the similarity between pixels describing the building is not different greatly, in which case, the pixels describing the person may be collectively regarded as one sub-image, and the pixels describing the building may be collectively regarded as one sub-image. For another example, for pixels describing a person, there may be a large difference in similarity between pixels describing a person's coat and pixels describing a person's hair, in which case the pixels describing a person's coat may be collectively regarded as one sub-image and the pixels describing a person's hair may be collectively regarded as one sub-image.

It will be appreciated that the third threshold may be the same size or different sizes for different sub-images.

In order to distinguish from the pixel blocks, in some descriptions of the embodiments of the present application, each sub-image obtained by dividing the image based on the similarity is referred to as one pixel group. One frame image may include one or more pixel groups. Based on the condition 2, one sub-image can be regarded as one pixel group in one frame image.

Condition 3: the sub-images are determined based on the similarity of the pixels in the image blocks in the first image, the sizes of the different image blocks are the same, and the similarity of the pixels in the same sub-image of the same image block is greater than a third threshold.

Based on condition 3, one sub-image can be regarded as one pixel group in one image block of one frame image.

The condition 3 is that one frame image is divided from a finer granularity than the above-described condition 1 or condition 2. In this way, more CCM matrices are facilitated when S43 is performed, thereby making the interpolation result obtained when S44 is performed more accurate.

Based on condition 3, the output result of the first neural network may be sized h × w (n) _g *n _m ). The process of the first neural network processing the first image and the second image may correspond to: averagely dividing the first image frame into h x w pixel blocks, wherein the pixels in each pixel block are divided into n according to the similarity _g A plurality of pixel groups; for each pixel group, the first neural network outputs n _m And CCM matrix elements. Exemplary, n _m Generally, 9 is taken, and the number of elements of the CCM matrix corresponding to 3 × 3 is the number of y channels, u channels, and y channels. Illustratively, if it is desired that the scheme only handle chrominance, i.e., only the u-channel and the y-channel, then n _m A value less than 9 may be taken, such as 6. This is equivalent to processing 3 elements of the y-channel and may be disregarded. Of course, the output CCM matrix at this time is applied in YUV domain.

The above S43 may be considered as an implementation manner of acquiring the CCM matrix of at least two sub-images, and the implementation is not limited thereto. For example, the CCM matrix for at least two sub-images may be obtained using conventional methods rather than AI techniques.

S44: the terminal obtains a CCM matrix for the pixels in the first image based on the CCM matrices for the at least two sub-images. For example, the terminal performs interpolation operation based on the CCM matrices of the at least two sub-images to obtain a CCM matrix of each pixel in the first image. The first image includes a first pixel, and the first pixel may be any one of pixels in the first image. The CCM matrix of the first pixel is used to characterize a mapping between the color of the first pixel and the color of a second pixel in the second image. The first pixel and the second pixel correspond to the same image content.

As shown in fig. 7, a schematic diagram of dividing the first image using mesh averaging is shown. Let the first neural network output netout [ i, j,0 n _g *n _m ]Is the (i, j) th, 0 of the first image<＝i<h；0<＝j<w, CCM matrix corresponding to the area central point (namely grid point); the CCM matrix for each pixel in the first image may then be interpolated from the grid points adjacent to it. And taking each pixel as a center, enclosing a square area, wherein the area of the area can be equal to that of each square area equally divided by the first image, and using all grid points in the square area for interpolation to obtain the CCM matrix of the pixel. According to this rule, as shown in fig. 7, it is easy to see that there are three types of interpolation regions (e.g., a first type interpolation region, a second type interpolation region, and a third type interpolation region) divided by the number of mesh points required for interpolation.

Recording the similarity of one pixel of the first image as s; the number of grid points for interpolation is M, the M-th grid point, 0<＝m<M, the data participating in interpolation is netout [ i _m ,j _m ,n _ms *n _m :(n _ms +1)*n _m ]And netout [ i ] _m ,j _m ,(n _ms +1)*n _m :(n _ms +2)*n _m ]Wherein for the mth grid point, the pixel similarity s is between the nth grid point _ms Group and n _ms Between +1 groups, 0<＝n _ms <n _g -1, it being understood here that the pixel can be grouped into these two groups. Regarding the measure of similarity, the simplest is to use the y-channel value of a pixel, and for grouping, the y-channel value is ranged from 0,255]Is divided into n on average _g Group, then have (n) _g -1) an integer part of s/255 being n _ms . Of course, there are other calculation methods for the similarity, yuv may be considered comprehensively, correlation may be calculated, and even a network may be trained specially to calculate. More broadly, the similarity measure method of the pixels for each interpolated grid point may be different as long as the continuity of the interpolation can be ensured. To ensure continuity, for a single grid point, all pixels of the first image in a square region with a side length of 2D centered on the grid point are measured by the same similarity measure with respect to the grid point.

The interpolation method may be a linear, cubic, lancoz, or the like method. Taking linearity as an example, for the m-th grid point, the pixel similarity s to the n-th grid point _ms Group and n _ms The distance of +1 groups is L _ms And L _ms +1, am = netout [ i _m ,j _m ,n _ms *n _m :(n _ms +1)*n _m ]*L _ms +1/(L _ms +L _ms +1)+netout[i _m ,j _m ,(n _ms +1)*n _m :(n _ms +2)*n _m ]*L _ms /(L _ms +L _ms +1). Knowing the coordinates of the pixel and the M interpolation grid points, each Am is used for carrying out bilinear interpolation, and then the CCM matrix of the pixel is obtained.

Using trellis division, the network outputs a partial CCM matrix. The network can synchronously output a global CCM matrix, and then the local CCM matrix and the global CCM matrix are integrated for each pixel to obtain a final CCM matrix. The integration method may be proportional linear addition.

Optionally, after the terminal obtains the CCM matrix of each pixel, the terminal may perform a review and review correction by using the relevant hardware parameters in the shooting. The CCM matrix is further optimized or the scheme is terminated in case the calculated CCM matrix is very off-spectrum (e.g. the first image or the second image is abnormal, or the calculation is wrong). In particular, it is also possible to synchronize the recording of the environmental information with the additional sensor as a check (which may be used to deduce whether the first or second image taken is normal).

S45: the terminal optimizes the first image based on the CCM matrix of pixels (e.g., each pixel) in the first image to obtain a third image. Specifically, for any pixel (marked as a target pixel) in the first image, the terminal multiplies the pixel value of the target pixel by the CCM matrix of the target pixel to obtain the pixel value of a pixel in the third image. And the third pixel and the target pixel in the first pixel correspond to the same image content.

In the mode 1, the first image is optimized by using the CCM matrix method, which helps to ensure that the texture information and the definition of the first image are unchanged (or substantially unchanged) after being processed. In addition, in the above S43 to S44, the CCM matrix of each sub-image is obtained through the neural network, and then the CCM matrix of each pixel is obtained through the conventional interpolation method. Therefore, compared with the technical scheme of directly obtaining the CCM matrix of each pixel based on the neural network in the prior art, the method is beneficial to reducing the calculation complexity, thereby shortening the calculation time and effectively controlling the performance overhead.

As shown in fig. 8A, a schematic diagram of a process for optimizing a first image based on a CCM matrix is shown. For the specific process description, reference may be made to the above description, which is not repeated herein.

As shown in fig. 8B, a schematic diagram of the comparison of the first image, the second image and the third image when the first image is optimized based on the CCM matrix. As can be seen from fig. 8B, this embodiment may "migrate" (not limited to) the color shade information in the second image to the corresponding position of the first image pair. And comparing the third image with the first image, the texture information and definition of the third image and the first image are basically consistent, that is, the original texture information and definition of the first image before and after optimization are basically unchanged. Moreover, comparing the third image with the first image shows that the third image has better vividness and saturation of green plants than the first image, and the third image has better brightness of the ground and the stone column than the first image, that is, the original color and light and shadow degree of the first image before and after optimization is improved.

In the above, the explanation is made in the following description of "obtaining the CCM matrix of the plurality of sub-images included in the first image first, and then performing interpolation based on the CCM matrices of the plurality of sub-images to obtain the CCM of each pixel in the first image". Optionally, the sub-image may not be divided into the first image, that is, the CCM matrix of the first image is obtained by taking the image as the granularity, and the CCM matrix of the first image is used as the CCM matrix of each pixel in the first image. Therefore, the method is beneficial to reducing the complexity of calculation and reducing the cost of calculation performance.

Mode 2: the third image is directly output using a neural network, i.e., the pixels of each image in the third image are directly output. As shown in fig. 9, the method may specifically include the following steps S51 to S53:

s51 to S52: reference may be made to S41 to S42 described above.

S53: and the terminal optimizes the first image by using the second neural network and the second image to obtain a third image. The second neural network is used for optimizing the image style of the image with the poor image style by utilizing the image with the good image style.

Optionally, the input information of the second neural network includes a target tensor, the target tensor is used for representing texture information and color shadow information of the first image, and the texture information and the color shadow information of the second image. The output information of the second neural network includes a third image.

Alternatively, the second neural network may be an AI network of the uet type.

Optionally, as shown in fig. 10, a schematic structural and logical diagram of a second neural network provided in the embodiments of the present application is shown.

The second neural network can apply the coding and decoding idea, and the left side of the network is regarded as the coding layer of the first image, and the right side is the decoding layer of the first image. The coding layer part comprises a texture information layer and a color shadow information layer, wherein the texture information layer is used for coding the texture information of the first image, and the color shadow information layer is used for coding the color shadow information of the first image. The requirements for the coding layer include: when the network inputs texture information and color shadow information of two frames of the same first image and performs concat to obtain a tensor, the color shadow information layer faithfully records the original color shadow information of the first image, and therefore the network output is strictly equal to the first image. When the texture information and the color shadow information of the first image and the texture information and the color shadow information of the second image are input to the network to obtain tensors after concat, the network learns the color shadow information of the second image, so that the color shadow information layer of the first image is guided to change, but the texture information layer is not influenced (or influenced in a smaller range), and finally the network output is the expected result graph (namely the third image) after the color shadow information of the first image is optimized.

Specifically, as shown in fig. 10, the second neural network performs convolution operations on a first sub-tensor and a second sub-tensor in the target tensor, respectively, wherein the first sub-tensor includes texture information in the target tensor, and the second sub-tensor includes colors in the target tensor. In particular implementations, multiple convolution operations may be performed. Then, a tensor obtained by convolution operation based on the first sub-tensor and a tensor obtained by convolution operation based on the second sub-tensor are overlapped to obtain a superposed tensor. And finally, superposing the tensor obtained after superposition, the first sub-tensor and the second sub-tensor to obtain a third image.

The actual benefits of such a design of a network include: while the network processes the color shading information of the first image, the texture information of the second image is not changed (or the texture information of the first image is changed less). In addition, the network structurally decouples texture information and color light and shadow information, so that on one hand, the network is more targeted in processing, more accurate and less prone to introducing artifacts (artifacts), and texture information and definition are not influenced while color light and shadow information is changed; on the other hand, it helps to reduce the performance overhead of network processing. For example, assuming that the depths of the texture information layer before and after convolution processing are d1in and d1out, the depths of the color light and shadow information layer before and after convolution processing are d2in and d2out, and the total depths of the color light and shadow information layer before and after decoupling are d1in + d2in and d1out + d2out, the performance overhead of the convolution operation before decoupling is generally much larger than the performance overhead of the two convolution operations after decoupling for processing the detail information layer and the color light and shadow information layer respectively, as long as the sizes of convolution kernels are equivalent.

In the mode 2, the optimization step is performed by inputting feature information (including texture information and color shadow information) of an image into the second neural network at the granularity of the image. In practical implementation, feature information (including texture information and color shadow information) representing the same corresponding image block in the first image and the second image may also be input into the second neural network to perform the optimization step with the granularity of the sub-image (such as the image block); and then, splicing the optimization results of the corresponding sub-images in the first image and the second image to obtain a third image. Transition fusion may be required during splicing, and the specific implementation manner may refer to the prior art.

In addition to the above-described mode 1 and mode 2, S104 may be implemented using a conventional method, such as a method of solving an optimization equation, color shade information matching between the first image and the second image, and the like.

Fig. 11 is a schematic diagram of an image processing method according to an embodiment of the present application. The method is applied to a terminal, the terminal comprises a first camera and a second camera, the multiplying power of the first camera is not less than that of the second camera, and the method can comprise the following steps:

s201: the terminal acquires a first image aiming at a first scene in the current shooting environment through a first camera. Optionally, when the sensitivity ISO of the first camera in the current shooting environment is greater than the first threshold, the first camera collects a first image for a first scene in the current shooting environment.

S202: the terminal acquires a second image aiming at the first scene through a second camera; wherein the color of the second image is closer to the true color of the first scene than the color of the first image.

Optionally, the second image and the first image satisfy at least one of the following conditions: the chromaticity of the second image is closer to the true chromaticity of the first scene than the chromaticity of the first image; the second image has a luminance closer to the true luminance of the first scene than the luminance of the first image.

S203: the terminal optimizes the first image according to the second image to obtain a third image; the third image has a color that is closer to the true color of the first scene than the color of the first image.

Optionally, the image content of the third image is the same as the image content of the first image.

The embodiment is particularly suitable for scenes in which the color difference between the shot image and the real scene is too large, namely, the color cast phenomenon occurs. In this embodiment, the conditions that the first camera and the second camera satisfy, how to select the first camera and the second camera, how to optimize the first image (or optimize the first image obtained after fusion) using the second image, and the like can be referred to above. In principle, reference may be made to the above for any alternative implementation based on the present embodiments, without conflict.

It is understood that, in order to implement the functions of the above embodiments, the terminal includes a hardware structure and/or a software module corresponding to each function. Those of skill in the art will readily appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as hardware or combinations of hardware and computer software. Whether a function is performed as hardware or computer software driven hardware depends on the particular application scenario and design constraints imposed on the solution.

Fig. 12 includes a schematic structural diagram of a possible image processing apparatus provided in an embodiment of the present application. The image processing devices can be used for realizing the functions of the terminal in the method embodiment, so that the beneficial effects of the method embodiment can be realized. In the embodiment of the present application, the image processing apparatus may be the terminal 100 shown in fig. 1, and may also be a module (e.g., a chip) applied to the terminal. The following description will be given taking as an example that the image processing apparatus 131 is a module (e.g., a chip) in a terminal.

The terminal 13 includes an image processing apparatus 131, a first camera 132 and a second camera 133 as shown in fig. 12. The magnification of the first camera 132 is not less than that of the second camera 133. The image processing apparatus 131 may include: a control unit 1311 and an optimization unit 1312.

In some embodiments, the control unit 1311 is configured to control the first camera 132 to capture a first image for a first scene in the current shooting environment and control the second camera 133 to capture a second image for the first scene when the ISO of the first camera 132 in the current shooting environment is greater than a first threshold. An optimizing unit 1312 is configured to optimize the first image according to the second image to obtain a third image. For example, in conjunction with fig. 3, the control unit 1311 may be used to perform S102, and the optimization unit 1312 may be used to perform S104. Wherein when the color of the second image is better than the color of the first image, the color of the third image is better than the color of the first image; or when the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image; or when the dynamic range of the second image is larger than that of the first image, the dynamic range of the third image is larger than that of the first image.

Optionally, the color of the second image is better than the color of the first image, including at least one of the following conditions: the second image has a higher chroma than the first image; the second image has a luminance greater than the luminance of the first image.

In other embodiments, the control unit 1311 is configured to control the first camera 132 to capture a first image for a first scene in the current shooting environment, and control the second camera 133 to capture a second image for the first scene, when the sensitivity ISO of the first camera 132 in the current shooting environment is greater than a first threshold; wherein the color of the second image is closer to the true color of the first scene than the color of the first image; an optimizing unit 1312 configured to optimize the first image according to the second image to obtain a third image; the third image has a color that is closer to the true color of the first scene than the color of the first image. For example, in conjunction with fig. 11, the control unit 1311 may be used to perform S201 and S202, and the optimization unit 1312 may be used to perform S203.

Based on any of the above embodiments, the following optional implementation manners are provided:

Optionally, the light sensing performance of the second camera 133 is greater than that of the first camera 132.

Optionally, the aperture of the second camera 133 is larger than the aperture of the first camera 132.

Optionally, the exposure duration when the second camera 133 acquires the second image is longer than the exposure duration when the first camera 132 acquires the first image.

Optionally, the ISO used when the second camera 133 acquires the second image is larger than the ISO used when the first camera 132 acquires the first image.

Alternatively, the magnification range of the second camera 133 includes [0.5,1 ], and the magnification range of the first camera 132 includes [1,20].

Alternatively, the magnification of the second camera 133 is 1, and the range of the magnification of the first camera 132 includes (1, 20).

As shown in fig. 12, optionally, the image processing apparatus 131 further includes: a noise reduction unit 1313.

Optionally, the control unit 1311 is further configured to control the first camera 132 to acquire N frames of images for the first scene, where N is an integer greater than or equal to 1; a denoising unit 1313, configured to perform multi-frame denoising according to the N frames of images and the first image to obtain a fourth image; the optimizing unit 1312 is specifically configured to optimize the fourth image according to the second image to obtain a third image. Wherein the image content of the fourth image is the same as the image content of the first image.

Optionally, the control unit 1311 is further configured to control the first camera 132 and the second camera 133 to capture N1 frame images and N2 frame images for the first scene respectively; wherein N1 and N2 are integers greater than or equal to 1; a noise reduction unit 1313, configured to perform multi-frame noise reduction according to the N1 frame image, the N2 frame image, and the first image, to obtain a fifth image; the optimizing unit 1312 is specifically configured to optimize the fifth image according to the second image to obtain a third image. Wherein the image content of the fifth image is the same as the image content of the first image.

Optionally, the terminal 13 further includes a third camera 134, and a magnification of the third camera 134 is smaller than a magnification of the first camera 132; the control unit 1311 is further configured to control the first camera 132 and the third camera 134 to acquire N3 frames of images and N4 frames of images for the first scene, respectively; wherein N3 and N4 are integers greater than or equal to 1; a denoising unit 1313, configured to perform multi-frame denoising according to the N3 frame image, the N4 frame image, and the first image, to obtain a sixth image; the optimizing unit 1312 is specifically configured to optimize the sixth image according to the second image to obtain a third image. Wherein the image content of the sixth image is the same as the image content of the first image.

As shown in fig. 12, optionally, the image processing apparatus 131 further includes:

optionally, the selecting unit 1314 is configured to:

when the shooting magnification of the terminal 13 for the first scene is [1, 3), a camera with a magnification of 1in the terminal 13 is selected as the first camera 132.

Alternatively, when the shooting magnification of the terminal 13 for the first scene is [3,7 ], a camera with a magnification of 3 in the terminal 13 is selected as the first camera 132.

Alternatively, when the shooting magnification of the terminal 13 for the first scene is greater than 10, a camera with a magnification of 10 in the terminal 13 is selected as the first camera 132.

Alternatively, when the shooting magnification of the terminal 13 for the first scene is [7,10 ], a camera with a magnification of 10 in the terminal 13 is selected as the first camera, and a camera with a magnification of 3 in the terminal is selected as the third camera 132.

Optionally, the selecting unit 1314 is further configured to:

when the shooting magnification of the terminal 13 for the first scene is [1, 3), selecting a camera with the magnification of 1 or less than 1in the terminal 13 as the second camera 133;

or, when the shooting magnification of the terminal 13 for the first scene is [3, 7), a camera with the magnification of 3, 1, or less than 1in the terminal 13 is selected as the second camera 133;

or, when the shooting magnification of the terminal 13 for the first scene is [7, 10), a camera with the magnification of 3, 1, or less than 1in the terminal 13 is selected as the second camera 133;

alternatively, when the shooting magnification of the terminal 13 for the first scene is greater than 10, a camera with a magnification of 10, 3, 1, or less than 1in the terminal 13 is selected as the second camera 133.

Optionally, the optimizing unit 1312 is specifically configured to: acquiring a color correction matrix CCM matrix of at least two sub-images in a first image; the CCM matrix of the first sub-image is used for representing the mapping relation between the characteristic of the first sub-image and the characteristic of a second sub-image in the second image; the first sub-image and the second sub-image are images of the same object; the characteristic comprises at least one of color, contrast, or dynamic range; obtaining a CCM matrix of pixels in the first image based on the CCM matrices of the at least two sub-images; the CCM matrix of the first pixel is used for representing the mapping relation between the characteristic of the first pixel and the characteristic of a second pixel in the second image; the first pixel and the second pixel correspond to the same image content; a third image is derived based on the CCM matrix of pixels in the first image and the first image. For example, in conjunction with FIG. 6, optimization unit 1312 may be used to perform S43-S45.

Optionally, when the optimization unit 1312 executes obtaining the CCM matrix of at least two sub-images in the first image, it is specifically configured to: acquiring a CCM matrix of at least two sub-images using a first neural network; the first neural network is used for analyzing the characteristic and texture information of the first image and the characteristic and texture information of the second image to obtain a CCM matrix of at least two sub-images.

Optionally, the optimizing unit 1312 is specifically configured to: optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for optimizing the image style of the image with poor image style by using the image with good image style.

For the detailed description of the above alternative modes, reference may be made to the foregoing method embodiments, which are not described herein again. In addition, for any explanation and beneficial effect description of the image processing apparatus 131 provided above, reference may be made to the corresponding method embodiment described above, and details are not repeated.

As an example, with reference to fig. 1, the functions of any one of the control unit 1311, the optimization unit 1312, the noise reduction unit 1313, and the selection unit 1314 may be implemented by the processor 110 calling the level code stored in the internal memory 121.

Another embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed by a terminal, the terminal performs each step in the method flow shown in the foregoing method embodiment.

In some embodiments, the disclosed methods may be implemented as computer program instructions encoded on a computer-readable storage medium in a machine-readable format or encoded on other non-transitory media or articles of manufacture.

It should be understood that the arrangements described herein are for illustrative purposes only. Thus, those skilled in the art will appreciate that other arrangements and other elements (e.g., machines, interfaces, functions, orders, and groupings of functions, etc.) can be used instead, and that some elements may be omitted altogether depending upon the desired results. In addition, many of the described elements are functional entities that may be implemented as discrete or distributed components or in conjunction with other components, in any suitable combination and location.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present application are generated in whole or in part when the computer-executable instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). Computer-readable storage media can be any available media that can be accessed by a computer or can comprise one or more data storage devices, such as servers, data centers, and the like, that can be integrated with the media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a DVD), or a semiconductor medium (e.g., a Solid State Disk (SSD)), among others.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and shall cover the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. An image processing method is applied to a terminal, the terminal comprises a first camera and a second camera, the magnification of the first camera is not less than that of the second camera, and the photosensitivity of the second camera is greater than that of the first camera, and the method comprises the following steps:

acquiring a first image for a first scene in a current shooting environment through the first camera;

when the chromaticity or brightness of the first image does not satisfy a desired condition,

acquiring a second image for the first scene by the second camera; wherein the second image has a higher chroma or luminance than the first image;

optimizing the first image according to the color or brightness of the second image to obtain a third image, wherein the texture information of the third image is the same as that of the first image;

optimizing the first image according to the color or brightness of the second image to obtain a third image, including:

optimizing the first image based on the CCM matrix of the pixels in the first image to obtain the third image, wherein the first image comprises first pixels, and the CCM matrix of the first pixels is used for representing the mapping relation between the characteristics of the first pixels and the characteristics of second pixels in the second image; the first pixel and the second pixel correspond to the same image content; the characteristic comprises at least one of color, contrast, or dynamic range; the CCM matrix of pixels in the first image is obtained by analyzing feature and texture information of the first image and feature and texture information of the second image.

2. The method of claim 1,

when the color of the second image is better than the color of the first image, the color of the third image is better than the color of the first image; alternatively, the first and second electrodes may be,

when the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image; alternatively, the first and second electrodes may be,

when the dynamic range of the second image is larger than that of the first image, the dynamic range of the third image is larger than that of the first image.

3. The method of claim 1, wherein the image content of the third image is the same as the image content of the first image.

4. The method of claim 1, wherein an aperture of the second camera is larger than an aperture of the first camera.

5. The method of claim 1, wherein an exposure time period during which the second camera acquires the second image is greater than an exposure time period during which the first camera acquires the first image.

6. The method of claim 1, wherein the ISO used when the second camera captures the second image is greater than the ISO used when the first camera captures the first image.

7. The method of claim 1,

the magnification range of the second camera comprises [0.5,1 ], and the magnification range of the first camera comprises [1,20];

alternatively, the magnification of the second camera is 1, and the magnification range of the first camera includes (1, 20).

8. The method of any one of claims 1 to 7, further comprising:

acquiring N frames of images for the first scene through the first camera, wherein N is an integer greater than or equal to 1;

performing multi-frame noise reduction according to the N frames of images and the first image to obtain a fourth image; wherein the image content of the fourth image is the same as the image content of the first image;

optimizing the first image according to the second image to obtain a third image, including:

and optimizing the fourth image according to the second image to obtain the third image.

9. The method according to any one of claims 1 to 7, further comprising:

acquiring N1 frames of images and N2 frames of images for the first scene through the first camera and the second camera respectively; wherein N1 and N2 are integers greater than or equal to 1;

performing multi-frame noise reduction according to the N1 frame image, the N2 frame image and the first image to obtain a fifth image; wherein the image content of the fifth image is the same as the image content of the first image;

and optimizing the fifth image according to the second image to obtain the third image.

10. The method of claim 9, further comprising:

when the shooting magnification of the terminal for the first scene is [1, 3), selecting a camera with the magnification of 1in the terminal as the first camera;

or when the shooting magnification of the terminal for the first scene is [3, 7), selecting a camera with the magnification of 3 in the terminal as the first camera;

or when the shooting magnification of the terminal for the first scene is larger than 10, selecting a camera with the magnification of 10 in the terminal as the first camera.

11. The method according to any one of claims 1 to 7, wherein the terminal further comprises a third camera, the third camera is a camera different from the first camera and the second camera in the terminal, and the magnification of the third camera is not greater than that of the first camera; the method further comprises the following steps:

acquiring N3 frames of images and N4 frames of images for the first scene through the first camera and the third camera respectively; wherein N3 and N4 are integers greater than or equal to 1;

performing multi-frame noise reduction according to the N3 frame image, the N4 frame image and the first image to obtain a sixth image; wherein the image content of the sixth image is the same as the image content of the first image;

and optimizing the sixth image according to the second image to obtain the third image.

12. The method of claim 11, further comprising:

when the shooting magnification of the terminal for the first scene is [7,10 ], selecting a camera with the magnification of 10 in the terminal as the first camera, and selecting a camera with the magnification of 3 in the terminal as the third camera.

13. The method of claim 12, further comprising:

when the shooting magnification of the terminal for the first scene is [1, 3), selecting a camera with the magnification of 1 or less than 1in the terminal as the second camera;

or when the shooting magnification of the terminal for the first scene is [3, 7), selecting a camera with the magnification of 3, 1 or less than 1in the terminal as the second camera;

or when the shooting magnification of the terminal for the first scene is [7,10 ], selecting a camera with the magnification of 3, 1 or less than 1in the terminal as the second camera;

14. The method of any of claims 1-7, 10, and 12-13, wherein the optimizing the first image based on the color or brightness of the second image to obtain a third image further comprises:

acquiring a color correction matrix CCM matrix of at least two sub-images in the first image; the CCM matrix of the first sub-image is used for representing the mapping relation between the characteristics of the first sub-image and the characteristics of a second sub-image in the second image; the first sub-image and the second sub-image are images of the same object; the characteristic comprises at least one of color, contrast, or dynamic range;

obtaining a CCM matrix of pixels in the first image based on the CCM matrices of the at least two sub-images;

the acquiring a CCM matrix of at least two sub-images in the first image comprises:

acquiring a CCM matrix of the at least two sub-images using a first neural network; the first neural network is used for analyzing the characteristic and texture information of the first image and the characteristic and texture information of the second image to obtain a CCM matrix of the at least two sub-images.

15. The method of any of claims 1-7 and 12-13, wherein the chrominance or luminance of the first image not satisfying the expected condition comprises:

the color saturation of the first image does not reach a first preset threshold value; the value range of the first preset threshold is [0.6,0.8]; alternatively, the first and second electrodes may be,

the brightness of the first image does not reach a second preset threshold value; the value range of the second preset threshold is [80,100]; alternatively, the first and second liquid crystal display panels may be,

the contrast of the first image does not reach a third preset threshold; the value range of the third preset threshold is [40,60]; alternatively, the first and second electrodes may be,

the dynamic range of the first image does not reach a fourth preset threshold; the value range of the fourth preset threshold is [6,8]; alternatively, the first and second electrodes may be,

the hue of the first image is different from the hue of the second image, and the confidence of the hue of the second image is higher than the confidence of the hue of the first image.

16. The method of any of claims 1-7, 10 and 12-13, wherein optimizing the first image from the second image to obtain a third image comprises:

optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for carrying out image style optimization on the image with the poor image style by utilizing the image with the good image style.

17. The utility model provides an image processing apparatus, its characterized in that, the device is applied to the terminal, the terminal still includes first camera and second camera, the multiplying power of first camera is not less than the multiplying power of second camera, the sensitization performance of second camera is greater than the sensitization performance of first camera, the device includes:

the control unit is used for controlling the first camera to acquire a first image aiming at a first scene in the current shooting environment and controlling the second camera to acquire a second image aiming at the first scene when the chromaticity or the brightness of the first image does not meet the expected condition; wherein the second image has a higher chroma or luminance than the first image;

the optimization unit is used for optimizing the first image according to the color or the brightness of the second image to obtain a third image, and the texture information of the third image is the same as that of the first image;

the optimization unit is further configured to: optimizing the first image based on the CCM matrix of the pixels in the first image to obtain the third image, wherein the first image comprises first pixels, and the CCM matrix of the first pixels is used for representing the mapping relation between the characteristics of the first pixels and the characteristics of second pixels in the second image; the first pixel and the second pixel correspond to the same image content; the characteristic comprises at least one of color, contrast, or dynamic range; the CCM matrix of pixels in the first image is obtained by analyzing feature and texture information of the first image and feature and texture information of the second image.

18. The apparatus of claim 17,

19. The apparatus of claim 17, wherein the image content of the third image is the same as the image content of the first image.

20. The apparatus of claim 17, wherein the aperture of the second camera is larger than the aperture of the first camera.

21. The apparatus of claim 17, wherein an exposure time period for the second camera to acquire the second image is longer than an exposure time period for the first camera to acquire the first image.

22. The apparatus of claim 17, wherein the ISO used when the second camera captures the second image is greater than the ISO used when the first camera captures the first image.

23. The apparatus of claim 17,

24. The apparatus of any one of claims 17 to 23,

the control unit is further configured to control the first camera to acquire N frames of images for the first scene, where N is an integer greater than or equal to 1;

the device further comprises: the noise reduction unit is used for carrying out multi-frame noise reduction according to the N frames of images and the first image to obtain a fourth image; wherein the image content of the fourth image is the same as the image content of the first image;

the optimization unit is specifically configured to optimize the fourth image according to the second image to obtain the third image.

25. The apparatus of any one of claims 17 to 23,

the control unit is further configured to control the first camera and the second camera to acquire N1 frames of images and N2 frames of images respectively for the first scene; wherein N1 and N2 are integers greater than or equal to 1;

the device further comprises: the noise reduction unit is used for carrying out multi-frame noise reduction according to the N1 frame image, the N2 frame image and the first image to obtain a fifth image; wherein the image content of the fifth image is the same as the image content of the first image;

the optimization unit is specifically configured to optimize the fifth image according to the second image to obtain the third image.

26. The apparatus according to claim 25, further comprising a selection unit configured to:

27. The apparatus according to any one of claims 17 to 23, wherein the terminal further comprises a third camera, the third camera is a camera of the terminal different from the first camera and the second camera, and a magnification of the third camera is not greater than a magnification of the first camera;

the control unit is further configured to control the first camera and the third camera to acquire N3 frames of images and N4 frames of images respectively for the first scene; wherein N3 and N4 are integers greater than or equal to 1;

the device further comprises: the noise reduction unit is used for carrying out multi-frame noise reduction according to the N3 frame image, the N4 frame image and the first image to obtain a sixth image; wherein the image content of the sixth image is the same as the image content of the first image;

the optimization unit is specifically configured to optimize the sixth image according to the second image to obtain the third image.

28. The apparatus according to claim 27, further comprising a selection unit configured to:

29. The apparatus of claim 28, wherein the selection unit is further configured to:

30. The apparatus according to any one of claims 17 to 23, 26 and 28, wherein the optimization unit is further configured to:

wherein the optimization unit is further specifically configured to: acquiring a CCM matrix of the at least two sub-images using a first neural network; the first neural network is used for analyzing the characteristic and texture information of the first image and the characteristic and texture information of the second image to obtain a CCM matrix of the at least two sub-images.

31. The apparatus of any one of claims 17 to 23, 26 and 28,

the chrominance or luminance of the first image not satisfying the expected condition comprises:

the color saturation of the first image does not reach a first preset threshold value; the value range of the first preset threshold is [0.6,0.8]; alternatively, the first and second liquid crystal display panels may be,

the brightness of the first image does not reach a second preset threshold value; the value range of the second preset threshold is [80,100]; alternatively, the first and second electrodes may be,

32. The apparatus of any one of claims 17 to 23, 26 and 28,

the optimization unit is specifically configured to: optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for carrying out image style optimization on the image with the poor image style by utilizing the image with the good image style.

33. A terminal, comprising: a processor, a memory, and at least two cameras for capturing images, the memory for storing computer programs and instructions, the processor for invoking the computer programs and instructions to perform the method of any of claims 1-16 in cooperation with the at least two cameras.

34. An image processing method is applied to a terminal, wherein the terminal comprises a target camera, and the method comprises the following steps:

under the current shooting parameters, acquiring a first image aiming at a first scene in the current shooting environment through the target camera;

when the color or brightness of the first image does not satisfy a preset condition,

adjusting the ISO or exposure time of the current shooting parameter, and acquiring a second image aiming at the first scene through the target camera under the adjusted shooting parameter; wherein the second image is of a better color or brightness than the first image;

optimizing the first image based on the CCM matrix of the pixels in the first image to obtain the third image, wherein the first image comprises the first pixels, and the CCM matrix of the first pixels is used for representing the mapping relation between the characteristics of the first pixels and the characteristics of the second pixels in the second image; the first pixel and the second pixel correspond to the same image content; the characteristic comprises at least one of color, contrast, or dynamic range; the CCM matrix of pixels in the first image is obtained by analyzing feature and texture information of the first image and feature and texture information of the second image.

35. The method of claim 34,

when the dynamic range of the second image is larger than that of the first image, the dynamic range of the third image is larger than that of the first image;

wherein the image content of the third image is the same as the image content of the first image.

36. The method of claim 34,

the chrominance or luminance of the first image not meeting a desired condition comprises:

37. The method of any of claims 34-36, wherein said optimizing said first image from said second image, resulting in a third image, comprises:

38. An image processing device is characterized in that the device is applied to a terminal, and the terminal comprises a target camera; the device comprises:

the control unit is used for acquiring a first image aiming at a first scene in the current shooting environment through the target camera under the current shooting parameters; when the color or the brightness of the first image does not meet preset conditions, the ISO or the exposure time of the current shooting parameters is adjusted upwards, and a second image is collected for the first scene through the target camera under the adjusted shooting parameters; wherein the second image is of a better color or brightness than the first image;

39. The apparatus of claim 38,

when the contrast of the second image is higher than that of the first image, the contrast of the third image is higher than that of the first image; alternatively, the first and second liquid crystal display panels may be,

40. The apparatus of claim 38,

41. The apparatus according to any of claims 38-40, wherein the optimization unit is specifically configured to: optimizing the first image by using a second neural network and the second image to obtain a third image; the second neural network is used for carrying out image style optimization on the image with the poor image style by utilizing the image with the good image style.

42. A terminal, comprising: a processor, a memory, and a target camera for capturing images, the memory for storing computer programs and instructions, the processor for invoking the computer programs and instructions to perform the method of any of claims 34-37 in cooperation with the target camera.