CN116012262A

CN116012262A - Image processing method, model training method and electronic equipment

Info

Publication number: CN116012262A
Application number: CN202310222718.4A
Authority: CN
Inventors: 陈彬
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2023-03-09
Filing date: 2023-03-09
Publication date: 2023-04-25
Anticipated expiration: 2043-03-09
Also published as: CN116012262B

Abstract

The application provides an image processing method, a model training method and electronic equipment, and relates to the technical field of images. The scheme reduces the calculation amount of real-time denoising and deblurring. The specific scheme is as follows: responsive to a first operation indicating to acquire an image, displaying a first interface; during the display of the first interface, a first original RAW image is acquired, wherein the first RAW image is image data acquired by an image sensor in the electronic equipment; after removing noise from the first RAW image, converting the first RAW image into a first image, wherein the first image is an RGB image; predicting a first parameter according to the first model and the first image, wherein the first parameter comprises a first vector corresponding to a first pixel point, and the first vector is a predicted position offset vector of the first pixel point before and after the image blur is removed from the first image; after the first parameter is determined, the position of the first pixel point in the first image is moved, and a second image with the image blur removed is obtained; the second image is refreshed into the first interface.

Description

Image processing method, model training method and electronic equipment

Technical Field

The present disclosure relates to the field of image technologies, and in particular, to an image processing method, a model training method, and an electronic device.

Background

Shooting is a life recording mode of a user, of course, most users lack rich shooting experience, and the shot image data is difficult to avoid problems such as noise and blurring, and particularly, under the condition of poorer ambient lighting conditions, the more serious the noise and blurring problems in the shot image data are, the more difficult the shot image data are to remove.

In the related art, an image processing model with a complex structure is required to be adopted to realize denoising and deblurring processing for image data in one step. However, such an image processing model is large in size, and many system resources are required for operation, and cannot be directly configured to electronic equipment with limited system resources, so that the electronic equipment cannot perform processes such as deblurring and denoising on captured image data in real time.

Disclosure of Invention

The embodiment of the application provides an image processing method, a model training method and electronic equipment, which are used for removing image noise in a RAW domain, removing image blurring in an RGB domain, reducing task difficulty of an image processing model, simplifying complexity of the image processing model, enabling the image processing model to be configured in various electronic equipment, and further improving the problem that the electronic equipment with limited system resources is difficult to remove noise and blurring of shot image data in real time.

In order to achieve the above purpose, the embodiments of the present application adopt the following technical solutions:

in a first aspect, an image processing method provided in an embodiment of the present application is applied to an electronic device, where a first model is configured in the electronic device, where the first model is used to predict a position change condition of each pixel point of an RGB image in a color mode of a same frame before and after removing image blur, and the method includes: responsive to a first operation indicating to acquire an image, displaying a first interface; during the display of the first interface, acquiring a first original RAW image, wherein the first RAW image is image data acquired by an image sensor in the electronic equipment; after removing noise from the first RAW image, converting the first RAW image into a first image, wherein the first image is the RGB image; predicting a first parameter according to the first model and the first image, wherein the first parameter comprises a first vector corresponding to a first pixel point, and the first vector is a predicted position offset vector of the first pixel point before and after the image blur is removed from the first image; after the first parameter is determined, the position of the first pixel point in the first image is moved, and a second image with the image blur removed is obtained; and refreshing the second image into the first interface.

In the above embodiment, on the one hand, noise is eliminated in the RGB domain, and then deblurring is performed in the RGB domain, so that the processing difficulty of image blurring and noise is reduced, the processing speed is improved, and the method is beneficial to processing the scene of the acquired image in real time.

On the other hand, the electronic device can realize the elimination of the image blur by moving the positions of the respective pixel points in the first image after predicting the position offset condition (i.e., the first parameter) between the respective pixel points in the first image before and after the image blur is removed by using the first model. That is, the process is simple to implement while ensuring that the image blur can be removed. Compared with the method for removing the image blurring and the noise by one step by adopting an image processing model, the method has the advantages that the calculation task amount is obviously reduced, the occupation of system resources is also less in the deblurring process, and the problem that the electronic equipment with limited system resources is difficult to remove the noise and deblur the shot image data in real time is solved.

In some embodiments, the electronic device includes a second model therein for canceling noise in the RAW image, the method further comprising, prior to converting the first RAW image into the first image: and eliminating noise in the first RAW image by using the second model.

In the above embodiment, the noise in the first RAW image is directly eliminated by using the second model by using the characteristic that the noise in the RAW domain is linear, the implementation is simple, and the noise elimination can be effectively completed by using the lightweight second model, so that not only the occupation of the second model to system resources is reduced, but also the difficulty of image blurring elimination is reduced.

In some embodiments, moving the position of the first pixel point in the first image comprises: inquiring the first vector corresponding to the first pixel point in the first parameter; and moving the first pixel point in the first image according to the first vector.

In the above embodiment, the first pixel point is moved in the first image according to the predicted first parameter, so as to effectively remove the problem of image blurring.

In some embodiments, before moving the position of the first pixel point in the first image, the method comprises: determining a second parameter, wherein the second parameter comprises a second vector corresponding to the first pixel point, and the second vector is a position offset vector of the first pixel point before and after the first image is subjected to image blurring removal; fusing the first parameter and the second parameter, and determining a third parameter, wherein the third parameter comprises a third vector corresponding to the first pixel point; the moving the position of the first pixel point in the first image includes: querying the third vector corresponding to the first pixel point in the third parameter; and moving the first pixel point in the first image according to a third vector.

In the above embodiment, the first pixel point is moved in the first image according to the predicted third parameter, so that the problem of image blurring is removed, and at the same time, the problem of image dithering is synchronously eliminated, so that the repeated occupation of system resources is avoided, the deformation is performed on the first image, and the complexity of image processing is simplified.

In some embodiments, fusing the first parameter and the second parameter, determining a third parameter includes: inquiring a first vector corresponding to the first pixel point in the first parameter, and inquiring a second vector corresponding to the first pixel point in the second parameter; and superposing the first vector and the second vector to obtain a third vector corresponding to the first pixel point.

In other embodiments, the fusing the first parameter and the second parameter, determining a third parameter, includes: inquiring a first vector corresponding to the first pixel point in the first parameter, and inquiring a second vector corresponding to the first pixel point in the second parameter; superposing the first vector and the second vector to obtain a fourth vector; and calibrating the fourth vector according to a preset calibration parameter to obtain a third vector corresponding to the first pixel point, wherein the calibration parameter comprises a rotation angle and a translation distance.

In a second aspect, an image processing method provided in an embodiment of the present application is applied to an electronic device, where a third model and a fourth model are configured in the electronic device; the third model is used for eliminating noise in a RAW image and predicting the position change condition of each pixel point in the same frame of RAW image before and after removing image blurring, and the fourth model is used for predicting the position change condition of each pixel point in the same frame of RGB image before and after removing image blurring, and the method comprises the following steps: responsive to a second operation indicating to acquire the image, displaying a second interface; during the display of the second interface, acquiring a second RAW image, wherein the second RAW image is an image acquired by an image sensor in the electronic equipment; determining a noise-free third RAW image and predicting a fourth parameter according to the third model and the second RAW image, wherein the fourth parameter comprises a fifth vector corresponding to a second pixel point, and the fifth vector is a predicted position offset vector of the second pixel point before and after the third RAW image removes image blurring; converting the third RAW image into a third image, wherein the third image is the RGB image; predicting a fifth parameter according to the fourth model, the fourth parameter and the third image, wherein the fifth parameter comprises a sixth vector corresponding to the second pixel point, and the position offset vectors of the second pixel point predicted by the sixth vector before and after the image blurring of the third RAW image is removed; after the fifth parameter is determined, the position of the second pixel point in the third image is moved, and a fourth image with image blurring removed is obtained; refreshing the fourth image into the second interface.

In the above embodiment, by the third model, the deblurred intermediate parameter (i.e., the fourth parameter) for the RAW domain is calculated while the image noise is removed in the RAW domain, and the operational capability of the third model is reused to the maximum extent. In addition, in the RGB domain, the RGB image may be downsampled, so that the fourth model is combined with the fourth parameter, and the calculation amount required in the process of predicting the fifth parameter is less, which is more beneficial to the application of the network model with noise and blur removal in the electronic equipment with limited system resources.

In some embodiments, the moving the position of the second pixel point in the third image comprises: inquiring the sixth vector corresponding to the second pixel point in the fifth parameter; and moving the second pixel point in the third image according to the sixth vector to obtain the fourth image.

In some embodiments, before the moving the position of the second pixel point in the third image, the method includes: determining a sixth parameter, wherein the sixth parameter comprises a seventh vector corresponding to the second pixel point, and the seventh vector is a predicted position offset vector of the second pixel point before and after the third image eliminates image jitter; fusing the sixth parameter and the fifth parameter, and determining a seventh parameter, wherein the seventh parameter comprises an eighth vector corresponding to the second pixel point; the moving the position of the second pixel point in the third image includes: in the seventh parameter, inquiring the eighth vector corresponding to the second pixel point; and according to the eighth vector, moving the second pixel point in the third image to obtain the fourth image.

In some embodiments, the fusing the sixth parameter and the fifth parameter, determining a seventh parameter, comprises: inquiring the sixth vector corresponding to the second pixel point in the fifth parameter, and inquiring the seventh vector corresponding to the second pixel point in the sixth parameter; and superposing the sixth vector and the seventh vector to obtain an eighth vector corresponding to the second pixel point in the seventh parameter.

In some embodiments, the fusing the sixth parameter and the fifth parameter, determining a seventh parameter, comprises: inquiring the sixth vector corresponding to the second pixel point in the fifth parameter, and inquiring the seventh vector corresponding to the second pixel point in the sixth parameter; superposing the sixth vector and the seventh vector to obtain a ninth vector corresponding to the second pixel point in the seventh parameter; and calibrating the ninth vector according to a pre-configured calibration parameter to obtain an eighth vector corresponding to the second pixel point in the seventh parameter, wherein the calibration parameter comprises a rotation angle and a translation distance.

In a third aspect, an embodiment of the present application provides a model training method, where the method includes: acquiring training sample data, wherein the training sample data comprises a first sample image and a second sample image, the second sample image is image data obtained by eliminating image blurring in the first sample image, and the first sample image and the second sample image are RGB images; processing the first sample image by using a pre-initial model to obtain an eighth parameter, wherein the eighth parameter comprises a tenth vector corresponding to a third pixel point, and the tenth vector is a predicted position offset vector of the third pixel point before and after the image blur of the first sample image is removed; according to a tenth vector in the eighth parameter, moving the third pixel point in the first sample image to obtain a first RGB image; iterating model parameters of the initial model according to the difference between the second sample image and the first RGB image; after the initial model is trained to converge, a first model is obtained.

In a fourth aspect, an embodiment of the present application provides a model training method, where the method includes: acquiring training sample data, wherein the training sample data comprises a third sample image and a fourth sample image, the fourth sample image is image data obtained by removing noise and image blurring of the third sample image, and the third sample image and the fourth sample image are RAW images; processing the third sample image by using a pre-initial model to obtain a fourth RAW image and a ninth parameter, wherein the ninth parameter comprises an eleventh vector corresponding to a fourth pixel point, and the eleventh vector is a predicted position offset vector of the fourth pixel point before and after the fourth RAW image is deblurred; according to the eleventh vector in the ninth parameter, moving the fourth pixel point in the fourth RAW image to obtain a sixth sample image; iterating model parameters of the initial model according to the difference between the sixth sample image and the fourth sample image; and after the initial model is trained to be converged, obtaining a third model.

In a fifth aspect, an embodiment of the present application provides a model training method, where the method includes: acquiring training sample data, wherein the training sample data comprises a third sample image and a fifth sample image, the third sample image is a RAW image, and the fifth sample image is an RGB image obtained by converting the third sample image after removing noise and image blurring; processing the third sample image by using a preconfigured third model to obtain a fifth RAW image and a tenth parameter, wherein the third model is used for eliminating noise in the RAW image and predicting the position change condition of each pixel point in the same frame of RAW image before and after image blurring is removed, the fifth RAW image is noise-free image data, the tenth parameter comprises a twelfth vector corresponding to a fifth pixel point, and the twelfth vector is a predicted position offset vector of the fifth pixel point before and after image blurring is removed in the fifth RAW image; after converting the fifth RAW image into a second RGB image, processing the second RGB image and tenth parameters by using a pre-initial model to obtain eleventh parameters, wherein the eleventh parameters comprise thirteenth vectors corresponding to sixth pixel points, and the thirteenth vectors are predicted position offset vectors of the sixth pixel points before and after the second RGB image is subjected to image blurring removal; according to the thirteenth vector in the eleventh parameter, moving the sixth pixel point in the second RGB image to obtain a third RGB image; iterating model parameters of the initial model according to the difference between the third RGB image and the fifth sample image; after the initial model is trained to converge, a fourth model is obtained.

In a sixth aspect, an electronic device provided in an embodiment of the present application includes one or more processors and a memory; the memory is coupled to a processor, the memory being for storing computer program code comprising computer instructions which, when executed by one or more processors, are for performing the methods of the above-described first, second, third, fourth, fifth aspects and possible embodiments thereof.

In a seventh aspect, embodiments of the present application provide a computer storage medium including computer instructions that, when executed on an electronic device, cause the electronic device to perform the method in the first aspect, the second aspect, the third aspect, the fourth aspect, the fifth aspect, and possible embodiments thereof.

In an eighth aspect, the present application provides a computer program product for, when run on an electronic device as described above, causing the electronic device to perform the method of the first, second, third, fourth, fifth aspects and possible embodiments thereof as described above.

It will be appreciated that the electronic device, the computer storage medium and the computer program product provided in the above aspects are all applicable to the corresponding methods provided above, and therefore, the advantages achieved by the electronic device, the computer storage medium and the computer program product may refer to the advantages in the corresponding methods provided above, and are not repeated herein.

Drawings

Fig. 1 is a schematic hardware structure of an electronic device according to an embodiment of the present application;

fig. 2 is a schematic diagram of a software and hardware structure of an electronic device according to an embodiment of the present application;

FIG. 3 is a schematic illustration of image processing according to an embodiment of the present application;

fig. 4 is an exemplary diagram of physical meaning of a position offset vector provided in an embodiment of the present application;

FIG. 5 is one of the flowcharts of the steps of the image processing method according to the embodiment of the present application;

FIG. 6 is an exemplary diagram of a morphing process RGB image 6 provided by an embodiment of the present application;

FIG. 7 is a second schematic illustration of image processing provided in an embodiment of the present application;

FIG. 8 is a second flowchart illustrating steps of an image processing method according to an embodiment of the present disclosure;

FIG. 9 is a third illustrative diagram of the principles of image processing provided by embodiments of the present application;

FIG. 10 is a third flowchart illustrating steps of an image processing method according to an embodiment of the present disclosure;

FIG. 11 is a schematic illustration of image processing provided in an embodiment of the present application;

fig. 12 is an exemplary diagram of a chip system according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application. Wherein, in the description of the present application, unless otherwise indicated, "at least one" means one or more, and "a plurality" means two or more. In addition, in order to clearly describe the technical solutions of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", and the like are used to distinguish the same item or similar items having substantially the same function and effect. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.

Shooting is a way of recording life of a user, of course, most users lack abundant shooting experience, and shot image data inevitably has problems of image noise, image blurring and the like, and the image problems not only affect the evaluation of the user on the image quality, but also affect the evaluation of the user on shooting equipment (namely, electronic equipment for shooting the image data).

Clearly, preprocessing the captured image data, such as denoising and deblurring, after capturing the image data, is also a fundamental requirement for electronic devices.

In some embodiments, an image processing model for removing noise and removing blur can be configured in the electronic device. Thus, after the electronic device collects the image data, the image processing model can be utilized in real time to perform denoising and deblurring processing on the collected image data. The specific process can be as follows:

after the RAW (RAW) image is acquired by the image sensor of the electronic device, the RAW image may be converted into a color mode (RGB) image using an image signal processor (image signal processor, ISP). The RGB image may be an image formed by overlapping three primary colors (i.e., red, blue and green), and the principle of converting the RAW image into the RGB image may refer to related art, which is not described herein.

Then, the RGB image is input into an image processing model. The image processing model may output noise-and blur-removed image data after processing the RGB image.

Of course, in different photographing environments, even if the same electronic apparatus is used, the noise and the blurring degree of the photographed image are different. For example, in a scene with lower light, noise and blurring of a captured image are more difficult to remove. For images where noise and blur are more difficult to remove, it is necessary to use an image processing model with a more complex structure.

It will be appreciated that the more complex the structure of the image processing model, the more memory resources are required to configure the image processing model, as are the more system resources (including memory resources and computing resources) required to run the image processing model.

However, after the image processing model with a complex structure is configured in an electronic device with limited system resources (such as a mobile device of a mobile phone, a tablet computer and the like), other normal services in the electronic device are directly affected due to the fact that the model needs to occupy a large amount of system resources.

In order to solve the above-described problems, the embodiments of the present application provide an image processing method that can be applied to an electronic device having a photographing function. And then, converting the RAW image into an RGB image by using ISP to obtain an RGB image without noise. And then, predicting the intermediate parameters corresponding to the RGB image by using a lightweight deblurring model. The intermediate parameters are a predicted set of matrix parameters. The above intermediate parameters may indicate the difference of the same RGB image before and after blur removal. In this way, the electronic device can use the RGB image and the corresponding intermediate parameters to obtain an deblurred RGB image, i.e. to obtain noise-and blur-removed image data.

Because the noise in the RAW image is linear, the difficulty of removing the noise in the RAW image is smaller, and thus, in the embodiment, the denoising task can be effectively completed by adopting a simple and lightweight denoising model.

In addition, compared with a noise-free RAW image, the noise-free RAW image has clear corresponding display content outline after being converted into an RGB image, and the difficulty of eliminating blurring can be reduced to a certain extent. Meanwhile, in the process of removing the blur, the deblurring model only needs to predict the difference between RGB images before deblurring and RGB images after deblurring, the image processing model does not need to calculate the image information after actual deblurring and denoising, and the task difficulty of the deblurring model is reduced, so that the deblurring model can also be a lightweight model.

In summary, only two lightweight neural network models are required to be configured in the above electronic device, and denoising and deblurring for image data (that is, denoising in the RAW domain and deblurring in the RGB domain) are accomplished in cascade through the two lightweight neural network models. On the premise of achieving ideal deblurring and denoising effects, system resources occupied by an image processing model (a deblurring model and a denoising model) in electronic equipment are effectively reduced.

By way of example, the electronic device in the embodiments of the present application may be a mobile phone, a tablet computer, a smart watch, a desktop device, a laptop device, a handheld computer, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a cellular phone, a personal digital assistant (personal digital assistant, PDA), an augmented reality (augmented reality, AR) \virtual reality (VR) device, or a device including a plurality of cameras, and the embodiments of the present application do not limit the specific form of the electronic device.

Referring to fig. 1, a schematic structural diagram of an electronic device 100 according to an embodiment of the present application is provided. As shown in fig. 1, the electronic device 100 may include: processor 110, external memory interface 120, internal memory 121, universal serial bus (universal serial bus, USB) interface 130, charge management module 140, power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headset interface 170D, sensor module 180, keys 190, motor 191, indicator 192, camera 193, display 194, and subscriber identity module (subscriber identification module, SIM) card interface 195, etc.

The sensor module 180 may include a pressure sensor, a gyroscope sensor, a barometric sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, and the like.

It is to be understood that the structure illustrated in the present embodiment does not constitute a specific limitation on the electronic apparatus 100. In other embodiments, electronic device 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.

The controller may be a neural hub and command center of the electronic device 100. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.

A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.

In some embodiments, the processor 110 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver/transmitter, UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (subscriber identity module, SIM) interface, and/or a universal serial bus (universal serial bus, USB) interface, among others.

It should be understood that the connection relationship between the modules illustrated in this embodiment is only illustrative, and does not limit the structure of the electronic device 100. In other embodiments, the electronic device 100 may also employ different interfaces in the above embodiments, or a combination of interfaces.

The electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.

The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light emitting diode (AMOLED), a flexible light-emitting diode (flex), a mini, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like.

The electronic device 100 may implement photographing functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.

The ISP is used to process data fed back by the camera 193. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element (image sensor) through the lens, the optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing, so that the electric signal is converted into an image visible to the naked eye. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in the camera 193.

The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, electronic device 100 may include N cameras 193, N being a positive integer greater than 1.

The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals. For example, when the electronic device 100 selects a frequency bin, the digital signal processor is used to fourier transform the frequency bin energy, or the like.

Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record video in a variety of encoding formats, such as: dynamic picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.

The NPU is a neural-network (NN) computing processor, and can rapidly process input information by referencing a biological neural network structure, for example, referencing a transmission mode between human brain neurons, and can also continuously perform self-learning. Applications such as intelligent awareness of the electronic device 100 may be implemented through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.

Fig. 2 is a block diagram of the software and hardware structures of the electronic device 100 according to the embodiment of the present application. The layered architecture can divide software and hardware into a plurality of layers, and each layer has clear roles and division of work. The layers communicate with each other through a software interface. In some embodiments, the electronic device may include an application layer (application layer for short), an application framework layer (framework layer for short), a hardware abstraction layer (hardware abstract layer, HAL) layer, a Kernel layer (Kernel, also referred to as a driver layer), and a hardware layer.

The Application layer (Application) may include a series of Application packages, among other things. The application layer may include a plurality of application packages. The plurality of application packages may be camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, desktop start (Launcher) applications, etc.

For example, as shown in fig. 2, the application layer may include a camera system application (also referred to as a camera application), and of course, may also include other third party applications capable of instructing a camera to take a photograph, such as a short video application, a live broadcast application, and the like.

In the above examples, the camera system application includes a view interface, where the view interface may be a preview view interface prior to actually taking a picture or video, or a capture view interface during capturing a video. The camera system application may present the image stream reported by the bottom layer in the corresponding view finding interface, where the bottom layer may be a layer that does not directly interact with the user, such as a HAL layer, a kernel layer, or a hardware layer.

In addition, short video applications may also include a view finding interface similar to camera system applications. The live application may comprise a live interface. Both the viewfinder interface and the live interface may present the image stream reported by the bottom layer, for example, the image stream collected by the hardware layer and transferred to the application layer via the kernel layer, the HAL layer and the framework layer.

The Framework layer (Framework) provides an application programming interface (application programming interface, API) and programming Framework for the application programs of the application layer. The application framework layer includes a number of predefined functions. As shown in fig. 2, the framework layer may provide a Camera API (Camera API), a Camera Service (Camera Service), a Camera expansion Service (Camera Service Extra), a hardware development kit (hardware software development kit, hw SDK), and the like.

The Camera API is used as an interface for interaction between the bottom layer (such as a hardware abstraction layer) and the application layer. Specifically, the Camera API may also receive Camera control instructions from an upper layer (e.g., an application layer), such as control instructions for turning on a Camera, starting image acquisition, and turning off the Camera. Then, the camera control instruction from uploading is transmitted to the camera of the hardware layer through the frame layer, the HAL layer and the kernel layer, so that the working state of the camera is controlled.

In the embodiment of the application, when the application layer interacts with a user and triggers a scene of starting image acquisition, the application layer can call a Camera API, and a control instruction for indicating to acquire an image is transmitted to the Camera through the frame layer, the HAL layer and the kernel layer. The camera can respond to the control instruction for indicating to collect the image, collect the image data and transmit the obtained image data to the application layer through the kernel layer, the HAL layer and the frame layer. The image data transferred to the application layer may be RGB data from which noise and blur have been removed.

After the application layer obtains the image data from the camera, the application layer can refresh the Surface view corresponding to the preview interface or the live broadcast interface in real time, for example, store the image data from the camera to the cache area of the Surface view, and then refresh the image data to the Surface view in sequence according to the acquisition sequence of the image data. Thus, the electronic device can display the image data acquired by the camera in real time.

The HAL layer is used to connect the frame layer and the kernel layer. For example, the HAL layer may be data transparent between the framework layer and the kernel layer. Of course, the HAL layer may also process data from the kernel layer and then transmit it to the framework layer. For example, the HAL layer may translate parameters of the kernel layer about the hardware device into a software programming language recognizable by the framework layer and the application layer. For example, the HAL layer may include a Camera HAL (i.e., camera HAL). The Camera HAL can schedule the kernel layer and control the working state of the Camera.

The kernel layer includes a camera driver, an image signal processor ISP, a denoising model, and a deblurring model. Wherein, the image signal processor ISP, the denoising model and the deblurring model can be independently arranged with the camera. In other embodiments, the image signal processor ISP, the denoising model, and the deblurring model described above may be provided in a camera.

Among them, an image signal processor ISP, a denoising model, a deblurring model, and a camera are main devices that take image data (e.g., video or picture). The light signal reflected by the view finding environment is irradiated on the image sensor through the camera lens and can be converted into an electric signal so as to obtain a RAW image. Here, the RAW image may refer to a Bayer (Bayer) format image, which refers to an image including only red, blue, and green (i.e., three primary colors) among images. That is, the RAW image includes only red pixels, blue pixels, green pixels, and the like. The RAW image may also be generally referred to as RAW image data acquired by an image sensor.

In addition, after the RAW image is processed by the image signal processor ISP, the RAW image may be transmitted to an upper layer as an original parameter stream (i.e., an image stream) driven by the camera. And, the Camera driver may also receive a notification (such as a notification indicating to turn on or off the Camera) from an upper layer, and send a functional processing parameter stream to the Camera device according to the notification, so as to turn on or off the corresponding Camera.

The hardware layers include various types of hardware devices shown in fig. 1, such as sensors, processors, cameras, and the like.

In addition, software modules and hardware modules not shown in fig. 2, such as a morphing (warp) module, may also be included in the electronic device. It can be appreciated that the deformation module (warp module) can migrate the pixel points in the RGB image to deform the RGB image to obtain the required image data.

In some embodiments, the warp module may be a software module, and in other embodiments, the warp module may also be a hardware module, which is not specifically limited in this embodiment of the present application.

In the embodiment of the application, the user may instruct the camera to start image acquisition by interacting with an application program (such as a camera system application, a short video application, or a live broadcast application) in the application layer. As shown in fig. 3, after the camera starts image capturing, an optical signal reflected by the viewfinder environment is irradiated on the image sensor 302 through the camera lens 301 in the camera, and the image sensor 302 can convert the optical signal into an electrical signal, so as to obtain a corresponding RAW image, for example, a RAW image containing noise and having a blur problem. Then, noise in the RAW image is eliminated by using the denoising model. Then, the noise-removed RAW image is transmitted to the image signal processor ISP, and is converted into an RGB image by the image signal processor ISP, wherein the RGB image may have a problem of image blurring. Of course, compared with the related art, the RGB image obtained by the image signal processor ISP is already free of noise.

Thereafter, the RGB image is determined by a process using a deblurring model, and intermediate parameters corresponding to the RGB image are predicted. It will be appreciated that the above intermediate parameters may indicate the difference between the RGB image before and after deblurring. Illustratively, each element of the above-described intermediate parameters is associated with a pixel in the RGB image. The value of each element indicates the position change condition of the associated pixel before and after the RGB image is deblurred.

Then, by using the warp module, the RGB image with the blur is changed into an RGB image with the blur removed in combination with the above intermediate parameters.

Compared with the related art, under the condition that the denoising and deblurring effects are achieved, the configured neural network model (the denoising model and the deblurring model) is lighter and simpler, is more beneficial to being applied to electronic equipment with limited system resources, and can be used for denoising and deblurring the shot image data in real time in the process of shooting the image data (video or picture).

The implementation principle of the image processing method provided in the embodiment of the present application is described below with reference to the accompanying drawings.

In some embodiments, the denoising model may be a model based on a lightweight initial neural network model, which is obtained by training. The denoising model has the capability of denoising in the RAW domain. Similarly, the deblurring model can be based on a lightweight initial neural network model, and the deblurring model is trained to have the capability of predicting intermediate parameters. The above intermediate parameters may indicate the difference of the RGB image before and after deblurring.

The lightweight initial neural network model may be a model with a volume smaller than a preset value, a model with a simple structure, or a model with a network layer number smaller than a preset layer number. In a word, the lightweight initial neural network model can be a model with small amount of required configuration resources and small amount of required operation resources, and the model is adopted, so that the operation speed is faster, the deblurring and denoising processing for image data can be carried out in real time, and the model is more widely suitable for various electronic devices.

In some embodiments, the above-mentioned denoising model and deblurring model may be trained by using a plurality of methods, different training modes, and the functions of the obtained denoising model and deblurring model have certain differences, and the actual application of the flows of the denoising model and deblurring model with different functions may also have differences, which are exemplified by the following training processes of several types of denoising models and deblurring models:

the first way to train the denoising model may be to train the denoising model with noise cancellation in the RAW domain in conjunction with the image processor ISP. Illustratively, the training process may be as follows:

first, a plurality of sets of training samples 1 are acquired. For example, each set of training samples 1 corresponds to one frame of sample RAW image 1 and RGB image 1, the RGB image 1 being an RGB image converted by the sample RAW image 1 and noise in the image having been eliminated.

Second, during training, the sample RAW images 1 in each set of training samples 1 may be input into a preselected initial neural network model 1. Wherein the initial neural network model 1 is a pre-selected lightweight model.

After the sample RAW image 1 is processed by the initial neural network model 1, a RAW image 2 is obtained. Then, the RAW image 2 is processed by the image signal processor IPS to obtain an RGB image 2. Finally, the model parameters in the initial neural network model 1 are iterated based on the differences between RGB image 2 and RGB image 1 (e.g., the euclidean distance between RGB image 2 and RGB image 1).

It will be appreciated that, in the initial stage of training, the RAW image 2 obtained by the initial neural network model 1 based on the input sample RAW image 1 may be random data, and of course, the random RAW image 2 is also data associated with the sample RAW image 1, and the noise removal in the RAW image 2 has randomness.

After the initial neural network model 1 is trained by adopting a large number of training samples 1, model parameters in the initial neural network model 1 are more accurate, and at the moment, the denoising capability of the initial neural network model 1 in a RAW domain is stronger. That is, at this stage, after the sample RAW image 1 in the training sample 1 is processed, a RAW image 2 is obtained, and the RAW image 2 has the same content as the RAW image 1, but has successfully eliminated noise.

In summary, after a large amount of training, the initial neural network model 1 gradually converges, and thus a denoising model capable of eliminating noise in the RAW domain, which may also be referred to as denoising model 1, can be obtained. In addition, the method for determining whether the initial neural network model converges may refer to the related art, and will not be described herein.

In some embodiments, the denoising model 1 may be trained in an electronic device, that is, after the initial neural network model 1 is configured in the electronic device, the electronic device has the denoising model 1 after multiple training.

In other embodiments, the denoising model 1 may be configured into the electronic device after training in other devices is completed. Wherein the other devices may have the same model of image model processor ISP as the electronic device. That is, the initial neural network model 1 may be configured into other devices, then the initial neural network model 1 is trained using the training sample 1 to obtain the denoising model 1, and then the denoising model 1 is configured into the electronic device.

The first way to train the deblurring model may be by combining the warp modules to train the deblurring model with predicted intermediate parameter flow1 (also referred to as intermediate parameter 1). The intermediate parameter flow1 may indicate the same RGB image, and the difference between before and after removing the blur. Illustratively, the training process may be as follows:

First, a plurality of sets of training samples 2 are acquired. For example, each set of training samples 2 contains a frame of sample RGB image 3, which sample RGB image 3 is noise free but has problems with blurring. In addition, each set of training samples 2 further comprises an RGB image 4, which RGB image 4 is the same content as the RGB image 3, and is noise-free, but has been blurred. Illustratively, the RGB image 3 is subjected to blur removal, and then the RGB image 4 can be obtained.

Second, during training, the RGB images 3 in each set of training samples 2 may be input into a preselected initial neural network model 2. Wherein the initial neural network model 2 is a pre-selected lightweight model.

After the initial neural network model 2 has processed the RGB image 3, a set of intermediate parameters flow1 may be output. The intermediate parameter flow1 is a predicted set of matrix parameters, each element in the intermediate parameter flow1 being associated with a pixel in the RGB image 3. The value of each element in the intermediate parameter flow1 indicates the position offset between the pixel point in the RGB image 3 and the corresponding pixel point in the RGB image 4.

It will be appreciated that each pixel in the RGB image 3 corresponds to a corresponding pixel in the RGB image 4. The corresponding pixels in RGB image 3 and RGB image 4 may be referred to as a set of corresponding pixels, for example, as shown in fig. 4, the pixel 401 in RGB image 3 and the pixel 402 in RGB image 4 are a set of corresponding pixels, and of course, other pixels in RGB image 3 also correspond to a corresponding pixel in RGB image 4. In this way, the above-mentioned intermediate parameter flow1 includes the predicted positional offset vector between the pixel 401 and the pixel 402. The positional offset vector may be a direction vector between the pixel point 401 and the pixel point 402 after the two are projected on the same plane.

In one implementation, the pixel 401 is projected onto the third plane according to its position in the RGB image 3, and then the pixel 402 is projected onto the third plane according to its position in the RGB image 4, so that the direction vectors of the pixel 401 and the pixel 402 on the third plane may be the positional offset vectors therebetween.

In another implementation, after the pixel point 402 is projected onto the RGB image 3, in the RGB image 3, a direction vector from the pixel point 401 to the pixel point 402, that is, a position offset vector between the pixel point 401 and the pixel point 402 may also be used.

For example, the position of the pixel point 401 in the RGB image 3 is the first row and the first column, the position of the pixel point 402 in the RGB image 4 is the third row and the third column, and after the pixel point 402 is projected onto the third row and the third column in the RGB image 3, the determination of the corresponding position offset vector may be the direction vector of "the first row and the first column" to "the third row and the third column". In addition, the intermediate parameter flow1 further includes a predicted position offset vector between other corresponding pixel points.

In some embodiments, after the initial neural network model 2 predicts a set of intermediate parameters flow1, the RGB image 3 may be deformed to obtain the RGB image 5 according to the intermediate parameters flow1 using the warp module. It can be understood that the principle of deforming the RGB image 3 may refer to the functional principle of the warp module, which is not described herein, and of course, the deforming the RGB image 3 according to the intermediate parameter flow1 may be: and traversing pixel points in the RGB image 3 in sequence, and searching a position offset vector corresponding to the pixel point in the middle parameter flow1 every time the pixel point is traversed, and then moving the position of the pixel point according to the position offset vector, so that the deformed RGB image 5 can be obtained after the pixel points in the RGB image 3 are traversed.

Finally, the model parameters in the initial neural network model 2 are iterated based on the differences between the RGB image 5 and the RGB image 4, e.g., the euclidean distance between the RGB image 5 and the RGB image 4.

Or, the actual position offset between the corresponding pixel points in the RGB image 3 and the RGB image 4 is acquired, and the actual intermediate parameter flow2 (which may also be referred to as intermediate parameter 2) is determined. Then, each model parameter in the initial neural network model 2 is iterated according to the difference between the intermediate parameter flow2 and the intermediate parameter flow 1.

It will be appreciated that, in the initial stage of training, the intermediate parameter flow1 obtained by the initial neural network model 2 based on the input RGB image 3 has randomness, that is, in this stage, the predicted intermediate parameter flow1 has a very high probability of inaccuracy, and the difference between the obtained RGB image 5 and the RGB image 4 is very large after the RGB image 3 is deformed according to the intermediate parameter flow 1.

Of course, after the initial neural network model 2 is trained by using a large number of different training samples 2, the initial neural network model 2 gradually converges, so that the model parameters of the initial neural network model 2 become more and more accurate, and the accuracy of predicting the intermediate parameter flow1 by the initial neural network model 2 is also higher. That is, at this stage, the deformed RGB image 5 is also closer to the corresponding RGB image 4 according to the predicted intermediate parameter flow 1. That is, after a lot of training, an initial neural network model 2, that is, a deblurring model, capable of accurately predicting the difference between RGB images before and after deblurring can be obtained. The deblurring model trained in the above manner needs to be matched with the denoising model 1, and can also be called as a deblurring model 1.

In some embodiments, the deblurring model 1 may be trained on the electronic device, i.e., after the initial neural network model 2 is configured to the electronic device, the electronic device has the deblurring model 1 described above after multiple training.

In other embodiments, the deblurring model 1 may be reconfigured into the electronic device after being trained on other devices. Wherein other devices may have the same warp module as the electronic device. That is, the initial neural network model 2 may be configured into other devices, and then the initial neural network model 2 is trained using the training sample 2 to obtain the deblurring model 1, and then the deblurring model 1 described above is configured into the electronic device.

It can be understood that the denoising model 1 and the deblurring model 1 are matched and started, so that the problems of noise and blurring of the image can be solved. In other possible embodiments, the denoising model 1 and the deblurring model 1 may also be trained simultaneously. In this embodiment, the training samples used for training include a sample RAW image 1 and a corresponding denoised and deblurred RGB image (simply referred to as a sample RGB image).

In the training process, after the sample RAW image 1 is sequentially processed by the initial neural network model 1 and the image signal processor ISP, a corresponding RGB image 2 can be obtained. Then, the RGB image 2 is sequentially processed by the initial neural network model 2 and the warp module, and an output RGB image can be obtained. Finally, the model parameters of the initial neural network model 1 and the initial neural network model 2 are iterated according to the difference between the output RGB image and the sample RGB image to obtain a denoising model 1 and a deblurring model 1.

In summary, in the embodiment of the present application, the training modes of the denoising model 1 and the deblurring model 1 are not specifically limited, and only by training, the denoising model 1 can have the capability of eliminating noise in the RAW domain, and the deblurring model 1 can have the capability of predicting the intermediate parameter flow 1.

In some embodiments, in the case where the electronic device is configured with the denoising model 1 and the deblurring model 1, as shown in fig. 5, the above-described image processing method may include the steps of:

s101, in response to an instruction indicating acquisition of image data, a RAW image 3 is acquired.

In some embodiments, the electronic device may generate instructions indicating acquisition of image data based on business requirements when running a camera-enabled application, such as when running a camera application.

For example, after the electronic device initiates operation of the camera application in response to a user indication, a camera preview interface may be displayed. It will be appreciated that the camera preview interface is required to display the preview image captured by the camera, so that the event (displaying the camera preview interface) can trigger the service requirement to activate the camera, i.e. an instruction to capture image data is generated before the camera preview interface is actually displayed.

Also by way of example, in the case where the electronic device runs a short video application, a short video shooting interface may be displayed in response to a user indicating an operation to shoot a video. It will be appreciated that the short video capture interface also requires the display of video data (continuous multi-frame image data) captured by the camera, and that this event (displaying the short video capture interface) may also trigger the service requirement to activate the camera, i.e. an instruction to capture image data may be generated before the short video capture interface is actually displayed.

In some embodiments, after detecting the instruction for instructing to collect image data, the application layer of the electronic device may call a Camera API, and send the instruction for instructing to collect image data to the Camera through the Camera HAL and the Camera driver. The camera may also initiate image acquisition in response to the instruction to acquire an image. In this link, the image sensor of the camera obtains a RAW image, that is, a RAW image 3 by sensing an optical signal in the framing environment.

S102, eliminating noise in the RAW image 3 by using the denoising model 1 to obtain a RAW image 4.

The denoising model 1 can complete the denoising in the RAW domain. In some embodiments, RAW image 3 may be input to denoising model 1, and denoising model 1 outputs RAW image 4 after processing by denoising model 1.

S103, an RGB image 6 is generated from the RAW image 4 and the image signal processor ISP.

In some embodiments, only red pixels, blue pixels, and green pixels in RAW image 4, while pixels in RGB image 6 contain components of the three color channels of red, blue, and green. In the process of converting the RAW image 4 into the RGB image by the image signal processor ISP, the image signal processor ISP needs to accurately restore the colors that are not present in the RAW image 4, that is, restore the actual colors of the photographic subject.

Of course, the image signal processor ISP may also perform one or more of the following processes for the converted RGB image: dark current removal (current noise removal), shading (solving the brightness attenuation and color change caused by a lens), dead point removal (dead point removal in a sensor), denoising (noise removal), 3A (automatic white balance, automatic focusing and automatic exposure), gamma correction (brightness mapping curve optimization of local and whole contrast), rotation (angle change), sharpening (sharpness adjustment), scaling (zoom-in and zoom-out), color space conversion (conversion to different color spaces for processing), color enhancement (optional, color adjustment) and the like.

In some embodiments, the RAW image 4 may be processed by the image signal processor ISP to obtain the RGB image 6.

S104, determining a corresponding intermediate parameter flow1 according to the RGB image 6 and the deblurring model 1.

In some embodiments, after processing the RGB image 6 using the deblurring model 1, an intermediate parameter flow1 may be output. The intermediate parameter flow1 is a set of parameters predicted by the deblurring model 1, and the set of parameters may indicate a positional offset vector between each corresponding pixel point in the RGB image 6 and the deblurred RGB image 6.

S105, according to the intermediate parameter flow1, the RGB image 6 is deformed, and the RGB image 7 is obtained.

In some embodiments, a warp module may be used to deform the RGB image 6 according to the intermediate parameter flow1. In the case where the accuracy of the deblurring model 1 reaches a predetermined standard, the resulting RGB image 7 has no problem of blurring.

It can be understood that the above-mentioned intermediate parameter flow1 includes a plurality of sets of position offset vectors between corresponding pixels, and each set of corresponding pixels includes a pixel of the RGB image 6, that is, in the intermediate parameter flow1, each pixel of the RGB image 6 may correspond to a position offset vector.

Thus, by using the warp module, according to the intermediate parameter flow1, the deformation processing on the RGB image 6 may be: each pixel point in the RGB image 6 is traversed. And (3) each time traversing to a pixel point, inquiring a corresponding position offset vector in the middle parameter flow1, and moving the pixel point according to the position offset vector of the pixel point in the middle parameter flow 1. After all the pixels in the RGB image 6 are shifted, a deformed RGB image 7, that is, image data with noise and blur removed, can be obtained as image data acquired by the camera.

For example, as shown in fig. 6, a pixel 601 is included in the RGB image 6, and a position offset vector of the pixel 601 in the intermediate parameter flow1 may be a direction vector 602 as in fig. 6. The pixel 601 is located in the first row and the first column, and the direction vector 602 may indicate that the pixel 601 in the RGB image 6 is moved to the third row and the third column. In this way, after the deformation processing is performed on the RGB image 6, the pixel point 601 is located in the third row and the third column in the obtained RGB image 7.

S106, displaying the RGB image 7.

In some embodiments, after the RGB image 7 is obtained, the RGB image 7 may be transferred to the application layer through the HAL layer, the framework layer. The RGB image 7 may be displayed by a Surface view, i.e. the image data is displayed noise-free, blur-free, by the control electronics.

In some embodiments, the electronic device collects video data, or in a scene where a preview video stream is collected, each frame of RAW image collected by the image sensor may be processed frame by frame according to the method shown in S101 to S106, and after the RAW image is converted into an RGB image without noise and blur, the obtained RGB image without noise and blur is displayed frame by frame, so that the electronic device may remove the video data in real time during shooting, or preview noise and blur problems in the video stream.

That is, after the above described denoising model 1 and deblurring model 1 are configured to an electronic apparatus, after the camera in the electronic apparatus starts up the operation, as shown in fig. 7:

the optical signal reflected by the view finding environment is irradiated on the image sensor 702 through the camera lens 701 in the camera, and the image sensor 702 can convert the optical signal into an electrical signal, so as to obtain a corresponding RAW image 3. Then, the RAW image 3 is processed using the denoising model 1, resulting in a RAW image 4. Wherein the RAW image 4 does not contain noise.

Then, the RAW image 4 is converted into an RGB image 6 by the image signal processor ISP. There is no noise in the RGB image 6, but there may be blurring. In order to eliminate possible blurring problems in the RGB image 6, the RGB image 6 needs to be further processed by using the deblurring model 1 to obtain a corresponding intermediate parameter flow1. And finally, performing deformation processing on the RGB image 6 by using a warp module according to the intermediate parameter flow1 to obtain an RGB image 7 without blurring.

In the above embodiment, the electronic device may complete noise cancellation in the RAW domain using a lightweight model during the process of acquiring image data. In the RGB domain, a lightweight model is also utilized, and a warp module is combined, so that image blurring is eliminated, the image quality is ensured, meanwhile, the occupation of an image processing model (a denoising model and a deblurring model) on system resources is reduced, and the universality of the method is improved.

The second way to train the denoising model can be to train the denoising model by combining the warp module. The trained denoising model has the capability of not only removing noise in the RAW domain, but also predicting an intermediate parameter flow3 (also called intermediate parameter 3). The intermediate parameter flow3 is similar to the intermediate parameter flow1, and the difference between the two parameters is that the intermediate parameter flow3 can indicate the same frame of RAW image to eliminate the difference before and after blurring.

Illustratively, the training process may be as follows:

first, a plurality of sets of training samples 3 are acquired. For example, each set of training samples 3 corresponds to a frame of sample RAW image 5 and an RGB image 8 corresponding to the sample RAW image 5, and the RGB image 8 is free of noise and has no blurring problem. Illustratively, the sample RAW image 5 is converted into an RGB image, and after denoising and deblurring processing, an RGB image 8 can be obtained.

Second, during training, the sample RAW images 5 in each set of training samples 3 may be input into a preselected initial neural network model 3.

Wherein the initial neural network model 3 is a pre-selected lightweight model, and the initial neural network model 3 comprises one input port and two output ports. Illustratively, one output port of the initial neural network model 3 is used to output a predicted RAW image 6, which RAW image 6 is a predicted noise-free RAW image. The other output of the initial neural network model 3 is used for outputting the intermediate parameter flow3. The intermediate parameter flow3 is the predicted positional offset between the corresponding pixel points in the RAW image 6 and the RAW image 7. Illustratively, the RAW image 7 is a de-noised and de-blurred RAW image. The RAW image 5 is subjected to denoising and deblurring processing in the RAW domain, and the RAW image 7 described above can be obtained.

After the sample RAW image 5 is processed by the initial neural network model 3, a RAW image 6 is obtained. Then, the RAW image 6 is deformed by the warp module according to the intermediate parameter flow3, and a RAW image 8 is obtained. For the process of deforming the RAW image 6, reference may be made to the process of deforming the RGB image 6 in the foregoing embodiment, which is not described herein.

After the RAW image 8 is obtained, the RAW image 8 is processed by the image signal processor IPS to obtain an RGB image 9. Finally, the model parameters in the initial neural network model 3 are iterated based on the differences between the RGB images 8 and 9, e.g., the euclidean distance between the RGB images 8 and 9.

It will be appreciated that the RAW image 6 obtained by the initial neural network model 3 based on the input sample RAW image 5 may be random data at the initial stage of training, and of course, there is a correlation between the random RAW image 6 and the sample RAW image 5, and the randomness of the RAW image 6 at this stage is reflected in the degree of denoising. In addition, the intermediate parameter flow3 obtained by the initial neural network model 3 based on the input sample RAW image 5 may also be a set of random parameters, that is, at this stage, the probability that the intermediate parameter flow3 predicted by the initial neural network model 3 is inaccurate is high, so that the difference between the RAW image 8 obtained according to the intermediate parameter flow3 and the RAW image 7 is also great. After that, the RGB image 9 converted from the RAW image 8 may still have problems such as noise and blurring.

After training the initial neural network model 3 with a large number of training samples 3, the accuracy of the initial neural network model 3 is higher and higher. At this stage, after the initial neural network model 3 is used to process the sample RAW image 5 in the training sample 3, the obtained RAW image 6 has the same content as the sample RAW image 5, but has successfully eliminated noise.

Meanwhile, the more accurate the intermediate parameter flow3 predicted by the initial neural network model 3 is, that is, at this stage, after the RAW image 6 is deformed according to the predicted intermediate parameter flow3, the obtained RAW image 8 is also closer to the corresponding RAW image 7. Thus, the RGB image 9 obtained after the RAW image 8 is processed by the image signal processor ISP is also closer to the RGB image 8 in the training sample 3.

That is, after a large amount of training, the initial neural network model 3 converges, so that the initial neural network model 3, that is, the denoising model 2, capable of eliminating noise and blur in the RAW domain can be obtained.

Also by way of example, the training process may also be as follows:

first, a plurality of sets of training samples 4 are acquired. For example, each set of training samples 4 corresponds to a frame of sample RAW image 5 and a RAW image 7 corresponding to the sample RAW image 5, and the RAW image 5 is in the RAW domain, and after denoising and deblurring processing, the RAW image 7 can be obtained.

Second, during training, the sample RAW images 5 in each set of training samples 4 may be input into a preselected initial neural network model 3.

After the RAW image 8 is obtained, the model parameters in the initial neural network model 3 are iterated according to the difference between the RAW image 8 and the RAW image 7, for example, the euclidean distance between the RAW image 8 and the RAW image 7. In this way, after the initial neural network model 3 converges, the denoising model 2 can be obtained.

In some embodiments, in the case where the electronic device is configured with the denoising model 2, as shown in fig. 8, the above image processing method may include the steps of:

s201, in response to an instruction indicating acquisition of image data, a RAW image 9 is acquired.

In some embodiments, the RAW image 9 and the RAW image 3 are similar, and are all RAW images acquired by the image sensor in response to an instruction for acquiring image data. In addition, the implementation details of S201 and S101 are the same, and will not be described here again.

S202, using the denoising model 2 to process the RAW image 9 to obtain a RAW image 10 and a corresponding intermediate parameter flow3.

The denoising model 2 can complete the elimination of noise in the RAW domain, and can also predict the corresponding intermediate parameter flow3. In some embodiments, the RAW image 9 may be input to the denoising model 2, and after being processed by the denoising model 2, the denoising model 2 outputs a RAW image 10 without noise and a corresponding intermediate parameter flow3.

S203, according to the intermediate parameter flow3, the RAW image 10 is deformed, and the RAW image 11 is obtained.

In some embodiments, the warp module may be used to deform the RAW image 10 according to the intermediate parameter flow 3. In the case where the accuracy of the denoising model 2 reaches a predetermined standard, the resulting RAW image 11 has no problem of blurring.

It can be understood that the above-mentioned intermediate parameter flow3 includes a plurality of sets of position offset vectors between corresponding pixels, and each set of corresponding pixels includes a pixel of the RAW image 10, that is, in the intermediate parameter flow3, each pixel of the RAW image 10 may correspond to a position offset vector.

Thus, by using the warp module, the deformation processing of the RAW image 10 according to the intermediate parameter flow3 may be: and traversing the pixel points in the RAW image 10, and searching a position offset vector corresponding to the pixel points in the intermediate parameter flow3 every time the pixel points are traversed, and then moving the pixel points according to the position offset vector in the RAW image 10. After all pixels in the RAW image 10 have been traversed, the RAW image 11 may be obtained.

S204, the RGB image 10 is generated from the RAW image 11 and the image signal processor ISP.

In some embodiments, the implementation details of S103 and S204 are the same, and will not be described herein. Since the RAW image 11 is an image from which noise and blur have been removed, the RGB image 10 from which noise and blur have been removed is converted from the RAW image 11 as image data acquired by a camera.

S205, the RGB image 10 is displayed.

In some embodiments, the electronic device may collect video data, or, in a scene where a preview video stream is collected, process each frame of RAW image collected by the image sensor frame by frame according to the method described in S201 to S205, and after converting the RAW image into an RGB image free of noise and blur, display the obtained RGB image free of noise and blur frame by frame.

That is, after the above described denoising model 2 is configured to an electronic device, after the camera in the electronic device starts the operation, as shown in fig. 9:

the light signal reflected by the view finding environment is irradiated onto the image sensor 802 through the camera lens 801 in the camera, and the image sensor 802 can convert the light signal into an electric signal to obtain a corresponding RAW image 9. Then, the RAW image 9 is processed using the denoising model 2, resulting in a RAW image 10 and a corresponding intermediate parameter flow3. Wherein the RAW image 10 does not contain noise. The corresponding intermediate parameter flow3 may indicate a difference between the RAW image 10 and the RAW image 10 with the deblurring completed.

Then, the RAW image 10 is deformed according to the obtained intermediate parameter flow3 by using the warp module to obtain a RAW image 11, and finally, the RAW image 11 is converted into an RGB image 10 by using an image signal processor ISP and displayed. Thus, the displayed RGB image 10 is image data that does not contain noise nor have a blur problem.

In the above embodiment, the electronic device may use a lightweight model to eliminate noise in the RAW domain during the process of acquiring image data, and combine a warp module in the RAW domain to eliminate the image blurring problem. In other embodiments, only noise may be removed in the RAW domain, and then, in combination with the deblurring model 2, the warp module is combined to complete the image blur removal in the RGB domain. The deblurring model 2 can be obtained by training in a second way of training the deblurring model.

The second way to train the deblurring model may be to combine the denoising model 2, the image processor ISP and the warp module to train the deblurring model. The trained deblurring model has the ability to predict the intermediate parameter flow4 (also referred to as intermediate parameter 4). The intermediate parameter flow4 may indicate the difference between RGB images before and after blur is eliminated. The intermediate parameter flow4 is more accurate than the intermediate parameter flow 1.

Illustratively, the training process may be as follows:

first, a plurality of sets of training samples 3 are acquired. And processing the sample RAW images 5 in each group of training samples 3 by using the denoising model 2 to obtain corresponding intermediate parameter flows 3 and RAW images 6 after denoising. Then, the RAW image 6 is processed by the image signal processor ISP to obtain a corresponding RGB image 11.

Second, during training, the intermediate parameters flow3 and RGB image 11 corresponding to each set of training samples 3 may be input into the pre-selected initial neural network model 4. Wherein the initial neural network model 4 is a pre-selected lightweight initial neural network model.

After the initial neural network model 4 has processed the RGB image 11 and the intermediate parameter flow3, the intermediate parameter flow4 is obtained. The intermediate parameter flow4 is the predicted positional offset between the RGB image 11 and the corresponding pixel point in the RGB image 8. Similar to the intermediate parameter flow1 in the previous embodiment, however, the intermediate parameter flow3 of the RAW domain is considered in predicting the intermediate parameter flow4, which is more accurate than the intermediate parameter flow 1.

Then, the RGB image 11 is deformed according to the intermediate parameter flow4 by using the warp module, to obtain an RGB image 12. The details of the deformation of the RGB image 11 may be referred to in the foregoing embodiments, and the details of the deformation of the RGB image 6 are not described herein.

Finally, the model parameters in the initial neural network model 4 are iterated based on differences between the RGB image 12 and the RGB image 8, such as euclidean distance between the RGB image 12 and the RGB image 8. After a large amount of training, each model parameter in the initial neural network model 4 is more and more accurate, and can be used as a deblurring model 2 to be configured in electronic equipment and used with the denoising model 2.

In some embodiments, in the case where the electronic device is configured with the denoising model 2 and the deblurring model 2, as shown in fig. 10, the above-described image processing method may include the steps of:

s301, in response to an instruction indicating acquisition of image data, a RAW image 9 is acquired.

S302, the RAW image 9 is processed by using the denoising model 2, and a RAW image 10 and a corresponding intermediate parameter flow3 are obtained.

In some embodiments, the implementation details of S301 and S302 may refer to S201 and S202 in the foregoing embodiments, which are not described herein.

S303, the RGB image 13 is generated from the RAW image 10 and the image signal processor ISP.

In some embodiments, the implementation principle of S303 is the same as that of S103, and will not be described herein.

S304, according to the RGB image 13 and the corresponding intermediate parameter flow3, the corresponding intermediate parameter flow4 is predicted by combining the deblurring model 2.

S305, according to the intermediate parameter flow4, the RGB image 13 is deformed, and the RGB image 14 is obtained.

In some embodiments, the warp module is used to deform the RGB image 13 according to the intermediate parameter flow4 to obtain the RGB image 14 with noise and blur removed as image data acquired by the camera. The process of deforming the RGB image 13 may refer to the foregoing embodiment, and the deforming process of the RGB image 6 is not described herein.

S306, the RGB image 14 is displayed.

In some embodiments, after the RGB image 14 is obtained, the warp module may pass the RGB image 14 to the application layer through the HAL layer, the framework layer. The application layer may then control the electronic device to display the RGB image 14, i.e. to display noise-free, blur-free image data, via Surface view.

In some embodiments, the electronic device collects video data, or in a scene where a preview video stream is collected, each frame of RAW image collected by the image sensor may be processed frame by frame according to the method shown in S301 to S305, and after the RAW image is converted into an RGB image without noise and blur, the obtained RGB image without noise and blur is displayed frame by frame, so that the electronic device may remove the video data in real time during shooting, or preview noise and blur problems in the video stream.

That is, after the above described denoising model 2 and deblurring model 2 are configured to the electronic apparatus, after the camera in the electronic apparatus starts the operation, as shown in fig. 11:

the light signal reflected by the view finding environment is irradiated onto the image sensor 1002 through the camera lens 1001 in the camera, and the image sensor 1002 can convert the light signal into an electric signal, so as to obtain a corresponding RAW image 9. Then, the RAW image 9 is processed using the denoising model 2, resulting in a RAW image 10 and a corresponding intermediate parameter flow3. Wherein the RAW image 10 does not contain noise.

Then, the RAW image 10 is converted into an RGB image 13 using an image signal processor ISP, and the RGB image 13 has no noise therein, but there may be blurring. In order to eliminate possible blurring problem in the RGB image 13, the deblurring model 2 is further used to process the RGB image 13 and the intermediate parameter flow3 to obtain a corresponding intermediate parameter flow4. Finally, the warp module is utilized to deform the RGB image 13 according to the intermediate parameter flow4, and the RGB image 14 without blurring is obtained.

In the above embodiment, the electronic device may complete noise cancellation in the RAW domain using a lightweight model during the process of acquiring image data. In the RGB domain, a lightweight model is also utilized, and a warp module is combined to finish the removal of image blurring, so that the occupation of an image processing model (a denoising model and a deblurring model) on system resources is reduced while the image quality is ensured, and the universality of the method is improved.

In addition, in the embodiment of the application, the warp module is utilized in the process of removing the image blur, so that the task amount of an image processing model (such as a deblurring model) is reduced. The specification requirements on the image processing model can be reduced to a certain extent, namely, the system resources occupied by the image processing model can be effectively reduced.

In some embodiments, after determining the intermediate parameter flow1 or determining the intermediate parameter flow4, the electronic device may perform the deformation processing on the RGB image 6 or the RGB image 13 in combination with the intermediate parameter flow1 or the intermediate parameter flow4 when the image anti-shake flow triggers the warp module.

It can be understood that during the image anti-shake processing, the warp module is also enabled to perform the anti-shake processing on the RGB image output by the image signal processor ISP.

Taking the example of the anti-shake processing on the RGB image 6, in the process of performing the anti-shake processing on the RGB image 6, a difference between the RGB image 6 and the anti-shake RGB image 6 may be calculated, for example, an intermediate parameter flow5 (may also be referred to as intermediate parameter 5) is calculated. Each pixel in the RGB image 6 corresponds to one pixel on the RGB image 6 for eliminating shake, and these two pixels may also be referred to as corresponding pixels. In the above example, the above intermediate parameter flow5 includes the positional offset between the RGB image 6 and each group of corresponding pixels in the dithered RGB image 6. In some embodiments, after determining the intermediate parameter flow5, the warp module may be enabled to deform the RGB image 6 according to the intermediate parameter flow5 to eliminate image shake problems in the RGB image 6.

Taking the example of the anti-shake processing on the RGB image 13, in the process of performing the anti-shake processing on the RGB image 13, a difference between the RGB image 13 and the anti-shake RGB image 13 may be calculated, for example, an intermediate parameter flow6 (may also be referred to as intermediate parameter 6) is calculated. Each pixel in the RGB image 13 corresponds to one pixel on the RGB image 13, and these two pixels may also be referred to as corresponding pixels. In the above example, the above intermediate parameter flow6 includes the positional offset between the RGB image 13 and each group of corresponding pixels in the dithered RGB image 13. In some embodiments, after determining the intermediate parameter flow6, a warp module may be enabled to deform the RGB image 13 according to the intermediate parameter flow6 to eliminate image shake problems in the RGB image 13.

In addition, the details of obtaining the intermediate parameter flow5 or the intermediate parameter flow6 may refer to the related technology of image anti-shake, which is not described herein.

In the above embodiment, the electronic device may perform the de-jittering processing on the image data (RGB image 6 or RGB image 13) by using the warp module, and simultaneously perform the de-blurring processing, so as to simplify the complexity of the flow of the electronic device and avoid the repeated occupation of system resources.

In some embodiments, if the electronic device acquires the intermediate parameter flow1, but does not acquire the intermediate parameter flow5, the electronic device may wait for the anti-shake processing to be performed on the RGB image 6 until the intermediate parameter flow5 is acquired, and then perform the deformation processing on the RGB image 6 according to the intermediate parameter flow1 and the intermediate parameter flow 5.

Similarly, if the electronic device acquires the intermediate parameter flow4, but does not acquire the intermediate parameter flow6, the electronic device may wait for the anti-shake processing to be performed on the RGB image 13 until the intermediate parameter flow6 is acquired, and then perform the deformation processing on the RGB image 13 according to the intermediate parameter flow4 and the intermediate parameter flow 5.

That is, illustratively, after step S104, S105 may not be performed. In addition, in the process of performing anti-shake processing on the RGB image 6, if the intermediate parameter flow5 is determined, the intermediate parameter flow5 and the intermediate parameter flow1 may be fused to obtain a target flow, and then the warp module is utilized to deform the RGB image 6 according to the target flow, so as to obtain an RGB image without shake, noise and blur, which is used as image data acquired by a camera, and displayed.

In comparison with the foregoing embodiment, "the RGB image 6 is deformed in accordance with the intermediate parameter flow 1", the difference is that the positional shift vector of each pixel point needs to be queried in the target flow.

In addition, as an implementation manner, the fusing of the intermediate parameter flow5 and the intermediate parameter flow1 may be: and acquiring a position offset vector 1 of each pixel point in the RGB image 6 in the intermediate parameter flow5 and a position offset vector 2 in the intermediate parameter flow1, and then adding the position offset vector 1 and the position offset vector 2 to obtain a position offset vector corresponding to the pixel point in the target flow.

In other embodiments, after adding the position offset vector 1 and the position offset vector 2, the position offset vector corresponding to the pixel point in the target flow may be calibrated in combination with a specific parameter (e.g., an image rotation parameter, etc.).

Also illustratively, after step S304, S305 may not be performed. In addition, in the process of performing anti-shake processing on the RGB image 13, if the intermediate parameter flow6 is determined, the intermediate parameter flow6 and the intermediate parameter flow4 may be fused to obtain a target flow, and then the warp module is utilized to deform the RGB image 6 according to the target flow, so as to obtain an RGB image without shake, noise and blur, which is used as image data acquired by a camera, and displayed.

In comparison with the foregoing embodiment in which the RGB image 13 is deformed in accordance with the intermediate parameter flow4, the difference is also that the positional shift vector of each pixel point needs to be queried in the target flow.

In addition, as an implementation manner, the fusing of the intermediate parameter flow6 and the intermediate parameter flow4 may be: and acquiring a position offset vector 3 of each pixel point in the RGB image 13 in the intermediate parameter flow6 and a position offset vector 4 in the intermediate parameter flow4, and then adding the position offset vector 3 and the position offset vector 4 to obtain a position offset vector corresponding to the pixel point in the target flow.

In other embodiments, after adding the position offset vector 3 and the position offset vector 4, the position offset vector corresponding to the pixel point in the target flow may be calibrated in combination with a specific parameter (e.g., an image rotation parameter, etc.).

In other possible embodiments, the electronic device may also enable the warp module to perform anti-shake processing and defuzzification processing, respectively.

Illustratively, after S103 described above, the anti-shake process may be started first, resulting in the post-shake RGB image 6. Then, according to the dithered RGB image 6 and the deblurring model 1, a corresponding intermediate parameter flow1 is determined. And according to the intermediate parameter flow1, deforming the RGB image 6 after the shake is removed, and obtaining an RGB image 7 with noise, blurring and shake removed.

Also illustratively, after S303 described above, the anti-shake process may be started first, resulting in the post-shake RGB image 13. Then, a corresponding intermediate parameter flow3 is determined from the de-dithered RGB image 13 and the de-blurring model 2. And according to the intermediate parameter flow3, deforming the RGB image 13 after the shake is removed, and obtaining an RGB image 14 with noise, blurring and shake removed.

Further illustratively, after S105, anti-shake processing is performed on the RGB image 7, so as to obtain an RGB image from which noise, blur, and shake are removed.

Further illustratively, after S305, anti-shake processing is performed on the RGB image 14, so as to obtain an RGB image with noise, blur and shake removed.

In some embodiments, in a scene of capturing image data with higher real-time performance such as video, each acquired frame of image can be processed according to the method in the foregoing embodiments, and meanwhile, in the processing process, a hardware warp module can be adopted, so that the deblurring real-time performance is improved.

The embodiment of the application also provides an electronic device, which may include: a memory and one or more processors. The memory is coupled to the processor. The memory is for storing computer program code, the computer program code comprising computer instructions. The computer instructions, when executed by the processor, cause the electronic device to perform the steps performed by the handset in the embodiments described above. Of course, the electronic device includes, but is not limited to, the memory and the one or more processors described above.

In some embodiments, the electronic device may be configured with a first model, that is, the deblurring model 1 in the foregoing embodiment, that is, the first model may predict the position change of each pixel point of the RGB image of the same frame before and after removing the image blur. In a scene where the user needs the electronic device to take a photograph, the user may make an operation indicating the photograph, that is, a first operation. For example, the above-described first operation may be an operation indicating to open the camera system application, and thus the electronic device may display a photographing preview interface (may also be referred to as a first interface) in response to the first operation. Also by way of example, the first operation may also be an operation indicating that the short video application is enabled to capture video, so that the electronic device may display a video capture interface (may also be referred to as a first interface) in response to the first operation. Further illustratively, the first operation may further be an operation for indicating that the live application is enabled to perform live broadcast, so that the electronic device may display a live broadcast interface (may also be referred to as a first interface) in response to the first operation.

During the display of the first interface, the electronic device may acquire a first RAW image, such as RAW image 3, through the image sensor. After the first RAW image has completed removing noise, the first RAW image (the noise-removed first RAW image may also be referred to as a RAW image 4) is converted into a first image (RGB image 6), for example, the first RAW image (i.e., the RAW image 4) is converted into a first image of RGB mode using ISP.

After the first image is obtained, a first parameter, i.e. the intermediate parameter flow1 in the previous embodiment, is predicted from the first model and the first image. Wherein the first parameter comprises a predicted first vector. The first vector is a predicted position shift vector of the first pixel before and after the image blur is removed from the first image, for example, the first vector is a position shift vector between a first position and a second position, the first position is a position of the first pixel in the first image, and the second position is a predicted position of the first pixel in the first image from which the image blur is removed. In the first image with the image blur removed, the pixel point with the changed position may be referred to as a first pixel point, and the first pixel point may include one or more first pixel points, which is specific to the actual situation. In addition, each first pixel point corresponds to a first vector, so that the first parameter can indicate the position change condition of each pixel point in the first image before and after removing the blur.

After the first parameter is determined, the electronic device may eliminate the problem of image blurring in the first image by deforming the first image, and of course, image data obtained after the deformation of the first image may be referred to as a second image. Finally, the electronic device may display the second image in the first interface.

In other embodiments, the second model, i.e. the denoising model 1 mentioned in the previous embodiment, may also be configured in the electronic device. That is, the second model can realize a function of eliminating noise in the RAW image. In this way, the electronic device may use the second model to cancel noise in the first RAW image before converting the first RAW image into the first image, and after the noise in the first RAW image is canceled, convert the first RAW image into the first image, so that the obtained first image is also the noise-canceled image.

As an implementation manner, the electronic device deforms the first image, and the process of obtaining the second image is as follows: and inquiring a first vector corresponding to the first pixel point in the first parameter. For example, in the case of having a plurality of first pixels, each first pixel may be traversed, one for each traversal, and in the first parameter, the corresponding first vector is queried, and then the first pixels in the first image are moved according to the first vector, resulting in the second image (i.e., RGB image 7).

As another implementation manner, before the electronic device deforms the first image to obtain the second image, the electronic device may further start an anti-shake processing flow based on the first image. In this process, the second parameter (i.e., intermediate parameter flow5 or intermediate parameter flow6 in the foregoing embodiment) may be obtained. The second parameter includes a second vector, where the second vector is a predicted position shift vector of the first pixel before and after the first image is subjected to image shake removal, for example, the second vector indicates a position shift vector between a first position and a third position, and the third position is a position of the predicted first pixel in the first image from which the image shake is removed. Of course, if the predicted first pixel point is unchanged in position in the first image from which the image shake is eliminated, the third position may coincide with the first position. In addition, the second parameter may further include a position offset vector corresponding to the other pixel points. The other pixels may be pixels in the first image other than the first pixel, and in the second parameter, the position offset vectors corresponding to the other pixels may indicate the position change condition of the other pixels in the first image before and after eliminating the image shake.

In the above-described embodiment, in order to reduce the number of deformations for an image, power consumption is reduced. The electronic device may determine the third parameter (i.e., the target flow) by fusing the first parameter and the second parameter before deforming the first image, i.e., before moving each pixel in the first image.

For example, the position offset vectors corresponding to the same pixel point in the first parameter and the second parameter may be superimposed to obtain the third parameter.

Of course, if a pixel has a corresponding first vector in the first parameter, but does not have a corresponding second vector in the second parameter, the first vector of the pixel is taken as the corresponding positional offset vector of the pixel in the third parameter. Also, if a pixel has a corresponding second vector in the second parameter, but does not have a corresponding first vector in the first parameter, the second vector of the pixel is taken as the corresponding positional offset vector of the pixel in the third parameter.

For example, a first vector corresponding to a first pixel point is queried in a first parameter, and a second vector corresponding to the first pixel point is queried in a second parameter; and superposing the first vector and the second vector, so that a third vector corresponding to the first pixel point can be obtained and used as a position offset vector corresponding to the first pixel point in a third parameter.

For example, the position offset vectors corresponding to the same pixel point in the first parameter and the second parameter may be superimposed, and then the obtained position offset vectors are calibrated according to the pre-configured calibration parameters to obtain the third parameter.

If a pixel has a corresponding first vector in a first parameter but does not have a corresponding second vector in a second parameter, the first vector of the pixel is calibrated as the corresponding positional offset vector of the pixel in a third parameter. Also, if a pixel has a corresponding second vector in the second parameter, but does not have a corresponding first vector in the first parameter, the second vector of the pixel is calibrated as the corresponding positional offset vector of the pixel in the third parameter.

For example, a first vector corresponding to the first pixel point is queried in a first parameter, and a second vector corresponding to the first pixel point is queried in a second parameter; superposing the first vector and the second vector to obtain a fourth vector; and calibrating the fourth vector according to a preset calibration parameter to obtain a third vector corresponding to the first pixel point, wherein the calibration parameter comprises a rotation angle and a translation distance.

In the above example, after the electronic device determines the third parameter, the electronic device may query, in the third parameter, a third vector corresponding to the first pixel point; the first pixel point in the first image is moved according to the third vector, which may be referred to simply as moving the first pixel point according to the third parameter. Of course, in the same way, other pixels in the first image may be moved according to the third parameter. In this way, the electronic device can obtain a second image that is free of blur, noise, and image shake.

In other embodiments, a third model (i.e., denoising model 2) and a fourth model (i.e., deblurring model 2) may also be configured in the electronic device. Illustratively, the third model may eliminate noise in the RAW image, and predict the position change of each pixel point in the RAW image of the same frame before and after removing the image blur. Also for example, the fourth model may predict the position change of each pixel point in the same frame RGB image before and after removing the image blur.

Likewise, a second interface is displayed at the electronic device in response to a second operation indicating that an image was acquired. The second operation is similar to the first operation, and the second interface is also similar to the first interface, which is not described herein.

During the display of the second interface, the electronic device may acquire a second RAW image (i.e., RAW image 9) through the image sensor.

The electronic device may further determine a noise-free third RAW image (i.e. the RAW image 10) and predict a fourth parameter (i.e. the intermediate parameter flow 3) according to the third model and the second RAW image, wherein the fourth parameter includes a predicted fifth vector, and the fifth vector is a predicted position offset vector of the second pixel point before and after the third RAW image removes the image blur. For example, the fifth vector is a positional shift vector between a fourth position, which is a position of the second pixel point in the third RAW image, and a fifth position, which is a predicted position of the second pixel point in the blur-removed third RAW image. The second pixel in the third RAW image is similar to the first pixel in the first image, that is, the second pixel may be a pixel that needs to be moved in the third RAW image during the blur elimination process, and the number of the second pixel may include one or more, depending on the actual situation.

In addition, after obtaining the third RAW image and the fourth parameter, the electronic device may convert the third RAW image into a third image (e.g., RGB image 13) in RGB mode.

In this way, the electronic device can predict a fifth parameter (e.g., intermediate parameter flow 4) based on the fourth model, the fourth parameter, and the third image. The fifth parameter comprises a predicted sixth vector, and the sixth vector is a predicted position offset vector of the second pixel before and after the third image is subjected to image blur removal. For example, the sixth vector is a positional shift vector between a sixth position, which is a position of the second pixel point in the third image, and a seventh position, which is a position of the second pixel point in the third image where the image blur is eliminated.

After determining the fifth parameter, the electronic device may deform the third image, e.g., move the position of the second pixel in the third image, resulting in a fourth image (RGB image 14) that is noiseless and unblurred, and refresh the fourth image into the second interface.

In some embodiments, the electronic device performs deformation on the third image (for example, moving the position of the second pixel point in the third image) to obtain a fourth image, including: in the fifth parameter, inquiring a sixth vector corresponding to the second pixel point; and according to the sixth vector, the second pixel point in the third image is moved, and the fourth image is obtained.

In other embodiments, before the electronic device deforms the third image (e.g., moves the position of the second pixel point in the third image) to obtain the fourth image, the electronic device may start an image anti-shake processing flow based on the third image, where during the processing, the electronic device may obtain a sixth parameter (e.g., an intermediate parameter flow5 or an intermediate parameter flow 6), where the sixth parameter includes a seventh vector corresponding to the second pixel point, where the seventh vector is a predicted position offset vector of the second pixel point before and after the third image removes the image shake, for example, the seventh vector is a position offset vector between a sixth position and an eighth position, where the predicted second pixel point is in the third image that removes the image shake. In some cases, in the third image from which the image shake is eliminated, it may be included that other pixel points have been changed in addition to the second pixel point. The sixth parameter may further include a position offset vector corresponding to another pixel.

In the above embodiment, before the third image is deformed (e.g., the position of the second pixel point in the third image is moved) to obtain the fourth image, the electronic device may further fuse the sixth parameter and the fifth parameter to determine the seventh parameter, that is, the target flow.

For example, the position offset vectors corresponding to the same pixel point in the sixth parameter and the fifth parameter may be superimposed to obtain the seventh parameter.

Of course, if a pixel has a corresponding sixth vector in the fifth parameter, but does not have a corresponding seventh vector in the sixth parameter, the sixth vector of the pixel is taken as the corresponding positional offset vector of the pixel in the seventh parameter. Also, if one pixel has a corresponding seventh vector in the sixth parameter, but does not have a corresponding sixth vector in the fifth parameter, the seventh vector of the pixel is taken as the corresponding positional offset vector of the pixel in the seventh parameter.

For example, a sixth vector corresponding to the second pixel point is queried in the fifth parameter, and a seventh vector corresponding to the second pixel point is queried in the sixth parameter; the sixth vector and the seventh vector are superimposed, and thus an eighth vector corresponding to the second pixel point can be obtained as a positional displacement vector corresponding to the second pixel point in the seventh parameter.

For example, the position offset vectors corresponding to the same pixel point in the sixth parameter and the fifth parameter may be superimposed, and then the obtained position offset vector may be calibrated according to the pre-configured calibration parameter to obtain the seventh parameter.

If a pixel has a corresponding sixth vector in the fifth parameter, but does not have a corresponding seventh vector in the sixth parameter, the sixth vector of the pixel is calibrated as the corresponding positional offset vector of the pixel in the seventh parameter. Also, if a pixel has a corresponding seventh vector in the sixth parameter, but does not have a corresponding sixth vector in the fifth parameter, the seventh vector of the pixel is calibrated as the corresponding positional offset vector of the pixel in the seventh parameter.

For example, a sixth vector corresponding to the second pixel point is queried in a fifth parameter, and a seventh vector corresponding to the second pixel point is queried in the sixth parameter; superposing the sixth vector and the seventh vector to obtain a ninth vector; and calibrating the ninth vector according to a pre-configured calibration parameter to obtain an eighth vector corresponding to the second pixel point, wherein the calibration parameter comprises a rotation angle and a translation distance.

In the above example, after the electronic device determines the seventh parameter, the electronic device may query, in the seventh parameter, an eighth vector corresponding to the second pixel point; the process of moving the second pixel point in the third image according to the eighth vector may simply be referred to as moving the second pixel point according to the seventh parameter. Of course, in the same way, other pixels in the third image may be moved according to the seventh parameter. In this way, the electronic device can obtain a fourth image that is not blurred and that is free of noise.

In other embodiments, the electronic device may also train the models required by the above embodiments, e.g., the first model, the second model, the third model, the fourth model, and so on.

Illustratively, the process of the electronic device training the first model is as follows:

the electronic device may obtain training sample data, such as training sample 2, comprising a first sample image (e.g., sample RGB image 3) and a second sample image (e.g., RGB image 4), the second sample image being image data obtained by eliminating image blur in the first sample image.

The electronic device may process the first sample image with a pre-initial model (initial neural network model 2) to obtain an eighth parameter (e.g., an intermediate parameter flow 1), where the eighth parameter includes a predicted tenth vector, where the tenth vector is a predicted position offset vector of the third pixel before and after the image blur is removed from the first sample image, for example, the tenth vector is a position offset vector between a ninth position and a tenth position, where the ninth position is a position of the third pixel in the first sample image, and the tenth position is a predicted position of the third pixel in the first sample image from which the image blur is removed. The third pixel point is a pixel point in the first sample image, and is also a pixel point of which the position is changed after the blur of the first sample image is removed.

The electronic device deforms the first sample image according to a tenth vector in the eighth parameter (for example, moves a third pixel point in the first sample image according to the tenth vector) to obtain a first RGB image (for example, RGB image 5), and then iterates the model parameters of the initial model according to the difference between the second sample image and the first RGB image; after the initial model is trained to converge, a first model is obtained. It can be appreciated that, the manner of determining whether the neural network model converges may refer to the related art, and will not be described herein.

Illustratively, the process of the electronic device training the third model is as follows:

the electronic device obtains training sample data, such as training sample 4. The training sample data includes a third sample image (sample RAW image 5) and a fourth sample image (RAW image 7 corresponding to the sample RAW image 5), which is image data obtained by removing noise and image blur from the third sample image.

The electronic device processes the third sample image by using a pre-initial model (initial neural network model 3) to obtain a fourth RAW image (RAW image 6) and a ninth parameter (intermediate parameter flow 3), wherein the ninth parameter includes a predicted eleventh vector, the eleventh vector is a predicted position offset vector of the fourth pixel before and after the fourth RAW image is subjected to image blur removal, for example, the eleventh vector is a position offset vector between an eleventh position and a twelfth position, the eleventh position is a position of the fourth pixel in the fourth RAW image, and the twelfth position is a position of the fourth pixel in the fourth RAW image subjected to image blur removal. The fourth pixel point is a pixel point in the fourth RAW image, and at the same time, the fourth pixel point is a pixel point of which the position is changed after the fourth RAW image eliminates the image blur.

The electronic device may deform the fourth RAW image according to the eleventh vector in the ninth parameter, for example, move the position of the fourth pixel point in the fourth RAW image according to the eleventh vector, to obtain a sixth sample image; iterating model parameters of the initial model according to the difference between the sixth sample image and the fourth sample image; after the initial model is trained to converge, a third model is obtained. It can be appreciated that, the manner of determining whether the neural network model converges may refer to the related art, and will not be described herein.

Illustratively, the process of the electronic device training the fourth model is as follows:

the electronic device obtains training sample data, such as training sample 3. The training sample data includes a third sample image (sample RAW image 5) and a fifth sample image (e.g., RGB image 8), wherein the third sample image is a RAW image, and the fifth sample image is an RGB image obtained by converting the third sample image after removing noise and image blur.

The electronic device processes the third sample image by using a preconfigured third model to obtain a fifth RAW image (noise-removed RAW image 6) and a tenth parameter (intermediate parameter flow 3), wherein the tenth parameter comprises a predicted twelfth vector, and the twelfth vector is a predicted position offset vector of the fifth pixel before and after the image blur is removed from the fifth sample image, for example, the twelfth vector is a position offset vector between a thirteenth position and a fourteenth position, the thirteenth position is a position of the fifth pixel in the fifth sample image, and the fourteenth position is a predicted position of the fifth pixel in the fifth sample image from which the image blur is removed. The fifth pixel point is a pixel point in the fifth sample image, and the fifth pixel point is a pixel point whose position is changed after the blur of the fifth sample image is removed.

After converting the fifth RAW image into the second RGB image (e.g., RGB image 11), the second RGB image and the tenth parameter are processed using a predetermined initial model (e.g., initial neural network model 4) to obtain an eleventh parameter (e.g., intermediate parameter flow 4), where the eleventh parameter includes a predicted thirteenth vector, the thirteenth vector is a predicted positional shift vector of the sixth pixel point before and after the second RGB image is blurred, for example, the thirteenth vector is a positional shift vector between a fifteenth position and a sixteenth position, the fifteenth position is a position of the sixth pixel point in the second RGB image, and the sixteenth position is a position of the sixth pixel point in the second RGB image from which the image is blurred. The sixth pixel is a pixel in the second RGB image, and the position of the sixth pixel is changed after the second RGB image is blurred. The second RGB image is deformed according to the thirteenth vector in the eleventh parameter, for example, the position of the sixth pixel point in the second RGB image is shifted according to the thirteenth vector, resulting in a third RGB image (e.g., RGB image 12).

The electronic equipment iterates model parameters of the initial model according to the difference between the third RGB image and the fifth sample image; after the initial model is trained to converge, a fourth model is obtained. It can be appreciated that, the manner of determining whether the neural network model converges may refer to the related art, and will not be described herein.

The embodiment of the application also provides a chip system, which can be applied to the electronic equipment in the previous embodiment. As shown in fig. 12, the system-on-chip includes at least one processor 2201 and at least one interface circuit 2202. The processor 2201 may be a processor in an electronic device as described above. The processor 2201 and the interface circuit 2202 may be interconnected by wires. The processor 2201 may receive and execute computer instructions from the memory of the electronic device described above through the interface circuit 2202. The computer instructions, when executed by the processor 2201, cause the electronic device to perform the steps performed by the handset in the embodiments described above. Of course, the chip system may also include other discrete devices, which are not specifically limited in this embodiment of the present application.

In some embodiments, it will be clearly understood by those skilled in the art from the foregoing description of the embodiments, for convenience and brevity of description, only the division of the above functional modules is illustrated, and in practical application, the above functional allocation may be implemented by different functional modules, that is, the internal structure of the apparatus is divided into different functional modules to implement all or part of the functions described above. The specific working processes of the above-described systems, devices and units may refer to the corresponding processes in the foregoing method embodiments, which are not described herein.

The functional units in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the embodiments of the present application may be essentially or a part contributing to the prior art or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium, including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: flash memory, removable hard disk, read-only memory, random access memory, magnetic or optical disk, and the like.

The foregoing is merely a specific implementation of the embodiments of the present application, but the protection scope of the embodiments of the present application is not limited thereto, and any changes or substitutions within the technical scope disclosed in the embodiments of the present application should be covered by the protection scope of the embodiments of the present application. Therefore, the protection scope of the embodiments of the present application shall be subject to the protection scope of the claims.

Claims

1. An image processing method, which is applied to an electronic device, wherein a first model is configured in the electronic device, the first model is used for predicting the position change condition of each pixel point of an RGB image in the same frame color mode before and after removing image blurring, and the method comprises the following steps:

responsive to a first operation indicating to acquire an image, displaying a first interface;

during the display of the first interface, acquiring a first original RAW image, wherein the first RAW image is image data acquired by an image sensor in the electronic equipment;

after removing noise from the first RAW image, converting the first RAW image into a first image, wherein the first image is the RGB image;

predicting a first parameter according to the first model and the first image, wherein the first parameter comprises a first vector corresponding to a first pixel point, and the first vector is a predicted position offset vector of the first pixel point before and after the image blurring of the first image is removed;

After the first parameter is determined, the position of the first pixel point in the first image is moved, and a second image with image blurring removed is obtained;

refreshing the second image into the first interface.

2. The method of claim 1, wherein the electronic device includes a second model therein for canceling noise in the RAW image, the method further comprising, prior to the converting the first RAW image into the first image:

and eliminating noise in the first RAW image by using the second model.

3. The method of claim 1 or 2, wherein said moving the position of the first pixel point in the first image comprises:

querying the first vector corresponding to the first pixel point in the first parameter;

and moving the first pixel point in the first image according to the first vector.

4. The method according to claim 1 or 2, wherein the moving the position of the first pixel point in the first image is preceded by the method comprising:

determining a second parameter, wherein the second parameter comprises a second vector corresponding to the first pixel point, and the second vector is a position offset vector of the first pixel point before and after the first image is subjected to image blurring removal;

Fusing the first parameter and the second parameter, and determining a third parameter, wherein the third parameter comprises a third vector corresponding to the first pixel point;

the moving the position of the first pixel point in the first image includes: querying the third vector corresponding to the first pixel point in the third parameter;

and moving the first pixel point in the first image according to the third vector.

5. The method of claim 4, wherein the fusing the first and second parameters to determine a third parameter comprises:

inquiring a first vector corresponding to the first pixel point in the first parameter, and inquiring a second vector corresponding to the first pixel point in the second parameter;

and superposing the first vector and the second vector to obtain a third vector corresponding to the first pixel point.

6. The method of claim 4, wherein the fusing the first and second parameters to determine a third parameter comprises:

Superposing the first vector and the second vector to obtain a fourth vector;

and calibrating the fourth vector according to a preset calibration parameter to obtain a third vector corresponding to the first pixel point, wherein the calibration parameter comprises a rotation angle and a translation distance.

7. An image processing method is characterized by being applied to an electronic device, wherein a third model and a fourth model are configured in the electronic device; the third model is used for eliminating noise in a RAW image and predicting the position change condition of each pixel point in the same frame of RAW image before and after removing image blurring, and the fourth model is used for predicting the position change condition of each pixel point in the same frame of RGB image before and after removing image blurring, and the method comprises the following steps:

responsive to a second operation indicating to acquire the image, displaying a second interface;

during the display of the second interface, acquiring a second RAW image, wherein the second RAW image is an image acquired by an image sensor in the electronic equipment;

determining a noise-free third RAW image and predicting a fourth parameter according to the third model and the second RAW image, wherein the fourth parameter comprises a fifth vector corresponding to a second pixel point, and the fifth vector is a predicted position offset vector of the second pixel point before and after the third RAW image removes image blurring;

Converting the third RAW image into a third image, wherein the third image is the RGB image;

predicting a fifth parameter according to the fourth model, the fourth parameter and the third image, wherein the fifth parameter comprises a sixth vector corresponding to the second pixel point, and the position offset vectors of the second pixel point predicted by the sixth vector before and after the image blurring of the third RAW image is removed;

after the fifth parameter is determined, the position of the second pixel point in the third image is moved, and a fourth image with image blurring removed is obtained;

refreshing the fourth image into the second interface.

8. The method of claim 7, wherein said moving the position of the second pixel point in the third image comprises:

inquiring the sixth vector corresponding to the second pixel point in the fifth parameter;

and moving the second pixel point in the third image according to the sixth vector to obtain the fourth image.

9. The method of claim 7, wherein the moving the position of the second pixel point in the third image is preceded by the method comprising:

Determining a sixth parameter, wherein the sixth parameter comprises a seventh vector corresponding to the second pixel point, and the seventh vector is a predicted position offset vector of the second pixel point before and after the third image eliminates image jitter;

fusing the sixth parameter and the fifth parameter, and determining a seventh parameter, wherein the seventh parameter comprises an eighth vector corresponding to the second pixel point;

the moving the position of the second pixel point in the third image includes: in the seventh parameter, inquiring the eighth vector corresponding to the second pixel point;

and according to the eighth vector, moving the second pixel point in the third image to obtain the fourth image.

10. The method of claim 9, wherein the fusing the sixth parameter and the fifth parameter to determine a seventh parameter comprises:

inquiring the sixth vector corresponding to the second pixel point in the fifth parameter, and inquiring the seventh vector corresponding to the second pixel point in the sixth parameter;

and superposing the sixth vector and the seventh vector to obtain an eighth vector corresponding to the second pixel point in the seventh parameter.

11. The method of claim 9, wherein the fusing the sixth parameter and the fifth parameter to determine a seventh parameter comprises:

superposing the sixth vector and the seventh vector to obtain a ninth vector corresponding to the second pixel point in the seventh parameter;

and calibrating the ninth vector according to a pre-configured calibration parameter to obtain an eighth vector corresponding to the second pixel point in the seventh parameter, wherein the calibration parameter comprises a rotation angle and a translation distance.

12. A method of model training, the method comprising:

acquiring training sample data, wherein the training sample data comprises a first sample image and a second sample image, the second sample image is image data obtained by eliminating image blurring in the first sample image, and the first sample image and the second sample image are RGB images;

processing the first sample image by using a pre-initial model to obtain an eighth parameter, wherein the eighth parameter comprises a tenth vector corresponding to a third pixel point, and the tenth vector is a predicted position offset vector of the third pixel point before and after the image blur of the first sample image is removed;

According to a tenth vector in the eighth parameter, moving the third pixel point in the first sample image to obtain a first RGB image;

iterating model parameters of the initial model according to the difference between the second sample image and the first RGB image;

after the initial model is trained to converge, a first model is obtained.

13. A method of model training, the method comprising:

acquiring training sample data, wherein the training sample data comprises a third sample image and a fourth sample image, the fourth sample image is image data obtained by removing noise and image blurring of the third sample image, and the third sample image and the fourth sample image are RAW images;

processing the third sample image by using a pre-initial model to obtain a fourth RAW image and a ninth parameter, wherein the ninth parameter comprises an eleventh vector corresponding to a fourth pixel point, and the eleventh vector is a predicted position offset vector of the fourth pixel point before and after the fourth RAW image is deblurred;

according to the eleventh vector in the ninth parameter, moving the fourth pixel point in the fourth RAW image to obtain a sixth sample image;

Iterating model parameters of the initial model according to the difference between the sixth sample image and the fourth sample image;

and after the initial model is trained to be converged, obtaining a third model.

14. A method of model training, the method comprising:

acquiring training sample data, wherein the training sample data comprises a third sample image and a fifth sample image, the third sample image is a RAW image, and the fifth sample image is an RGB image obtained by converting the third sample image after removing noise and image blurring;

processing the third sample image by using a preconfigured third model to obtain a fifth RAW image and a tenth parameter, wherein the third model is used for eliminating noise in the RAW image and predicting the position change condition of each pixel point in the same frame of RAW image before and after image blurring is removed, the fifth RAW image is noise-free image data, the tenth parameter comprises a twelfth vector corresponding to a fifth pixel point, and the twelfth vector is a predicted position offset vector of the fifth pixel point before and after image blurring is removed in the fifth RAW image;

After converting the fifth RAW image into a second RGB image, processing the second RGB image and tenth parameters by using a pre-initial model to obtain eleventh parameters, wherein the eleventh parameters comprise thirteenth vectors corresponding to sixth pixel points, and the thirteenth vectors are predicted position offset vectors of the sixth pixel points before and after the second RGB image is subjected to image blurring removal;

according to the thirteenth vector in the eleventh parameter, moving the sixth pixel point in the second RGB image to obtain a third RGB image;

iterating model parameters of the initial model according to the difference between the third RGB image and the fifth sample image;

after the initial model is trained to converge, a fourth model is obtained.

15. An electronic device comprising one or more processors and memory; the memory being coupled to a processor, the memory being for storing computer program code comprising computer instructions which, when executed by one or more processors, are for performing the method of any of claims 1-14.

16. A computer storage medium comprising computer instructions which, when run on an electronic device, cause the electronic device to perform the method of any of claims 1-14.