CN114926351A

CN114926351A - Image processing method, electronic device, and computer storage medium

Info

Publication number: CN114926351A
Application number: CN202210382866.8A
Authority: CN
Inventors: 应国豪; 曹瑞
Original assignee: Honor Device Co Ltd
Current assignee: Honor Device Co Ltd
Priority date: 2022-04-12
Filing date: 2022-04-12
Publication date: 2022-08-19
Anticipated expiration: 2042-04-12
Also published as: CN114926351B

Abstract

The application discloses an image processing method, electronic equipment and a computer storage medium, relates to the technical field of computers, and solves the problem of image definition. The specific scheme is as follows: responding to shooting operation, acquiring a first image and a second image, and obtaining a first image area from the first image; the first image area is an image area of a first main body, and the first image area is subjected to feature matching with a 3D model of the first main body to obtain a reference image area; inputting the first image area and the reference image area into a high-definition processing model, and outputting to obtain a second image area; and fusing the second image area and the third image area to obtain a third image. The third image area is an image area of a shooting background obtained according to the first image and/or the second image. The second image area fuses the high-definition characteristics of the reference image area, so that the third image obtained by fusing the second image area and the third image area is clearer than the first image and the third image.

Description

Image processing method, electronic device, and computer storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to an image processing method, an electronic device, and a computer storage medium.

Background

The moon is a very common shooting subject, and some mobile phones in the market launch a moon-looking mode in order to meet the requirements of users when shooting the moon. When the mobile phone shoots the moon in the moon watching mode, the moon image can be intelligently shot.

However, the clarity of the moon image shot in the conventional telescopic mode is still not high enough, and the difference between the clarity of the moon image shot by the single lens reflex camera and the clarity of the moon image shot by the single lens reflex camera is large, so that the clarity requirement of the user on the moon image cannot be met.

Disclosure of Invention

The application provides an image processing method, an electronic device and a computer storage medium, and aims to solve the problem of image definition.

In order to achieve the above object, the present application provides the following technical solutions:

in a first aspect, the present application provides an image processing method applied to an electronic device, where the image processing method includes:

in response to the photographing operation, a first image (which may be, for example, a first frame image mentioned in the embodiment described below) and a second image (which may be, for example, a second frame image mentioned in the embodiment described below) are acquired. The shooting scene of the first image and the second image is the same. The first image and the second image each include a first subject, and the exposure time period of the first image is shorter than that of the second image.

A first image region (for example, a moon region of a first frame image mentioned below) is obtained from the first image, and the first image region is an image region of the first subject in the first image. The first body may be the moon mentioned below, or may be the body of a fixed scene such as a starry sky, Mona Lisa portrait, or the like.

The first image region is feature-matched with the 3D model of the first subject, and a reference image region (for example, a reference moon region mentioned below) is obtained by matching from the 3D model of the first subject. The 3D model of the first subject has better image quality than the first image region. The image quality may be specifically understood as attributes that can reflect the image quality, such as definition, beautification degree of the image, and the like.

The first image area and the reference image area are input into the high-definition processing model, and a second image area (for example, a high-definition moon area mentioned below) is obtained by the output of the high-definition processing model, wherein the second image area is an image area in which the features of the first image area and the reference image area are fused, and the high-definition processing model is obtained by training the neural network model.

The second image region and the third image region are fused to obtain a third image (for example, a third frame image mentioned below). The third image area is an image area of a shooting background obtained according to the first image and/or the second image.

In the embodiment of the application, after the first image and the second image are acquired, the first image area is obtained from the first image with shorter exposure time and higher definition of the first main body. And then, carrying out feature matching on the first image area and the 3D model of the first subject, and matching to obtain a reference image area from the 3D model of the first subject. Since the image quality of the 3D model of the first subject is better than the first image region, the image quality of the reference image region is also better than the first image region. And inputting the first image area and the reference image area into a high-definition processing model, and outputting by the high-definition processing model to obtain a second image area. The second image area is obtained more clearly because the second image area fuses the characteristics of the first image area and the reference image area. And finally, fusing the second image area and the third image area to obtain a third image, wherein the definition of the third image is superior to that of the first image and the second image. And the third image area is an image area of a shooting background obtained according to the first image and/or the second image. For example, the third image area may be obtained by cutting out the image area of the shooting background directly from the first image or the second image, or may be obtained by fusing the first image and the second image and then cutting out the image area of the shooting background from the fused image. The third image comprises the second image area fused with the high-definition reference image area characteristic, so that the quality of the third image is higher than that of the first image and the second image, and the definition of the image is improved.

In one possible implementation, the high-definition processing model is obtained by training a neural network model through a plurality of sample images (for example, the low-definition moon sample mentioned below) of the first subject, a reference image corresponding to each sample image (for example, the reference moon sample corresponding to the low-definition moon sample mentioned below), and a target image corresponding to each sample image (for example, the target moon sample corresponding to the low-definition moon sample mentioned below). And the reference image corresponding to the sample image is obtained by performing feature matching on the sample image and the 3D model of the first subject.

In another possible implementation manner, before performing feature matching on the first image region and the 3D model of the first subject, and obtaining the reference image region by matching from the 3D model of the first subject, the method further includes:

the first image area is divided into a plurality of blocks, resulting in a plurality of first image blocks (for example, a group a moon taps mentioned below). The 3D model of the first subject is divided into a plurality of blocks, resulting in a plurality of second image blocks (e.g., the B groups of moon crop patches mentioned below).

The feature matching of the first image area and the 3D model of the first subject is performed, and a reference image area is obtained by matching from the 3D model of the first subject, including: and aiming at each first image block, respectively performing feature matching on the first image block and each second image block to obtain a reference image block (reference patch), wherein the reference image block is a second image block matched with the features of the first image block. And combining all the reference image blocks to obtain a reference image area.

In another possible implementation manner, for each first image block, performing feature matching on the first image block and each second image block, and obtaining a reference image block by matching, the method includes:

and respectively calculating the similarity between each first image block and each second image block aiming at each first image block, and determining the second image block with the highest similarity as a reference image block. Specifically, the similarity may be represented by a cosine distance and a cosine similarity.

In another possible implementation manner, the generating process of the third image area includes: generating a highlight dynamic rendering HDR image (such as the HDR image mentioned below) according to the first image and the second image, cutting out a background area in the HDR image, and obtaining a third image area (such as the HDR background area mentioned below).

In another possible implementation, cropping out a background region in the HDR image to obtain a third image region includes:

and identifying the HDR image, identifying an image area where the first main body is located from the HDR image, and removing the image area where the first main body is located from the HDR image to obtain a third image area.

In another possible implementation, obtaining a first image region from a first image includes: the first image is identified, the image area where the first main body is located is identified from the first image, and the image area where the first main body is located is cut out from the first image to obtain the first image area.

In another possible implementation manner, before acquiring the first image and the second image in response to the shooting operation, the method further includes:

and displaying a shooting interface, wherein the shooting interface is used for displaying the image acquired by the camera of the electronic equipment. When it is recognized that the first subject is included in the acquired image, a first mode (for example, a telescopic mode mentioned below) is entered.

In another possible implementation manner, a method for constructing a high definition processing model includes:

constructing a data sample set; the set of data samples includes: the method comprises the steps of obtaining a plurality of sample images of a first subject, a reference image corresponding to each sample image and a target image corresponding to each sample image, wherein the reference image corresponding to the sample images is obtained by performing feature matching on the sample images and a 3D model of the first subject.

And for each sample image, inputting the sample image and the reference image corresponding to the sample image into the neural network model, and obtaining and outputting an output image corresponding to each sample image by the neural network model.

And adjusting parameters in the neural network model according to the error between the output image corresponding to each sample image output by the neural network model and the target image corresponding to the sample image until the error between the output image of each sample image output by the adjusted neural network model and the target image corresponding to the sample image meets the preset convergence condition, and determining the adjusted neural network model as a high-definition processing model.

In another possible implementation manner, after the second image region and the third image region are fused to obtain the third image, the method further includes: saving the third image to the gallery application.

In a second aspect, the present application provides an electronic device, comprising: one or more processors and a memory coupled to the one or more processors for storing computer program code comprising computer instructions which, when executed by the one or more processors, cause the electronic device to perform the image processing method of any of the first aspects as described above.

In one possible implementation, the electronic device further includes: the neural network processor NPU, NPU is used to run a high-definition processing model.

In a third aspect, the present application provides a computer storage medium comprising computer instructions which, when run on an electronic device, cause a processor in the electronic device to perform the image processing method according to any of the first aspects described above.

It should be appreciated that the description of technical features, solutions, benefits, or similar language throughout this application does not imply that all of the features and advantages may be realized in any single embodiment. Rather, it should be appreciated that any discussion of a feature or advantage is meant to encompass a particular feature, aspect, or advantage in at least one embodiment. Therefore, the descriptions of technical features, technical solutions or advantages in the present specification do not necessarily refer to the same embodiment. Furthermore, the technical features, technical solutions and advantages described in the present embodiments may also be combined in any suitable manner. One skilled in the relevant art will recognize that an embodiment may be practiced without one or more of the specific features, aspects, or advantages of a particular embodiment. In other embodiments, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.

Drawings

FIG. 1 is a view of a scene in which a photograph is taken in a telescopic mode;

fig. 2 is a first flowchart of an image processing method according to an embodiment of the present application;

FIG. 3 is a hardware block diagram of an electronic device provided herein;

FIG. 4 is a software architecture diagram of an electronic device provided herein;

FIG. 5 is a second flowchart of an image processing method according to an embodiment of the present application;

fig. 6 is a 2D display diagram of a 3D high definition moon model provided in an embodiment of the present application;

fig. 7 is a schematic diagram of a feature matching process of a 3D high-definition moon model according to an embodiment of the present application;

fig. 8 is a structural diagram of a high definition processing model provided in an embodiment of the present application;

FIG. 9 is a schematic diagram of the feature processing procedure of the HD processing model shown in FIG. 8;

FIG. 10 is a display interface of a third frame image obtained by the method shown in FIG. 5;

fig. 11 is a flowchart of a third image processing method according to the embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. The terminology used in the following examples is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of this application and the appended claims, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, such as "one or more", unless the context clearly indicates otherwise. It should also be understood that in the embodiments of the present application, "one or more" means one, two or more; "and/or" describes the association relationship of the associated object, and indicates that three relationships can exist; for example, a and/or B, may represent: a alone, both A and B, and B alone, where A, B may be singular or plural. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.

The embodiments of the present application refer to a plurality of the same or greater than two. It should be noted that, in the description of the embodiments of the present application, the terms "first", "second", and the like are used for distinguishing between descriptions and not for describing a relative importance or order of indication.

To clarify the technical solution of the present application more clearly, a description will be first given of a scene of shooting the moon using the periscopic mode according to the embodiment of the present application.

The mobile phone displays a main interface shown in (1) of fig. 1, a plurality of applications such as a camera, an address book, a telephone, information, a clock, a calendar, a gallery and the like are displayed on the main interface, a user clicks an icon 101 of the camera application, and the mobile phone starts the camera application in response to the operation of the user and enters the camera interface shown in (2) of fig. 1. The AI control 102 on the interface shown in fig. 1 (2) is in an open state, and the finder frame 103 is used to acquire an image for shooting a preview and display the preview image in real time. The camera interface shown in fig. 1 (2) is currently in a photographing mode, and a user can slide to switch a video recording mode, a portrait mode, a professional mode, and the like. Also shown in fig. 1 (2) is a magnification extension control 104. The magnification stretching control 310 may be displayed as control bars distributed in dots as shown in (2) of fig. 1, or may be control bars in a straight line in other embodiments. Taking control bars distributed in dots as an example, a user can change the magnification by dragging the position of the magnification stretching control 104. For example, "1 ×" in (2) of fig. 1 indicates that the current magnification is 1 time, and the user may perform the operation of dragging the widget 104 shown in (2) of fig. 1, increase the magnification displayed by the widget 104, change the numerical value of the magnification, and take a moon photograph of the size desired by the user. After the user increases the magnification, the mobile phone enters the interface shown in fig. 1 (3).

In the interface shown in fig. 1 (3), the magnification stretching control 104 displays that the magnification is 10 times, the viewfinder frame 103 of the mobile phone obtains the image of the enlarged moon, at this time, the AI function intelligently recognizes the moon-looking scene, the mobile phone automatically enters the moon-looking mode, and the user is prompted to enter the moon-looking mode at present through the moon-looking mode icon 105 on the interface. After the mobile phone enters the moon observing mode, the user clicks the shooting control 106, the mobile phone responds to the user operation, shoots and processes the moon image with clear outline in the moon observing mode, and the interface shown in (4) of fig. 1 is entered. A thumbnail of a moon image captured in the telescopic mode is shown in the album control 107 displayed in (4) of fig. 1. By clicking on the album control 107, the mobile phone enters the details interface of the moon image shown in (5) of fig. 1. The detail interface of the moon image displays the moon image 108 obtained by the moon tracking mode processing.

The moon image may be understood as an image in which the subject in the image includes at least the moon. The moon viewing mode may be understood as a shooting mode used when the mobile phone shoots the moon, and may also be referred to as a moon mode, a moon viewing mode, and the like, and the name of the mode is not limited in the embodiments of the present application. The process of using the camera application shown in fig. 1 is only one way of triggering the telescopic mode, and there may be other scenarios to trigger the mobile phone to enter the telescopic mode.

As shown in fig. 2, in a scenario where a moon is photographed using a moon mode as mentioned in an embodiment of the present application, a process of processing a moon image through the moon mode may be: after entering the moon observing mode, the moon is shot to obtain two different frames of the long frame moon image and the short frame moon image. The long frame moon image and the short frame moon image are both the frames of the moon shot in the same scene by the mobile phone. But the exposure duration of the long frame moon image is greater than the short frame moon image. For example, the exposure time period of the short frame moon image is 5ms, and the exposure time period of the long frame moon image is 10 ms.

After the long frame moon image and the short frame moon image are obtained through shooting by the mobile phone, moon identification is carried out on the long frame moon image and the short frame moon image respectively, then an identified moon region is cut out (Crop) from the short frame moon image, and then high-cleaning processing is carried out on the moon region which is cut out by Crop. The moon area is cut off from the long frame image, and then the remaining background area is subjected to High-Dynamic light rendering (HDR) processing. And finally, fusing the processed moon region and the background region to obtain a moon image. The moon mode may process a moon image obtained by photographing by a user, or may process a moon image obtained by recording by a user, which is not limited in the embodiment of the present application.

Due to the performance limitation of the mobile phone and other reasons, the mobile phone has a poor high-definition processing effect on the moon area, and the definition of the processed moon image is much different from that of a single-reflection shot high-definition moon image.

In order to improve the definition of a moon subject shot by an electronic device such as a mobile phone, an embodiment of the application provides an image processing method, and in the process of performing high-definition processing on a moon region, features of a high-definition moon image are fused through a neural network so as to improve the definition of the moon region. The image processing method provided by the embodiment of the application can be suitable for the electronic equipment with the camera, such as a mobile phone, a tablet computer, a notebook computer, an intelligent watch and the like.

Fig. 3 shows a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 3, the electronic device may include a processor 310, an external memory interface 320, an internal memory 321, a camera 330, and a display screen 340.

It is to be understood that the illustrated structure of the present embodiment does not constitute a specific limitation to the electronic device. In other embodiments, an electronic device may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

Processor 310 may include one or more processing units, such as: the processor 310 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors. For example, in the present application, the processor 310 may execute any image processing method proposed in the embodiments of the present application, and specifically, refer to the related description of fig. 5.

A memory may also be provided in the processor 310 for storing instructions and data. In some embodiments, the memory in the processor 310 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 310. If the processor 310 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 310, thereby increasing the efficiency of the system.

The NPU is a neural-network (NN) computing processor that processes input information quickly by using a biological neural network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can realize applications such as intelligent cognition of electronic equipment, for example: image recognition, face recognition, speech recognition, text understanding, and the like. For example, the present application may run the image processing model mentioned in the embodiment of the present application through the NPU to output a moon region with high definition, which may be specifically described with reference to step S506 of fig. 5.

The external memory interface 320 may be used to connect an external memory card, such as a Micro SD card, to extend the storage capability of the electronic device. The external memory card communicates with the processor 310 through the external memory interface 320 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.

The internal memory 321 may be used to store computer-executable program code, which includes instructions. The processor 310 executes various functional applications of the electronic device and data processing by executing instructions stored in the internal memory 321. For example, in the present embodiment, the processor 310 may perform image processing by executing instructions stored in the internal memory 321.

The electronic device implements display functions via the GPU, the display screen 340, and the application processor, etc. The GPU is a microprocessor for image processing, and is connected to the display screen 340 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 310 may include one or more GPUs that execute program instructions to generate or alter display information.

The display screen 340 is used to display images, video, and the like. The display screen 340 includes a display panel. The display panel may be a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), a miniature, a Micro-OLED, a quantum dot light-emitting diode (QLED), or the like. In some embodiments, the electronic device may include 1 or N display screens 340, N being a positive integer greater than 1.

A series of Graphical User Interfaces (GUIs) may be displayed on the display screen 340 of the electronic device and are the main screen of the electronic device. For example, in the embodiment of the present application, the display screen 340 may display a moon image processed in the telescopic mode.

The electronic device may implement the shooting function through the ISP, the camera 330, the video codec, the GPU, the display screen 340, and the application processor, etc.

The ISP is used to process the data fed back by the camera 330. For example, when a photo is taken, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converting into an image visible to naked eyes. The ISP can also carry out algorithm optimization on noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in camera 330.

The camera 330 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a Charge Coupled Device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The light sensing element converts the optical signal into an electrical signal, which is then passed to the ISP where it is converted into a digital image signal. And the ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into image signal in standard RGB, YUV and other formats. In some embodiments, the electronic device may include 1 or N cameras 330, N being a positive integer greater than 1. For example, in the embodiment of the present application, the camera 330 may be used to capture an image or video of a subject carrying the moon.

The digital signal processor is used for processing digital signals, and can process digital image signals and other digital signals. For example, when the electronic device is in frequency bin selection, the digital signal processor is used for performing fourier transform and the like on the frequency bin energy.

Video codecs are used to compress or decompress digital video. The electronic device may support one or more video codecs. Thus, the electronic device can play or record video in a variety of encoding formats, such as: moving Picture Experts Group (MPEG) 1, MPEG2, MPEG3, MPEG4, and the like.

In addition, an operating system runs on the above components. Such as an iOS operating system, an Android open source operating system, a Windows operating system, etc. A running application may be installed on the operating system.

The operating system of the electronic device may employ a layered architecture, an event-driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. The embodiment of the application takes an Android system with a layered architecture as an example, and exemplarily illustrates a software structure of an electronic device.

Fig. 4 is a block diagram of a software structure of an electronic device according to an embodiment of the present application.

The layered architecture divides the software into several layers, each layer having a clear role and division of labor. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, an application layer, an application framework layer, an Android runtime (Android) and system library, and a kernel layer from top to bottom.

The application layer may include a series of application packages. As shown in fig. 4, the application package may include camera, gallery, calendar, phone call, map, navigation, WLAN, bluetooth, music, video, short message, etc. applications. For example, in embodiments of the present application, a camera application may be used to capture images or video including a moon subject.

The application framework layer provides an Application Programming Interface (API) and a programming framework for the application program of the application layer. The application framework layer includes a number of predefined functions. As shown in FIG. 4, the application framework layers may include a window manager, content provider, view system, phone manager, resource manager, notification manager, and the like.

The window manager is used for managing window programs. The window manager can obtain the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.

The content provider is used to store and retrieve data and make it accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phone books, etc.

The view system includes visual controls such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.

The phone manager is used to provide communication functions of the electronic device. Such as management of call status (including on, off, etc.).

The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and the like.

The notification manager enables the application to display notification information in the status bar, can be used to convey notification-type messages, can disappear automatically after a short dwell, and does not require user interaction. Such as a notification manager used to notify download completion, message alerts, etc. The notification manager may also be a notification that appears in the form of a chart or scroll bar text at the top status bar of the system, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, prompting text information in the status bar, sounding a prompt tone, vibrating the electronic device, flashing an indicator light, etc.

The Android Runtime comprises a core library and a virtual machine. The Android runtime is responsible for scheduling and managing an Android system.

The core library comprises two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.

The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application layer and the application framework layer as binary files. The virtual machine is used for performing the functions of object life cycle management, stack management, thread management, safety and exception management, garbage collection and the like.

The system library may include a plurality of functional modules. For example: surface managers (surface managers), Media Libraries (Media Libraries), three-dimensional graphics processing Libraries (e.g., OpenGL ES), 2D graphics engines (e.g., SGL), and the like.

The surface manager is used to manage the display subsystem and provide fusion of 2D and 3D layers for multiple applications.

The media library supports a variety of commonly used audio, video format playback and recording, and still image files, among others. The media library may support a variety of audio-video encoding formats such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, and the like.

The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.

The 2D graphics engine is a drawing engine for 2D drawing.

The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.

Although the Android system is taken as an example for description in the embodiments of the present application, the basic principle is also applicable to electronic devices based on an os such as iOS or Windows.

Embodiments of the present application will be described below with specific reference to fig. 5 to 11. For convenience of description, the electronic device is a mobile phone as an example.

Fig. 5 is an image processing method provided in an embodiment of the present application, which is applied to the electronic device mentioned in the foregoing embodiment of the present application, taking the electronic device as a mobile phone as an example, and the image processing method may include the following steps:

s501, displaying a shooting interface, wherein the shooting interface is used for displaying images collected by the camera.

The user can preview the image collected by the camera through the shooting interface, and then shooting operation can be performed when the image satisfied by the user is previewed. The shooting operation may be a shooting operation or a recording operation.

Illustratively, a shooting control can be included on the shooting interface. The shooting control is used for recording or shooting images collected by the camera. For example, the shooting interface in the shooting mode shown in fig. 1 (2) may be a shooting interface that displays an image captured by a camera in the finder frame 103, or a shooting interface that is displayed by a camera in other operation modes such as a video recording mode and a night view mode. The embodiment of the application does not limit the specific display form of the shooting interface.

The method for triggering and displaying the shooting interface includes a variety of ways, for example, the shooting interface may be displayed in response to an operation of starting a camera application, the shooting interface may be displayed in response to an operation of calling a camera by a third-party application, and the shooting interface may be displayed by triggering a mobile phone to display the shooting interface by voice. For example, reference may be made to the description related to (1) and (2) of fig. 1 regarding the display of the shooting interface. The embodiment of the application does not limit the manner of triggering the display of the shooting interface.

It should be noted that after step S501 is executed, if the mobile phone does not exit the shooting interface, the mobile phone is always in a state of collecting an image through the camera and displaying the collected image on the shooting interface, and the subsequent steps executed in fig. 5 do not affect the process of displaying the image collected by the camera by the shooting interface.

And S502, entering a moon-looking mode when the collected image is identified to include a moon subject.

Specifically, the mobile phone can intelligently identify a shooting subject according to the image collected by the camera, and determine the current shooting scene. The subject can be understood as a main person or object to be photographed. When the collected images include the moon subject, determining that the current scene is a moon scene, and then entering a moon mode to process the images including the moon subject by adopting the moon mode.

In some embodiments, before performing step S502, the method further includes: and opening an AI identification function. For example, refer to the description about the AI control in fig. 1, where when the AI recognition function is turned off, the mobile phone does not perform subject recognition on the captured image, and when the AI recognition function is turned on, the mobile phone starts to recognize the captured image.

In some embodiments, when the mobile phone enters the moon watching mode, a control of the moon watching mode may be displayed on the shooting interface to prompt the user that the moon watching mode is currently entered.

Specifically, there are many ways to identify the subject in the image, for example, the subject in the image may be identified through a neural network model, and the identification way is not limited in the embodiment of the present application.

S503, responding to shooting operation, and acquiring a first frame image and a second frame image, wherein the first frame image and the second frame image are acquired under different exposure durations and comprise moon images, and the exposure duration of the first frame image is smaller than that of the second frame image.

The image contents of the first frame image and the second frame image are consistent, that is, the shooting subject and the shooting background included in the first frame image and the second frame image are consistent, but the exposure duration when the first frame image is acquired is different from the exposure duration when the second frame image is acquired, so that the image effects of the first frame image and the second frame image are different. In some embodiments, it may be understood that the exposure start time of the first frame image and the exposure start time of the second frame image coincide, but the exposure end time of the first frame image is earlier than that of the second frame image. Here, the first frame image may be understood as the short frame moon image mentioned in the foregoing embodiment of the present application, and the second frame image may be understood as the long frame moon image mentioned in the foregoing embodiment of the present application.

Images with shorter exposure times will be sharper when displaying image areas with stronger light (e.g., the moon area of the first frame image), while images with longer exposure times will be sharper when displaying image areas with weaker light (e.g., the background area in the second frame image).

In other embodiments, the first frame image and the second frame image may have different exposure durations or different other types of shooting parameters, that is, step S503 may be understood as a specific implementation manner for obtaining images including the moon acquired under a plurality of different shooting parameters, and the number of the multi-frame images obtained in step S503 includes, but is not limited to, two, or more than two multi-frame images.

For the same shooting scene (that is, the shooting subject and the shooting scene are the same), the images of a plurality of different shooting parameters acquired by shooting the shooting scene respectively have different image effects. And then can fuse better image area of processing under the different image effect, form new higher, the better image of light processing of definition.

The shooting operation may be understood as an operation of triggering the camera application to acquire an image, process the image, and store the image in the gallery. For example, the shooting operation may be an operation of clicking the shooting control by the user in the shooting mode, or an operation of clicking the recording control by the user in the recording mode, and the shooting operation may be one operation or a plurality of operations. For example, reference may be made to the associated description of the user triggering the camera application to take a moon image in fig. 1. In other embodiments, the shooting operation may also be an operation that triggers shooting in a manner of voice of a user, inputting characters, and the like, and the specific form of the shooting operation is not limited in the embodiments of the present application.

It should be noted that, in the process of acquiring the first frame image and the second frame image, a series of processes of intelligently generating images, such as intelligent focusing and intelligent exposure parameter adjustment, may also be included, which is not limited in the embodiment of the present application.

S504, the moon area of the first frame image is formed by Crop.

Here, the moon region may be understood as an image region displaying the moon body. The brightness of the moon region is stronger than that of the background region (the regions except the moon main body are collectively referred to as the background region), and the embodiment of the present invention mainly aims to obtain an image including a clear moon, so that the moon region Crop needs to be separately taken out, and a special high-definition process needs to be performed. And because the exposure time length of the first frame image is shorter than that of the second frame image, the definition of the moon area of the first frame image is higher than that of the second frame image. And then selecting the first frame image Crop with higher moon definition to go out of the moon area for high-definition processing.

In some embodiments, the step S504 may be performed by identifying a moon region of the first frame image, and then performing Crop on the identified moon region to obtain the moon region of the first frame image. There are many ways to identify the moon region of the first frame image, for example, the moon region may be identified by using deep learning Crop. Deep learning may use RetinaNet. That is, the first frame image is input to the RetinaNet, 4 coordinate positions of the moon in the first frame image can be obtained, and the moon region is formed in Crop on the first frame image according to the detected moon coordinate, so that the moon region of the first frame image is obtained. It should be noted that, besides retinaNet, a target detection algorithm such as R-CNN and YOLO may be used to determine the position of the moon region in the first frame image, and then Crop the moon region.

And S505, carrying out feature matching on the moon region of the first frame image and the 3D high-definition moon model, and matching to obtain a reference moon region.

The 3D high-definition moon model can be understood as a three-dimensional high-definition moon image. For example, as shown in fig. 6, fig. 6 is an example when a 3D high definition moon model adopts a 2D presentation. The 2D high-definition moon image under any angle can be observed in the 3D high-definition moon model.

In order to enable the moon region of the first frame image to have high definition after processing, the moon region close to the moon region of the first frame image and having extremely high definition may be selected first, and used as a reference for performing high definition processing on the moon region of the first frame image subsequently.

In this embodiment, since the 2D high-definition moon image at any angle can be viewed in the 3D high-definition moon model, an image area highly similar to the moon area of the first frame image in the 3D high-definition moon model can be found in a manner of performing feature matching between the moon area of the first frame image and the 3D high-definition moon model, and used as a reference moon area obtained by matching.

There are many ways to perform feature matching on the moon region of the first frame image and the 3D high-definition moon model. For example, the similarity between the moon region of the first frame image and each region in the 3D high-definition moon model may be directly calculated to find a region in the 3D high-definition moon model with the highest similarity to the moon region, which is used as the reference moon region obtained by matching.

Specifically, the feature of the moon region of the first frame image is represented in a vector form, the feature in the 3D high-definition moon model is also represented in a vector form, and then the cosine distance between the feature vector of the first frame image and the feature vector of the 3D high-definition moon model is obtained. The cosine distance is 1-cosine similarity, and its value range is [0, 2], that is, the cosine distance of two identical vectors is 0. When analyzing the similarity between two feature vectors, the cosine similarity is often used, and the value range is [ -1, 1 ]. The cosine distance can also be used to represent the similarity between two vectors, and the smaller the cosine distance, the higher the similarity between two vectors. The process of performing feature matching on the moon region of the first frame of image and each region in the 3D high-definition moon model can be understood as a process of calculating cosine distances between the moon region of the first frame of image and each region in the 3D high-definition moon model, and after the cosine distances between the moon region of the first frame of image and each region in the 3D high-definition moon model are calculated, determining a region in the 3D high-definition moon model with the smallest cosine distance as a reference moon region obtained by matching, that is, the region is considered to have the highest similarity with the moon region of the first frame of image.

For example, the manner of executing step S503 may also be as shown in fig. 7: the moon region (abbreviated as moon crop) crop of the first frame image is divided into a plurality of small blocks (crops), a plurality of moon crop batches (collectively referred to as a group a moon crop batches) are obtained, a 3D high-definition moon model (also referred to as a moon 3D model) is also cropped into a plurality of batches, and then a group B moon crop batches are obtained. Wherein the size of the moon crop patches of group A and group B may be the same. For each of the A set of moon chop patches, a distance measure (distance measure) between the patch and each of the B set of moon chop patches is calculated, that is, the distance is respectively matched with each of the B set of moon chop patches. Illustratively, distance metric may refer to the cosine distance, and then the patch with the smallest cosine distance is selected from the B group as the reference patch that matches the patch of the a group. After all the reference patches are screened out from the group B, all the reference patches are stitched according to the coordinates (index) of each reference patch, and finally the complete reference moon (i.e. the reference moon region mentioned in the foregoing step S503) is formed. Wherein, the coordinate of the reference patch may be consistent with the coordinate of the a group patch that is correspondingly matched. In other embodiments, the patch with the second smallest cosine distance may be selected as the reference patch to form the reference moon. I.e. there may be a plurality of reference moons obtained in the examples of the present application. In the process of forming the reference moon, each patch in the moon region of the first frame image is respectively subjected to feature matching. The reference patch finally matched with each patch forms a reference moon region, so that the similarity between the reference moon region and the moon region of the first frame image is high, and the more the moon region of the first frame image is divided into the patches, the more accurate (i.e. more similar) the reference moon finally matched.

S506, inputting the moon area and the reference moon area of the first frame image into the high-definition processing model, and obtaining and outputting the high-definition moon area by the high-definition processing model.

The high-definition processing model can be obtained by training a neural network model through a plurality of low-definition moon samples, a reference moon sample corresponding to each low-definition moon sample and a target moon sample corresponding to each low-definition moon sample. The reference moon sample corresponding to the low-definition moon sample is obtained by performing feature matching on the low-definition moon sample and the 3D high-definition moon model, and the specific process may refer to the related description in step S505, which is not described herein again.

After the moon region of the first frame image and the reference moon region obtained in step S505 are input to the high definition processing model, the high definition processing model fuses the features of the reference moon region and outputs the high definition moon region. In the process of high-definition processing, the high-definition processing model integrates the characteristics of the high-definition reference moon region, so that the output high-definition moon region is more high-definition.

For example, the high definition processing model may be as shown in high definition processing model 801 shown in fig. 8, and the input and output, the convolutional layer, the anti-convolutional layer, and the fusion layer of high definition processing model 801 are distinguished according to the colors shown in legend 802. Specifically, the process of executing step S506 by the high definition processing model 801 may be: the moon region (hereinafter, referred to as moon crop) of the first frame image and the reference moon region (hereinafter, referred to as reference) are input into the high definition processing model 801, then the moon crop is processed by the convolution kernel of 3-2-1-64 of the first convolution layer to obtain the feature vector processed by the convolution kernel of 3-2-1-64, and similarly, the feature vector processed by the convolution kernel of 3-2-1-64 of the first convolution layer is sequentially convolved by the convolution kernel of 3-2-1-64 of the second convolution layer, the convolution kernel of 3-2-1-96 of the third convolution layer, and the convolution kernel of 3-1-1-64 of the fourth convolution layer.

Here, the convolution kernel of 3-2-1-64 may be understood as a convolution kernel of 3, 2 steps, 1 padding, 3 × 64, the convolution kernel of 3-2-1-96 may be understood as a convolution kernel of 3, 2 steps, 1 padding, 3 × 96, and the convolution kernel of 3-1-1-64 may be understood as a convolution kernel of 3, 1 steps, 1 padding, 3 × 64. Note that, when the number of convolution kernel steps of the convolution layer is 2, the size of the convolution kernel is reduced to one-half of the original size, and when the number of steps is 1, the size is not changed.

The processing process of the feature vector of the reference is similar to that of moon crop, and the reference is sequentially processed by the convolution kernel of 3-2-1-64 of the first convolution layer, the convolution kernel of 3-2-1-64 of the second convolution layer and the convolution kernel of 3-2-1-96 of the third convolution layer, so as to further extract the reference feature of the reference (referred to as ref-feature for short) through the convolution kernels. In the fusion layer (fusion) in the high definition processing model, the feature vectors of the input fusion layer (i.e. the arrow points to the fusion layer in the figure) are spliced. For example, the fused layer 8011 in 801 is spliced with the feature vector after convolution kernel processing of 3-2-1-96, the feature vector after convolution kernel processing of 3-1-1-64, and the feature vector of ref-feature after convolution kernel processing. The other fusion layer processing procedures in the high definition processing model 801 are similar and are not described herein again.

Continuing to refer to 801 shown in FIG. 8, 801 further includes deconvolution kernels of 3-2-1-128 of the deconvolution layer, and the deconvolution kernels perform deconvolution processing on feature vectors output by the fusion layer. For example, the feature vector output from the fusion layer 8011 in fig. 8 is subjected to deconvolution kernel processing of 3-2-1-128, and then output to the next fusion layer.

After the input moon crop and reference are processed by the multilayer convolution layer, the fusion layer and the deconvolution layer, the features of the fused moon crop and reference and further high definition are realized at 801 shown in fig. 8, and finally the feature vector of the output (output) is the feature vector of the high-definition moon region. And displaying the high-definition moon region according to the feature vector of the high-definition moon region, wherein the definition of the high-definition moon region is higher than that of the moon region of the first frame of image and is close to that of a single-reflection shooting moon.

Among these, a deconvolution kernel of 3-2-1-128 is understood to be a convolution kernel of size 3, number of steps 2, padding 1, and number 3 x 128. The deconvolution layer is similar to the convolution layer, but the deconvolution layer is an enlargement of the image feature size. I.e. when the number of steps is 2, the size is enlarged twice as much.

For example, the high-definition processing model 801 may be as shown in fig. 9, and the processing procedure of the high-definition processing model 801 shown in fig. 9 is as follows: and inputting the moon crop and reference of 512 × 3 features, carrying out convolution processing on the moon crop and reference by the convolution kernel of the first layer 3-2-1-64 to obtain 256 × 64 features, and similarly, continuously carrying out convolution kernel of the second layer 3-2-1-64 to obtain 128 × 64 features. And then the convolution kernel of 3-2-1-96 of the third convolution layer is carried out to obtain 64 × 96 characteristics. The convolution kernel processing of the fourth layer 3-1-1-64 is continued to obtain 64 × 64 features. And splicing the characteristics of the input fusion layer together at the fusion layer. For example, the fused layer 8011 splices the features of 64 × 96, 64 × 64, 64 × 96 into 64 × 256 features. The other fusion layers are processed similarly, and can be directly referred to in fig. 9, which is not described herein again. In the deconvolution layer, the processing is similar to that of the convolution layer, but the image feature size is enlarged. For example, 64 × 256 features from the fused layer 8011 are deconvoluted into 128 × 128 features. High definition processing model 801 finally outputs features of 512 × 3 high definition moon regions. Therefore, the size of the image area after the high-definition processing model processing can be kept unchanged.

In some embodiments, the training process for the high definition processing model may be: and constructing a data sample set. The set of data samples includes: the method comprises the steps of obtaining a plurality of low-clear moon samples, a reference moon sample corresponding to each low-clear moon sample and a target moon sample corresponding to each low-clear moon sample. Wherein the definition of the target moon sample is higher than that of the low-definition moon sample. The clarity of the reference moon sample is also higher than that of the low-clarity moon sample. The low-definition moon sample, the reference moon sample and the target moon sample are all images of moon areas shot in the same shooting scene. The target moon sample may be understood as the sample that the high definition processing model ultimately desires to output.

And for each low clear moon sample, inputting the low clear moon sample and a reference moon sample corresponding to the low clear moon sample into a neural network model, and obtaining and outputting output image data corresponding to the low clear moon sample by the neural network model. And adjusting parameters in the neural network model according to the error between the output image data corresponding to each low-definition moon sample output by the neural network model and the target moon sample corresponding to the low-definition moon sample until the error between the output image data corresponding to each low-definition moon sample output by the adjusted neural network model and the target moon sample corresponding to the low-definition moon sample meets a preset convergence condition, and determining the adjusted neural network model as a high-definition processing model. It should be noted that the high-definition processing model may be obtained by offline training or online training, and if the mobile phone includes an NPU, the high-definition processing model may be run on the NPU.

Through steps S504 to S506, the feature of the high-definition reference moon region is fused in the moon region in the first frame of image, and a higher-definition moon region is output by the high-definition processing model, thereby completing the high-definition processing of the moon region.

And S507, generating an HDR image according to the first frame image and the second frame image.

Specifically, the first frame image and the second frame image are fused by the HDR technology, so as to obtain an HDR image with higher image quality. Since the brightness of the moon region is large and the background brightness at night is small, a clear moon needs to be shot with a short exposure and the background needs to be shot with a long exposure. However, long exposures can overexpose brighter areas in the background and lose detail. In order to avoid the situation that the background area in the second frame image is overexposed, the second frame image needs to be supplemented with details by using the first frame image through the HDR technology, so as to generate an HDR image with higher image quality through fusion.

For example, the HDR image is generated by the HDR technique by multiplying the pixel points of the first frame image by the corresponding sensitivity (gain), and then fusing the pixel points with the pixel points of the long frame image to supplement the details of the overexposed region of the long frame.

In some embodiments, step S507 may be performed by an HDR image Sensor (Sensor). For the process of generating the HDR image, reference may be made to contents related to the HDR technology in an operating system such as Android, which is not described herein again.

And S508, Crop out the HDR background area in the HDR image.

Therein, an HDR background region may be understood as a background region part in an HDR image. After the processing in step S507, the background area of the HDR image has no problem of overexposure, the image quality is high, and the beautification of the background area is realized, so that the background area in the HDR image can be obtained, and the moon area obtained in step S506 and the HDR background area in the HDR image are fused to obtain a high-definition and beautified image.

Specifically, the manner of executing step S508 may be to remove the moon region in the HDR image, and further obtain the HDR background region. For example, the moon region in the HDR image may be identified first, and for a specific identification process, reference may be made to the related content in step S504, which is not described herein again. After identifying the moon region, Crop the moon region to get part of the remaining HDR background region.

In other embodiments, the moon region in the second frame image may be removed first to obtain the background region of the second frame image, and then step S507 is performed, that is, the HDR background region is generated by using the background regions of the first frame image and the second frame image. That is, using the first frame image and the second frame image, there are many ways to obtain the HDR background region, including but not limited to the contents of the embodiments of the present application.

It should be noted that the processes of generating the HDR background region from step S507 to step S508 and the processes of generating the high definition moon region from step S504 to step S506 are not mutually affected, and may be executed in parallel or in sequence, and step S507 and step S504 only need to be executed after step S503 is completed, which is not limited in the present application.

And S509, fusing the high-definition moon region and the HDR background region to obtain a third frame image.

Wherein the third frame image may be understood as an image that is finally stored in a gallery (or album). The shooting scene of the third frame image is consistent with the first frame image and the second frame image, but the third frame image fuses the high-definition moon region after high-definition processing and the HDR background region after HDR processing, so the definition of the third frame image is higher than that of the first frame image and the second frame image.

Specifically, after the mobile phone enters the telescopic mode, the user may click the shooting control 106 shown in (3) of fig. 1 to shoot, and in response to the operation of clicking the shooting control 106 by the user, the mobile phone performs steps S503 to S509, and saves the third frame image obtained in step S509 and saves the third frame image in the album icon. For example, in the album icon 107 shown in (4) of fig. 1. The subsequent user can display the detailed interface of the third frame image by clicking the album icon 107 and responding to the operation of the user by the mobile phone. Or, the user can check the album storing the third frame of image through the gallery application, and then enter the album to check the detailed interface of the third frame of image. After the third frame image is saved, there are various ways to display the detailed interface of the third frame image, which are not described in detail in this embodiment of the present application.

Illustratively, after the mobile phone performs steps S501 to S509, the user clicks the album icon on the shooting interface, and enters a detail interface of the third frame image shown in fig. 10, and the detail interface of the third frame image displays the third frame image 1001. As can be seen from fig. 10, since the third frame image incorporates the high-definition feature of the reference moon region, the contour of the moon is clearer in the third frame image 1001 processed by the method shown in fig. 5 than in the image 108 shown in (5) of fig. 1.

It should be understood that the final saved third frame image is the image processed by the method shown in fig. 5, and the above processes are all completed inside the mobile phone until the final third frame image is saved in the album.

Alternatively, the above-described processing requires a certain period of time. For example, the processing process may take 2 seconds, when the user clicks the shooting control to shoot after clicking the shooting control, the mobile phone may first store an image (for example, a first frame image or a second frame image) that has not been processed, and after finishing the processing for 2 seconds, the mobile phone may automatically replace the stored image with a processed clear third frame image, and finally store the third frame image that is clear and bright in the album.

In some embodiments, in order to avoid that an edge portion where two regions meet may have a segmentation feeling when the high definition moon region and the HDR background region are directly merged and fused, the embodiment of the present application may perform fusion processing on the edge region between the high definition moon region and the HDR background region. Illustratively, the process of the fusion processing may be to give different weights according to the distance to the dividing line, so as to realize the fusion of the edge region between the high definition moon region and the HDR background region.

In some embodiments, the edge region may be understood as a region of coincidence between the high definition moon region and the HDR background region. The dividing line is located in the edge region. For example, a region within 32 pixels (pixels) adjacent to the dividing line may be selected as the edge region. For each pixel point in the edge region, when calculating the pixel value of the point, the pixel value of the point in the high definition moon region and the pixel value of the point in the HDR background region are used for fusion calculation, if the point is closer to the high definition moon region, the weight of the pixel value given to the high definition moon region in the fusion calculation is larger, and if the point is closer to the HDR background region, the weight of the pixel value given to the HDR background region in the fusion calculation is larger. It should be noted that there are many algorithms for fusing the high definition moon region and the HDR background region, including but not limited to what is proposed in the embodiments of the present application.

For example, the process of performing step S504 to step S509 may be as shown in fig. 11: after a first frame image (hereinafter, referred to as a short frame) and a second frame image (hereinafter, referred to as a long frame) are acquired, a moon region crop of the short frame image is obtained to obtain a moon region (hereinafter, referred to as a moon crop) of the first frame image, then the moon crop is subjected to feature matching with a moon 3D model (also referred to as a 3D high-definition moon model), and a reference moon region (hereinafter, referred to as a reference moon) is obtained by matching from the moon 3D model. The reference moon and the moon crop are input into a neural network (for example, the high-definition processing model mentioned above) to fuse the features of the reference moon and the moon crop and perform high-definition processing, and the neural network obtains and outputs the fused high-definition moon (which may also be called a high-definition moon region). On the other hand, using the long frame and the short frame, an HDR image is generated to complete HDR processing of the background region. Then, the loop generates an HDR background region in the HDR image, and finally, the HDR background region generated by the loop and the fused high-definition moon are fused to obtain a result (result) which is the third frame image.

It should be noted that the image processing method provided in the embodiment of the present application is applicable to not only a scene in which a moon subject is photographed, but also other fixed scenes in which a priori knowledge is strong and unique, such as a starry sky subject, a Mona Lisa portrait subject, and a Luo palace statue subject.

It should be further noted that, the processing operation on the image in the embodiment of the present application may be understood as a processing process on image data, and after the processing of the embodiment shown in fig. 5 is performed on the image data, the finally obtained image data of the third frame image may be controlled by the GPU to be displayed on the display screen, so that the user may see the third frame image with clear and high-definition moon outline from the display screen.

The present embodiment also provides a computer-readable storage medium, which includes instructions that, when the instructions are executed on an electronic device with a testing apparatus, cause the electronic device with the testing apparatus to execute any image processing method steps set forth in the embodiments of the present application.

The present embodiment also provides a computer program product containing instructions, which when run on an electronic device, causes the electronic device to execute the relevant method steps of any image processing method as proposed in the embodiments of the present application, so as to implement the methods in the embodiments described above.

The present embodiment also provides a control device, which includes a processor and a memory, where the memory is used to store a computer program code, and the computer program code includes computer instructions, and when the processor executes the computer instructions, the control device executes any one of the steps of the processing method of image data as proposed in the embodiments of the present application to implement the method in the above embodiments. The control device may be an integrated circuit IC or a system on chip SOC. The integrated circuit may be a general-purpose integrated circuit, a field programmable gate array FPGA, or an application specific integrated circuit ASIC.

Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.

In the several embodiments provided in this embodiment, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present embodiment may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present embodiment essentially or partially contributes to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) or a processor to execute all or part of the steps of the method described in the embodiments. And the aforementioned storage medium includes: various media that can store program code, such as flash memory, removable hard drive, read-only memory, random-access memory, magnetic or optical disk, etc.

The above description is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An image processing method applied to an electronic device, the image processing method comprising:

acquiring a first image and a second image in response to a photographing operation; shooting scenes of the first image and the second image are the same; the first image and the second image each comprise a first subject; the exposure time of the first image is lower than that of the second image;

obtaining a first image area from the first image; the first image area is an image area of the first subject in the first image;

performing feature matching on the first image area and the 3D model of the first main body, and matching the first image area and the 3D model of the first main body to obtain a reference image area; the image quality of the 3D model of the first subject is better than the first image region;

inputting the first image area and the reference image area into a high-definition processing model, and outputting the high-definition processing model to obtain a second image area; the second image area is an image area which fuses the characteristics of the first image area and the reference image area; the high-definition processing model is obtained by training a neural network model;

fusing the second image area and the third image area to obtain a third image; the third image area is an image area of a shooting background obtained according to the first image and/or the second image.

2. The method according to claim 1, wherein the high-definition processing model is obtained by training a neural network model through a plurality of sample images of a first subject, a reference image corresponding to each sample image and a target image corresponding to each sample image; and the reference image corresponding to the sample image is obtained by performing feature matching on the sample image and the 3D model of the first subject.

3. The method according to claim 1 or 2, wherein before the feature matching of the first image region with the 3D model of the first subject and the matching of the reference image region from the 3D model of the first subject, further comprising:

dividing the first image area into a plurality of blocks to obtain a plurality of first image blocks;

dividing the 3D model of the first main body into a plurality of blocks to obtain a plurality of second image blocks;

the performing feature matching on the first image region and the 3D model of the first subject, and obtaining a reference image region from the 3D model of the first subject by matching includes:

aiming at each first image block, respectively performing feature matching on the first image block and each second image block to obtain a reference image block; the reference image block is a second image block matched with the features of the first image block;

and combining all the reference image blocks to obtain a reference image area.

4. The method according to claim 3, wherein said performing feature matching on each of the first image blocks and each of the second image blocks to obtain a reference image block comprises:

for each first image block, respectively calculating the similarity between the first image block and each second image block;

and determining the second image block with the highest similarity as the reference image block.

5. The method according to claim 1, wherein the generating of the third image area comprises:

generating a Highlight Dynamic Rendering (HDR) image according to the first image and the second image;

and cutting out a background area in the HDR image to obtain the third image area.

6. The method of claim 5, wherein the cropping out the background region in the HDR image to obtain the third image region comprises:

identifying the HDR image;

identifying an image area where the first subject is located from the HDR image;

and removing the image area where the first subject is located from the HDR image to obtain a third image area.

7. The method of claim 1, wherein deriving the first image region from the first image comprises:

identifying the first image;

identifying an image area where the first subject is located from the first image;

and cutting out an image area where the first main body is located from the first image to obtain a first image area.

8. The method of claim 1, wherein prior to acquiring the first image and the second image in response to the capturing operation, further comprising:

displaying a shooting interface; the shooting interface is used for displaying images collected by a camera of the electronic equipment;

the first mode is entered when it is recognized that the first subject is included in the captured image.

9. The method of claim 1, wherein the method for constructing the high-definition processing model comprises:

constructing a data sample set; the set of data samples includes: a plurality of sample images of a first subject, a reference image corresponding to each of the sample images, and a target image corresponding to each of the sample images; the reference image corresponding to the sample image is obtained by performing feature matching on the sample image and the 3D model of the first subject;

for each sample image, inputting the sample image and a reference image corresponding to the sample image into a neural network model, and obtaining and outputting an output image corresponding to each sample image by the neural network model;

and adjusting parameters in the neural network model according to the error between the output image corresponding to each sample image output by the neural network model and the target image corresponding to the sample image until the error between the output image of each sample image output by the adjusted neural network model and the target image corresponding to the sample image meets a preset convergence condition, and determining the adjusted neural network model as a high-definition processing model.

10. The method according to claim 1, wherein after fusing the second image region and the third image region to obtain a third image, further comprising:

saving the third image to a gallery application.

11. An electronic device, comprising: one or more processors and memory;

the memory coupled with the one or more processors for storing computer program code comprising computer instructions which, when executed by the one or more processors, cause the electronic device to perform the image processing method of any of claims 1-10.

12. The electronic device of claim 11, further comprising: a neural network processor NPU, the NPU to run the high definition processing model.

13. A computer storage medium comprising computer instructions which, when run on an electronic device, cause a processor in the electronic device to perform the image processing method of any one of claims 1 to 10.