CN117689545A - Image processing method, electronic device, and computer-readable storage medium - Google Patents

Image processing method, electronic device, and computer-readable storage medium Download PDF

Info

Publication number
CN117689545A
CN117689545A CN202410148313.5A CN202410148313A CN117689545A CN 117689545 A CN117689545 A CN 117689545A CN 202410148313 A CN202410148313 A CN 202410148313A CN 117689545 A CN117689545 A CN 117689545A
Authority
CN
China
Prior art keywords
image
model
feature vector
parameters
inputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410148313.5A
Other languages
Chinese (zh)
Inventor
韩新杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202410148313.5A priority Critical patent/CN117689545A/en
Publication of CN117689545A publication Critical patent/CN117689545A/en
Pending legal-status Critical Current

Links

Abstract

The present disclosure relates to the field of computer technologies, and in particular, to an image processing method, an electronic device, and a computer readable storage medium. In order to improve the highlight effect of the shot image under the target view angle generated by the nerve radiation field (neural radiance fields, neRF) model, multiple groups of polarized images can be used as training data of the NeRF model, so that the NeRF model can learn different highlight effects corresponding to different polarization parameters of the polarized images through multiple training iterative processes. Furthermore, when the trained NeRF model is used to adjust the viewing angle of the photographed image, the photographed image and the target polarization parameter may be input to the NeRF model, so that the photographed image under the finally generated target viewing angle may exhibit a highlight effect corresponding to the target polarization parameter. Therefore, the high-light effect of the photographed image at the target visual angle generated based on the NeRF model is effectively improved, and the image texture of the photographed image at the target visual angle is improved.

Description

Image processing method, electronic device, and computer-readable storage medium
Technical Field
The present invention relates to the field of computer technology, and in particular, to an image processing method, an electronic device, and a computer readable storage medium.
Background
The neural radiation field (neural radiance fields, neRF) is widely used in new view angle synthesis tasks (novel view synthesis) as a deep learning model for rapid development of computer vision. For example, spatial position coordinates of each pixel point of the captured image, and direction parameters for characterizing the viewing angle of the target may be input to the trained NeRF model. Furthermore, the NeRF model predicts the pixel colors of the corresponding pixel points of each spatial position coordinate under the target visual angle, and further generates a shooting image under the target visual angle.
It can be understood that the NeRF model predicts the pixel color in the captured image of the spatial position coordinate under the target viewing angle based on the spatial position coordinate and the direction parameter, that is, the prediction of the pixel color of each pixel point by the NeRF model is independent, and does not consider the spatial relationship between one pixel point and its neighboring pixel points. It will be appreciated that the spatial relationship may include edge details, texture details, highlight details, etc. between the pixel points. Therefore, the texture of the generated photographed image at the target viewing angle may be poor because the NeRF model does not consider the spatial relationship between the pixels. For example, a captured image at a target viewing angle may lack highlight detail, or the like.
Specifically, referring to the schematic effect diagram of the mobile phone 100 generating a photographed image at a target viewing angle based on the NeRF model shown in fig. 1, the mobile phone 100 generates a photographed image 100a based on a photographing operation of a user, and the mobile phone 100 generates a photographed image 100b of the photographed image 100a at the target viewing angle based on the NeRF model according to the target viewing angle input by the user. Since the NeRF model does not consider the spatial relationship between the pixel points, the captured image 100b at the target viewing angle has defects such as lack of highlight details.
Disclosure of Invention
The application provides an image processing method, an electronic device and a computer readable storage medium. The highlight detail of the shooting image under the target visual angle generated by the NeRF model can be improved, and the image texture is improved.
In a first aspect, the present application provides an image processing method, applied to an electronic device, including: acquiring a first image; and inputting the first image and the first polarization parameter into the first image processing model to obtain a second image, wherein the second image comprises first highlight information corresponding to the first polarization parameter.
Here, the first image may be a photographed image hereinafter, the first polarization parameter may be a target polarization parameter hereinafter, and the second image may be a photographed image hereinafter at a target viewing angle, a low resolution photographed image in embodiment 1 at a target viewing angle with high light detail, a low resolution photographed image in embodiment 2 at an original viewing angle with high light detail.
The first highlight information may be a highlight effect corresponding to a target polarization parameter hereinafter. Based on the mode, the highlight detail can be added for the shot image under the target visual angle, and the image texture is improved.
In a possible implementation of the first aspect described above, the first image processing model comprises a NeRF model.
Here, the input data of the NeRF model includes a first image and a first polarization parameter, and further, the image processing method provided in the application may increase, for the first image, highlight details corresponding to the first polarization parameter through the NeRF model, so as to improve image texture.
In one possible implementation of the first aspect, the direction parameters of the first image include: a first direction parameter characterizing a viewing angle of the first image, or a second direction parameter, wherein the second direction parameter characterizes a viewing angle different from the viewing angle of the first image.
Here, the input data of the NeRF model may also include direction parameters. For example, when the input direction parameter is a first direction parameter representing a shooting angle of view of a first image, such as an application scenario shown in embodiment 2 below, the second image generated based on the NeRF model provided in the present application may be an image with highlight details that is the same as the shooting angle of view of the first image. For example, the low resolution photographed image in the original view angle with high light detail in embodiment 2.
Likewise, when the input direction parameter characterizes a shooting angle of view different from that of the first image, for example, an application scenario shown in embodiment 1 below, that is, the direction parameter input to the NeRF model is the aforementioned second direction parameter (for example, the target direction parameter hereinafter) characterizing the target angle of view, the second image generated based on the NeRF model provided in the present application may be an image with highlight detail different from that of the first image. For example, a captured image at a target viewing angle with highlight detail generated based on the process described below in fig. 2.
The first image model may also be a distance symbol field expression network (signed distance function networks, SDF-Net). Specifically, for the SDF-Net model, the polarization parameters may be spliced into the input data of the SDF-Net model, so that the SDF-Net model may also add, based on the input first polarization parameters, a highlight effect corresponding to the first polarization parameters to the generated second image. The training and application process of the SDF-Net model is substantially the same as the training and application process of the NeRF model received below, and the effects can be the same, and the training and application process of the SDF-Net model is not repeated here.
In one possible implementation of the first aspect, the image parameters of the first image at least include a position parameter of each pixel of the first image and a direction parameter of the first image, and the inputting the first image and the first polarization parameter to the first image processing model to obtain the second image includes: inputting image parameters of a first image and first polarization parameters into a first image processing model to obtain pixel color parameters of each pixel point; and generating a second image according to the pixel color parameters of each pixel point.
Here, the positional parameters of the respective pixels of the first image may be the positional parameters (x, y, z) hereinafter, the direction parameters of the first image may be the target direction parameters (θ, Φ) hereinafter, the image parameters of the first image input to the first image processing model and the first polarization parameters may be the six-dimensional data (x, y, z, θ, Φ, p) hereinafter. The pixel color parameters of each pixel point output by the first image processing model may be the pixel color obtained based on the NeRF model hereinafter.
In a possible implementation of the first aspect, after obtaining the second image, the method further includes: inputting a second image, a first polarization parameter, a first reference image and a second reference image into the second image processing model to obtain a third image, wherein the first reference image comprises a high-resolution polarization image which is the same as a shooting object and/or a shooting scene of the first image, and the image parameter of the first reference image at least comprises a position parameter of each pixel point of the first reference image, a direction parameter of the first reference image and a polarization parameter; the second reference image is a low-resolution polarized image obtained by inputting image parameters of the first reference image into the first image processing model.
Here, the second image may be a low resolution photographed image at a target viewing angle with high light detail generated based on the NeRF model in embodiment 1. It may be the low resolution captured image at the original view angle with high light detail generated based on the NeRF model in embodiment 2.
The first reference image may be a high resolution reference image to which the photographed image is correspondingly matched hereinafter. The second reference image may be a low resolution reference image based on the NeRF model being degraded, which is obtained by inputting the high resolution reference image into the NeRF model hereinafter.
The third image may be a super-resolution photographed image at the target viewing angle with high light detail in embodiment 1. A super-resolution photographed image at the original viewing angle with highlight detail in embodiment 2 may be possible.
Specifically, high frequency information such as edge details, texture details and the like can be introduced into the image generated by the second image processing model based on the high resolution reference image, and residual features can be introduced into the image generated by the second image processing model based on the low resolution reference image so as to further repair the high frequency information lost or distorted due to the degradation process.
Based on the foregoing, the definition of the low-resolution photographed image at the target viewing angle or the original viewing angle with high light details generated by the NeRF model can be improved, and the super-resolution photographed image at the target viewing angle or the original viewing angle with high light details can be generated. It will be appreciated that the second image processing model may be any image processing model capable of introducing high frequency information into the second image based on the high resolution reference image and/or the low resolution reference image. The foregoing high frequency information or details include, but are not limited to, edge information, texture information, highlight information, and the like.
In one possible implementation of the first aspect, the second image processing model may include a reference-frame super-resolution (RefSR) model.
The image processing method provided by the application can improve the resolution and the image texture of the image generated by the Ref model based on the RefSR model.
In one possible implementation of the first aspect, the second image processing model includes a first model, a second model, and a third model, and the inputting the second image, the first polarization parameter, the first reference image, and the second reference image into the second image processing model, to obtain a third image includes: inputting a first reference image, a second image and a first polarization parameter into the first model to obtain a first feature vector; inputting a second reference image and a second image into the second model to obtain a second feature vector; inputting the first characteristic vector and the second characteristic vector into a third model to obtain a third image; the third image is a super-resolution image, and the third image comprises first high-light information corresponding to the first polarization parameter and first high-frequency information corresponding to the first feature vector and the second feature vector.
Here, the first model may be a high frequency model obtained by high frequency modeling of the RefSR model hereinafter, the second model may be a degradation model obtained by degradation modeling of the RefSR model hereinafter, and the third model may be a fusion module of the RefSR model hereinafter.
Specifically, the first feature vector may be a high-frequency feature that is output by a high-frequency model of the RefSR model, hereinafter, in combination with the target polarization parameter, by the high-resolution reference image and the low-resolution photographed image.
The second feature vector may be a residual feature that is output by inputting a low resolution photographed image, a low resolution reference image, a high resolution reference image, and a target polarization parameter at a target view angle having high light details into a degradation model in the RefSR model, hereinafter.
The third image may be a super-resolution photographed image under a target view angle or an original view angle with highlight details obtained by fusing the residual features and the high-frequency features based on a fusion module in the RefSR model. Here, the first high frequency information may be high frequency information corresponding to the aforementioned high frequency feature and residual feature in the third image, for example, edge details, texture details, and the like.
It will be appreciated that the captured image may be adjusted to a target or super-resolution captured image at the original viewing angle with high light detail based on the above.
In one possible implementation manner of the first aspect, inputting the first reference image, the second image, and the first polarization parameter to the first model, to obtain the first feature vector includes: performing space-to-depth rearrangement operation on the up-sampling result of the second image and the first reference image to obtain a third feature vector; inputting the third feature vector into an encoder in the first model to obtain a fourth feature vector; splicing the first polarization parameter into a fourth feature vector based on the first model to obtain a fifth feature vector; the fifth feature vector is input to a decoder in the first model to obtain a first feature vector.
Here, based on the up-sampling of the second image, the image quality of the low-resolution captured image at the target view angle or the original view angle with high light detail generated by the NeRF model may be primarily improved.
The aforementioned space-to-depth rearrangement operation may be a space-to-depth (S2D) operation hereinafter. Through the S2D operation, feature fusion between the upsampling result of the second image and the first reference image can be realized, and the fusion result can be a third feature vector. It will be appreciated that the third feature vector may be a feature map that fuses the upsampled result of the second image and the first reference image.
Further, the input of the third feature vector to the encoder in the first model to obtain the fourth feature vector may include inputting a feature map, which is a combination of the up-sampling result of the low resolution captured image and the high resolution reference image, to the encoder in the high frequency model, and outputting the high frequency feature vector (as an example of the fourth feature vector) by the encoder.
And splicing the first polarization parameter into the fourth feature vector based on the first model to obtain a fifth feature vector, wherein the target polarization parameter can be spliced into a high-frequency feature vector output by the encoder based on the high-frequency model. Correspondingly, the fifth eigenvector may be a high frequency eigenvector comprising the target polarization parameter. Here, the target polarization parameters may be spliced into the high frequency feature vector based on the concat function. The specific splicing manner is not limited in this application.
Further, the fifth feature vector may be input to a decoder in the first model to obtain the first feature vector, and the high-frequency feature vector including the target polarization parameter may be input to a decoder in the high-frequency model to obtain the high-frequency feature output by the high-frequency model.
It will be appreciated that the high frequency characteristics of the high frequency model output may be used to construct the aforementioned first high frequency information.
In one possible implementation manner of the first aspect, inputting the second reference image and the second image to the second model to obtain the second feature vector includes: inputting a second reference image and a second image into the second model to obtain a sixth feature vector; and performing depth-to-space rearrangement operation on the sixth feature vector to obtain a second feature vector.
Here, the second reference image and the second image are input to the second model to obtain a sixth feature vector, and the method may include inputting the low-resolution reference image and the low-resolution captured image to the degradation model to obtain a residual feature vector (as an example of the sixth feature vector) output by the degradation model.
Here, the depth-to-space rearrangement operation may be a depth-to-space (D2S) operation hereinafter. Further, the D2S operation is performed on the sixth feature vector, and a residual feature (as an example of the second feature vector) after feature rearrangement may be obtained, so as to perform feature fusion with the high-frequency feature.
In a second aspect, the present application provides a model training method, applied to an electronic device, where the method includes: acquiring a plurality of fourth images, wherein the fourth images comprise high-resolution polarized images obtained by shooting the same shooting object based on different shooting visual angles, and the image parameters of the fourth images at least comprise the position parameters of all pixel points of the fourth images, the direction parameters of the fourth images and the polarization parameters; inputting image parameters of a fourth image into the first image processing model to obtain training pixel color parameters of each pixel point of the fourth image; calculating a loss value through a loss function according to the training pixel color parameters and the actual color parameters of each pixel point of the fourth image; and adjusting parameters of the first image processing model so that the loss value is located in a preset interval.
Here, the fourth image may be a high resolution reference image hereinafter, and the number of fourth images may be a plurality of sets of high resolution reference images hereinafter. The position parameters of the respective pixels of the fourth image may be the position parameters (x, y, z) of the high resolution reference image hereinafter. The direction parameter of the fourth image may be a direction parameter (θ, Φ) that characterizes a shooting view angle of the high-resolution reference image hereinafter. The polarization parameter of the fourth image may be a polarization parameter p hereinafter characterizing the polarization direction of a polarization filter or a polarization lens that captures the high resolution reference image.
The training pixel color parameters of each pixel point of the fourth image may be output by the NeRF model, where the spatial point corresponding to the pixel point corresponds to the pixel color of the pixel point under the shooting view angle of the high resolution reference image. Here, the actual pixel color parameter of each pixel point of the fourth image may be the original pixel color of each pixel point in the high resolution reference image.
In a possible implementation of the second aspect, the first image processing model includes a NeRF model.
Here, the training data input to the NeRF model includes a fourth image having a highlight effect corresponding to the polarization parameter, and therefore training the NeRF model based on the fourth image and the polarization parameter of the fourth image can cause the NeRF model to learn the correspondence between the polarization parameter and the highlight effect. Furthermore, in the process of applying the NeRF model, the highlight effect corresponding to the target polarization parameter can be added to the input shooting image based on the input target polarization parameter.
In a possible implementation of the second aspect, the image processing method further includes: and obtaining a fifth image based on the training pixel color parameters of each pixel point of the fourth image, wherein the fifth image is a low-resolution polarized image corresponding to the fourth image.
Here, the fifth image may be a low resolution reference image into which the high resolution reference image is degraded through the NeRF model. It can be understood that, based on the foregoing manner, the pixel color output by the NeRF model is affected by the polarization parameter of the high-resolution reference image, so that after a plurality of training iteration processes are performed, the loss value is located in a preset interval, and the low-resolution polarized image output by the NeRF model based on the pixel color of each pixel point can also exhibit a highlight effect corresponding to the polarization parameter. Here, the present application does not make a limiting description about a preset interval in which the loss value should be located.
In a possible implementation of the second aspect, the image processing method further includes: inputting the fourth image, the fifth image and polarization parameters of the fourth image into the second image processing model to obtain a sixth image; calculating a loss value through a loss function according to the sixth image and the fourth image; and adjusting parameters of the second image processing model so that the loss value is located in a preset interval.
Here, the fifth image may include a low resolution reference image and/or a low resolution reference image at a new view angle hereinafter. The sixth image may be a super resolution reference image at a new view angle hereinafter.
Specifically, high frequency information such as edge details, texture details and the like can be introduced into the image generated by the second image processing model based on the high resolution reference image, and residual features can be introduced into the image generated by the second image processing model based on the low resolution reference image and/or the low resolution reference image under a new view angle to further repair the high frequency information lost or distorted due to the degradation process. The definition of the low resolution reference image generated by the NeRF model can be improved based on the foregoing. It will be appreciated that the second image processing model may be any image processing model capable of introducing high frequency information to the low resolution reference image generated by the NeRF model based on the high resolution reference image and/or the low resolution reference image. The foregoing high frequency information or details include, but are not limited to, edge information, texture information, highlight information, and the like.
In a possible implementation of the second aspect, the second image processing model includes a RefSR model.
The image processing method provided by the application can improve the resolution and the image texture of the image generated by the Ref model based on the RefSR model.
In one possible implementation of the second aspect, the second image processing model includes a first model, a second model, and a third model, and the inputting, into the second image processing model, the fourth image, the fifth image, and polarization parameters of the fourth image, to obtain a sixth image includes: inputting polarization parameters of the fourth image, the fifth image and the fourth image into a first model to obtain a first training feature vector; inputting the fifth image into a second model to obtain a second training feature vector; and inputting the first training feature vector and the second training feature vector into a third model to obtain a sixth image.
Here, the first model may be a high frequency model obtained by high frequency modeling of the RefSR model hereinafter, the second model may be a degradation model obtained by degradation modeling of the RefSR model hereinafter, and the third model may be a fusion module of the RefSR model hereinafter.
Specifically, the first training feature vector may be a high-frequency feature obtained by inputting the high-resolution reference image, the low-resolution reference image of the new view angle, and the polarization parameter corresponding to the new view angle into the high-frequency model. It can be understood that the polarization parameter corresponding to the new view angle may also be a polarization parameter corresponding to the high resolution reference image, which corresponds to the low resolution reference image of the new view angle, that is, the low resolution reference image obtained by degradation after the high resolution reference image is input into the NeRF model.
Corresponding to a scene in which the fifth image includes a low-resolution reference image and a low-resolution reference image under a new view angle, the fifth image is input into the second model to obtain a second training feature vector, and the low-resolution reference image under the new view angle may be input into the degradation model to obtain a residual feature.
And inputting the first training feature vector and the second training feature vector into a third model to obtain a sixth image, wherein the sixth image can be obtained by inputting high-frequency features obtained based on the high-frequency model and residual features obtained based on the degradation model into a fusion module to obtain a super-resolution reference image under a new view angle.
In one possible implementation manner of the second aspect, inputting the polarization parameters of the fourth image, the fifth image, and the fourth image into the first model to obtain the first training feature vector includes: performing space-to-depth rearrangement operation on the upsampling result of the fifth image and the fourth image to obtain a third training feature vector; inputting the third training feature vector into an encoder in the first model to obtain a fourth training feature vector; splicing the polarization parameters of the fourth image to the fourth training feature vector to obtain a fifth training feature vector; and inputting the fifth training feature vector into a decoder in the first model to obtain a first training feature vector.
Here, up-sampling the fifth image includes up-sampling the low resolution reference image at the new view angle in the fifth image, corresponding to a scene where the fifth image includes the low resolution reference image and the low resolution reference image at the new view angle. Based on the mode, the image quality of the low-resolution reference image under the new view angle generated by the NeRF model can be initially improved.
The aforementioned space-to-depth rearrangement operation may be a space-to-depth (S2D) operation hereinafter. Through the S2D operation, feature fusion between the up-sampling result of the low-resolution reference image and the high-resolution reference image under the new view angle can be realized, and the fusion result can be a third training feature vector. It will be appreciated that the third feature vector may be a feature map that fuses the upsampling result of the low resolution reference image and the high resolution reference image at the new view angle.
Further, inputting the third training feature vector into the encoder in the first model to obtain a fourth training feature vector may include inputting a feature map, which is a combination of the upsampling result of the low resolution reference image at the new view angle and the high resolution reference image, into the encoder in the high frequency model, and outputting the high frequency feature vector (as an example of the fourth training feature vector) through the encoder.
And splicing the polarization parameters of the fourth image into the fourth training feature vector based on the first model to obtain a fifth training feature vector, wherein the polarization parameters of the fourth image can be spliced into the high-frequency feature vector output by the encoder based on the high-frequency model. Correspondingly, the fifth training feature vector may be a high frequency feature vector comprising the polarization parameters of the fourth image. Here, the polarization parameters of the fourth image may be stitched into the high-frequency feature vector based on the concat function. The specific splicing manner is not limited in this application.
Further, the fifth feature training vector may be input to a decoder in the first model to obtain the first training feature vector, and the high-frequency feature vector including the polarization parameter of the fourth image may be input to a decoder in the high-frequency model to obtain the high-frequency feature output by the high-frequency model.
It will be appreciated that the high frequency features output by the high frequency model may be used to construct the high frequency information of the sixth image.
In a possible implementation manner of the second aspect, inputting the fifth image into the second model to obtain the second training feature vector includes: inputting a fifth image into the second model to obtain a sixth training feature vector; and performing depth-to-space rearrangement operation on the sixth training feature vector to obtain a second training feature vector.
Here, the fifth image is input to the second model to obtain a sixth training feature vector, and the low resolution reference image at the new view angle may be input to the degradation model to obtain a residual feature vector (as an example of the sixth training feature vector) output by the degradation model.
Here, the depth-to-space rearrangement operation may be a depth-to-space (D2S) operation hereinafter. Further, the D2S operation is performed on the sixth training feature vector, so that a residual feature (as an example of the second feature training vector) after feature rearrangement may be obtained, so as to perform feature fusion with the high-frequency feature.
In a third aspect, the present application provides an electronic device, including: one or more processors; one or more memories; the one or more memories store one or more programs that, when executed by the one or more processors, cause the electronic device to perform the image processing methods provided by the foregoing first aspect and various possible implementations of the first aspect, or to perform the model training methods provided by the foregoing second aspect and various possible implementations of the second aspect.
In a fourth aspect, the present application provides a computer readable medium having instructions stored thereon which, when executed on a computer, cause the computer to perform the image processing methods provided by the foregoing first aspect and various possible implementations of the first aspect, or to perform the model training methods provided by the foregoing second aspect and various possible implementations of the second aspect.
In a fifth aspect, the present application provides a computer program product comprising computer programs/instructions which when executed by a processor implement the image processing methods provided by the foregoing first aspect and the various possible implementations of the first aspect, or implement the model training methods provided by the foregoing second aspect and the various possible implementations of the second aspect.
The advantages of the second to fifth aspects may be referred to in the description of the first aspect and the various possible implementations of the first aspect, and are not described here in detail.
Drawings
Fig. 1 is a schematic diagram showing the effect of generating a photographed image under a target view angle based on a NeRF model;
fig. 2 is a schematic diagram of a process of generating a photographed image with a target viewing angle having a highlight effect based on a NeRF model according to an embodiment of the present application;
Fig. 3 is a schematic flow chart of an image processing method according to an embodiment of the present application;
fig. 4a is a schematic diagram showing the effect of inputting a target direction parameter and a target polarization parameter by a user according to an embodiment of the present application;
FIG. 4b is a schematic diagram showing the effect of inputting the target direction parameter and the target polarization parameter by the user according to another embodiment of the present application;
fig. 4c shows an application scenario of the image processing method provided in the embodiment of the present application;
fig. 5 is a schematic flow chart of another image processing method according to an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating the effect of inputting target polarization parameters by a user according to an embodiment of the present application;
fig. 7a is a schematic diagram of a training process of a NeRF model according to an embodiment of the present application;
FIG. 7b is a schematic diagram illustrating a training process of a RefSR model according to an embodiment of the disclosure;
fig. 7c is a schematic diagram illustrating a process of generating a high-resolution photographed image based on a NeRF model and a RefSR model according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a mobile phone 100 according to an embodiment of the present application;
fig. 9 is a block diagram of a software structure of a mobile phone 100 according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be described in detail below with reference to the accompanying drawings and specific embodiments of the present application.
It will be appreciated that the image processing method provided in the embodiments of the present application, the applicable electronic devices may include, but are not limited to, mobile phones, tablet computers, desktop computers, laptops, handheld computers, netbooks, and augmented reality (augmented reality, AR) \virtual reality (VR) devices, smart televisions, smart watches, wearable devices such as servers, mobile email devices, car machine devices, and other televisions or other electronic devices embedded in or coupled with one or more processors.
In order to facilitate understanding of the solutions in the embodiments of the present application by those skilled in the art, the process and principle of generating a photographed image at a target viewing angle based on the NeRF model will be explained first.
In the training of the NeRF model, training data of the NeRF model may be a plurality of sets of photographed images, wherein each set of photographed images may include a plurality of photographed images photographed at a plurality of viewing angles for the same subject. The input data of the NeRF model may include position parameters (x, y, z) of respective pixels of the captured image and direction parameters (θ, Φ) representing a direction of view of the object. The position parameter (x, y, z) of the pixel point may be a three-dimensional position coordinate of the spatial point corresponding to the pixel point in a world coordinate system. The target viewing angle direction may be a shooting viewing angle to be obtained based on a NeRF model, θ may be a camera pose parameter corresponding to the target viewing angle direction, Φ may be a camera internal parameter corresponding to the target viewing angle direction, where the camera internal parameter may include parameters such as a focal length, a distortion coefficient, and a scaling factor that can represent the target viewing angle direction.
In each training iteration process, a batch of pixel points can be selected from training data, a plurality of sampling points are selected on the light rays from the camera to the space points corresponding to the pixel points, and the color value and the volume density value of each sampling point are predicted based on the NeRF model. And integrating the color values and the volume density values of a plurality of sampling points based on the volume rendering technology, wherein the finally obtained integration result can be used as the predicted pixel color of the pixel generated by the light projection.
Here, the NeRF model may calculate the loss between the predicted pixel color and the actual pixel color based on the loss function, adjust parameters in the NeRF model by a gradient descent method or the like, and converge the loss to a satisfactory level based on the multi-test training iteration, so as to improve the prediction accuracy of the NeRF model.
Based on the above training process, in a scene where a captured image at a target viewing angle is generated by applying a NeRF model, position parameters (x, y, z) of respective pixels of the captured image and direction parameters (θ, Φ) representing the target viewing angle may be input to the trained NeRF model. The NeRF model may predict color values and bulk density values of a number of sampling points corresponding to the pixel points based on the input five-dimensional data (x, y, z, θ, Φ). Furthermore, the NeRF model can integrate the color values and the volume density values of a plurality of sampling points based on the volume rendering technology to obtain the pixel color of the pixel generated by the light projection. The pixel color of each generated pixel point can be used for generating a shooting image under the target visual angle. It is understood that the photographed image at the target viewing angle may be an image after changing the photographed viewing angle of the photographed image to the target viewing angle based on the NeRF model.
Based on the training process and the application process of the NeRF model, the determining process of the NeRF model for the pixel colors of each pixel point is independent, and the influence of the spatial relationship between different pixel points on the pixel colors is not considered. Further, the generated photographed image at the target viewing angle may be caused to lack highlight details or the like.
The image processing method capable of improving the highlight detail in the photographed image under the target view angle provided by the application is described in detail below in combination with the description of the lack of highlight detail in the photographed image under the target view angle generated by the NeRF model.
It can be understood that the polarized filters or polarized lenses with different polarization parameters can filter reflected light rays in different directions, so that polarized images shot by the polarized camera can show high light effects corresponding to the polarization parameters of the polarized camera. Here, the polarization camera includes an electronic device provided with a polarization filter or a polarization lens, and the polarization parameter of the polarization camera may be a polarization parameter of the polarization filter or the polarization lens, and the polarization parameter may include a polarization direction or the like.
Specifically, taking a portrait as an example, the light projected to the non-highlight area of the portrait can be filtered by adjusting the polarization parameters of the polarization filter or the polarization lens. It will be appreciated that after the filtering portion projects light onto the non-highlight region of the portrait, the color of the pixels of the non-highlight region of the portrait may change, e.g., the purity, brightness, etc., of the pixels of the non-highlight region may become low. Therefore, the difference between the non-highlight region and the highlight region can be enlarged by adjusting the polarization parameter, including pixel purity, brightness difference between the two regions, and the like. Further, the highlight region of the portrait can be highlighted, so that the portrait in the polarized image obtained by shooting shows a remarkable highlight effect.
Therefore, in order to solve the technical problem that the shot image under the target view angle generated by the NeRF model lacks highlight details, the application provides an image processing method. In order to improve the highlight effect in the photographed image under the target visual angle, the method can take a plurality of groups of polarized images as training data of the NeRF model, so that the NeRF model can learn different highlight effects corresponding to different polarization parameters of the polarized images through a plurality of training iterative processes. For example, the NeRF model may learn the degree of influence of different polarization parameters on the pixel color of each pixel in the polarized image, and so on. Furthermore, when the trained NeRF model is used to adjust the viewing angle of the photographed image, the photographed image and the target polarization parameter may be input to the NeRF model, so that the photographed image under the finally generated target viewing angle may exhibit a highlight effect corresponding to the target polarization parameter. By the method, the highlight effect of the photographed image at the target visual angle generated based on the NeRF model can be effectively improved, and the image texture of the photographed image at the target visual angle is improved.
Specifically, fig. 2 shows a schematic diagram of a process of generating a photographed image at a target viewing angle with a high light effect by applying the NeRF model provided in the present application.
Referring to fig. 2, in order for the NeRF model to generate a photographed image at a target viewing angle with a high light effect, polarization parameters may be spliced into input data of the NeRF model such that the input data of the NeRF model is expanded into position parameters, direction parameters, and polarization parameters. Correspondingly, when the trained NeRF model is used to adjust the viewing angle of the captured image, the input data of the NeRF model may be six-dimensional data (x, y, z, θ, Φ, p) corresponding to the pixels in the captured image. The position parameter (x, y, z) may be a spatial position coordinate of a spatial point corresponding to each pixel point in the captured image, the (θ, Φ) may be a target direction parameter for representing the adjusted target viewing angle, and the p may be a target polarization parameter corresponding to the desired highlight effect.
Furthermore, the pose multi-layer perceptron (positional multilayer perceptron, positional MLP) in the NeRF model can obtain the volume density values of a plurality of sampling points corresponding to the spatial point based on the spatial position coordinates (x, y, z) of the spatial point corresponding to the pixel point. The directional multilayer perceptron (directional multilayer perceptron, directional MLP) in the NeRF model can obtain the color value of the sampling point based on the target direction parameter (θ, Φ), the target polarization parameter p, and the feature vector of the sampling point output by the positional MLP. The plurality of sampling points corresponding to the spatial point may include a plurality of sampling points selected from a light beam from the camera to the spatial point. Thus, the volume rendering model based on the NeRF model integrates the volume density values and the color values of a plurality of sampling points corresponding to the space point, so that the pixel color of the pixel point corresponding to the space point under the target visual angle can be obtained.
Based on the foregoing, it can be appreciated that the pixel color of the pixel obtained according to the NeRF model provided in the present application is affected by the target polarization parameter. Further, a photographed image at a target viewing angle generated based on the pixel color of each pixel point may exhibit a highlight effect corresponding to the target polarization parameter. For example, the NeRF model provided in the application shown in FIG. 2 is input with the photographed image and the target polarization parameter, and the photographed image under the output target viewing angle can show the highlight effect corresponding to the target polarization parameter. Therefore, the defect that the shot image under the target view angle generated by the NeRF model lacks highlight details can be effectively overcome, and the image texture of the shot image under the target view angle generated based on the NeRF model is improved through the NeRF model and the image processing method.
Based on the foregoing principle explanation of the image processing method provided in the present application, a detailed description will be given below of specific implementation procedures of the image processing method provided in the present application in different application scenarios with reference to the accompanying drawings and different embodiments.
It should be further stated that, in the embodiments of the present application, the steps in the method and the flow are numbered for convenience of reference, but not for limiting the sequence, and if the sequence exists between the steps, the text description shall be used.
Examples
The embodiment of the application will describe in detail a specific implementation process of the image processing method provided in the embodiment of the application in which a user needs to adjust a viewing angle of a photographed image.
First, fig. 3 shows a flowchart of an image processing method provided in an embodiment of the present application, and a detailed description will be given below of a specific process of applying the image processing method provided in the embodiment of the present application to an electronic device in conjunction with fig. 3.
It will be appreciated that the electronic device implementing the steps of the flow described in fig. 3 may be the handset 100 described above. For convenience of description, the mobile phone 100 is taken as an execution body in the following description of each step, and the execution body will not be described in detail.
Specifically, the implementation flow of the image processing method provided in the embodiment of the present application may include the following steps:
300: a captured image is acquired in response to a first user operation.
In one example manner, the photographed image may be acquired in real time by the mobile phone 100 based on a photographing operation of the user. In this example manner, the first user operation may be a photographing operation performed by the user based on a photographing application in the mobile phone 100, for example, the photographing operation may be a clicking operation of a shutter control in the photographing application by the user, a convenient operation in which photographing can be achieved, or the like. The convenient operation may be a clicking operation of a physical key or a virtual key such as a volume key or a power key of the mobile phone 100, a voice operation of dictating a specific password, a gesture operation of performing a specific gesture operation, or the like. Here, the present application does not make a restrictive explanation of the first user operation of acquiring the user-captured image.
In this example manner, the image processing method provided in the present application may be integrated in the foregoing photographing application, so as to implement automatic adjustment of a photographed image or provide a manual adjustment scheme to a user directly in the photographing application based on the following steps 301 to 304 after the user completes a photographing operation based on the photographing application. Here, the photographing application may be a system application in the mobile phone 100, for example, a camera application, or may be a third party application installed in the mobile phone 100 or an applet usable by the mobile phone 100. The present application does not make a limiting explanation of the shooting application in which the image processing method is integrated.
In another example manner, the captured image may be an image that has been stored in the cell phone 100. In this example manner, the first user operation may be a user selection operation of a photographed image stored in the mobile phone 100. Here, the image processing method provided by the present application may be integrated in an image processing application, and the user may perform a selection operation of a photographed image based on an image selection function provided by the image processing application. Further, automatic adjustment of the captured image or providing a manual adjustment scheme to the user may be implemented in the image processing application based on steps 301 to 304 described below.
Here, the image processing application may be the aforementioned photographing application, that is, the photographing application may perform the immediate optimization of the photographed image photographed by the user in real time, or may perform the post-optimization of the photographed image stored in the mobile phone 100 selected by the user. The image processing application may also be other applications for post-optimizing the captured image stored in the cell phone 100. Specifically, the image processing application may be a system application of the mobile phone 100, such as a gallery application (or album application). Or may be a third party application installed in the handset 100 or an applet or the like that the handset 100 may use. The present application does not make a limiting explanation of the image processing application integrating the image processing method.
301: and acquiring a target direction parameter and a target polarization parameter corresponding to the shot image.
For example, the target direction parameter and the target polarization parameter corresponding to the photographed image may be preset by the mobile phone 100, and it may be understood that the user may also modify the target direction parameter and the target polarization parameter preset by the mobile phone 100. In addition, the target direction parameter and the target polarization parameter corresponding to the photographed image may also be determined based on the input of the user.
In one example manner, the cell phone 100 may preset the target polarization parameter and the target direction parameter. For example, the mobile phone 100 may preset the target polarization parameter and the target direction parameter. Taking the example that the mobile phone 100 performs real-time optimization on the shot image shot by the user in real time based on the shooting application, when the user shoots the shot image based on the step 300, the shooting application may acquire the target polarization parameter and the target direction parameter preset in the mobile phone 100 before displaying the shot image to the user, and generate the super-resolution shot image with high-light detail under the target view angle based on the following steps 302 to 303 according to the target polarization parameter and the target direction parameter. Further, the super-resolution photographed image at the target viewing angle with highlight detail may be displayed to the user based on step 304 described below.
In another example, the mobile phone 100 may also preset only the target polarization parameter and determine the target direction parameter based on the user input. For example, the photographing application may instantly optimize the photographed image based on the target polarization parameters preset by the mobile phone 100 to generate a super-resolution photographed image with high light detail based on steps 302 to 303 described below. It can be appreciated that at this time, the angle of view of the super-resolution captured image with highlight detail coincides with the original captured image. Further, the mobile phone 100 may save the super-resolution photographed image with highlight detail, and may perform viewing angle adjustment on the super-resolution photographed image with highlight detail based on a target direction parameter input by a user to generate the super-resolution photographed image at the target viewing angle with highlight detail based on steps 302 to 303 described below.
Correspondingly, the mobile phone 100 may just preset the target direction parameter, and determine the target polarization parameter based on the user input. In this scenario, the process of generating the super-resolution photographed image with the target view angle with highlight details by the mobile phone 100 is substantially the same as that described above, and will not be described herein.
In yet another example manner, the cell phone 100 may determine the target direction parameter and the target polarization parameter based on user input. Taking the example of the mobile phone 100 performing the post-optimization on the photographed image stored in the mobile phone 100 based on the image processing application, fig. 4a shows an effect diagram of inputting the target direction parameter and the target polarization parameter by the user.
For example, referring to fig. 4a, when a user selects a photographed image 400a based on an image selection function provided by an image processing application, the image processing application may display the photographed image 400a in the adjustment interface 400. Further, the user may input the target direction parameter according to the perspective adjustment control 400b in the adjustment interface 400. For example, the view angle adjustment control 400b may be a slider control, and different positions of the slider may correspond to different target direction parameters. Accordingly, the user can make a viewing angle adjustment by sliding the viewing angle adjustment control 400 b. After the user slides the view angle adjustment control 400b, the image processing application may obtain a target direction parameter corresponding to the sliding position at this time, and display the generated super-resolution captured image 400c of the target view angle in the adjustment interface 400 based on steps 302 to 304 described below according to the target direction parameter.
Further, the user may continue to input the target polarization parameter according to the highlighting adjustment control 400d in the adjustment interface 400. For example, highlight adjustment control 400d may be a slider control, and different positions of the slider may correspond to different target polarization parameters. Accordingly, the user can highlight the super-resolution photographed image 400c at the target viewing angle by sliding the highlight adjustment control 400 d. After the user slides the highlight adjustment control 400d, the image processing application may obtain the target polarization parameter corresponding to the sliding position at this time, and display the generated super-resolution captured image 400e with the highlight detail under the target viewing angle in the adjustment interface 400 based on the following steps 302 to 304 according to the target polarization parameter. The super-resolution photographed image 400e may have a highlight region 400f.
In addition, fig. 4b shows another schematic effect of the user inputting the target direction parameter and the target polarization parameter. Referring to fig. 4b, the image processing application may also obtain the corresponding target polarization parameter and target direction parameter after the user slides the viewing angle adjustment control 400b and the highlight adjustment control 400d, and display the generated super-resolution captured image 400e with highlight detail at the target viewing angle in the adjustment interface 400 based on steps 302 to 304 described below.
In addition, it can be appreciated that the user may also input the target polarization parameter for highlighting adjustment and then input the target direction parameter for viewing angle adjustment according to the adjustment interface 400. Here, the present application does not provide a restrictive explanation of the adjustment process performed by the user according to the adjustment interface 400.
302: and determining a high-resolution reference image corresponding to the shot image.
Illustratively, as can be seen from the foregoing description of fig. 2, the NeRF model provided in the present application can enhance the highlight details of the captured image. However, since the NeRF model does not consider the influence of the spatial relationship between different pixel points on the pixel color, the generated captured image under the target viewing angle may lack edge details, texture details, and the like in addition to the highlight details. It will be appreciated that the lack of edge details, texture details, etc. will result in lower resolution of the captured image at the target viewing angle generated by the NeRF model, i.e., the captured image at the target viewing angle generated by the NeRF model may be a low resolution captured image at the target viewing angle with high light details. Therefore, the image processing method provided by the application can also be combined with a reference frame super-resolution (reference-based super resolution, refSR) model, and edge details and texture details in the low-resolution shooting image under the target visual angle output by the NeRF model are supplemented based on the high-resolution reference image so as to improve the resolution of the image. Here, for the sake of description consistency, details of improving the resolution of the image based on the RefSR model will be specifically described in the following description of fig. 7a, which is not repeated herein.
Here, the mobile phone 100 may store one or a plurality of high-resolution reference images, and the high-resolution reference images may be polarized images captured by an electronic device having a polarization lens incorporated therein or a polarization filter attached thereto. The polarization parameters corresponding to the different high resolution reference images may be different, for example, a polarization lens or a polarization filter used to capture the different high resolution reference images may have different polarization parameters. The polarization parameters may include parameters such as polarization direction. The electronic device that captures the high-resolution reference image may be the mobile phone 100 or may be another electronic device other than the mobile phone 100.
For example, the high resolution reference image corresponding to the photographed image determined by the mobile phone 100 may be a high resolution reference image whose photographed content is the same as or similar to that of the photographed image. For example, after the mobile phone 100 acquires the shot image based on the above step 300, the shot scene and/or the shot object of the shot image and the high-resolution reference image may be identified, and further, the high-resolution reference image in which the shot scene and/or the shot object are identical or similar to the shot image may be used as the high-resolution reference image corresponding to the shot image. The shooting scene includes night scene, daytime scene, blue sky scene, rainy sky scene, starry sky scene, self-timer scene, dim light scene, strong light scene, etc., and the shooting object includes person, animal, flower, tree, lake, mountain, etc. The present application does not make a limiting description of a shooting scene and a shooting object.
Here, if the photographed object of the photographed image obtained in the above step 300 is a person, since the person photographed by the mobile phone 100 is generally repetitive, the person in the photographed image can be recognized as a face, and one or more high resolution reference images of the person as well as the photographed object stored in the mobile phone 100 can be used as the high resolution reference images corresponding to the photographed image. It can be understood that the optimization of the photographed image based on the same high-resolution reference image of the photographed person can effectively improve the optimization effect.
303: a super-resolution captured image is generated based on the captured image, the high-resolution reference image, the target direction parameter, and the target polarization parameter.
For example, for a scene where the mobile phone 100 acquires the target direction parameter and the target polarization parameter of the photographed image based on the foregoing step 301, the mobile phone 100 may generate the super-resolution photographed image under the target viewing angle with high light detail based on the photographed image, the high-resolution reference image, the target direction parameter and the target polarization parameter.
It can be understood that the foregoing scenario in which the mobile phone 100 acquires the target direction parameter and the target polarization parameter of the captured image includes, but is not limited to, the scenario in which the mobile phone 100 presets the target polarization parameter and the target direction parameter in the foregoing step 301, and the scenario in which the corresponding target polarization parameter and the target direction parameter are acquired based on the user operation shown in fig. 4b in the foregoing step 301.
The process of obtaining the super-resolution photographed image 400e at the target view angle with highlight detail based on the photographed image 400a will be described below taking the aforementioned scene shown in fig. 4b as an example.
Specifically, the handset 100 may input the high resolution reference image determined based on the foregoing step 302 into the NeRF model. Taking the example of the mobile phone 100 optimizing the captured image based on the image processing application, the image processing application may input the high resolution reference image determined in the foregoing step 302 into the foregoing NeRF model to obtain a low resolution reference image that is degraded based on the NeRF model. It will be appreciated that the high resolution reference image is essentially a single image with the low resolution reference image, which is the degradation of the high resolution reference image by the image processing of the NeRF model. Here, for the sake of description consistency, a specific process of obtaining the low resolution reference image based on the high resolution reference image will be described in detail in the following description of fig. 7a, which is not repeated herein.
And, the image processing application may input the photographed image 400a acquired based on the foregoing step 300, the target direction parameter and the target polarization parameter acquired based on the foregoing step 301 into the NeRF model to obtain a low resolution photographed image at the target viewing angle with high light detail. Here, for the sake of description consistency, a specific process of obtaining a low resolution captured image with high light detail at a target viewing angle based on the captured image, the target direction parameter and the target polarization parameter will be described in detail in the following description of fig. 7c, which is not repeated here.
Further, the image processing application may input the aforementioned low resolution reference image, the low resolution photographed image at the target view angle with high light detail, the high resolution reference image, the polarization parameters into the RefSR model to introduce high frequency information such as edge detail, texture detail, etc. based on the high resolution reference image, and introduce residual features based on the low resolution reference image to further repair the high frequency information lost or distorted due to the degradation process. It is understood that, based on the foregoing, the sharpness of the low-resolution captured image at the target viewing angle with high-light details can be improved, and the super-resolution captured image at the target viewing angle with high-light details can be generated. For example, the super-resolution captured image 400e at the target view angle with high light detail output by the RefSR model can be obtained based on the foregoing manner. Here, for descriptive consistency, a specific process of obtaining the super-resolution captured image at the target viewing angle with high light detail based on the low-resolution reference image, the low-resolution captured image at the target viewing angle with high light detail, and the high-resolution reference image will be described in detail in the following description of fig. 7b, which is not repeated herein.
Therefore, based on the above, the image processing method provided in the present application may perform view angle adjustment and highlight adjustment on the captured image based on the NeRF model, and perform Super Resolution (SR) reconstruction based on the RefSR model, so as to obtain a super resolution captured image with higher resolution and more high frequency details. The method and the device have the advantages that the requirements of users for adjusting the view angle of the shot image and generating the shot image under the target view angle are met based on the NeRF model, the defect that the shot image under the target view angle generated by the NeRF model lacks high-frequency details is overcome, and the super-resolution shot image under the target view angle with high-light details is obtained. For example, for the scene shown in fig. 4b, the angle of view and the highlight of the captured image 400a may be adjusted based on the foregoing, resulting in a super-resolution captured image 400e at the target angle of view with highlight details.
The process of obtaining the super-resolution photographed image 400e at the target view angle with highlight detail based on the photographed image 400a will be described below taking the aforementioned scene shown in fig. 4a as an example.
For example, for the aforementioned scenario shown in fig. 4a, when the image processing application first acquires the corresponding target direction parameter based on the operation of the user sliding view angle adjustment control 400b, the image processing application may generate the super-resolution captured image 400c at the target view angle based on the captured image 400a, the high-resolution reference image, and the target direction parameter. When the image processing application acquires the corresponding target polarization parameter based on the operation of the user sliding the highlight adjustment control 400d, the image processing application may generate the super-resolution captured image 400e at the target viewing angle with highlight detail based on the super-resolution captured image 400c at the target viewing angle, the high-resolution reference image, and the target polarization parameter. Here, the process of obtaining the super-resolution photographed image 400c from the photographed image 400a and obtaining the super-resolution photographed image 400e from the super-resolution photographed image 400c is substantially the same as the process of obtaining the super-resolution photographed image 400e based on the photographed image 400a, and will not be described herein.
304: and displaying the generated super-resolution photographed image at the target viewing angle.
Illustratively, the super-resolution photographed image obtained by the mobile phone 100 based on the steps 300 to 303 includes: a super-resolution photographed image at a target viewing angle with high light detail generated based on the photographed image, the high-resolution reference image, the target direction parameter, and the target polarization parameter, for example, the aforementioned super-resolution photographed image 400e; a super-resolution photographed image at a target viewing angle generated based on the photographed image, the high-resolution reference image, and the target direction parameter, for example, the aforementioned super-resolution photographed image 400c. Here, specific contents of the super-resolution captured image at the target angle of view are not described in a limiting manner.
Specifically, for the aforementioned scene shown in fig. 4a, the image processing application may display a super-resolution captured image 400c at the target viewing angle generated based on the captured image 400a, the high-resolution reference image, and the target direction parameter in the adjustment interface 400. The super-resolution photographed image 400e at the target viewing angle with high light detail generated based on the super-resolution photographed image 400c, the high-resolution reference image, and the target polarization parameter may be displayed in the adjustment interface 400.
Specifically, for the aforementioned scene shown in fig. 4b, the image processing application may display in the adjustment interface 400a super-resolution captured image 400e at the target view angle with high light detail generated based on the captured image 400a, the high-resolution reference image, the target direction parameter, the target polarization parameter.
In addition, referring to fig. 4c, taking an example of a case where the mobile phone 100 performs post-optimization on the photographed image stored in the mobile phone 100 based on the foregoing image processing application, the image processing application may extract a part of the image in the photographed image, and further, adjust the viewing angle and the highlight for the part of the image.
For example, the captured image 401a acquired based on the foregoing step 300 may include a portrait 401b and a background 401c. Further, the image processing application may extract the portrait 401b in the captured image 401a based on the user operation and display the portrait 401b in the adjustment interface 400. Here, the user operation to extract the portrait 401b in the captured image 401a may be a single-finger or multi-finger operation on the portrait 401b, for example, a single-finger long press of the portrait 401b, a double-finger click of the portrait 401b, or the like. But may also be a click operation of a matting control (not shown in the figure) in the adjustment interface 400, etc. Here, the present application does not make a restrictive explanation of the user operation of extracting the portrait 401 b. In addition, the partial image extracted by the image processing application based on the user operation may also be a face of a person in the portrait 401b or the like, and the specific content of the partial image extracted based on the user operation is not limited in the present application.
Further, the image processing application may perform highlight adjustment and view angle adjustment on the human image 401b based on the foregoing steps 301 to 303, generate a super-resolution captured image 401d at the target view angle with the highlight region 401c, and display the super-resolution captured image 401d in the adjustment interface 400.
Here, the user may save the super-resolution captured image 401d alone, and the user may make the super-resolution captured image 401d into an avatar, an expression pack, wallpaper, combine it with other images, or the like, for example. The application does not make a limiting explanation on a specific application scene after the user saves the super-resolution photographed image 401 d. It is understood that the user may also paste the super-resolution captured image 401d back into the captured image 401 a.
Examples
The embodiment of the application will describe in detail a specific implementation process of the image processing method provided in the embodiment of the application in an application scenario in which a user needs to only improve the texture of a shot picture, but does not need to adjust the viewing angle.
First, fig. 5 shows a flowchart of another image processing method provided in the embodiment of the present application, and a detailed description will be given below of a specific process of applying the image processing method provided in the embodiment of the present application to an electronic device in conjunction with fig. 5.
It will be appreciated that the electronic device implementing the steps of the flow described in fig. 5 may be the handset 100 described above. For convenience of description, the mobile phone 100 is taken as an execution body in the following description of each step, and the execution body will not be described in detail.
Specifically, the implementation flow of the image processing method provided in the embodiment of the present application may include the following steps:
500: a captured image is acquired in response to a first user operation.
Here, the specific process of acquiring the photographed image based on the first user operation by the mobile phone 100 may be referred to the specific description in the foregoing step 300, which is not described herein.
501: and acquiring a target polarization parameter corresponding to the shot image.
In one example manner, the cell phone 100 may preset the target polarization parameters. For example, taking the example that the mobile phone 100 performs real-time optimization on the shot image shot by the user based on the shooting application, after the user shoots the shot image based on the step 500, the shooting application may acquire the target polarization parameter and the target direction parameter preset in the mobile phone 100 before displaying the shot image to the user, and generate the super-resolution shot image with highlight detail based on the following steps 502 to 503 according to the target polarization parameter. Further, the super-resolution captured image with highlight detail may be displayed to the user based on step 504 described below. It will be appreciated that, since the present embodiment does not adjust the photographing angle of view of the photographed image, the generated super-resolution photographed image may be a super-resolution photographed image at the original angle of view with high light detail. Here, the original viewing angle includes a photographing viewing angle based on the photographing image acquired in step 500. Likewise, the original viewing angle includes a photographing viewing angle corresponding to a direction parameter of the photographed image.
In another example manner, the cell phone 100 may determine the target polarization parameter based on user input. Taking the example of the mobile phone 100 performing the post-optimization on the photographed image stored in the mobile phone 100 based on the aforementioned image processing application, fig. 6 shows an effect of inputting the target polarization parameter by the user.
For example, referring to fig. 6, when the user selects the photographed image 600a based on the image selection function provided by the image processing application, the image processing application may display the photographed image 600a in the adjustment interface 600. Further, the user may continue to input the target polarization parameter according to the highlighting adjustment control 600b in the adjustment interface 600. For example, highlight adjustment control 600b may be a slider control, and different positions of the slider may correspond to different target polarization parameters. Thus, the user can highlight the captured image by sliding the highlight adjustment control 600 b. After the user slides the highlight adjustment control 600b, the image processing application may acquire a target polarization parameter corresponding to the sliding position at this time, and display the generated super-resolution captured image 600c with highlight details in the adjustment interface 600 based on steps 502 to 504 described below according to the target polarization parameter. The super-resolution photographed image 600c may have a highlight region 600d.
502: and determining a high-resolution reference image corresponding to the shot image.
As can be seen from the foregoing description of fig. 2, since the NeRF model does not consider the influence of the spatial relationship between different pixel points on the pixel color, the generated captured image at the target viewing angle may lack edge details, texture details, and the like, in addition to the highlight details. It can be appreciated that lack of edge details, texture details, etc. will result in lower resolution of the captured image at the target viewing angle generated by the NeRF model, i.e., the image generated based on the NeRF model provided by the embodiments of the present application may be a low resolution image. Therefore, the image processing method provided by the application can be combined with the RefSR model, and the edge details and the texture details in the low-resolution image with the high light effect output by the NeRF model are supplemented based on the high-resolution reference image, so that the resolution of the image is improved. Here, for the sake of description consistency, details of improving the resolution of the image based on the RefSR model will be specifically described in the following description of fig. 7c, which is not repeated herein.
The specific process of determining the high resolution reference image corresponding to the shot image by the mobile phone 100 can be referred to in the above-mentioned step 302, and will not be described herein.
503: a super-resolution captured image is generated based on the captured image, the high-resolution reference image, and the target polarization parameter.
For example, for a scene where the mobile phone 100 acquires the target polarization parameters of the photographed image based on the foregoing step 501, the mobile phone 100 may generate a super-resolution photographed image with high light detail based on the photographed image, the high-resolution reference image, and the target polarization parameters. The super-resolution photographed image with high-light details may be, for example, the super-resolution photographed image 600c with the high-light region 600d in fig. 6.
Specifically, the process of generating the super-resolution captured image with high-light details based on the captured image, the high-resolution reference image, and the target polarization parameter by the mobile phone 100 is substantially the same as the process of generating the super-resolution captured image with high-light details based on the captured image, the high-resolution reference image, the target direction parameter, and the target polarization parameter by the mobile phone 100 in the foregoing step 303, and will not be described herein.
It will be appreciated that in this embodiment, since the user does not need to adjust the shooting angle of view, the direction parameter input into the NeRF model may be the direction parameter of the shot image. Specifically, a direction parameter (as an example of a first direction parameter) that characterizes a shooting angle of view of a shot image may be mentioned. Furthermore, a low-resolution photographed image with high light detail at an original viewing angle can be obtained through a NeRF model according to the photographed image and the target polarization parameters. The aforementioned original viewing angle may be a photographing viewing angle of a photographed image. Further, according to the photographed image, the target polarization parameter and the high-resolution reference image, the super-resolution photographed image with high light detail under the original viewing angle can be obtained through a RefSR model.
504: and displaying the super-resolution shooting image.
Illustratively, the super-resolution photographed image obtained by the mobile phone 100 based on the steps 500 to 503 includes a super-resolution photographed image with high-light details generated based on the photographed image, the high-resolution reference image, and the target polarization parameter, for example, the aforementioned super-resolution photographed image 600c with the high-light region 600 d. Here, the present application does not make a limiting description of the specific contents of the super-resolution captured image.
Specifically, for the aforementioned scene shown in fig. 6, the image processing application may display a super-resolution captured image 600c generated based on the captured image 600a, the high-resolution reference image, and the target polarization parameter in the adjustment interface 600.
Here, the NeRF model provided in the embodiment of the present application strengthens the highlight region 600d of the captured image 600a, and in addition, the RefSR model provided in the embodiment of the present application increases texture details, edge details, and the like of the super-resolution captured image 600c while maintaining the highlight region 600 d. It will be appreciated that the super-resolution captured image 600c may be a captured image 600a with enhanced highlight detail, texture detail, edge detail. That is, the highlight effect and definition of the photographed image 600a are effectively improved based on the foregoing steps 500 to 504.
The specific process of generating a super-resolution photographed image with high light detail at a target viewing angle according to the image processing method provided by the present application can be described in detail below based on the NeRF model and RefSR model with reference to the accompanying drawings.
First, a training process of the NeRF model and the RefSR model in the image processing method provided in the present application will be described in detail.
Illustratively, fig. 7a shows a schematic diagram of a training process of the NeRF model provided in the present application.
Referring to fig. 7a, the training data of the nerf model may be multiple sets of high resolution reference images (as one example of the aforementioned multiple sets of polarized images). Each set of high-resolution reference images may include a plurality of polarized images of the same subject photographed at a plurality of angles of view, the polarized images being photographed by a camera provided with a polarization filter or a polarization lens. Different shooting viewing angles can correspond to different direction parameters, and different polarization filters or polarization lenses can correspond to different polarization parameters.
The input data of the NeRF model may include position parameters (x, y, z) of respective pixels of respective high-resolution reference images, direction parameters (θ, Φ) representing a photographing viewing angle of the high-resolution reference images, and polarization parameters p representing a polarization direction of a polarization filter or a polarization lens photographing the high-resolution reference images. The position parameter (x, y, z) of the pixel point may be a three-dimensional position coordinate of the spatial point corresponding to the pixel point in a world coordinate system. θ may be a camera pose corresponding to a photographing angle of view of the high resolution reference image, and Φ may be a camera internal reference corresponding to the photographing angle of view. p may be the polarization direction of a polarizing filter or polarizing lens that captures the high resolution reference image.
Furthermore, the positional MLP in the NeRF model may obtain the bulk density values of a plurality of sampling points corresponding to the spatial point and the feature vector of each sampling point based on the spatial position coordinates (x, y, z) of the spatial point corresponding to the pixel point. The directional MLP in the NeRF model may obtain the color value of the sampling point based on the direction parameter (θ, Φ) of the high-resolution reference image, the polarization parameter p of the high-resolution reference image, and the feature vector of the sampling point output by the positional MLP. The plurality of sampling points corresponding to the spatial point may include a plurality of sampling points selected from a light beam from the camera to the spatial point.
Furthermore, the volume density values and the color values of a plurality of sampling points corresponding to the spatial point can be integrated based on the volume rendering model in the NeRF model, so as to obtain the pixel color of the pixel point corresponding to the spatial point output by the NeRF model under the shooting view angle of the high-resolution reference image. It can be understood that the pixel color of the pixel points obtained based on the foregoing manner is affected by the polarization parameter of the high-resolution reference image, and therefore, the image output by the NeRF model based on the pixel color of each pixel point can exhibit a highlight effect corresponding to the polarization parameter.
It can be appreciated that, since the determination process of the NeRF model for the pixel colors of each pixel point is independent, the influence of the spatial relationship between different pixel points on the pixel colors is not considered, and thus, the result output by the NeRF model may lack high-frequency information such as edge details, texture details, and the like, and thus, the high-resolution reference image may be degraded into the low-resolution reference image through the NeRF model. That is, the high resolution reference image may output a degradation result retaining high light information through the NeRF model, for example, a low resolution reference image having a high light effect may be output.
Furthermore, the loss between the pixel color and the actual pixel color in the high-resolution reference image can be obtained by calculating the NeRF model based on the loss function, parameters in the NeRF model can be adjusted in a gradient descent method and the like, and the loss is converged to a satisfactory level through multiple training iterations, so that the accuracy of the finally determined pixel color of the NeRF model is improved.
It will be appreciated that the higher the accuracy of the pixel color determined by the NeRF model, the more similar the highlight effect that represents the low resolution reference image it generates can exhibit to its corresponding polarization parameter. Accordingly, when the NeRF model is applied, the shot image can accurately show the highlight effect corresponding to the input polarization parameter based on the input polarization parameter.
Based on the foregoing, the image output by the NeRF model is a low resolution image, so, in order to further improve the resolution of the image output by the NeRF model, fig. 7b shows a schematic diagram of a training process of the RefSR model provided in the present application.
Referring to fig. 7b, training data of the refsr model may include a high resolution reference image, a low resolution reference image at a new view angle, and polarization parameters. Here, the low resolution reference image may be an output result obtained by inputting a trained NeRF model into the high resolution reference image. The low resolution reference image at the new view angle may be an output result obtained by inputting a trained NeRF model to the high resolution reference image and a direction parameter different from the high resolution reference image. The polarization parameter may be a polarization parameter of the high resolution reference image. It will be appreciated that the aforementioned low resolution reference image and the low resolution reference image at the new view angle all have high light detail.
In particular, in one aspect, the RefSR model may be degradation modeled based on the low resolution reference image and the low resolution reference image at the new view angle. The degradation process of the low resolution reference image and the low resolution reference image at the new view angle is thereby learned to obtain residual features that help to repair high frequency information lost or distorted by the degradation process. And reorder feature vectors based on depth to space (D2S) operations to facilitate subsequent feature fusion.
On the other hand, the RefSR model may be modeled at a high frequency based on a low resolution reference image and a high resolution reference image at a new view angle. For example, the low resolution reference image at the new view angle may be up-sampled (upsampling) to primarily improve image quality. Further, a space to depth (S2D) operation may be performed on the high resolution reference image and the low resolution reference image at the up-sampled new view angle to achieve feature fusion. Further, high frequency modeling may be performed based on an encoder-decoder (encoder-decoder) architecture. Specifically, the high-frequency feature vector of the high-resolution reference image after feature fusion and the low-resolution reference image of the new view can be extracted according to the encoder. And splicing the polarization parameters into the high-frequency eigenvector, and outputting a high-frequency modeling result based on the high-frequency eigenvector through a decoder.
Furthermore, the fusion module in the RefSR model can fuse the result of the high-frequency modeling output and the result of the degradation modeling output to generate a super-resolution reference image under a new view angle. It can be appreciated that the super-resolution reference image at the new view angle has high-light details corresponding to the polarization parameters.
Here, the super-resolution reference image under the new view angle output by the RefSR model may have high-frequency information, such as edge details, texture details, etc., introduced from the high-resolution reference image, and may exhibit a highlight effect corresponding to the polarization parameter of the high-resolution reference image, based on a plurality of training iterative processes for the RefSR model.
Furthermore, based on the training process of the NeRF model and RefSR model, fig. 7c provides a schematic process diagram of generating a super-resolution captured image with highlight detail under the target view angle based on the trained NeRF model and RefSR model.
Referring to fig. 7c, in the process of adjusting the viewing angle and the highlight of the photographed image, the photographed image acquired based on the foregoing step 300 or step 500, and the target direction parameter and/or the target polarization parameter acquired based on the foregoing step 301 or step 501 may be input to the trained NeRF model. It can be appreciated that, since the trained NeRF model can adjust the viewing angle of the captured image based on the input direction parameter, generate the captured image under the target viewing angle, and adjust the highlight detail of the captured image based on the input polarization parameter, so that the captured image has the highlight effect corresponding to the polarization parameter. Thus, after the captured image, the target direction parameter, and the target polarization parameter are input into the trained NeRF model, a low resolution captured image at the target viewing angle with high light detail may be generated.
It can be understood that the process of generating the low resolution captured image based on the captured image, the target direction parameter, and the target polarization parameter is substantially the same as the process of generating the low resolution reference image based on the high resolution reference image, the direction parameter of the high resolution reference image, and the polarization parameter of the high resolution reference image in fig. 7a, and will not be repeated herein.
Further, a high resolution reference image corresponding to the captured image determined based on the foregoing step 302 or step 502 may be input to the trained NeRF model. Specifically, the position parameter, the direction parameter, and the polarization parameter of the high resolution reference image may be input to the NeRF model to obtain a corresponding low resolution reference image. The specific content of the position parameter, the direction parameter and the polarization parameter of the high resolution reference image, and the specific process of generating the low resolution reference image can be referred to in the foregoing detailed description of fig. 7a, and will not be described herein.
Further, the low resolution photographed image, the low resolution reference image, the high resolution reference image, and the target polarization parameter under the target view angle with high light detail may be input into a trained RefSR model, so as to output residual characteristics according to the low resolution reference image and the low resolution photographed image through a degradation model in the RefSR model, and output high frequency characteristics according to the high resolution reference image and the low resolution photographed image in combination with the target polarization parameter through a high frequency model in the RefSR model. And further, based on a fusion module in the RefSR model, fusing the residual characteristics and the high-frequency characteristics to obtain the super-resolution shooting image with high-light detail under the target view angle. Here, the specific contents of the output residual characteristic and the high frequency characteristic can be referred to the specific description in fig. 7b, and will not be described herein.
Fig. 8 is a schematic structural diagram of a mobile phone 100, taking an electronic device suitable for the image processing method provided in the present application as an example of the mobile phone 100.
As shown in fig. 8, the mobile phone 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, keys 190, a motor 191, an indicator 192, a camera 193, a display 194, a subscriber identity module (subscriber identification module, SIM) card interface 195, and the like. The sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, and the like.
It should be understood that the structure illustrated in the embodiments of the present invention is not limited to the specific embodiment of the mobile phone 100. In other embodiments of the present application, the handset 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components may be provided. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.
In some embodiments, the processor 110 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, and the like.
The mobile phone 100 implements display functions through a GPU, a display 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light emitting diode (AMOLED), a flexible light-emitting diode (FLED), a Mini-LED, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the cell phone 100 may include 1 or N display screens 194, N being a positive integer greater than 1.
The mobile phone 100 may implement photographing functions through an ISP, a camera 193, a video codec, a GPU, a display 194, an application processor, and the like.
The ISP is used to process data fed back by the camera 193. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, so that the electrical signal is converted into an image visible to the naked eye. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in the camera 193.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, the cell phone 100 may include 1 or N cameras 193, N being a positive integer greater than 1.
The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals. For example, when the handset 100 selects a frequency bin, the digital signal processor is used to fourier transform the frequency bin energy, etc.
Video codecs are used to compress or decompress digital video. The handset 100 may support one or more video codecs. In this way, the mobile phone 100 can play or record video in multiple coding formats, for example: dynamic picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
The NPU is a neural-network (NN) computing processor, and can rapidly process input information by referencing a biological neural network structure, for example, referencing a transmission mode between human brain neurons, and can also continuously perform self-learning. Applications such as intelligent cognition of the mobile phone 100 can be realized through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.
The pressure sensor 180A is used to sense a pressure signal, and may convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 180A may be disposed on the display screen 194. The pressure sensor 180A is of various types, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like. The capacitive pressure sensor may be a capacitive pressure sensor comprising at least two parallel plates with conductive material. The capacitance between the electrodes changes when a force is applied to the pressure sensor 180A. The handset 100 determines the strength of the pressure from the change in capacitance. When a touch operation is applied to the display 194, the mobile phone 100 detects the intensity of the touch operation according to the pressure sensor 180A. The mobile phone 100 may also calculate the position of the touch based on the detection signal of the pressure sensor 180A. In some embodiments, touch operations that act on the same touch location, but at different touch operation strengths, may correspond to different operation instructions.
The touch sensor 180K, also referred to as a "touch device". The touch sensor 180K may be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, which is also called a "touch screen". The touch sensor 180K is for detecting a touch operation acting thereon or thereabout. The touch sensor may communicate the detected touch operation to the application processor to determine the touch event type. Visual output related to touch operations may be provided through the display 194. In other embodiments, the touch sensor 180K may be disposed on the surface of the mobile phone 100 at a different location than the display 194.
The keys 190 include a power-on key, a volume key, etc. The keys 190 may be mechanical keys. Or may be a touch key. The handset 100 may receive key inputs, generating key signal inputs related to user settings and function control of the handset 100.
Fig. 9 is a block diagram of a software structure of a mobile phone 100, taking an electronic device applicable to the image processing method provided in the present application as an example of the mobile phone 100.
The software system of the mobile phone 100 may employ a layered architecture, an event driven architecture, a micro-core architecture, a micro-service architecture, or a cloud architecture. In the embodiment of the invention, taking an Android system with a layered architecture as an example, a software structure of the mobile phone 100 is illustrated. The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, an application layer, an application framework layer, an Zhuoyun row (Android run) and system libraries, and a kernel layer, respectively.
The application layer may include a series of application packages.
As shown in fig. 9, the application package may include the aforementioned image processing application, photographing application, and the like.
The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions.
As shown in fig. 9, the application framework layer may include a window manager, a content provider, a view system, a phone manager, a resource manager, a notification manager, and the like.
The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.
The content provider is used to store and retrieve data and make such data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, etc.
The view system includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, a display interface including a text message notification icon may include a view displaying text and a view displaying a picture.
The telephony manager is used to provide the communication functions of the handset 100. Such as the management of call status (including on, hung-up, etc.).
The resource manager provides various resources for the application program, such as localization strings, icons, pictures, layout files, video files, and the like.
The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction. Such as notification manager is used to inform that the download is complete, message alerts, etc. The notification manager may also be a notification in the form of a chart or scroll bar text that appears on the system top status bar, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in a status bar, a prompt tone is emitted, the electronic device vibrates, and an indicator light blinks, etc.
Android run time includes a core library and virtual machines. Android run time is responsible for scheduling and management of the Android system.
The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.
The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface manager (surface manager), media library (media library), three-dimensional graphics processing library (e.g., openGL ES), 2D graphics engine (e.g., SGL), etc.
The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.
Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.
The 2D graphics engine is a drawing engine for 2D drawing.
The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.
The embodiment of the application also provides a computer program product for realizing the image processing method provided by each embodiment.
Embodiments of the mechanisms disclosed herein may be implemented in hardware, software, firmware, or a combination of these implementations. Embodiments of the present application may be implemented as computer program modules or module code executing on a programmable system including at least one processor, a storage system (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device.
Computer program modules or module code may be applied to input instructions to perform the functions described herein and generate output information. The output information may be applied to one or more output devices in a known manner. For purposes of this application, a processing system includes any system having a processor such as, for example, a digital signal processor (digital signal processor, DSP), microcontroller, application specific integrated circuit (application specific integrated circuit, ASIC), or microprocessor.
The module code may be implemented in a high level modular language or an object oriented programming language for communication with a processing system. The module code may also be implemented in assembly or machine language, if desired. Indeed, the mechanisms described in the present application are not limited in scope to any particular programming language. In either case, the language may be a compiled or interpreted language.
In some cases, the disclosed embodiments may be implemented in hardware, firmware, software, or any combination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on one or more transitory or non-transitory machine-readable (e.g., computer-readable) storage media, which may be read and executed by one or more processors. For example, the instructions may be distributed over a network or through other computer readable media. Thus, a machine-readable medium may include any mechanism for storing or transmitting information in a form readable by a machine (e.g., a computer), including but not limited to floppy diskettes, optical disks, magneto-optical disks, read Only Memories (ROMs), random access memories (random access Memory, RAMs), erasable programmable Read Only memories (erasable programmable Read Only Memory, EPROMs), electrically erasable programmable Read Only memories (electrically erasable programmable Read-Only memories, EEPROMs), magnetic or optical cards, flash Memory, or tangible machine-readable Memory for transmitting information (e.g., carrier waves, infrared signal digital signals, etc.) using the internet in an electrical, optical, acoustical or other form of propagated signal. Thus, a machine-readable medium includes any type of machine-readable medium suitable for storing or transmitting electronic instructions or information in a form readable by a machine (e.g., a computer).
Reference in the specification to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one example implementation or technique disclosed in accordance with embodiments of the present application. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment.
The disclosure of the embodiments of the present application also relates to an operating device for executing the text. The apparatus may be specially constructed for the required purposes, or it may comprise a general-purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random Access Memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards, application Specific Integrated Circuits (ASICs), or any type of media suitable for storing electronic instructions, and each may be coupled to a computer system bus. Furthermore, the computers referred to in the specification may include a single processor or may be architectures employing multiple processors for increased computing power.
Additionally, the language used in the specification has been principally selected for readability and instructional purposes, and may not have been selected to delineate or circumscribe the disclosed subject matter. Accordingly, the present application example disclosure is intended to be illustrative, but not limiting, of the scope of the concepts discussed herein.

Claims (20)

1. An image processing method applied to an electronic device, the method comprising:
acquiring a first image;
and inputting a first image and a first polarization parameter into the first image processing model to obtain a second image, wherein the second image comprises first highlight information corresponding to the first polarization parameter.
2. The method of claim 1, wherein the image parameters of the first image include at least a location parameter of each pixel of the first image, a direction parameter of the first image, and,
the inputting the first image and the first polarization parameter to the first image processing model to obtain the second image includes:
inputting the image parameters of the first image and the first polarization parameters into the first image processing model to obtain pixel color parameters of each pixel point;
and generating the second image according to the pixel color parameters of the pixel points.
3. The method of claim 2, wherein the direction parameters of the first image comprise:
a first direction parameter characterizing a viewing angle of capture of the first image, or,
and a second direction parameter, wherein the second direction parameter characterizes a shooting view angle different from the shooting view angle of the first image.
4. A method according to claim 3, wherein after the second image is obtained, the method further comprises:
inputting the second image, the first polarization parameter, a first reference image and a second reference image into a second image processing model to obtain a third image, wherein the first reference image comprises a high-resolution polarization image which is the same as a shooting object and/or a shooting scene of the first image, and the image parameters of the first reference image at least comprise position parameters of all pixel points of the first reference image, direction parameters and polarization parameters of the first reference image;
and the second reference image is a low-resolution polarized image obtained by inputting the image parameters of the first reference image into the first image processing model.
5. The method of claim 4, wherein the second image processing model comprises a first model, a second model, a third model, and,
The inputting the second image, the first polarization parameter, the first reference image and the second reference image into the second image processing model to obtain a third image includes:
inputting the first reference image, the second image and the first polarization parameter into the first model to obtain a first feature vector;
inputting the second reference image and the second image into the second model to obtain a second feature vector;
inputting the first characteristic vector and the second characteristic vector into the third model to obtain the third image;
the third image is a super-resolution image, and the third image includes first high-light information corresponding to the first polarization parameter and first high-frequency information, where the first high-frequency information corresponds to the first feature vector and the second feature vector.
6. The method of claim 5, wherein said inputting the first reference image, the second image, and the first polarization parameter into the first model results in a first feature vector, comprising:
performing space-to-depth rearrangement operation on the up-sampling result of the second image and the first reference image to obtain a third feature vector;
Inputting the third feature vector into an encoder in the first model to obtain a fourth feature vector;
splicing the first polarization parameter into the fourth feature vector based on the first model to obtain a fifth feature vector;
and inputting the fifth feature vector into a decoder in the first model to obtain the first feature vector.
7. The method of claim 5, wherein said inputting the second reference image and the second image into the second model results in a second feature vector, comprising:
inputting the second reference image and the second image into the second model to obtain a sixth feature vector;
and carrying out depth-to-space rearrangement operation on the sixth feature vector to obtain the second feature vector.
8. The method of claim 1, wherein the first image processing model comprises a NeRF model.
9. The method of claim 4, wherein the second image processing model comprises a RefSR model.
10. A model training method applied to an electronic device, the method comprising:
acquiring a plurality of fourth images, wherein the fourth images comprise high-resolution polarized images obtained by shooting the same shooting object based on different shooting visual angles, and the image parameters of the fourth images at least comprise the position parameters of all pixel points of the fourth images, the direction parameters of the fourth images and the polarization parameters;
Inputting the image parameters of the fourth image into a first image processing model to obtain training pixel color parameters of each pixel point of the fourth image;
calculating a loss value through a loss function according to the training pixel color parameters and the actual color parameters of each pixel point of the fourth image;
and adjusting parameters of the first image processing model to enable the loss value to be located in a preset interval.
11. The method according to claim 10, wherein the method further comprises:
and obtaining a fifth image based on the training pixel color parameters of each pixel point of the fourth image, wherein the fifth image is a low-resolution polarized image corresponding to the fourth image.
12. The method of claim 11, wherein the method further comprises:
inputting the fourth image, the fifth image and polarization parameters of the fourth image into a second image processing model to obtain a sixth image;
calculating a loss value through a loss function according to the sixth image and the fourth image;
and adjusting parameters of the second image processing model to enable the loss value to be located in a preset interval.
13. The method of claim 12, wherein the second image processing model comprises a first model, a second model, a third model, and,
inputting the polarization parameters of the fourth image, the fifth image and the fourth image into the second image processing model to obtain a sixth image, wherein the method comprises the following steps:
inputting polarization parameters of the fourth image, the fifth image and the fourth image into the first model to obtain a first training feature vector;
inputting the fifth image into the second model to obtain a second training feature vector;
and inputting the first training feature vector and the second training feature vector into the third model to obtain a sixth image.
14. The method of claim 13, wherein inputting the polarization parameters of the fourth image, the fifth image, and the fourth image into the first model yields a first training feature vector, comprising:
performing space-to-depth rearrangement operation on the upsampling result of the fifth image and the fourth image to obtain a third training feature vector;
inputting the third training feature vector into an encoder in the first model to obtain a fourth training feature vector;
Splicing the polarization parameters of the fourth image to the fourth training feature vector to obtain a fifth training feature vector;
and inputting the fifth training feature vector into a decoder in the first model to obtain the first training feature vector.
15. The method of claim 13, wherein said inputting the fifth image into the second model results in a second training feature vector, comprising:
inputting the fifth image to the second model to obtain a sixth training feature vector;
and carrying out depth-to-space rearrangement operation on the sixth training feature vector to obtain the second training feature vector.
16. The method of claim 10, wherein the first image processing model comprises a NeRF model.
17. The method of claim 12, wherein the second image processing model comprises a RefSR model.
18. An electronic device, comprising: one or more processors; one or more memories; the one or more memories store one or more programs that, when executed by the one or more processors, cause the electronic device to perform the image processing method of any of claims 1-9, or the model training method of any of claims 10-17.
19. A computer readable medium having instructions stored thereon, which when executed on a computer cause the computer to perform the image processing method of any of claims 1 to 9 or the model training method of any of claims 10 to 17.
20. A computer program product comprising computer programs/instructions which when executed by a processor implement the image processing method of any of claims 1 to 9 or the model training method of any of claims 10 to 17.
CN202410148313.5A 2024-02-02 2024-02-02 Image processing method, electronic device, and computer-readable storage medium Pending CN117689545A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410148313.5A CN117689545A (en) 2024-02-02 2024-02-02 Image processing method, electronic device, and computer-readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410148313.5A CN117689545A (en) 2024-02-02 2024-02-02 Image processing method, electronic device, and computer-readable storage medium

Publications (1)

Publication Number Publication Date
CN117689545A true CN117689545A (en) 2024-03-12

Family

ID=90128551

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410148313.5A Pending CN117689545A (en) 2024-02-02 2024-02-02 Image processing method, electronic device, and computer-readable storage medium

Country Status (1)

Country Link
CN (1) CN117689545A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027415A (en) * 2019-11-21 2020-04-17 杭州凌像科技有限公司 Vehicle detection method based on polarization image
CN115424115A (en) * 2022-09-11 2022-12-02 西北农林科技大学 Transparent target detection method based on polarization imaging and deep learning
CN115661320A (en) * 2022-11-28 2023-01-31 荣耀终端有限公司 Image processing method and electronic device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027415A (en) * 2019-11-21 2020-04-17 杭州凌像科技有限公司 Vehicle detection method based on polarization image
CN115424115A (en) * 2022-09-11 2022-12-02 西北农林科技大学 Transparent target detection method based on polarization imaging and deep learning
CN115661320A (en) * 2022-11-28 2023-01-31 荣耀终端有限公司 Image processing method and electronic device

Similar Documents

Publication Publication Date Title
US20230217097A1 (en) Image Content Removal Method and Related Apparatus
CN113538273B (en) Image processing method and image processing apparatus
WO2021078001A1 (en) Image enhancement method and apparatus
CN113706414B (en) Training method of video optimization model and electronic equipment
CN113099146B (en) Video generation method and device and related equipment
CN115689963B (en) Image processing method and electronic equipment
CN114640783B (en) Photographing method and related equipment
CN113538227B (en) Image processing method based on semantic segmentation and related equipment
WO2023093169A1 (en) Photographing method and electronic device
CN114926351B (en) Image processing method, electronic device, and computer storage medium
CN113705665A (en) Training method of image transformation network model and electronic equipment
CN113452969B (en) Image processing method and device
WO2022057384A1 (en) Photographing method and device
CN116916151B (en) Shooting method, electronic device and storage medium
CN115359105B (en) Depth-of-field extended image generation method, device and storage medium
US20230014272A1 (en) Image processing method and apparatus
CN115580690B (en) Image processing method and electronic equipment
CN117689545A (en) Image processing method, electronic device, and computer-readable storage medium
CN115880350A (en) Image processing method, apparatus, system, and computer-readable storage medium
CN115587938A (en) Video distortion correction method and related equipment
CN116453131B (en) Document image correction method, electronic device and storage medium
CN116193243B (en) Shooting method and electronic equipment
CN114245011B (en) Image processing method, user interface and electronic equipment
CN116708996B (en) Photographing method, image optimization model training method and electronic equipment
CN117499797B (en) Image processing method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination