CN114170084A

CN114170084A - Image super-resolution processing method, device and equipment

Info

Publication number: CN114170084A
Application number: CN202111485136.2A
Authority: CN
Inventors: 方璐; 季梦奇; 金鼎健; 戴琼海
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2021-12-07
Filing date: 2021-12-07
Publication date: 2022-03-11

Abstract

The invention discloses an image super-resolution processing method, device and equipment. The method comprises the following steps: acquiring an image to be processed; the super-resolution processing is carried out on the image to be processed through a target neural network model, the target neural network model is obtained through a target image sample set iterative training SISR network model, the target image sample comprises a light field image sample containing four-dimensional information, and the technical scheme of the invention not only solves the problem that the training data is a single picture data set, so that the high-resolution and low-resolution mapping of only one visual angle in one scene can be learned, and the limitation is large. The problem that input is limited to multiple pictures and is not suitable for super resolution of a single picture when pictures with different visual angles in the same scene are used for super resolution is solved, and super resolution capability of a network can be enhanced.

Description

Image super-resolution processing method, device and equipment

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a method, a device and equipment for processing image super-resolution.

Background

With the development of the field of image-based computer vision, the spatial resolution of images becomes a fundamental and important research field. The mainstream image super-resolution method at present is divided into a traditional non-learning method and a learning method. Generally speaking, the learning method learns a complex mapping function from a low dimension to a high dimension, and the super-resolution effect is superior to that of the traditional method in terms of visual effect or parameter. Meanwhile, in the currently mainstream super-resolution learning method for a Single picture, Single image super-resolution (SISR) aims to predict and recover high-resolution information from a low-resolution picture. SISR has been studied for decades as a fundamental field of computer vision research, and the results achieved have also been widely applied in many computer vision applications, such as medical imaging, satellite imaging, and security.

Recently, the research of deep learning method brings a new exploration idea for SISR. Various methods for SISR based on deep learning are proposed to improve the effect of super resolution. However, the learning method only learns the high-low resolution mapping of a next view angle in a scene due to the limitation of the training data, and the limitation is large, so that the performance is limited to be further improved. In order to make full use of training data, some work utilizes the angle information of the fused input pictures (usually, angle information is generated only when a plurality of pictures are input), so that a better spatial super-resolution effect is obtained. However, these efforts rely on multi-picture input and do not raise the problem of single-picture input, i.e., SISR.

Disclosure of Invention

The embodiment of the invention provides an image super-resolution processing method, device and equipment, which can learn better super-resolution capability by a network through a mapping relation of four-dimensional information, and meanwhile, the network architecture is also suitable for solving super-resolution of a two-dimensional picture, namely, the super-resolution capability of a SISR network is enhanced based on multi-view picture training. And the method is suitable for all the SISR networks based on learning at present, and has high network portability.

In a first aspect, an embodiment of the present invention provides an image super-resolution processing method, including:

acquiring an image to be processed;

performing super-resolution processing on the image to be processed through a target neural network model, wherein the target neural network model is obtained through iterative training of a target image sample set SISR network model, and the target image sample comprises a light field image sample containing four-dimensional information.

Further, iteratively training the SISR network model through the target image sample set includes:

establishing a SISR network model;

determining K sub-view samples according to the light field image samples, wherein K is a positive integer greater than 1 or equal to 1;

inputting the K sub-view samples into the SISR network model respectively to obtain K first predicted images;

determining a predicted light field image according to the K first predicted images;

training parameters of the SISR network model according to an objective function formed by the predicted light field image and the light field image sample, wherein the objective function comprises: a content and loss function;

and returning to execute the operation of inputting the K sub-view samples into the SISR network model to obtain a predicted image until a target neural network model is obtained.

Further, the content and loss function is:

wherein the content of the first and second substances,

a is the proportion of the loss of the structural similarity loss function SSIM,

for the first light-field image sample,

to predict light-field images,/_kIs light ofThe kth sub-view corresponding to a field image sample,

is an upsampling matrix.

Further, the objective function further includes: at least one of a variogram loss function and a disparity map loss function.

Further, training the parameters of the SISR network model according to the objective function formed by the predicted light field image and the light field image samples comprises:

acquiring a first super pixel corresponding to the light field image sample and a sub-pixel in the first super pixel;

acquiring a second super pixel corresponding to the predicted light field image and a sub-pixel in the second super pixel;

training parameters of the SISR network model according to a variogram loss function formed by the first super pixel, the sub-pixels in the first super pixel, the second super pixel and the sub-pixels in the second super pixel;

wherein the variogram loss function is:

wherein the content of the first and second substances,

in order to predict the light-field image variogram,

is a light field image sample variogram.

acquiring a disparity map corresponding to the predicted light field image;

and training parameters of the SISR network model according to a disparity map loss function formed by the disparity map corresponding to the predicted light field image and the disparity map corresponding to the light field image sample.

Further, the target image sample further includes: a two-dimensional image sample.

In a second aspect, an embodiment of the present invention further provides an image super-resolution processing apparatus, including:

the acquisition module is used for acquiring an image to be processed;

and the processing module is used for performing super-resolution processing on the image to be processed through a target neural network model, the target neural network model is obtained by iteratively training a SISR network model through a target image sample set, and the target image sample comprises a light field image sample containing four-dimensional information.

In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the image super-resolution processing method according to any one of the embodiments of the present invention when executing the program.

According to the method and the device, the image to be processed is obtained, the super-resolution processing is carried out on the image to be processed through the target neural network model, the target neural network model is obtained through iterative training of a target image sample set on a SISR network model, the target image sample comprises a light field image sample containing four-dimensional information, and the problems that due to the fact that the training data is a single picture data set, only high-resolution and low-resolution mapping of a next view angle of one scene can be learned, and limitation is large are solved. And the problem that the super-resolution of a single picture is not suitable because the input is limited to a plurality of pictures when pictures with different visual angles in the same scene are used for super-resolution is solved. The network learns the mapping relation of the four-dimensional information, the super-resolution capability of the network is enhanced, and meanwhile, the network architecture is also suitable for solving the super-resolution of the two-dimensional image, namely the super-resolution capability of the SISR network is enhanced based on multi-view image training. And the method is suitable for all the SISR networks based on learning at present, and has high network portability.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a flowchart of a super-resolution processing method in an embodiment of the present invention;

FIG. 1a is a schematic view of a microarray light field camera configuration in an embodiment of the present invention;

FIG. 1b is a schematic diagram of an enlarged view of a four-dimensional light field in an embodiment of the present invention;

FIG. 1c is a diagram of a framework in an embodiment of the invention;

FIG. 1d is a diagram showing the result of image super-resolution processing in the embodiment of the present invention;

FIG. 2 is a schematic structural diagram of an image super-resolution processing apparatus in an embodiment of the present invention;

fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like. In addition, the embodiments and features of the embodiments in the present invention may be combined with each other without conflict.

The term "include" and variations thereof as used herein are intended to be open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment".

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present invention, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.

Fig. 1 is a flowchart of an image super-resolution processing method provided by an embodiment of the present invention, where the present embodiment is applicable to the case of image super-resolution processing, and the method can be executed by an image super-resolution processing apparatus in an embodiment of the present invention, and the apparatus can be implemented by software and/or hardware, as shown in fig. 1, the method specifically includes the following steps:

and S110, acquiring an image to be processed.

The image to be processed may be a light field image or a two-dimensional image, which is not limited in this embodiment of the present invention.

And S120, performing super-resolution processing on the image to be processed through a target neural network model, wherein the target neural network model is obtained through iterative training of a target image sample set SISR network model, and the target image sample comprises a light field image sample containing four-dimensional information.

Wherein the target image sample set comprises: a light-field image sample containing four-dimensional information. The light field image is used as a special multi-view picture and has four-dimensional information of angle and space combination, the light field image sample is acquired through a light field camera, the light field camera is different from a common monocular camera, the structure of the light field camera is shown in figure 1a, the image on a sensor is converted into sub-views of different view angles, and the information recorded by the light field camera can be converted into a set of multiple sub-views with slightly different view points through an algorithm. Each sub-view represents spatial information, while subtle changes in viewpoint represent angular information. These images from different viewpoints are referred to as sub-views of the light field image. These sub-views may be viewed as views with unknown disparity to each other. The missing information sparsely sampled in a certain sub-view may be captured by another sub-view or sub-views, and this information is called supplementary information. In the embodiment of the invention, the supplementary information is fully utilized in the training process, so that a better super-resolution effect can be obtained.

Wherein the target image sample comprises: a light-field image sample containing four-dimensional information, the target image sample further comprising: a two-dimensional image sample. The embodiments of the present invention are not limited in this regard.

Specifically, the training mode of the target neural network model may be: establishing a SISR network model; determining K sub-view samples according to the light field image samples, wherein K is a positive integer greater than 1 or equal to 1; inputting the K sub-view samples into the SISR network model respectively to obtain K first predicted images; determining a predicted light field image according to the K first predicted images; training parameters of the SISR network model according to an objective function formed by the predicted light field image and the light field image sample, wherein the objective function comprises: a content and loss function; and returning to execute the operation of inputting the K sub-view samples into the SISR network model to obtain a predicted image until a target neural network model is obtained. The training mode of the target neural network model can also be as follows: establishing a SISR network model; determining K sub-view samples from the light field image samples; inputting the K sub-view samples into the SISR network model respectively to obtain K first predicted images; determining a predicted light field image according to the K first predicted images; training the predicted light field image based on an objective function formed by the predicted light field image and the light field image samplesParameters of a SISR network model, wherein the objective function comprises: content and loss function

Sum variance plot loss function

(ii) a And returning to execute the operation of inputting the K sub-view samples into the SISR network model to obtain a predicted image until a target neural network model is obtained. The training mode of the target neural network model can also be as follows: establishing a SISR network model; determining K sub-view samples from the light field image samples; inputting the K sub-view samples into the SISR network model respectively to obtain K first predicted images; determining a predicted light field image according to the K first predicted images; training parameters of the SISR network model according to an objective function formed by the predicted light field image and the light field image sample, wherein the objective function comprises: content and loss function

And disparity map loss function

Variance loss of the imageLoss function

And disparity map loss function

(ii) a The final loss function is of the form:

wherein, in the step (A),

respectively, the proportion of the three loss functions. And returning to execute the operation of inputting the K sub-view samples into the SISR network model to obtain a predicted image until a target neural network model is obtained.

Optionally, the iteratively training the SISR network model through the target image sample set includes:

establishing a SISR network model;

The SISR network model can also be a SISR super-resolution network based on other learning methods besides ESPCN, VDSR and RCAN.

Wherein, the light field image sample is light field data containing four-dimensional information (spatial dimension and angular dimension).

Specifically, the light field image sample is split into a plurality of sub-view samples under multiple viewing angles, and a plurality of first prediction images after super-resolution under the multiple viewing angles are obtained by respectively passing each sub-view sample through the same SISR network. These first predicted images are then combined into a four-dimensional light field image, i.e. a predicted light field image.

Optionally, the content and loss function is:

wherein the content of the first and second substances,

for the first light-field image sample,

to predict light-field images,/_kFor the kth sub-view corresponding to a light-field image sample,

is an upsampling matrix.

Optionally, the objective function further includes: at least one of a variogram loss function and a disparity map loss function.

Specifically, the objective function may include: content and loss function

Sum variance plot loss function

The final loss function may be of the form:

wherein, in the step (A),

is a loss function

The proportion of the active ingredients in the composition,

is a loss function

The ratio of the components; the objective function may include: content and loss function

And disparity map loss function

The final loss function may be of the form:

wherein, in the step (A),

is a loss function

The proportion of the active ingredients in the composition,

is a loss function

The ratio of the components; the objective function may further include: content and loss function

Variogram loss function

And disparity map loss function

The final loss function may be of the form:

. Wherein the content of the first and second substances,

respectively, the proportion of the three loss functions.

Optionally, the training of parameters of the SISR network model according to the objective function formed by the predicted light field image and the light field image sample includes:

wherein the variogram loss function is:

wherein the content of the first and second substances,

in order to predict the light-field image variogram,

is a light field image sample variogram.

acquiring a disparity map corresponding to the predicted light field image;

Wherein the disparity map loss function may be:

wherein the content of the first and second substances,

for a disparity map corresponding to a predicted light field image,

is a corresponding disparity map of the light-field image sample.

Optionally, the target image sample further includes: a two-dimensional image sample.

In a specific example, wherein for the conventional SISR method, the super-resolution model is:

wherein the content of the first and second substances,

in order to be a low-resolution image,

in order to obtain a super-resolution image,

in order to up-sample the matrix, the sampling matrix,

is noise. In most SISR methods, in order to find the best fitting function, the loss functionThe number is defined as

In the embodiment of the invention, in order to find a better one

And designing a new restriction item on the basis of the above. After each sub-image of the SISR, these super resolved sub-images are combined into a light field image. And the embodiment of the invention provides three loss functions to expect to achieve the expected effect.

1. Content and structure loss function

；

2. Loss function of variance map

；

3. Disparity map loss function

；

The final loss function is of the form:

；

wherein the content of the first and second substances,

respectively, the proportion of the three loss functions.

The content loss function is a function of the distribution of the super-resolution result and the real result similar while maintaining the structure of the four-dimensional optical field. The variogram loss function allows the object edge positions to be well preserved. The disparity map loss function further improves the super-resolution effect in the disparity field.

Content versus loss function: one of the most straightforward methods is to compute the l2 loss between the super-resolved and real reference light-field images. Unlike the l2 loss function in the general SISR method, the l2 loss function of the four-dimensional light field uses not only spatial two-dimensional information but also angular two-dimensional information, and the loss function is sensitive to the whole four-dimensional information, so that a better spatial super-resolution effect can be obtained. In addition, a structural similarity loss function SSIM is also used, so that the structure of the optical field is further improved. Therefore, the content and loss functions provided by the embodiments of the present invention are expressed as follows:

wherein the content of the first and second substances,

for the first light-field image sample,

is an upsampling matrix.

Variogram loss function: microlens whole-column images are a special image format designed specifically for light-field images, which can represent a four-dimensional light field on a two-dimensional image. One important characteristic is the super-pixel and sub-pixels, as shown in fig. 1 b. In the embodiment of the invention, a super pixel comprises a plurality of sub-pixels, and the variance of the sub-pixels contained in the super pixel is obtained for each super pixel to form a variance map, wherein the length and the width of the variance map are respectively equal to the length and the width of a sub-view. Specifically, the variance of each superpixel can be expressed as:

；

wherein, among others,

in order to predict the light-field image variogram,

for the light-field image sample variance map, N is the number of sub-pixels in a super-pixel, and in the embodiment of the present invention, N = 81. The VM is a variance map, and the VM is a variance map,

is the variance of the sub-pixels in a super-pixel,

is the coordinates of the sub-pixel(s),

is the coordinates of the super-pixel,

is a light field image.

Disparity map loss function:

an accurate disparity map can be obtained from a light field image with a good structure through an algorithm, and conversely, the disparity map obtained from the light field image is more accurate, and the structure of the light field image can be better recovered. Therefore, in the embodiment of the invention, the disparity map of the recovered super-resolution light field is predicted through the deep neural network, and the disparity map is compared with the disparity map generated by the real light field, so that the difference between the two disparity maps is continuously reduced.

Finally, in view of the limited light field data and the large amount of two-dimensional picture data, the mixed data is used for training. Specifically, input light field image samples are input to a training network with a certain probability P, and a single two-dimensional image sample is input to be trained with a probability (1-P), so that overfitting can be prevented. In the present embodiment, P = 0.2.

In the prior art, for a single picture spatial super-resolution algorithm: various learning-based super-resolution algorithms have been proposed in the recent years. The algorithms can be mainly divided into two types, one type is a research direction for pursuing calculation efficiency, namely a network structure is simple, but a super-resolution mapping function cannot be well fitted by the network; on the contrary, the other type is to pursue super-resolution effect, and the network structure of the method is relatively complex, and the real-time property of super-resolution is difficult to guarantee. In any network, these learning-based SISR methods utilize a single picture training set (such as image-Net, BSD500, etc.) to search for a mapping relationship between two-dimensional pictures, but in reality, objects are three-dimensional, and a plenoptic function describing natural light reaches seven dimensions, so the SISR methods described above have no way to fundamentally obtain a better mapping fit network by using higher-dimensional information. For a multi-picture spatial super-resolution algorithm: the multi-view picture space super-distribution algorithm is a huge field, and can be roughly divided into video super-resolution and multi-view super-resolution in terms of a learning-based method. The video super-resolution is to explore the relation between a time domain and a space domain by means of supplementary information between adjacent frames of the video, and the training data of the multi-view super-resolution is objects shot at different angles at the same time, and the relation between the angle domain and the space domain is explored. In the field of subdivision of multi-view super-resolution, optical field super-resolution is a research hotspot direction. Due to the regular spatial and angular arrangement of the optical field, super-resolution using the optical field can be more easily achieved. Although a better super-resolution effect can be achieved for a scene by using a plurality of pictures, the algorithm for performing spatial super-resolution by using a plurality of pictures can only use a plurality of pictures for input in a test stage, and therefore, the algorithm does not have the super-resolution capability of a single picture. The embodiment of the invention aims to learn better super-resolution capability by enabling a network to learn the mapping relation of four-dimensional information. Meanwhile, the network architecture is suitable for solving the super-resolution of the two-dimensional image, namely the super-resolution capability of the SISR network is enhanced based on multi-view image training. The method is also suitable for all the SISR networks based on learning at present, namely, the network portability is high.

In one specific example, as shown in fig. 1c, the training process: the light field image sample is split into a plurality of sub-view samples under multiple viewing angles, and a plurality of first prediction images after super-resolution under the multiple viewing angles are obtained by respectively passing each sub-view sample through the same SISR network. These first predicted images are then combined into a four-dimensional light field image, i.e. a predicted light field image. And training parameters of the SISR network model according to a target function formed by a predicted light field image and the light field image sample until a target neural network model is obtained. And in the testing stage, performing super-resolution processing on the image to be processed through a target neural network model. As shown in fig. 1d, what is to be followed by LF in fig. 1d is a super-resolution processing result diagram.

The embodiment of the invention provides a single-picture spatial super-resolution algorithm using multiple pictures as training data. Firstly, SISR and Multi-picture spatial super-resolution (MISR) are combined, and even though SISR is adopted, the MISR method can be used for training, so that the effect is improved. Based on the prior of the information of the multiple pictures, the network framework of the SISR can be flexibly replaced, and the performance of all the existing SISR networks can be improved. Secondly, the light field image is used as the data of the MISR, the light field image provides abundant spatial information and angle information, and meanwhile, the information is arranged together regularly without additional calibration and correction.

According to the technical scheme, the image to be processed is obtained, super-resolution processing is carried out on the image to be processed through the target neural network model, the target neural network model is obtained through iterative training of a target image sample set on a SISR network model, the target image sample comprises a light field image sample containing four-dimensional information, and the problems that due to the fact that the training data is a single picture data set, high-resolution and low-resolution mapping of a next view angle of one scene can only be learned, and limitation is large are solved. And the problem that the super-resolution of a single picture is not suitable because the input is limited to a plurality of pictures when pictures with different visual angles in the same scene are used for super-resolution is solved. The network learns better super-resolution capability by learning the mapping relation of four-dimensional information, and meanwhile, the network architecture is also suitable for solving the super-resolution of two-dimensional pictures, namely the super-resolution capability of the SISR network is enhanced based on multi-view picture training. And the method is suitable for all the SISR networks based on learning at present, and has high network portability.

Fig. 2 is a schematic structural diagram of an image super-resolution processing apparatus according to an embodiment of the present invention. The present embodiment may be applied to the case of image super-resolution processing, and the apparatus may be implemented in software and/or hardware, and may be integrated in any device providing an image super-resolution processing function, as shown in fig. 2, where the apparatus specifically includes: an acquisition module 210 and a processing module 220.

The acquiring module 210 is configured to acquire an image to be processed;

the processing module 220 is configured to perform super-resolution processing on the image to be processed through a target neural network model, where the target neural network model is obtained by iteratively training a SISR network model through a target image sample set, and the target image sample includes a light field image sample containing four-dimensional information.

Optionally, the processing module is specifically configured to:

establishing a SISR network model;

Optionally, the content and loss function is:

wherein the content of the first and second substances,

for the first light-field image sample,

is an upsampling matrix.

Optionally, the processing module is specifically configured to:

wherein the variogram loss function is:

wherein the content of the first and second substances,

in order to predict the light-field image variogram,

is a light field image sample variogram.

Optionally, the processing module is specifically configured to:

acquiring a disparity map corresponding to the predicted light field image;

The product can execute the method provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Fig. 3 is a schematic structural diagram of an electronic device in an embodiment of the present invention. FIG. 3 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 3 is only an example and should not bring any limitations to the function and scope of use of the embodiments of the present invention.

As shown in FIG. 3, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, an Industry Standard Architecture (ISA) bus, a Micro Channel Architecture (MCA) bus, an enhanced ISA bus, a Video Electronics Standards Association (VESA) local bus, and a Peripheral Component Interconnect (PCI) bus.

Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system Memory 28 may include computer system readable media in the form of volatile Memory, such as Random Access Memory (RAM) 30 and/or cache Memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 3, and commonly referred to as a "hard drive"). Although not shown in FIG. 3, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (a Compact disk-Read Only Memory (CD-ROM)), Digital Video disk (DVD-ROM), or other optical media may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. System memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. In the electronic device 12 of the present embodiment, the display 24 is not provided as a separate body but is embedded in the mirror surface, and when the display surface of the display 24 is not displayed, the display surface of the display 24 and the mirror surface are visually integrated. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), Wide Area Network (WAN), and/or a public Network such as the internet) via the Network adapter 20. As shown, the network adapter 20 communicates with other modules of the electronic device 12 via the bus 18. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, Redundant processing units, external disk drive Arrays, disk array (RAID) systems, tape drives, and data backup storage systems, to name a few.

The processing unit 16 executes various functional applications and data processing by executing programs stored in the system memory 28, for example, implementing an image super-resolution processing method provided by an embodiment of the present invention:

acquiring an image to be processed;

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. An image super-resolution processing method is characterized by comprising the following steps:

acquiring an image to be processed;

2. The method of claim 1, wherein iteratively training a SISR network model through a target image sample set comprises:

establishing a SISR network model;

3. The method of claim 2, wherein the content and loss function is:

wherein the content of the first and second substances,

for the first light-field image sample,

is an upsampling matrix.

4. The method of claim 2, wherein the objective function further comprises: at least one of a variogram loss function and a disparity map loss function.

5. The method of claim 4, wherein training the parameters of the SISR network model according to the objective function formed by the predicted light field image and the light field image samples comprises:

wherein the variogram loss function is:

wherein the content of the first and second substances,

in order to predict the light-field image variogram,

for light field imagesA sample variogram.

6. The method of claim 4, wherein training the parameters of the SISR network model according to the objective function formed by the predicted light field image and the light field image samples comprises:

acquiring a disparity map corresponding to the predicted light field image;

7. The method of claim 1, wherein the target image sample further comprises: a two-dimensional image sample.

8. An image super-resolution processing apparatus, comprising:

the acquisition module is used for acquiring an image to be processed;

9. The apparatus of claim 8, wherein the processing module is specifically configured to:

establishing a SISR network model;

10. An electronic device, comprising:

one or more processors;

a memory for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the processors to implement the method of any of claims 1-7.