CN109447919B - Light field super-resolution reconstruction method combining multi-view angle and semantic texture features - Google Patents

Light field super-resolution reconstruction method combining multi-view angle and semantic texture features Download PDF

Info

Publication number
CN109447919B
CN109447919B CN201811328290.7A CN201811328290A CN109447919B CN 109447919 B CN109447919 B CN 109447919B CN 201811328290 A CN201811328290 A CN 201811328290A CN 109447919 B CN109447919 B CN 109447919B
Authority
CN
China
Prior art keywords
light field
resolution
field image
image
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811328290.7A
Other languages
Chinese (zh)
Other versions
CN109447919A (en
Inventor
张汝民
蔡卫彤
张付停
陈建文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201811328290.7A priority Critical patent/CN109447919B/en
Publication of CN109447919A publication Critical patent/CN109447919A/en
Application granted granted Critical
Publication of CN109447919B publication Critical patent/CN109447919B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10052Images from lightfield camera
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a light field super-resolution reconstruction method combining multi-view angle and semantic texture features, which comprises the following steps: step 1, inputting a light field image with low resolution, giving consideration to the image quality after up-sampling reconstruction and the space geometric relation among all visual angles, respectively utilizing single-frame and multi-frame super-resolution technologies to perform up-sampling, and fusing and outputting the light field image after up-sampling; step 2, converting the up-sampled light field image into a two-dimensional polar line plane graph set, optimizing the space structure of the two-dimensional polar line plane graph set by using a three-layer full convolution neural network, and outputting the light field image after the space structure is optimized; and 3, performing semantic and texture optimization and correction on the light field image after the spatial structure optimization, and outputting the reconstructed super-resolution light field image. The method overcomes the defect that the reconstructed image after the super-resolution of the light field does not follow the multi-view imaging rule in space geometry, ensures the original semantic and textural characteristics of each sub-image, and has robustness and good reconstruction effect.

Description

Light field super-resolution reconstruction method combining multi-view angle and semantic texture features
Technical Field
The invention belongs to the technical field of computer vision and light field imaging, and particularly relates to a light field super-resolution reconstruction method combining multi-view angles and semantic texture features.
Background
Compared with the traditional imaging, the light field imaging can acquire light information in an expanded mode by capturing the spatial and angular distribution of light, and required images are calculated through data processing means such as transformation, integration and the like. Therefore, the light field image has four-dimensional information including two dimensions of space and angle, and the functions of refocusing after shooting, aperture control after shooting, depth estimation based on single shooting and the like are realized. Early light field cameras consisted primarily of a camera array, mask-based light field cameras, requiring a relatively expensive equipment foundation. In recent years, light field cameras based on microlens arrays (MLAs) have provided an economical and efficient way to acquire a light field. The micro-lens array is inserted between the main lens and the sensor to collect the light field, and each micro-lens receives the light transmitted by the main lens and then acquires the light field through the recording of the sensor. However, due to the design of sharing a single sensor to capture spatial and angular information, there is a trade-off between spatial and angular resolution in a light field camera based on a microlens array, which limits the improvement of spatial and angular resolution.
At present, scholars at home and abroad mainly adopt the existing thought based on single-frame image super-resolution to improve the spatial resolution of the light field image, and the super-resolution of the image is realized by using the traditional sparse representation method or the recent mainstream method based on the convolutional neural network and combining the characteristics of the light field image to carry out special transformation. However, many light field super-resolution methods do not take the view angle geometric relationship between sub-images into account well, that is, after super-resolution reconstruction, the original geometric relationship between sub-images is not maintained well. Meanwhile, some existing methods for adding structure optimization to the light field image cannot well give consideration to the continuity of semantics and textures before and after single sub-image reconstruction. These existing problems are all to be solved.
Disclosure of Invention
The invention aims to: the method solves the problems that the existing method for reconstructing the light field super-resolution image does not well consider the view angle geometric relationship among sub-images, so that the original geometric relationship among the sub-images of the image after the light field reconstruction is not well maintained, and the existing methods for adding structure optimization to the light field image cannot well consider the continuity of semantics and textures before and after the reconstruction of a single sub-image and cannot achieve high-quality imaging, and provides the light field super-resolution reconstruction method combining multiple view angles and semantic texture features.
The technical scheme adopted by the invention is as follows:
the light field super-resolution reconstruction method combining the multi-view angle and the semantic texture features comprises the following steps:
step 1, inputting a light field image with low resolution, giving consideration to the image quality after upsampling reconstruction and the space geometric relation among all visual angles, respectively utilizing single-frame and multi-frame super-resolution technologies to upsample the input light field image to reconstruct an image with an expected magnification, fusing the images after upsampling by utilizing the single-frame and multi-frame super-resolution technologies, and outputting the upsampled light field image of a multi-visual angle sub-image set;
2, converting the up-sampled light field image into a light field image represented by a two-dimensional polar line plane graph set, training the light field image by utilizing a three-layer full convolution neural network, further optimizing and correcting a spatial structure of the light field image represented by the two-dimensional polar line plane graph set, converting the light field image into a set of multi-view subgraphs, and outputting the light field image with the optimized spatial structure;
and 3, optimizing and correcting the semantics and the texture of the light field image after the spatial structure is optimized, and outputting the light field image after the optimization and the correction, namely the super-resolution light field image after the reconstruction is completed.
Further, the step 1 comprises the following steps:
step 1.1, carrying out image alignment registration on an input light field image;
1.2, respectively performing upsampling on the aligned and registered light field image by using a single-frame super-resolution technology and performing upsampling by using a multi-frame super-resolution technology to reconstruct an image with an expected magnification, wherein in the multi-frame super-resolution technology, a pseudo video sequence generation method aiming at a light field subgraph set is applied to meet the input condition of multi-frame super-resolution;
and step 1.3, fusing the images reconstructed by the two up-sampling methods in the step 1.2, and outputting the up-sampled light field image.
Further, in step 1.3, the image up-sampled by the single-frame super-resolution technology and the image up-sampled by the multi-frame super-resolution technology are fused by using an adaptive weighted fusion method.
Further, the image fusion adopts the following formula:
Figure GDA0003164221050000021
wherein the content of the first and second substances,
Figure GDA0003164221050000022
representing an image sampled by a single frame super resolution technique,
Figure GDA0003164221050000023
representation of multi-frame super-resolutionSampled images, ω1And ω2Is a sum of
Figure GDA0003164221050000024
And
Figure GDA0003164221050000025
matrices of the same size, and having ω1[i,j]+ω2[i,j]1 is ═ 1; wherein ω is1And ω2Size and input of
Figure GDA0003164221050000026
In this case, the indexes of PSNR and SSIM are used as weighting criteria to adaptively construct a dynamic ω1And ω2
Further, in the step 2, a three-layer full convolution neural network is adopted to train the light field image represented by the two-dimensional polar line plane graph set, and further optimization and correction of the spatial structure are performed, wherein the convolution kernel of the ith convolution layer of the three-layer full convolution neural network isiDimension fi×fiThe number is niA ReLU layer is connected behind each convolution layer as an activation function, and the specific process is as follows:
a training stage: outputting a low-resolution light field image subjected to bicubic interpolation as a label, taking the up-sampled light field image obtained in the step 1 as input, training a three-layer full convolution neural network model by taking a mean square error as a loss function, and extracting more accurate spatial structure characteristics of the low-resolution light field image by using the network;
and (3) spatial structure optimization and correction: and inputting each two-dimensional limit plan in the light field image represented by the two-dimensional polar plan set into the trained three-layer full convolution neural network, optimizing the spatial structure of the light field image according to the more accurate spatial structure characteristics of the extracted low-resolution light field image, and outputting the light field image with the optimized spatial structure.
Further, in the step 3, a three-layer full convolution neural network is adopted to perform semantic and texture optimization and correction on the light field image with the optimized spatial structure.
Further, the step 3 comprises the following steps:
step 3.1, inputting each sub-image of the light field image with the optimized spatial structure into a three-layer full convolution neural network, wherein the convolution kernel of the ith convolution layer of the neural networkiDimension fi×fiThe number is niA ReLU layer is connected behind each convolution layer as an activation function;
and 3.2, outputting the light field image subjected to semantic texture optimization correction, and restoring the one-dimensional angle coordinate into a two-dimensional coordinate to obtain the reconstructed super-resolution light field image.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. in the invention, firstly, the method utilizes the technology of combining single-frame and multi-frame super-resolution, samples each light field image sub-graph on the basis of considering the image quality after the up-sampling reconstruction and the space geometric relation among all visual angles, reconstructs the image with the expected magnification, then combines the spatial characteristic representation means such as a two-dimensional polar line plan and the like, optimizes the corresponding relation of the pixel level of each sub-graph by utilizing the constraint relation of multi-visual angle geometry and the inherent rule of light field imaging, corrects the whole light field image, simultaneously considers the continuity of semantics and texture before and after the reconstruction of a single sub-graph, carries out the semantics and texture optimization operation on each sub-graph after the optimization and correction, finally outputs the light field super-resolution image, can combine the visual angle relation among all sub-graphs to a certain extent to reconstruct the light field image, and improves the defect that the reconstructed image after the light field super-resolution does not conform to the multi-visual angle imaging rule in space geometry, the original semantic and textural features of each sub-image are also ensured, and the method has robustness and good reconstruction effect;
2. in the invention, a self-adaptive weighting fusion method is designed, and the design of adaptively fusing and generating the up-sampling image also optimizes the performance index of the whole super-resolution method;
3. in the invention, the three-layer full convolution neural network is adopted to carry out space structure optimization on the up-sampled light field image, and the three-layer full convolution neural network is adopted to carry out semantic texture optimization operation on each sub-image of the light field image after space structure optimization, so that the problems of double images, fuzzy boundaries with different depths of field and the like which may appear in a single sub-image can be well eliminated, semantic and texture characteristics of the input low-resolution sub-image are kept in each sub-image with a single visual angle, and the image reconstruction effect is better;
4. the invention creatively applies the pseudo video sequence generation method aiming at the light field sub-graph set to the super-resolution technology, meets the multi-frame super-resolution input condition, realizes the up-sampling reconstruction of the light field image on the basis of considering the space geometric relation among all the view angles, and helps to maintain the original geometric relation among all the sub-graphs of the image after the light field reconstruction.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is an overall flow chart of the method of the present invention;
FIG. 2 is a process flow diagram of step 1 of the process of the present invention;
FIG. 3 is a process flow diagram of step 2 of the method of the present invention;
FIG. 4 is a method flow diagram of step 3 of the method of the present invention;
FIG. 5 is a block flow diagram of the method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
It is noted that relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
A light field super-resolution reconstruction method combining multi-view and semantic texture features is shown in a flow chart of the method in figure 1, and the method comprises the following steps:
step 1, inputting a light field image with low resolution, respectively utilizing a single-frame and multi-frame super-resolution technology to perform up-sampling on the input light field image, considering the image quality after up-sampling reconstruction and the space geometric relation among all visual angles, reconstructing an image with an expected magnification factor, and fusing the two up-sampled images to output an up-sampled light field image of a multi-visual angle sub-image set;
2, converting the up-sampled light field image into a light field image represented by a two-dimensional polar line plane graph set, training the light field image by utilizing a three-layer full convolution neural network, further optimizing and correcting a spatial structure of the light field image represented by the two-dimensional polar line plane graph set, converting the light field image into a set of multi-view subgraphs, and outputting the light field image with the optimized spatial structure;
and 3, optimizing and correcting the semantics and the texture of the light field image after the spatial structure is optimized, and outputting the light field image after the optimization and the correction, namely the super-resolution light field image after the reconstruction is completed.
The light field image can be described in a four-dimensional structure: l (x, y, u, v), where x, y e (1,2, …, N) represents the spatial resolution coordinates, i.e. the pixel coordinates in a single sub-graph, and u, v e (1,2, …, M) represents the angular resolution, i.e. the size of the horizontal and vertical dimensions of the array of light field views. For convenience, the two dimensions are considered to be equal in size. Thus M2I.e. the number of views, i.e. sub-graphs.
Existing input low-resolution light field image Ll(x, y, u, v), viewed in units of subgraphs, the subgraph of horizontal position i, vertical position j can be represented as
Figure GDA0003164221050000051
Converting the angle coordinate (u, v) into a linear coordinate k epsilon (1,2, …, M) according to a method of k ═ ((v-1) M + u)2) Then there is
Figure GDA0003164221050000052
Thus what is input is a set of subgraphs
Figure GDA0003164221050000053
Figure GDA0003164221050000054
Further, the step 1 comprises the following steps:
and 1.1, carrying out image alignment registration on the input light field image. Matching the images under different shooting conditions, modeling and analyzing the deflection relation between other images and a reference image to obtain the external parameters of the camera, and further correcting each sub-image to obtain Il_Re
And step 1.2, respectively performing upsampling on the aligned and registered light field image by using a single-frame super-resolution technology and performing upsampling by using a multi-frame super-resolution technology to reconstruct an image with an expected magnification.
Performing up-sampling by using a single-frame super-resolution technology: aligning and registering the light field image Il_ReEach sub-picture of
Figure GDA0003164221050000055
As input, each picture is respectively subjected to independent up-sampling to obtain
Figure GDA0003164221050000056
Output is given as ISI_Up
Carrying out up-sampling by utilizing a multi-frame super-resolution technology: the multi-frame super-resolution technology is characterized in that a plurality of adjacent frame images with space-time continuity are input, and a single picture subjected to sampling is output. The method for generating the pseudo video sequence aiming at the light field sub-image set is applied to meet the input condition of multi-frame super-resolution, and the up-sampling reconstruction of the light field image on the basis of considering the space geometric relation among all the visual angles is realized. Firstly, a pseudo video sequence generation module is used for generating a set I of subgraphs with spatial continuityl_ReIn the method, based on the parallax size, a graph with the highest similarity with sub-images of other visual angles is found as a key frame, the difference value of each pixel point in an average frame is used for sorting, and the sub-image sets are rearranged and combined into a series of pseudo video sequences with parallax continuity
Figure GDA0003164221050000057
As input to the multi-frame super-resolution module.
In the multi-frame super-resolution technology, aiming at an input pseudo video sequence, motion compensation is carried out on the obtained visual angle transfer, and extra details brought by sub-pixel translation are combined to carry out the motion compensation on each sub-image
Figure GDA0003164221050000061
Is obtained by upsampling
Figure GDA0003164221050000062
Then the sequence is output by re-labeling
Figure GDA0003164221050000063
To obtain IV_Up
And step 1.3, fusing the images obtained by the two up-sampling methods in the step 1.2, and outputting the up-sampled light field image.
And fusing the image up-sampled by the single-frame super-resolution technology and the image up-sampled by the multi-frame super-resolution technology by using an adaptive weighted fusion method.
Further, the image fusion adopts the following formula:
Figure GDA0003164221050000065
wherein, the first and the second end of the pipe are connected with each other,
Figure GDA0003164221050000066
representing an image sampled by a single frame super resolution technique,
Figure GDA0003164221050000067
representing images, omega, sampled by a multi-frame super-resolution technique1And ω2Is a sum of
Figure GDA0003164221050000068
And
Figure GDA0003164221050000069
matrices of the same size, and having ω1[i,j]+ω2[i,j]1 is ═ 1; wherein ω is1And ω2Size and input of
Figure GDA00031642210500000610
In this case, the indexes of PSNR and SSIM are used as weighting criteria to adaptively construct a dynamic ω1And omega2
Finally, the light field image output by up-sampling is
Figure GDA00031642210500000611
Wherein
Figure GDA00031642210500000612
Alpha is an upsampling multiple.
The method flow chart of step 1 is shown in fig. 2.
The step utilizes the super-resolution technology of single frame and multiframe to realize the up-sampling of the input light field image and improve the spatial resolution. Firstly, image alignment calibration is needed, and then the two channels are divided into two channels, wherein one channel adopts a single-frame super-resolution technology, and the other channel adopts a multi-frame super-resolution technology. In the first channel, each sub-image in the light field image is reconstructed by a single-frame super-resolution method. In the second channel, in order to utilize the view angle geometric relationship between sub-images, firstly, the sub-image sets are rearranged and combined into a series of image sequences with parallax continuity through a pseudo video sequence generation module, and the super-resolution of all sub-images, namely light field images, is realized through a multi-frame super-resolution method by carrying out motion compensation on known view angle motion and combining the detail advantages brought by sub-pixel translation. Meanwhile, the output of the two channels is subjected to self-adaptive weighting fusion to obtain a final up-sampled light field image.
Further, the step 2 specifically comprises:
step 2.1, outputting the light field image I after the up-sampling in the step 1UpRepresentation I converted into a set of two-dimensional Epipolar Plans (EPI)Up_trans
Step 2.2, training the light field image represented by the two-dimensional polar line plane graph set by adopting a three-layer full convolution neural network, further optimizing and correcting the spatial structure, and outputting an optimized disparity map and a light field image I represented by the two-dimensional polar line plane graph (EPI) setRefine1_trans=f1(IUp_trans)。
Wherein the i-th convolutional layer of the three-layer full convolutional neural networkCore kerneliDimension fi×fiThe number is niAfter each convolutional layer, a ReLU layer is connected as an activation function, and the specific process of step 2.2 is as follows:
a training stage: outputting a low-resolution light field image subjected to bicubic interpolation as a label, taking the up-sampled light field image obtained in the step 1 as input, training a three-layer full convolution neural network model by taking a mean square error as a loss function, and extracting more accurate spatial structure characteristics of the low-resolution light field image by using the network;
and (3) optimizing and correcting the spatial structure: and inputting each two-dimensional limit plane graph in the light field image represented by the two-dimensional epipolar plane graph set into the trained three-layer full convolution neural network, optimizing the spatial structure of the light field image according to the more accurate spatial structure characteristics of the extracted low-resolution light field image, and outputting the light field image after the spatial structure is optimized.
Step 2.3, the light field image obtained in the step 2.2 after the space structure optimization is converted into an expression structure of a subgraph set again, and the expression structure is output
Figure GDA0003164221050000071
The method flow chart of step 2 is shown in fig. 3. As described above, one way of representing a light field image is to divide into projections of multiple views onto the same scene, i.e. a set of multi-view subgraphs. When the super-resolution is realized under the representation method, the consistency and continuity of the spatial structure among the sub-images are required, namely, the light field image after the super-resolution also has the spatial structure consistent with the input low-resolution sub-image set. And step 2, further constraining the spatial structure of the up-sampled light field image. Since the light field image represented by the sub-image set can only display a single image of each view angle individually, the spatial characteristics between the view angles cannot be well described. The characteristic connection of the spatial structure between the sub-images will therefore first be described at the image level by means of a two-dimensional Epipolar Plan (EPI). And (2) further optimizing and correcting the up-sampling light field image reconstructed in the step (1) according to the extracted spatial structure characteristics of the low-resolution light field image as a standard, and outputting the light field image after spatial structure optimization correction after converting into a sub-image set representation form again.
Further, in the step 3, a three-layer full convolution neural network is adopted to perform semantic and texture optimization and correction on the light field image with the optimized spatial structure.
Further, the step 3 comprises the following steps:
step 3.1, optimizing the spatial structure of the light field image IRefine1Each sub-figure of
Figure GDA0003164221050000072
Inputting the convolution kernel of the ith convolution layer into a three-layer full convolution neural networkiDimension fi×fiThe number is niA ReLU layer is connected behind each convolution layer as an activation function;
step 3.2, outputting the light field image after semantic texture optimization and correction, wherein the light field image output through the neural network can be represented as IRefine2=f2(IRefine1) Restoring the one-dimensional angle coordinate into a two-dimensional coordinate to obtain the reconstructed super-resolution light field image LSR(x,y,u,v)=FSR(Ll(x,y,u,v))。
The method flow chart of step 3 is shown in fig. 4. As a basic requirement of the super-resolution method, the reconstructed image has not only spatial geometrical continuity between views, but also semantic and texture features of the input low-resolution sub-image in the sub-image of each single view. According to principle analysis, the light field image I output by the spatial structure optimization moduleRefine1There is consistency in the spatial structure, but due to fine tuning in the spatial structure, there are still errors and noise in the semantic and texture information in each single sub-graph, and mismatches with the features of the low resolution sub-graph. For example, IRefine1Ghosting, blurring of boundaries at different depths, and the like may occur in a single sub-graph. Step 3 for each subgraphPerforming semantic texture optimization operation by using a convolutional neural network, and outputting an optimized and corrected light field image IRefine2
The method steps of the invention can be applied as a whole as a system comprising the following 3 modules: the method comprises an up-sampling reconstruction module, a spatial structure optimization module, a semantic texture optimization module and 3 modules, wherein the up-sampling reconstruction module, the spatial structure optimization module, the semantic texture optimization module and the 3 modules are respectively operated corresponding to three steps of the method. The schematic block flow diagram is shown in fig. 5.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (7)

1. A light field super-resolution reconstruction method combining multi-view and semantic texture features is characterized by comprising the following steps: the method comprises the following steps:
step 1, inputting a light field image with low resolution, giving consideration to the image quality after upsampling reconstruction and the space geometric relation among all visual angles, respectively utilizing single-frame and multi-frame super-resolution technologies to upsample the input light field image to reconstruct an image with an expected magnification, fusing the images after upsampling by utilizing the single-frame and multi-frame super-resolution technologies, and outputting the upsampled light field image of a multi-visual angle sub-image set;
step 2, converting the up-sampled light field image into a light field image represented by a two-dimensional polar line plane graph set, further optimizing the spatial structure of the light field image represented by the two-dimensional polar line plane graph set by using a three-layer full convolution neural network, converting the light field image into a set of multi-view subgraphs, and outputting the light field image with the optimized spatial structure;
and 3, optimizing and correcting the semantics and the texture of the light field image after the spatial structure is optimized, and outputting the light field image after the optimization and the correction, namely the super-resolution light field image after the reconstruction is completed.
2. The light field super-resolution reconstruction method combining multi-view and semantic texture features according to claim 1, characterized in that: the step 1 comprises the following steps:
step 1.1, carrying out image alignment registration on an input light field image;
1.2, respectively performing upsampling on the aligned and registered light field image by using a single-frame super-resolution technology and performing upsampling by using a multi-frame super-resolution technology to reconstruct an image with an expected magnification, wherein in the multi-frame super-resolution technology, a pseudo video sequence generation method aiming at a light field subgraph set is applied to meet the input condition of multi-frame super-resolution;
and step 1.3, fusing the images reconstructed by the two up-sampling methods in the step 1.2, and outputting the up-sampled light field image.
3. The light field super-resolution reconstruction method combining multi-view and semantic texture features according to claim 1 or 2, characterized in that: in the step 1, the image sampled by the single-frame super-resolution technology and the image sampled by the multi-frame super-resolution technology are fused by using a self-adaptive weighted fusion method.
4. The light field super resolution reconstruction method combining multi-view and semantic texture features according to any one of claims 1-3, characterized in that: the image fusion adopts the following formula:
Figure FDA0003164221040000011
wherein the content of the first and second substances,
Figure FDA0003164221040000012
representing an image sampled by a single frame super resolution technique,
Figure FDA0003164221040000013
representing images, omega, sampled by a multi-frame super-resolution technique1And ω2Is a sum of
Figure FDA0003164221040000014
And
Figure FDA0003164221040000015
matrices of the same size, and having ω1[i,j]+ω2[i,j]1 is ═ 1; wherein ω is1And ω2Size and input of
Figure FDA0003164221040000016
In this case, the indexes of PSNR and SSIM are used as weighting criteria to adaptively construct a dynamic ω1And omega2
5. The light field super-resolution reconstruction method combining multi-view and semantic texture features according to claim 1, characterized in that: in the step 2, a three-layer full convolution neural network is adopted to train the light field image represented by the two-dimensional polar line plane graph set, and further optimization and correction of the spatial structure are carried out, wherein the convolution kernel of the ith convolution layer of the three-layer full convolution neural network isiDimension fi×fiThe number is niA ReLU layer is connected behind each convolution layer as an activation function, and the specific process is as follows:
a training stage: outputting a low-resolution light field image subjected to bicubic interpolation as a label, taking the up-sampled light field image obtained in the step 1 as input, training a three-layer full convolution neural network model by taking a mean square error as a loss function, and extracting more accurate spatial structure characteristics of the low-resolution light field image by using the network;
and (3) spatial structure optimization and correction: and inputting each two-dimensional limit plane graph in the light field image represented by the two-dimensional epipolar plane graph set into the trained three-layer full convolution neural network, optimizing the spatial structure of the light field image according to the more accurate spatial structure characteristics of the extracted low-resolution light field image, and outputting the light field image after the spatial structure is optimized.
6. The light field super-resolution reconstruction method combining multi-view and semantic texture features according to claim 1, characterized in that: and 3, performing semantic and texture optimization and correction on the light field image with the optimized spatial structure by adopting a three-layer full convolution neural network.
7. The light field super resolution reconstruction method combining multi-view and semantic texture features according to claim 1 or 4, characterized in that: the step 3 comprises the following steps:
step 3.1, inputting each sub-image of the light field image with the optimized spatial structure into a three-layer full convolution neural network, wherein the convolution kernel of the ith convolution layer of the neural networkiDimension fi×fiThe number is niA ReLU layer is connected behind each convolution layer as an activation function;
and 3.2, outputting the light field image subjected to semantic texture optimization correction, and restoring the one-dimensional angle coordinate into a two-dimensional coordinate to obtain the reconstructed super-resolution light field image.
CN201811328290.7A 2018-11-08 2018-11-08 Light field super-resolution reconstruction method combining multi-view angle and semantic texture features Active CN109447919B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811328290.7A CN109447919B (en) 2018-11-08 2018-11-08 Light field super-resolution reconstruction method combining multi-view angle and semantic texture features

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811328290.7A CN109447919B (en) 2018-11-08 2018-11-08 Light field super-resolution reconstruction method combining multi-view angle and semantic texture features

Publications (2)

Publication Number Publication Date
CN109447919A CN109447919A (en) 2019-03-08
CN109447919B true CN109447919B (en) 2022-05-06

Family

ID=65552558

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811328290.7A Active CN109447919B (en) 2018-11-08 2018-11-08 Light field super-resolution reconstruction method combining multi-view angle and semantic texture features

Country Status (1)

Country Link
CN (1) CN109447919B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110533594B (en) * 2019-08-30 2023-04-07 Oppo广东移动通信有限公司 Model training method, image reconstruction method, storage medium and related device
CN112750076B (en) * 2020-04-13 2022-11-15 奕目(上海)科技有限公司 Light field multi-view image super-resolution reconstruction method based on deep learning
WO2022016350A1 (en) * 2020-07-21 2022-01-27 Oppo广东移动通信有限公司 Light field image processing method, light field image encoder and decoder, and storage medium
CN112102165B (en) * 2020-08-18 2022-12-06 北京航空航天大学 Light field image angular domain super-resolution system and method based on zero sample learning
US11543654B2 (en) * 2020-09-16 2023-01-03 Aac Optics Solutions Pte. Ltd. Lens module and system for producing image having lens module
CN112767246B (en) * 2021-01-07 2023-05-26 北京航空航天大学 Multi-multiplying power spatial super-resolution method and device for light field image
CN112785502B (en) * 2021-01-25 2024-04-16 江南大学 Light field image super-resolution method of hybrid camera based on texture migration
CN115423946B (en) 2022-11-02 2023-04-07 清华大学 Large scene elastic semantic representation and self-supervision light field reconstruction method and device
CN116721222B (en) * 2023-08-10 2023-10-31 清华大学 Large-scale light field semantic driving intelligent characterization and real-time reconstruction method
CN118154430A (en) * 2024-05-10 2024-06-07 清华大学 Space-time-angle fusion dynamic light field intelligent imaging method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108074218A (en) * 2017-12-29 2018-05-25 清华大学 Image super-resolution method and device based on optical field acquisition device
CN108615221A (en) * 2018-04-10 2018-10-02 清华大学 Light field angle super-resolution rate method and device based on the two-dimentional epipolar plane figure of shearing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011237997A (en) * 2010-05-10 2011-11-24 Sony Corp Image processing device, and image processing method and program
US10366480B2 (en) * 2016-07-01 2019-07-30 Analytical Mechanics Associates, Inc. Super-resolution systems and methods

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108074218A (en) * 2017-12-29 2018-05-25 清华大学 Image super-resolution method and device based on optical field acquisition device
CN108615221A (en) * 2018-04-10 2018-10-02 清华大学 Light field angle super-resolution rate method and device based on the two-dimentional epipolar plane figure of shearing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于深度反卷积神经网络的图像超分辨率算法;彭亚丽等;《软件学报》;20171204(第04期);全文 *

Also Published As

Publication number Publication date
CN109447919A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
CN109447919B (en) Light field super-resolution reconstruction method combining multi-view angle and semantic texture features
CN111311490B (en) Video super-resolution reconstruction method based on multi-frame fusion optical flow
CN113362223B (en) Image super-resolution reconstruction method based on attention mechanism and two-channel network
CN108074218B (en) Image super-resolution method and device based on light field acquisition device
CN114092330B (en) Light-weight multi-scale infrared image super-resolution reconstruction method
CN108921781B (en) Depth-based optical field splicing method
CN110070489A (en) Binocular image super-resolution method based on parallax attention mechanism
CN112465955A (en) Dynamic human body three-dimensional reconstruction and visual angle synthesis method
CN106023230B (en) A kind of dense matching method of suitable deformation pattern
CN113077505B (en) Monocular depth estimation network optimization method based on contrast learning
CN112785502B (en) Light field image super-resolution method of hybrid camera based on texture migration
CN115115516B (en) Real world video super-resolution construction method based on Raw domain
CN112950475A (en) Light field super-resolution reconstruction method based on residual learning and spatial transformation network
CN111476714B (en) Cross-scale image splicing method and device based on PSV neural network
CN116486074A (en) Medical image segmentation method based on local and global context information coding
CN114757862B (en) Image enhancement progressive fusion method for infrared light field device
CN116957931A (en) Method for improving image quality of camera image based on nerve radiation field
Wu et al. Depth mapping of integral images through viewpoint image extraction with a hybrid disparity analysis algorithm
CN114359041A (en) Light field image space super-resolution reconstruction method
CN112862675A (en) Video enhancement method and system for space-time super-resolution
CN116523757A (en) Light field image super-resolution model based on generation countermeasure network and training method thereof
CN116503245A (en) Multi-scale fusion and super-resolution reconstruction method for coal rock digital core sequence pictures
CN114998405A (en) Digital human body model construction method based on image drive
CN111586316A (en) Method for generating stereoscopic element image array based on spherical camera array
CN110769242A (en) Full-automatic 2D video to 3D video conversion method based on space-time information modeling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant