CN113592716B - Light field image space domain super-resolution method, system, terminal and storage medium - Google Patents

Light field image space domain super-resolution method, system, terminal and storage medium Download PDF

Info

Publication number
CN113592716B
CN113592716B CN202110906481.2A CN202110906481A CN113592716B CN 113592716 B CN113592716 B CN 113592716B CN 202110906481 A CN202110906481 A CN 202110906481A CN 113592716 B CN113592716 B CN 113592716B
Authority
CN
China
Prior art keywords
resolution
image
viewpoint
alignment
super
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110906481.2A
Other languages
Chinese (zh)
Other versions
CN113592716A (en
Inventor
安平
陈欣
陈亦雷
黄新彭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN202110906481.2A priority Critical patent/CN113592716B/en
Publication of CN113592716A publication Critical patent/CN113592716A/en
Application granted granted Critical
Publication of CN113592716B publication Critical patent/CN113592716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a light field image space domain super-resolution method, a system, a terminal and a storage medium, comprising the following steps: dividing viewpoint images into a plurality of groups according to the angle positions, and correspondingly setting a plurality of viewpoint image alignment schemes; the method comprises the steps of performing parallax alignment on viewpoint images to be aligned in different preset alignment ranges around the viewpoint images to be aligned by taking the viewpoint images in different angle positions as central viewpoint images of super-resolution to be aligned to the central viewpoint image positions by using a parallax extraction method based on frequency domain image pyramid decomposition; cutting, downsampling and aligning the viewpoint images to be trained, and training to obtain a light field image super-resolution model; and inputting the aligned viewpoint images to be tested into a light field image super-resolution model, and predicting to obtain the viewpoint images after spatial domain super-resolution. According to the invention, parallax alignment of the viewpoint image based on the phase is combined with a light field image super-resolution method based on the deep learning, so that a super-resolution result with higher quality is obtained.

Description

Light field image space domain super-resolution method, system, terminal and storage medium
Technical Field
The invention relates to the technical field of image processing, in particular to a light field image spatial domain super-resolution method, a system, a terminal and a storage medium.
Background
Light field imaging (Light Field Imaging) is one of the most widely used methods for capturing 3D appearance of a scene at present, unlike a common two-dimensional image which can only capture spatial information of light rays in space, record 2D projection of light rays in space, and a light field image can record light ray distribution in the whole space. Acquisition of light field images depends on a camera array or a purpose-built light field camera. The simple light field camera can be formed by reforming a layer of micro lens array (Array of Microlens) placed between an original photosensitive element and a main lens of a traditional camera, and the layer of micro lens array refracts light rays passing through the main lens of the camera again and projects the light rays on the photosensitive element, so that the light field camera can record the incident position of incident light rays according to the positions of the micro lenses and can record the incident direction of the incident light rays. Therefore, the light field imaging technology can be widely applied to the fields of Depth Estimation (Depth Estimation), image refocusing (Refocus), three-dimensional modeling (3D Reconstruction) and the like, and has very important research significance.
The image directly obtained by the light field camera is called a light field original image, the light field original image is composed of small micro-lens images, and the pixel points of the micro-lens images are rearranged according to the original positions of the micro-lens images, so that viewpoint images from different visual angles can be obtained. The number of pixels of the microlens image determines the number of viewpoint images, and the number of the microlens image determines the number of pixels of each viewpoint image. Thus, the size of the light field image (or the resolution of the light field) can be denoted as [ A, B, C, D ], where A, B represents the spatial resolution of the light field, i.e., the size of each viewpoint image is A B; c, D represents the angular resolution of the light field, i.e. the original image of the light field contains c×d viewpoint images. The light field records the angle and position information of the light rays at the same time, and high-dimensional image data expression is realized, but due to the limitation of light field acquisition equipment, the spatial resolution and the angular resolution of the light field image need to be compromised under the limited imaging condition, and the improvement of the angular resolution of the light field is at the expense of the spatial resolution. This results in a spatial resolution of the light field image that is much lower than the two-dimensional image resolution obtained with conventional imaging devices, and an angular resolution that is also quite insufficient, typically only 9 x 9 or 14 x 14; meanwhile, edge viewpoint images of real light field images are generally not available due to limitations of microlenses. Therefore, in order to improve the image quality of the light field for better applications, it is necessary to improve the resolution of the light field in the angular or spatial domain.
The light field super-resolution may be divided into spatial domain super-resolution (i.e., increasing the viewpoint image resolution) and angular domain super-resolution (i.e., increasing the number of viewpoint images) according to the characteristics of the light field. Aiming at the problem of insufficient resolution of the light field space domain, the most direct method is to sequentially perform two-dimensional image super-resolution on each view point image to obtain the light field image after the view point image super-resolution. In the prior art, most methods based on deep learning are designed into different network structures, different features are extracted for the spatial domain super-resolution aiming at the viewpoint images, but parallax relation among the viewpoint images is ignored, so that discontinuity of adjacent viewpoint image contents is caused, and subsequent application of a light field is directly influenced; meanwhile, the network needs to learn not only the difference information between the viewpoint image pixels at different angular positions, but also the difference information of the angular positions thereof. Therefore, how to consider the characteristics of the light field itself, rather than considering different viewpoint images as different independent integers, is a problem that needs to be further solved at present when performing spatial domain super-resolution processing and when performing deep learning based super-resolution processing on the light field image.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a spatial domain super-resolution method, a spatial domain super-resolution system, a spatial domain super-resolution terminal and a spatial domain super-resolution storage medium for a light field image, which are used for performing parallax alignment of a phase-based viewpoint image by utilizing a parallax relation between light field viewpoint images and combining the parallax alignment with a deep learning-based super-resolution method for the light field image, so as to obtain a super-resolution result with higher quality.
In order to solve the technical problems, the invention is realized by the following technical scheme:
according to a first aspect of the present invention, there is provided a light field image spatial domain super resolution method comprising:
s11: dividing viewpoint images into a plurality of groups according to the angle positions, and correspondingly setting alignment schemes with different alignment ranges of a plurality of viewpoint images;
s12: the viewpoint images at different angle positions are taken as central viewpoint images to be super-resolved, parallax alignment is carried out on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned to the central viewpoint image positions by utilizing a parallax extraction method based on frequency domain image pyramid decomposition according to the alignment scheme corresponding to the S11, and a group of aligned viewpoint images are obtained;
s13: cutting a viewpoint image to be trained into an image block with a preset size, downsampling, aligning according to the S12, and outputting the image block to a deep learning network for training to obtain a light field image super-resolution model corresponding to different alignment schemes;
s14: and aligning the viewpoint images to be tested according to the S12, inputting the aligned viewpoint images into the light field image super-resolution model corresponding to the alignment scheme of the aligned viewpoint images, which is obtained in the S13, and predicting to obtain the viewpoint images after spatial domain super-resolution.
Preferably, the step S11 specifically includes:
dividing 7×7 light field viewpoint images into 6 groups according to angular positions, and correspondingly setting 6 viewpoint image alignment schemes;
wherein the alignment range of the 6 alignment schemes includes: 3×3,5×5, 7×7.
Preferably, the step S12 specifically includes:
when alignment operation is carried out, taking viewpoint images at different angle positions as central viewpoint images of super resolution to be detected, setting an alignment scheme corresponding to a module according to the alignment scheme, carrying out parallax alignment on the viewpoint images to be aligned in four directions in a preset alignment range around the central viewpoint images, and carrying out full zero matrix alignment on the viewpoint images to be aligned if missing viewpoint images exceeding boundaries exist in the preset alignment range;
wherein the four directions are 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively;
the parallax alignment specifically includes: and calculating a phase difference between the reference viewpoint image and the viewpoint image to be aligned by adopting a parallax extraction method based on frequency domain image pyramid decomposition, taking the reference viewpoint image as a 1 phase and taking the viewpoint image to be aligned as a 0 phase, taking the phase difference as parallax information, and adding the phase difference to the original phase of the viewpoint image to be aligned.
Preferably, the step S13 specifically includes:
s131: cutting the high-resolution viewpoint image to be trained into image blocks of the high-resolution viewpoint image to be trained with the same size;
s132: downsampling the image block of the high-resolution viewpoint image to be trained obtained in the step S131 into an image block of a low-resolution viewpoint image to be trained by a bicubic interpolation method;
s133: aligning the image blocks of the low-resolution viewpoint image to be trained obtained in the step S132 according to the step S12 to obtain a group of aligned low-resolution viewpoint image blocks to be trained;
s134: and inputting the group of aligned low-resolution viewpoint image blocks to be trained and the image blocks of the non-downsampled high-resolution central viewpoint image to be trained obtained in the step S133 into a deep learning network for training to obtain light field image super-resolution models corresponding to 6 different alignment schemes.
Preferably, the step S14 specifically includes:
s141: downsampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
s142: aligning the low-resolution viewpoint images to be tested obtained in the step S141 according to the step S12 to obtain a group of aligned low-resolution viewpoint images to be tested;
s143: and inputting the group of aligned low-resolution viewpoint images to be tested obtained in the S142 into the light field image super-resolution model corresponding to the alignment scheme obtained in the S13, and predicting to obtain the viewpoint image after spatial domain super-resolution.
According to a second aspect of the present invention, there is also provided a light field image spatial domain super resolution system comprising: the system comprises an alignment scheme setting module, a viewpoint image alignment module, a viewpoint image model training module and a viewpoint image super-resolution module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the alignment scheme setting module is used for: dividing the viewpoint images into an array according to the angle positions, and correspondingly setting a plurality of alignment schemes with different alignment ranges of the viewpoint images;
the viewpoint image alignment module is used for: the method comprises the steps of taking viewpoint images at different angle positions as a central viewpoint image to be super-resolved, performing parallax alignment on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned to the central viewpoint image position by utilizing a parallax extraction method based on frequency domain image pyramid decomposition according to an alignment scheme corresponding to an alignment scheme setting module, and obtaining a group of aligned viewpoint images;
the viewpoint image model training module is used for: firstly cutting a viewpoint image to be trained into image blocks with preset sizes, then downsampling, aligning by using the viewpoint image alignment module, and outputting the aligned viewpoint images to a deep learning network for training to obtain light field image super-resolution models corresponding to different alignment schemes;
the viewpoint image super-resolution module is used for: and inputting the viewpoint images to be tested aligned according to the viewpoint image alignment module into a light field image super-resolution model corresponding to the alignment scheme of the viewpoint image to be tested, which is obtained by the viewpoint image model training module, and predicting to obtain the viewpoint images after spatial domain super-resolution.
Preferably, the alignment scheme setting module is configured to divide the 7×7 light field viewpoint images into 6 groups according to angular positions, and correspondingly set 6 viewpoint image alignment schemes;
wherein the alignment range of the 6 alignment schemes includes: 3×3,5×5, 7×7.
Preferably, the viewpoint image alignment module is configured to:
the method comprises the steps of taking viewpoint images at different angle positions as a central viewpoint image of super resolution to be detected, setting an alignment scheme corresponding to a module according to the alignment scheme, performing parallax alignment on the viewpoint images to be aligned in four directions in a preset alignment range around the central viewpoint image, and if a missing viewpoint image exceeding a boundary exists in the preset alignment range, performing alignment by using an all-zero matrix;
wherein the four directions are 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively;
the parallax alignment specifically includes: calculating a phase difference between a reference viewpoint image and a viewpoint image to be aligned by adopting a parallax extraction method based on pyramid decomposition of frequency domain images, taking the reference viewpoint image as a 1 phase and taking the viewpoint image to be aligned as a 0 phase, taking the phase difference as parallax information, and adding the phase difference to the original phase of the viewpoint image to be aligned; .
Preferably, the viewpoint image model training module includes: the device comprises a cutting module to be trained, a downsampling module to be trained, an alignment module to be trained and a training module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the to-be-trained clipping module is used for clipping the to-be-trained high-resolution viewpoint image into image blocks of the to-be-trained high-resolution viewpoint image with the same size;
the to-be-trained downsampling module is used for downsampling an image block of the to-be-trained high-resolution viewpoint image obtained by the to-be-trained clipping module into an image block of the to-be-trained low-resolution viewpoint image through a bicubic interpolation method;
the to-be-trained alignment module is used for aligning the image blocks of the to-be-trained low-resolution viewpoint images obtained by the to-be-trained downsampling module by using the viewpoint image alignment module to obtain a group of aligned to-be-trained low-resolution viewpoint image blocks;
the training module is used for inputting a group of aligned low-resolution viewpoint image blocks to be trained and image blocks of a high-resolution center viewpoint image to be trained, which are obtained by the alignment module to be trained, and the image blocks of the high-resolution center viewpoint image to be trained, which are not downsampled, into the deep learning network for training, so as to obtain light field image super-resolution models corresponding to 6 different alignment schemes.
Preferably, the viewpoint image super-resolution module includes: the device comprises a downsampling module to be tested, an alignment module to be tested and a super-resolution module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the under-test sampling module is used for down-sampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
the to-be-tested alignment module is used for aligning the to-be-tested low-resolution viewpoint images obtained by the to-be-tested downsampling module by using the viewpoint image alignment module to obtain a group of aligned to-be-tested low-resolution viewpoint images;
the super-resolution module is used for inputting a group of aligned low-resolution viewpoint images to be tested, which are obtained by the alignment module to be tested, into the light field image super-resolution model corresponding to the alignment scheme, which is obtained by the viewpoint image model training module, and predicting to obtain the viewpoint image after spatial domain super-resolution.
According to a third aspect of the present invention, there is also provided a terminal comprising: the device comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor is used for realizing the light field image space domain super-resolution method when executing the computer program.
According to a fourth aspect of the present invention, there is also provided a computer readable storage medium having stored thereon a computer program which, when executed, is adapted to carry out the above-described light field image spatial domain super resolution method.
Compared with the prior art, the invention has the following advantages:
(1) According to the spatial domain super-resolution method, the spatial domain super-resolution system, the spatial domain super-resolution terminal and the spatial domain super-resolution storage medium for the light field images, parallax alignment of the view images based on phases is carried out by utilizing the parallax relation between the view images of the light field, and the spatial domain super-resolution method for the light field images based on deep learning is combined with the spatial domain super-resolution method for the light field images, so that a super-resolution result with higher quality is obtained;
(2) Compared with the conventional parallax alignment method, the method does not need an additional depth map for assistance, and saves a plurality of computing resources; in addition, the depth map obtained through calculation or deep learning extraction has limited accuracy, and the alignment method of parallax extraction based on frequency domain image pyramid decomposition can calculate and obtain richer sub-pixel information;
(3) The light field image space domain super-resolution method, the system, the terminal and the storage medium provided by the invention adopt a strategy of firstly aligning and then superdividing; before the viewpoint images are input into the network for learning, the viewpoint images are aligned to the same angle position, so that parallax information among the original viewpoint images is eliminated, and the network is more favorable for learning and modeling the input viewpoint images.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.
FIG. 1 is a flow chart of a spatial domain super-resolution method of a light field image according to an embodiment of the invention;
FIG. 2 is a schematic diagram of different alignment schemes in S11 according to a preferred embodiment of the invention;
FIG. 3 is an alignment diagram of the 7×7 alignment range in S12 according to a preferred embodiment of the present invention;
FIG. 4 is a schematic diagram of the deep learning network in S13 according to a preferred embodiment of the present invention;
fig. 5 is a schematic diagram of a spatial domain super-resolution system of a light field image according to an embodiment of the present invention.
Reference numerals illustrate:
1-an alignment scheme setting module,
a 2-viewpoint image alignment module,
a 3-viewpoint image model training module,
4-super resolution module of view image.
Detailed Description
The following describes in detail the examples of the present invention, which are implemented on the premise of the technical solution of the present invention, and detailed embodiments and specific operation procedures are given, but the scope of protection of the present invention is not limited to the following examples.
Fig. 1 is a flowchart of a spatial domain super-resolution method of a light field image according to an embodiment of the invention.
Referring to fig. 1, the spatial domain super-resolution method of the light field image of the present embodiment includes:
s11: dividing viewpoint images into a plurality of groups according to the angle positions, and correspondingly setting alignment schemes with different alignment ranges of a plurality of viewpoint images;
s12: the viewpoint images at different angle positions are taken as central viewpoint images to be super-resolved, parallax alignment is carried out on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned to the central viewpoint image positions by utilizing a parallax extraction method based on frequency domain image pyramid decomposition according to the alignment scheme corresponding to the S11, and a group of aligned viewpoint images are obtained;
s13: cutting a viewpoint image to be trained into an image block with a preset size, downsampling, aligning according to the S12, and outputting the image block to a deep learning network for training to obtain a light field image super-resolution model corresponding to different alignment schemes;
s14: and aligning the viewpoint images to be tested according to the S12, inputting the aligned viewpoint images into the light field image super-resolution model corresponding to the alignment scheme of the aligned viewpoint images, which is obtained in the S13, and predicting to obtain the viewpoint images after spatial domain super-resolution.
In a preferred embodiment, S11 specifically includes:
dividing 7×7 light field viewpoint images into 6 groups as shown in fig. 2 according to angular positions, correspondingly setting 6 viewpoint image alignment schemes, and taking viewpoint images with the same texture as a group to correspond to one alignment scheme;
wherein, the alignment range of 6 alignment schemes includes: 3×3,5×5, 7×7. Among the 6 different alignment schemes of this embodiment, four alignment schemes correspond to an alignment range of 3×3, one alignment scheme corresponds to an alignment range of 5×5, and one alignment scheme corresponds to an alignment range of 7×7.
In the preferred embodiment, S12 decomposes the reference viewpoint image and the viewpoint image to be aligned by using a quarter octave pyramid to obtain sub-bands of the reference viewpoint image and the viewpoint image to be aligned in different frequency domain intervals, calculates the phase difference between the reference viewpoint image and the viewpoint image to be aligned on each sub-band by taking the reference viewpoint image as a "1" phase and the viewpoint image to be aligned as a "0" phase, and the phase difference on each frequency band can be regarded as global "motion information", that is, parallax information, between the reference viewpoint image and the viewpoint image to be aligned. Therefore, the parallax is superimposed on the original phase of the viewpoint image to be aligned, so that the parallax alignment effect of the viewpoint image to be aligned can be achieved, and the viewpoint image to be aligned is aligned to the angle position of the reference viewpoint image.
The principle is briefly explained below, taking the one-dimensional image intensity distribution f as an example, in the case of global motion, when the function of displacement over time is δ (t), the intensities of different coordinate positions in the one-dimensional image can be expressed as f (x+δ (t)). And carrying out Fourier series decomposition on the intensity value to obtain:
where A is amplitude and ω is frequency. Each ω frequency value corresponds to a frequency domain subband, which is described as:
S ω (x,t)=A ω e iω(x+δ(t)) (2)
as can be seen from equation (2), subband S ω The phase ω (x+δ (t)) contains motion information δ (t), so we can control motion information by manipulating the phase.
Thus, to amplify the motion information, the phase is filtered with a temporal filter (assuming the filter removes only the DC component) to obtain the isolation B ω (x, t) =ωδ (t), amplified by an amplification factor α, and superimposed on the frequency band S ω On the original phase of (c) can be:
by the above operation, we obtain f (x+ (1+α) δ (t)) after motion information amplification. Therefore, for a two-dimensional image, the effect of amplifying local motion can be achieved by controlling pyramid decomposition to perform phase amplification. For light field images, however, there is also "motion information" between the viewpoint images, and such "motion information" is global, i.e., parallax information. We can conclude that: the phase difference between the light field viewpoint images implies parallax information between the light field viewpoint images.
And when we only process two view images, only the phase difference of the corresponding frequency bands between the two view images needs to be considered for analyzing the phase change of the time domain. Since there is no concept of time, δ (t) is reduced to δ. For the viewpoint images a and B, the viewpoint image a is taken as a "1" phase, the viewpoint image B is taken as a "0" phase, the phase difference on the corresponding frequency band is calculated, and the phase difference is superimposed on the original phase of the frequency band of the viewpoint image B, so that the parallax of the viewpoint image B can be aligned to the angle position of the viewpoint image a.
When the alignment operation is performed, the viewpoint images at different angle positions are taken as the central viewpoint image to be super-resolved, parallax alignment is performed on the viewpoint images to be aligned in four directions in the preset alignment range around the viewpoint images to be aligned according to the alignment scheme corresponding to the S11, if the preset alignment range is a missing viewpoint image exceeding the boundary, the missing viewpoint images are complemented by an all-zero matrix, and the viewpoint images to be aligned (black squares except the central black squares) in the four directions in the 7×7 alignment range and the central viewpoint image to be super-resolved (central black squares) are shown in fig. 3.
In a preferred embodiment, S13 specifically includes:
s131: clipping the high-resolution viewpoint image to be trained into image blocks of the high-resolution viewpoint image to be trained of the same size (64×64);
in one embodiment, the training dataset (100 real light field images) provided by Kalantari et al is used as the training dataset, i.e., the high resolution viewpoint image to be trained;
s132: downsampling the image block of the high-resolution viewpoint image to be trained obtained in the step S131 into the image block of the low-resolution viewpoint image to be trained by a bicubic interpolation method;
s133: aligning the image blocks of the low-resolution viewpoint image to be trained, which is obtained in the step S132, according to the step S12 to obtain a group of aligned low-resolution viewpoint image blocks to be trained;
s134: and inputting the group of aligned low-resolution viewpoint image blocks to be trained and the image blocks of the non-downsampled high-resolution center viewpoint image to be trained obtained in the step S133 into a deep learning network for training to obtain light field image super-resolution models corresponding to 6 different alignment schemes.
In one embodiment, the deep learning network structure is shown in fig. 4, and is divided into three parts, namely local feature extraction, global feature extraction and upsampling. Firstly, extracting local features from four low-resolution viewpoint image blocks to be trained in different directions by using a residual network with the same four layers of structures, connecting the local features to be global features, then further extracting deep features from the global features by using the residual network, further adding the features extracted from the low-resolution viewpoint image blocks in the central viewpoint image as global residual into the deep features, inputting the global residual into an up-sampling module to obtain output, fitting the output with the image blocks of the central viewpoint image which are not subjected to down-sampling by using an L1 loss function, and finally obtaining light field image super-resolution models corresponding to 6 different alignment schemes.
In a preferred embodiment, S14 specifically includes:
s141: downsampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
in one embodiment, the test dataset (30 real light field images) provided by Kalantari et al is used as the test dataset, i.e. the high resolution viewpoint image to be tested;
s142: aligning the low-resolution viewpoint images to be tested obtained in the step S141 according to the step S12 to obtain a group of aligned low-resolution viewpoint images to be tested;
s143: and inputting the group of aligned viewpoint images to be tested with low resolution obtained in the S142 into the light field image super-resolution model corresponding to the alignment scheme obtained in the S13, and predicting to obtain the viewpoint image with spatial domain super-resolution.
Fig. 5 is a schematic diagram of a spatial domain super-resolution system of a light field image according to an embodiment of the invention.
Referring to fig. 4, the spatial domain super-resolution system of the light field image of the present embodiment includes: the system comprises a viewpoint image parallax alignment module 1, an alignment scheme setting module 2, a viewpoint image model training module 3 and a viewpoint image super-resolution module 4; wherein, the liquid crystal display device comprises a liquid crystal display device,
the alignment scheme setting module 1 is used for: dividing viewpoint images into a plurality of groups according to the angle positions, and correspondingly setting alignment schemes with different alignment ranges of a plurality of viewpoint images;
the viewpoint image alignment module 2 is configured to: the method comprises the steps of taking viewpoint images at different angle positions as a central viewpoint image to be super-resolved, performing parallax alignment on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned by using a parallax extraction method based on frequency domain image pyramid decomposition according to an alignment scheme corresponding to an alignment scheme setting module 1, and aligning the viewpoint images to be aligned to the central viewpoint image position to obtain a group of aligned viewpoint images;
the viewpoint image model training module 3 is used for: firstly cutting a viewpoint image to be trained to a preset size, then downsampling, aligning by using a viewpoint image alignment module 2, and outputting the viewpoint image to a deep learning network for training to obtain a light field image super-resolution model corresponding to different alignment schemes;
the viewpoint image super-resolution module 4 is configured to: and inputting the viewpoint images to be tested aligned according to the viewpoint image alignment module into a light field image super-resolution model corresponding to the alignment scheme of the viewpoint image to be tested, which is obtained by the viewpoint image model training module 3, and predicting to obtain the viewpoint images after spatial domain super-resolution.
In a preferred embodiment, the alignment scheme setting module is configured to divide the 7×7 light field viewpoint images into 6 groups according to angular positions, and correspondingly set 6 viewpoint image alignment schemes;
wherein, the alignment range of 6 alignment schemes includes: 3×3,5×5, 7×7.
In a preferred embodiment, the viewpoint image alignment module is configured to:
when alignment operation is carried out, taking viewpoint images at different angle positions as central viewpoint images of super resolution to be detected, setting an alignment scheme corresponding to a module according to the alignment scheme, carrying out parallax alignment on the viewpoint images to be aligned in four directions in a preset alignment range around the central viewpoint images, presetting the alignment range, if the preset alignment range is a missing viewpoint image exceeding a boundary, and carrying out complementation on the missing viewpoint images by using an all-zero matrix;
wherein the four directions are 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively;
the parallax alignment specifically includes: and calculating a phase difference between the reference viewpoint image and the viewpoint image to be aligned by adopting a parallax extraction method based on frequency domain image pyramid decomposition, taking the reference viewpoint image as a 1 phase and taking the viewpoint image to be aligned as a 0 phase, taking the phase difference as parallax information, and adding the phase difference to the original phase of the viewpoint image to be aligned.
In a preferred embodiment, the viewpoint image model training module includes: the device comprises a cutting module to be trained, a downsampling module to be trained, an alignment module to be trained and a training module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the to-be-trained clipping module is used for clipping the to-be-trained high-resolution viewpoint image into image blocks of the to-be-trained high-resolution viewpoint image with the same size;
the to-be-trained downsampling module is used for downsampling the image block of the to-be-trained high-resolution viewpoint image obtained by the to-be-trained clipping module into the image block of the to-be-trained low-resolution viewpoint image through a bicubic interpolation method;
the to-be-trained alignment module is used for aligning the image blocks of the to-be-trained low-resolution viewpoint image obtained by the to-be-trained downsampling module by using the viewpoint image alignment module to obtain a group of aligned to-be-trained low-resolution viewpoint image blocks;
the training module is used for inputting a group of aligned low-resolution viewpoint image blocks to be trained and image blocks of non-downsampled high-resolution center viewpoint images to be trained, which are obtained by the alignment module to be trained, into the deep learning network for training, and obtaining light field image super-resolution models corresponding to 6 different alignment schemes.
In a preferred embodiment, the viewpoint image super-resolution module includes: the device comprises a downsampling module to be tested, an alignment module to be tested and a super-resolution module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the under-test sampling module is used for down-sampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
the to-be-tested alignment module is used for aligning the to-be-tested low-resolution viewpoint images obtained by the to-be-tested downsampling module by using the viewpoint image alignment module to obtain a group of aligned to-be-tested low-resolution viewpoint images;
the super-resolution module is used for inputting a group of aligned low-resolution viewpoint images to be tested, which are obtained by the alignment module to be tested, into the light field image super-resolution model corresponding to the alignment scheme of the viewpoint images, which is obtained by the viewpoint image model training module, and predicting to obtain the viewpoint images after spatial domain super-resolution.
According to the light field image space and super-resolution method and system provided by the embodiment of the invention, the parallax relation between light field viewpoint images is utilized to perform parallax alignment of the viewpoint images based on phase, and the parallax alignment is combined with the light field image super-resolution method based on deep learning, so that a super-resolution result with higher quality is obtained.
Experiments were performed on the test dataset (30 real light field images) provided by Kalantari et al, supra, for evaluating the method proposed by the present invention. The experimental environment of the experiment is Win10 system, MATLAB R2016b experimental platform, pyTorch 1.3.0 and Python3.7. Objective evaluation indexes of the experiment are PSNR (Peak Signal to Noise Ratio) and SSIM (Structural Similarity). The higher the values of PSNR and SSIM (0-1), the better the process performance.
Table 1 shows the objective index comparison of the algorithm of the present invention with the present algorithm without the viewpoint image alignment step and the bicubic interpolation up-sampling method. As can be seen from the table, the algorithm of the invention has more excellent performance on two indexes of PSNR and SSIM. Meanwhile, the super-resolution quality of the light field image can be effectively improved by adding the viewpoint image alignment step.
Table 1 objective evaluation index comparison
The experiment shows that the spatial domain super-resolution method and the spatial domain super-resolution system for the light field image can effectively improve the super-resolution quality of the light field image.
Based on the above embodiments, in another embodiment, the present invention further provides a terminal, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor is configured to implement the light field image spatial domain super resolution method in any of the above embodiments when the processor executes the program.
Based on the above embodiments, in another embodiment, the present invention further provides a computer readable storage medium having stored thereon a computer program which, when executed, is configured to implement the light field image spatial domain super resolution method in any of the above embodiments.
It should be noted that, the steps in the method provided by the present invention may be implemented by using corresponding modules, devices, units, etc. in the system, and those skilled in the art may refer to a technical solution of the method to implement the composition of the system, that is, the embodiment in the method may be understood as a preferred example of constructing the system, which is not described herein.
Those skilled in the art will appreciate that the invention provides a system and its individual devices that can be implemented entirely by logic programming of method steps, in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc., in addition to the system and its individual devices being implemented in pure computer readable program code. Therefore, the system and various devices thereof provided by the present invention may be considered as a hardware component, and the devices included therein for implementing various functions may also be considered as structures within the hardware component; means for achieving the various functions may also be considered as being either a software module that implements the method or a structure within a hardware component.
The embodiments disclosed herein were chosen and described in detail in order to best explain the principles of the invention and the practical application, and to thereby not limit the invention. Any modifications or variations within the scope of the description that would be apparent to a person skilled in the art are intended to be included within the scope of the invention.

Claims (10)

1. A spatial domain super-resolution method of a light field image, comprising:
s11: dividing viewpoint images into a plurality of groups according to the angle positions, and correspondingly setting alignment schemes with different alignment ranges of a plurality of viewpoint images;
s12: the viewpoint images at different angle positions are taken as central viewpoint images to be super-resolved, parallax alignment is carried out on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned to the central viewpoint image positions by utilizing a parallax extraction method based on frequency domain image pyramid decomposition according to the alignment scheme corresponding to the S11, and a group of aligned viewpoint images are obtained;
s13: cutting a viewpoint image to be trained into an image block with a preset size, downsampling, aligning according to the S12, and outputting the image block to a deep learning network for training to obtain a light field image super-resolution model corresponding to different alignment schemes;
s14: aligning the viewpoint images to be tested according to the S12, inputting the aligned viewpoint images into the light field image super-resolution model corresponding to the alignment scheme of the aligned viewpoint images, which is obtained in the S13, and predicting to obtain the viewpoint images after spatial domain super-resolution;
the step S12 specifically includes:
performing parallax alignment on viewpoint images in four directions in a preset alignment range around the viewpoint images by taking the viewpoint images in different angle positions as central viewpoint images with super-resolution to be subjected to parallax alignment according to an alignment scheme corresponding to the step S11, and performing full-zero matrix alignment on the viewpoint images if missing viewpoint images exceeding boundaries exist in the preset alignment range;
wherein the four directions are 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively;
the parallax alignment specifically includes: and calculating a phase difference between the reference viewpoint image and the viewpoint image to be aligned by adopting a parallax extraction method based on frequency domain image pyramid decomposition, taking the reference viewpoint image as a 1 phase and taking the viewpoint image to be aligned as a 0 phase, taking the phase difference as parallax information, and adding the phase difference to the original phase of the viewpoint image to be aligned.
2. The light field image spatial domain super resolution method according to claim 1, wherein S11 specifically comprises: dividing 7×7 light field viewpoint images into 6 groups according to angular positions, and correspondingly setting 6 viewpoint image alignment schemes;
wherein, the alignment range of the 6 viewpoint image alignment scheme includes: 3×3,5×5, 7×7.
3. The light field image spatial domain super resolution method according to claim 1, wherein S13 specifically comprises:
s131: cutting the high-resolution viewpoint image to be trained into image blocks of the high-resolution viewpoint image to be trained with the same size;
s132: downsampling the image block of the high-resolution viewpoint image to be trained obtained in the step S131 into an image block of a low-resolution viewpoint image to be trained by a bicubic interpolation method;
s133: aligning the image blocks of the low-resolution viewpoint image to be trained obtained in the step S132 according to the step S12 to obtain a group of aligned low-resolution viewpoint image blocks to be trained;
s134: and inputting the group of aligned low-resolution viewpoint image blocks to be trained and the image blocks of the non-downsampled high-resolution central viewpoint image to be trained obtained in the step S133 into a deep learning network for training to obtain light field image super-resolution models corresponding to 6 different alignment schemes.
4. The method of spatial domain super resolution of a light field image according to claim 1, wherein S14 specifically comprises:
s141: downsampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
s142: aligning the low-resolution viewpoint images to be tested obtained in the step S141 according to the step S12 to obtain a group of aligned low-resolution viewpoint images to be tested;
s143: and inputting the group of aligned low-resolution viewpoint images to be tested obtained in the S142 into the light field image super-resolution model corresponding to the alignment scheme obtained in the S13, and predicting to obtain the viewpoint image after spatial domain super-resolution.
5. A light field image spatial domain super resolution system, comprising: the system comprises an alignment scheme setting module, a viewpoint image alignment module, a viewpoint image model training module and a viewpoint image super-resolution module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the alignment scheme setting module is used for: dividing the viewpoint images into an array according to the angle positions, and correspondingly setting a plurality of alignment schemes with different alignment ranges of the viewpoint images;
the viewpoint image alignment module is used for: the method comprises the steps of taking viewpoint images at different angle positions as a central viewpoint image to be super-resolved, performing parallax alignment on the viewpoint images to be aligned in a preset alignment range around the viewpoint images to be aligned to the central viewpoint image position by utilizing a parallax extraction method based on frequency domain image pyramid decomposition according to an alignment scheme corresponding to an alignment scheme setting module, and obtaining a group of aligned viewpoint images;
the viewpoint image model training module is used for: firstly cutting a viewpoint image to be trained into image blocks with preset sizes, then downsampling, aligning by using the viewpoint image alignment module, and outputting the aligned viewpoint images to a deep learning network for training to obtain light field image super-resolution models corresponding to different alignment schemes;
the viewpoint image super-resolution module is used for inputting the viewpoint image to be tested aligned according to the viewpoint image alignment module into the light field image super-resolution model corresponding to the alignment scheme of the viewpoint image to be tested, which is obtained by the viewpoint image model training module, and predicting to obtain the viewpoint image after spatial domain super-resolution;
the viewpoint image alignment module is used for:
the method comprises the steps of taking viewpoint images at different angle positions as a central viewpoint image of super resolution to be detected, setting an alignment scheme corresponding to a module according to the alignment scheme, performing parallax alignment on the viewpoint images to be aligned in four directions in a preset alignment range around the central viewpoint image, and if a missing viewpoint image exceeding a boundary exists in the preset alignment range, performing alignment by using an all-zero matrix;
wherein the four directions are 0 degrees, 45 degrees, 90 degrees and 135 degrees respectively;
the parallax alignment specifically includes: and calculating a phase difference between the reference viewpoint image and the viewpoint image to be aligned by adopting a parallax extraction method based on frequency domain image pyramid decomposition, taking the reference viewpoint image as a 1 phase and taking the viewpoint image to be aligned as a 0 phase, taking the phase difference as parallax information, and adding the phase difference to the original phase of the viewpoint image to be aligned.
6. The spatial domain super-resolution system of a light field image according to claim 5, wherein the alignment scheme setting module is configured to divide the 7 x 7 light field viewpoint images into 6 groups according to angular positions, and correspondingly set 6 viewpoint image alignment schemes;
wherein, the alignment range of the 6 viewpoint image alignment scheme includes: 3×3,5×5, 7×7.
7. The light field image spatial domain super resolution system according to claim 5, wherein the viewpoint image model training module comprises: the device comprises a cutting module to be trained, a downsampling module to be trained, an alignment module to be trained and a training module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the to-be-trained clipping module is used for: cutting the high-resolution viewpoint image to be trained into image blocks of the high-resolution viewpoint image to be trained with the same size;
the downsampling module to be trained is used for: downsampling an image block of the high-resolution viewpoint image to be trained, which is obtained by the cutting module to be trained, into an image block of the low-resolution viewpoint image to be trained by a bicubic interpolation method;
the alignment module to be trained is used for: aligning the image blocks of the low-resolution viewpoint images to be trained, which are obtained by the downsampling module to be trained, by using the viewpoint image alignment module to obtain a group of aligned low-resolution viewpoint image blocks to be trained;
the training module is used for: and inputting a group of aligned low-resolution viewpoint image blocks to be trained and image blocks of non-downsampled high-resolution center viewpoint images to be trained, which are obtained by the alignment module to be trained, into a deep learning network for training to obtain light field image super-resolution models corresponding to 6 different alignment schemes.
8. The light field image spatial domain super resolution system according to claim 5, wherein the viewpoint image super resolution module comprises: the device comprises a downsampling module to be tested, an alignment module to be tested and a super-resolution module; wherein, the liquid crystal display device comprises a liquid crystal display device,
the under-test sampling module is used for down-sampling the high-resolution viewpoint image to be tested into a low-resolution viewpoint image to be tested;
the to-be-tested alignment module is used for aligning the to-be-tested low-resolution viewpoint images obtained by the to-be-tested downsampling module by using the viewpoint image alignment module to obtain a group of aligned to-be-tested low-resolution viewpoint images;
the super-resolution module is used for inputting a group of aligned low-resolution viewpoint images to be tested, which are obtained by the alignment module to be tested, into the light field image super-resolution model corresponding to the alignment scheme, which is obtained by the viewpoint image model training module, and predicting to obtain the viewpoint image after spatial domain super-resolution.
9. A terminal, comprising: memory, a processor and a computer program stored on the memory and executable on the processor for implementing the light field image spatial domain super resolution method according to any one of claims 1 to 4 when said computer program is executed.
10. A computer readable storage medium having stored thereon a computer program, which when executed is adapted to implement the light field image spatial domain super resolution method of any of claims 1 to 4.
CN202110906481.2A 2021-08-09 2021-08-09 Light field image space domain super-resolution method, system, terminal and storage medium Active CN113592716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110906481.2A CN113592716B (en) 2021-08-09 2021-08-09 Light field image space domain super-resolution method, system, terminal and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110906481.2A CN113592716B (en) 2021-08-09 2021-08-09 Light field image space domain super-resolution method, system, terminal and storage medium

Publications (2)

Publication Number Publication Date
CN113592716A CN113592716A (en) 2021-11-02
CN113592716B true CN113592716B (en) 2023-08-01

Family

ID=78256198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110906481.2A Active CN113592716B (en) 2021-08-09 2021-08-09 Light field image space domain super-resolution method, system, terminal and storage medium

Country Status (1)

Country Link
CN (1) CN113592716B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108074218A (en) * 2017-12-29 2018-05-25 清华大学 Image super-resolution method and device based on optical field acquisition device
CN112070675A (en) * 2020-09-07 2020-12-11 武汉工程大学 Regularization light field super-resolution method based on graph and light field microscopic device
CN112102165A (en) * 2020-08-18 2020-12-18 北京航空航天大学 Light field image angular domain super-resolution system and method based on zero sample learning
CN112785502A (en) * 2021-01-25 2021-05-11 江南大学 Light field image super-resolution method of hybrid camera based on texture migration

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106464853B (en) * 2014-05-21 2019-07-16 索尼公司 Image processing equipment and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108074218A (en) * 2017-12-29 2018-05-25 清华大学 Image super-resolution method and device based on optical field acquisition device
CN112102165A (en) * 2020-08-18 2020-12-18 北京航空航天大学 Light field image angular domain super-resolution system and method based on zero sample learning
CN112070675A (en) * 2020-09-07 2020-12-11 武汉工程大学 Regularization light field super-resolution method based on graph and light field microscopic device
CN112785502A (en) * 2021-01-25 2021-05-11 江南大学 Light field image super-resolution method of hybrid camera based on texture migration

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Light Field Spatial Super-resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization;Jing Jin等;《IEEE》;全文 *
融合多尺度特征的光场图像超分辨率方法;赵圆圆等;《光电工程》;第47卷(第12期);全文 *

Also Published As

Publication number Publication date
CN113592716A (en) 2021-11-02

Similar Documents

Publication Publication Date Title
Lee et al. Local texture estimator for implicit representation function
CN108475415B (en) Method and system for image processing
Rajan et al. Simultaneous estimation of super-resolved scene and depth map from low resolution defocused observations
CN111105352A (en) Super-resolution image reconstruction method, system, computer device and storage medium
Guo et al. Deep spatial-angular regularization for light field imaging, denoising, and super-resolution
CN111626927B (en) Binocular image super-resolution method, system and device adopting parallax constraint
KR102188035B1 (en) Learning method and apparatus for improved resolution of satellite images
Zhang et al. Micro-lens image stack upsampling for densely-sampled light field reconstruction
CN115937794B (en) Small target object detection method and device, electronic equipment and storage medium
Zhou et al. AIF-LFNet: All-in-focus light field super-resolution method considering the depth-varying defocus
Chen et al. Deep light field super-resolution using frequency domain analysis and semantic prior
JP3699921B2 (en) Image reconstruction method and image reconstruction apparatus
Deng et al. Multiple frame splicing and degradation learning for hyperspectral imagery super-resolution
CN115147271A (en) Multi-view information attention interaction network for light field super-resolution
Huang et al. Light-field reconstruction and depth estimation from focal stack images using convolutional neural networks
CN110335228B (en) Method, device and system for determining image parallax
CN114359041A (en) Light field image space super-resolution reconstruction method
CN113592716B (en) Light field image space domain super-resolution method, system, terminal and storage medium
CN117115200A (en) Hierarchical data organization for compact optical streaming
CN107392986A (en) A kind of image depth rendering intent based on gaussian pyramid and anisotropic filtering
Schirrmacher et al. Sr 2: Super-resolution with structure-aware reconstruction
Burt A pyramid-based front-end processor for dynamic vision applications
CN115220211A (en) Microscopic imaging system and method based on deep learning and light field imaging
CN101742088A (en) Non-local mean space domain time varying video filtering method
Fang et al. Light field reconstruction with a hybrid sparse regularization-pseudo 4DCNN framework

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant