CN108074218B

CN108074218B - Image super-resolution method and device based on light field acquisition device

Info

Publication number: CN108074218B
Application number: CN201711474559.8A
Authority: CN
Inventors: 刘烨斌; 王玉旺; 戴琼海
Original assignee: Tsinghua University
Current assignee: Tsinghua University
Priority date: 2017-12-29
Filing date: 2017-12-29
Publication date: 2021-02-23
Anticipated expiration: 2037-12-29
Also published as: CN108074218A

Abstract

The invention discloses an image super-resolution method and device based on a light field acquisition device, wherein the light field acquisition device comprises a plurality of USB cameras and cameras to form the light field acquisition device with a 3 x 3 visual angle, and the USB cameras at the side visual angle are regularly arranged in a square space form and surround the cameras, wherein the method comprises the following steps: acquiring a plurality of side-view low-resolution images and middle-view high-resolution images by collecting light field images; performing super-resolution on the collected low-resolution images of the plurality of side viewing angles by utilizing dictionary learning and deep learning; and obtaining the depth information of the scene according to the parallax between the light field images of different visual angles. The method can predict and restore the high-frequency part of the input image through single-view and multi-view information, and calculate the depth of the scene by using the information of the multi-view high-resolution image, thereby reducing the manufacturing cost and ensuring the accuracy of the spatial and angular resolution.

Description

Image super-resolution method and device based on light field acquisition device

Technical Field

The invention relates to the technical field of computer vision, in particular to an image super-resolution method and device based on a light field acquisition device.

Background

The light field acquisition and reconstruction technology is a very important problem in the field of computer vision, and the three-dimensional reconstruction by using the light field has great advantages compared with the traditional three-dimensional reconstruction method: the dependence on hardware resources is small, and real-time reconstruction is convenient to carry out on a PC; the applicability is strong, and the complexity of the scene has no influence on the complexity of calculation. However, although the three-dimensional reconstruction with high precision can be performed by using the three-dimensional scanner, the practical application is limited by the expensive equipment price and the limitation of the application. The light field technology is widely applied to occasions such as lighting engineering, light field rendering, relighting, refocusing camera shooting, synthetic aperture imaging, 3D display, security monitoring and the like.

In the related art, the optical field acquisition device mainly includes: using camera arrays, most commonly spherical camera arrays and planar/linear camera arrays, typically requires simultaneous acquisition of the same scene using tens or hundreds of cameras arranged in appropriate positions in the scene; the lens array is used, photos of different depths of field of a scene are shot at one time, focusing of any range of the scene can be achieved, such light field cameras are already on the market and enter the commercial application field, and the acquisition device usually has high requirements on the spatial resolution of the camera, so that the development of the light field cameras is limited by high hardware cost.

One of the core problems of the light-field three-dimensional reconstruction technique is to improve the resolution of the light-field image. Since the resolution of the acquired image will directly affect the calculation of the scene depth, performing super-resolution on the acquired image will improve the accuracy of depth estimation and the accuracy of three-dimensional reconstruction. The scene can be subjected to three-dimensional modeling by utilizing the high-resolution image information, and on the basis, virtual imaging of any viewpoint and any illumination of the scene, image segmentation, three-dimensional display and other very meaningful applications can be realized. The traditional light field image super-resolution algorithm is mainly based on a multi-view dictionary learning method, mutual corresponding relation is established by extracting dictionary information in images with different viewpoints and different resolutions, and super-resolution is carried out on low-resolution images by means of weighted addition of high-resolution image blocks. Because the light field acquisition device is used for acquiring larger angular resolution, images at different viewing angles are often larger in parallax, and the dictionary learning method cannot effectively perform super resolution under the condition, so that ghost, blur and other results are caused, the accuracy of information carried by the light field image is greatly restricted, and the accuracy of subsequent scene reconstruction is further influenced and needs to be solved.

Disclosure of Invention

The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.

Therefore, one purpose of the invention is to provide an image super-resolution method based on a light field acquisition device, which can greatly reduce the manufacturing cost, ensure the accuracy of spatial and angular resolution and realize the refocusing of a light field camera.

The invention also aims to provide an image super-resolution device based on the light field acquisition device.

In order to achieve the above object, an embodiment of an aspect of the present invention provides an image super-resolution method based on a light field acquisition device, where the light field acquisition device includes a plurality of USB cameras and a camera to form a light field acquisition device with a 3 × 3 viewing angle, and the USB cameras at the side viewing angle are regularly arranged in a square spatial form and surround the camera, where the method includes the following steps: acquiring a light field image through the light field acquisition device to acquire a plurality of side-view low-resolution images and a plurality of middle-view high-resolution images; performing super-resolution on the acquired side-view low-resolution images by utilizing dictionary learning and deep learning; and obtaining the depth information of the scene according to the parallax between the light field images of different visual angles.

According to the image super-resolution method based on the light field acquisition device, the high-frequency part of the input image is predicted and recovered through single-view-angle and multi-view-angle information, the depth of the scene is further calculated by utilizing the information of the multi-view-angle high-resolution image, the method can be used for scene reconstruction and large scene monitoring, the manufacturing cost is greatly reduced, the accuracy of the spatial and angular resolutions is guaranteed, and the refocusing of the light field camera can be realized.

In addition, the image super-resolution method based on the light field acquisition device according to the above embodiment of the present invention may further have the following additional technical features:

further, in an embodiment of the present invention, the super-resolution of the acquired plurality of side-view low-resolution images by using dictionary learning and deep learning further includes: the intermediate view angle high-resolution image is subjected to down sampling, the super-resolution of the view angle is carried out through a trained convolutional neural network by utilizing the deep learning method, and a residual image is obtained by subtracting the image before the down sampling so as to reflect the residual of the super-resolution of the neural network; extracting dictionary information from the multiple side-view low-resolution images and the middle high-resolution image by using image partitioning, and making information correspondence between low-resolution image blocks and high-resolution image blocks; and performing initial super-resolution on each low-resolution image of the side view through the convolutional neural network, converting the residual image into the side view by utilizing the information corresponding relation, and adding the result to a super-resolution result obtained preliminarily through the neural network to obtain a final super-resolution result.

Further, in an embodiment of the present invention, the residual image block corresponding to each position is obtained by the following formula:

wherein the content of the first and second substances,

for the high-resolution image blocks reconstructed at the side view angle, k is the index of the selected 9 most approximate intermediate high-resolution image blocks, ω_kIs the weight of the k-th one,

and (3) forming a kth intermediate high resolution image block required by the jth side view angle high resolution image block, wherein R is a value on the intermediate high resolution image, and j is an index of the side view angle image block.

Further, in an embodiment of the present invention, the final super-resolution result is:

wherein the content of the first and second substances,

for the reconstructed super-resolution image block, HR is the super-resolution, f_CNN(S^LR) For super-resolution of image blocks via neural networks, S^ERRFor subtraction of the pre-downsampled image to obtain a residual image, LR represents the low resolution.

Further, in an embodiment of the present invention, the obtaining depth information of the scene according to a parallax between the plurality of light field images of different viewing angles further includes: acquiring the edge confidence of each light field image to obtain an edge confidence mask; obtaining the parallax value of the pixel point marked as the confidence edge according to the edge confidence mask; filtering the initial parallax image by combining bilateral median filtering to obtain the parallax values of the pixel points in the non-edge area and the pixel points with the parallax confidence coefficient smaller than a preset threshold value; and generating a disparity map according to the disparity value of each pixel point.

In order to achieve the above object, another embodiment of the present invention provides an image super-resolution device based on a light field acquisition device, where the light field acquisition device includes a plurality of USB cameras and a camera to form a light field acquisition device with a 3 × 3 viewing angle, and the USB cameras at the side viewing angle are regularly arranged in a square spatial form and surround the camera, where the device includes: the acquisition module is used for acquiring the light field image through the light field acquisition device so as to acquire a plurality of side view angle low-resolution images and a plurality of middle view angle high-resolution images; the super-resolution module is used for performing super-resolution on the acquired side-view low-resolution images by utilizing dictionary learning and deep learning; and the acquisition module is used for obtaining the depth information of the scene according to the parallax between the light field images of the plurality of different visual angles.

The image super-resolution device based on the light field acquisition device of the embodiment of the invention predicts and restores the high-frequency part of the input image through single-view-angle and multi-view-angle information, further calculates the depth of the scene by utilizing the information of the multi-view-angle high-resolution image, can be used for scene reconstruction and large scene monitoring, greatly reduces the manufacturing cost, ensures the accuracy of space and angular resolution and can realize the refocusing of the light field camera.

In addition, the image super-resolution device based on the light field acquisition device according to the above embodiment of the present invention may further have the following additional technical features:

further, in an embodiment of the present invention, the super-resolution module further includes: the computing unit is used for performing down-sampling on the intermediate view angle high-resolution image, performing super-resolution on the view angle through a trained convolutional neural network by using the deep learning method, and subtracting the image before down-sampling to obtain a residual image so as to reflect the residual of the super-resolution of the neural network; the extraction unit is used for extracting dictionary information from the plurality of side-view low-resolution images and the middle high-resolution image by using image partitioning, and performing information correspondence between low-resolution image blocks and high-resolution image blocks; and the first acquisition unit is used for performing initial super-resolution on the low-resolution image of each side view angle through the convolutional neural network, converting the residual image into the side view angle by using the information corresponding relation, and adding the residual image and the preliminarily obtained super-resolution result of the neural network to acquire a final super-resolution result.

wherein the content of the first and second substances,

wherein the content of the first and second substances,

Further, in an embodiment of the present invention, the obtaining module further includes: the second acquisition unit is used for acquiring the edge confidence of each light field image to obtain an edge confidence mask; the third acquisition unit is used for acquiring the parallax value of the pixel point marked as the confidence edge according to the edge confidence mask; the fourth obtaining unit is used for filtering the initial disparity map by combining bilateral median filtering to obtain disparity values of pixel points in a non-edge area and pixel points with disparity confidences smaller than a preset threshold; and the generating unit is used for generating a disparity map according to the disparity value of each pixel point.

Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

fig. 1 is a schematic structural diagram of a light field acquisition device according to an embodiment of the present invention;

FIG. 2 is a flow chart of an image super-resolution method based on a light field acquisition device according to an embodiment of the present invention;

fig. 3 is a flowchart of a three-dimensional reconstruction method of a super-resolution light field acquisition device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of an image super-resolution device based on a light field acquisition device according to an embodiment of the invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.

Before the image super-resolution method and device based on the light field acquisition device of the embodiment of the present invention are introduced, the light field acquisition device of the embodiment of the present invention is first introduced.

The light field collecting device shown in fig. 1 includes a plurality of USB cameras and a camera to form a light field collecting device with a 3 × 3 viewing angle, and the plurality of USB cameras at the side viewing angle are regularly arranged in a square spatial form and surround the camera.

Specifically, the light field acquisition device may include: 8 non-professional low-quality USB cameras, 1 high resolution camera, 1 customization aluminium frame. The 8 low-quality cameras at the side view angles are regularly arranged around the camera in a square shape, the distance between two adjacent cameras on the same edge is 60mm, 9 cameras form a sparse light field acquisition device with a 3X 3 view angle, and for a high-resolution camera, a Canon 600D single lens reflex camera can be adopted in the embodiment of the invention.

Before the light field acquisition device provided by the embodiment of the invention is used for acquiring images, the focal points of all cameras are set, and the inherent parameters of the whole device system are calibrated; each side view image is projected onto a plane whose reference plane is parallel to the intermediate view image, and the image is corrected into a light field image according to the calibration result. All the side view angle images thus obtained are distributed on a 3 × 3 grid with an average pitch.

In the subsequent image processing flow, it is necessary to crop the overlapping regions of the images acquired by all the cameras, and after the cropping processing, the resolution of the intermediate view angle image acquired by the high-resolution camera is 2944 × 1808, the resolution of the side view angle image is 368 × 266, and the resolution of the intermediate view angle image is 1/8.

In summary, the light field acquisition device according to the embodiment of the present invention is composed of 8 non-professional quality USB cameras and 1 high resolution camera, and realizes multi-view acquisition of a scene. The light field acquisition device can acquire the three-dimensional sparse light field with higher angular resolution by correction, and has high acquisition speed and high efficiency.

The image super-resolution method and device based on the light field acquisition device according to the embodiments of the present invention will be described below with reference to the accompanying drawings, and first, the image super-resolution method based on the light field acquisition device according to the embodiments of the present invention will be described with reference to the accompanying drawings.

Fig. 2 is a flowchart of an image super-resolution method based on a light field acquisition device according to an embodiment of the present invention.

As shown in fig. 2, the image super-resolution method based on the light field acquisition device comprises the following steps:

in step S201, a light field image is acquired by a light field acquisition device to acquire a plurality of side-view low resolution images and a plurality of middle-view high resolution images.

It can be understood that, as shown in fig. 3, the embodiment of the present invention takes as an example that 1 side view angle low resolution image S and the middle view angle high resolution image R are input, and are respectively applied to all 8 side view angles to acquire the light field image through the light field acquisition device, so as to acquire a plurality of side view angle low resolution images and middle view angle high resolution images.

In step S202, the acquired plurality of side-view low-resolution images are super-resolved using dictionary learning and deep learning.

Further, in an embodiment of the present invention, performing super-resolution on the plurality of acquired side-view low-resolution images by using dictionary learning and deep learning, further includes: down-sampling the intermediate view angle high-resolution image, performing super-resolution of the view angle through a trained convolutional neural network by using a deep learning method, and subtracting the image before down-sampling to obtain a residual image so as to reflect the residual of the super-resolution of the neural network; extracting dictionary information from a plurality of side-view low-resolution images and a plurality of middle high-resolution images by using image blocks, and making information correspondence between low-resolution image blocks and high-resolution image blocks; and performing initial super-resolution on each low-resolution image at the side view angle through a convolutional neural network, converting the residual image into the side view angle by utilizing the information corresponding relation, and adding the residual image to a super-resolution result obtained initially through the neural network to obtain a final super-resolution result.

It can be understood that, in the embodiment of the present invention, dictionary information can be extracted from low-resolution images and intermediate high-resolution images of 8 surrounding perspectives by using image blocking, and information correspondence between low-resolution image blocks and high-resolution image blocks is performed.

Specifically, the dictionary information of the intermediate-view-angle high-resolution image R is D_R＝{f_R,1,…,f_R,NIn which f_R,i(i ═ 1,2, …, N) are the first and second gradient results for the image block extracted from R. Similarly, the residual image R of the corresponding position^ERRThe dictionary information can also be obtained by the same method, and is marked as { e_R,1,…,e_R,N}. For each image block position j in the low resolution image of the side view, a first gradient and a second gradient are calculated and applied at D_RIn the middle of using L₂Calculating the norm distance to obtain 9 neighbors and recording the 9 neighbors as

Thus obtaining corresponding 9 characteristics in the residual dictionary and recording as

wherein the content of the first and second substances,

Further, in one embodiment of the present invention, the final super-resolution result is:

wherein the content of the first and second substances,

It can be understood that the embodiment of the invention can perform initial super-resolution on the low-resolution image of each side visual angle through the convolutional neural network, convert the residual image into the side visual angle by using the information corresponding relation, and add the result to the super-resolution result obtained through the neural network to obtain the final super-resolution result.

In particular, from residual candidate dictionaries

Obtaining each position j in S by using weighted averageCorresponding residual image block

Wherein the content of the first and second substances,

thus, a residual image corresponding to S is estimated by using a dictionary learning method and is recorded as S^ERR. Then, the original side-view angle low-resolution image is added with the residual image through a convolution neural network to finally obtain a super-resolution light field image, namely

In step S203, depth information of a scene is obtained according to parallax between a plurality of light field images of different viewing angles.

Further, in an embodiment of the present invention, obtaining depth information of a scene according to a parallax between a plurality of light field images of different viewing angles further includes: acquiring the edge confidence of each light field image to obtain an edge confidence mask; obtaining a parallax value of a pixel point marked as a confidence edge according to the edge confidence mask; filtering the initial parallax image by combining bilateral median filtering to obtain the parallax values of the pixel points in the non-edge area and the pixel points with the parallax confidence coefficient smaller than a preset threshold value; and generating a disparity map according to the disparity value of each pixel point.

It can be understood that the light field depth information calculation of the embodiment of the present invention specifically includes the following steps:

step 1: calculating the edge confidence of each image to obtain an edge confidence mask;

step 2: calculating the parallax value of the pixel point marked as the confidence edge according to the edge confidence mask;

and step 3: filtering the initial disparity map by joint bilateral median filtering;

step 4, calculating the parallax values of the pixel points in the non-edge area and the pixel points with the parallax confidence coefficient smaller than a preset threshold value;

and 5: and generating a disparity map according to the disparity value of each pixel point.

According to the image super-resolution method based on the light field acquisition device, provided by the embodiment of the invention, the high-frequency part of the input image is predicted and recovered through single-view-angle and multi-view-angle information, and the depth of the scene is further calculated by utilizing the information of the multi-view-angle high-resolution image, so that the method can be used for scene reconstruction and large scene monitoring, greatly reduces the manufacturing cost, ensures the accuracy of spatial and angular resolutions, and can realize the refocusing of the light field camera.

Next, an image super-resolution device based on a light field acquisition device according to an embodiment of the present invention will be described with reference to the drawings.

Fig. 4 is a schematic structural diagram of an image super-resolution device based on a light field acquisition device according to an embodiment of the present invention.

As shown in fig. 4, the image super-resolution device 10 based on the light field acquisition device comprises: an acquisition module 100, a super-resolution module 200 and an acquisition module 300.

The acquisition module 100 is configured to acquire a light field image through a light field acquisition device to acquire a plurality of side-view low-resolution images and a plurality of middle-view high-resolution images. The super-resolution module 200 is configured to perform super-resolution on the acquired plurality of side-view low-resolution images by using dictionary learning and deep learning. The obtaining module 300 is configured to obtain depth information of a scene according to a parallax between a plurality of light field images at different viewing angles. The device 10 of the embodiment of the invention can predict and restore the high-frequency part of the input image through single-view and multi-view information, further calculate the depth of the scene by utilizing the information of the multi-view high-resolution image, reduce the manufacturing cost, ensure the accuracy of the spatial and angular resolutions and realize the refocusing of the light field camera.

Further, in an embodiment of the present invention, the super-resolution module further includes: the computing unit is used for performing down-sampling on the intermediate-view-angle high-resolution image, performing super-resolution on the view angle through a trained convolutional neural network by using a deep learning method, and subtracting the image before down-sampling to obtain a residual image so as to reflect the residual of the super-resolution of the neural network; the extraction unit is used for extracting dictionary information from the multiple side-view low-resolution images and the middle high-resolution image by using image partitioning, and performing information correspondence between the low-resolution image blocks and the high-resolution image blocks; and the first acquisition unit is used for performing initial super-resolution on the low-resolution image of each side view angle through a convolutional neural network, converting the residual image into the side view angle by using the information corresponding relation, and adding the residual image and the preliminarily obtained super-resolution result of the neural network to acquire a final super-resolution result.

wherein the content of the first and second substances,

wherein the content of the first and second substances,

It should be noted that the foregoing explanation on the embodiment of the image super-resolution method based on the light field acquisition device also applies to the image super-resolution device based on the light field acquisition device of this embodiment, and details are not repeated here.

According to the image super-resolution based on the light field acquisition device provided by the embodiment of the invention, the high-frequency part of the input image is predicted and recovered through single-view and multi-view information, and the depth of the scene is further calculated by utilizing the information of the multi-view high-resolution image, so that the method can be used for scene reconstruction and large scene monitoring, the manufacturing cost is reduced, the accuracy of the spatial and angular resolutions is ensured, and the refocusing of the light field camera can be realized.

In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced devices or elements must have a particular orientation, be constructed and operated in a particular orientation, and are therefore not to be considered limiting of the invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

In the present invention, unless otherwise expressly stated or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally formed; can be mechanically or electrically connected; they may be directly connected or indirectly connected through intervening media, or they may be connected internally or in any other suitable relationship, unless expressly stated otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

In the present invention, unless otherwise expressly stated or limited, the first feature "on" or "under" the second feature may be directly contacting the first and second features or indirectly contacting the first and second features through an intermediate. Also, a first feature "on," "over," and "above" a second feature may be directly or diagonally above the second feature, or may simply indicate that the first feature is at a higher level than the second feature. A first feature being "under," "below," and "beneath" a second feature may be directly under or obliquely under the first feature, or may simply mean that the first feature is at a lesser elevation than the second feature.

In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. An image super-resolution method based on a light field acquisition device, wherein the light field acquisition device comprises a plurality of USB cameras and cameras to form a light field acquisition device with a 3 x 3 visual angle, and the USB cameras at the side visual angle are regularly arranged in a square space form and surround the cameras, wherein the method comprises the following steps:

acquiring a light field image through the light field acquisition device to acquire a plurality of side-view low-resolution images and a plurality of middle-view high-resolution images;

performing super-resolution on the acquired side-view low-resolution images by utilizing dictionary learning and deep learning; and

obtaining depth information of a scene according to parallax between a plurality of light field images with different viewing angles;

wherein the super-resolution of the collected low-resolution images of the plurality of side viewing angles by using dictionary learning and deep learning further comprises: the intermediate view angle high-resolution image is subjected to down sampling, the super-resolution of the view angle is carried out through a trained convolutional neural network by utilizing the deep learning method, and a residual image is obtained by subtracting the image before the down sampling so as to reflect the residual of the super-resolution of the neural network; extracting dictionary information from the multiple side-view low-resolution images and the middle high-resolution image by using image partitioning, and making information correspondence between low-resolution image blocks and high-resolution image blocks; and performing initial super-resolution on each low-resolution image of the side view through the convolutional neural network, converting the residual image into the side view by utilizing the information corresponding relation, and adding the result to a super-resolution result obtained preliminarily through the neural network to obtain a final super-resolution result.

2. The image super-resolution method based on the light field acquisition device according to claim 1, wherein the residual image block corresponding to each position is obtained by the following formula:

wherein the content of the first and second substances,

3. The image super-resolution method based on the light field acquisition device according to claim 2, wherein the final super-resolution result is:

wherein the content of the first and second substances,

4. The image super-resolution method based on the light field acquisition device according to any one of claims 1 to 3, wherein the obtaining of the depth information of the scene according to the parallax between the plurality of light field images at different viewing angles further comprises:

acquiring the edge confidence of each light field image to obtain an edge confidence mask;

obtaining the parallax value of the pixel point marked as the confidence edge according to the edge confidence mask;

filtering the initial parallax image by combining bilateral median filtering to obtain the parallax values of the pixel points in the non-edge area and the pixel points with the parallax confidence coefficient smaller than a preset threshold value; and

and generating a disparity map according to the disparity value of each pixel point.

5. An image super-resolution device based on a light field acquisition device, wherein the light field acquisition device comprises a plurality of USB cameras and a camera to form a light field acquisition device with a 3 x 3 visual angle, and the plurality of USB cameras at the side visual angle are regularly arranged in a square space form and surround the camera, wherein the device comprises:

the acquisition module is used for acquiring the light field image through the light field acquisition device so as to acquire a plurality of side view angle low-resolution images and a plurality of middle view angle high-resolution images;

the super-resolution module is used for performing super-resolution on the acquired side-view low-resolution images by utilizing dictionary learning and deep learning; and

the acquisition module is used for acquiring the depth information of a scene according to the parallax among the light field images of different visual angles;

wherein, the super-resolution module further comprises: the computing unit is used for performing down-sampling on the intermediate view angle high-resolution image, performing super-resolution on the view angle through a trained convolutional neural network by using the deep learning method, and subtracting the image before down-sampling to obtain a residual image so as to reflect the residual of the super-resolution of the neural network; the extraction unit is used for extracting dictionary information from the plurality of side-view low-resolution images and the middle high-resolution image by using image partitioning, and performing information correspondence between low-resolution image blocks and high-resolution image blocks; and the first acquisition unit is used for performing initial super-resolution on the low-resolution image of each side view angle through the convolutional neural network, converting the residual image into the side view angle by using the information corresponding relation, and adding the residual image and the preliminarily obtained super-resolution result of the neural network to acquire a final super-resolution result.

6. The light field acquisition device-based image super-resolution device according to claim 5, wherein the residual image block corresponding to each position is obtained by the following formula:

wherein the content of the first and second substances,

7. The light field acquisition device-based image super-resolution device of claim 6, wherein the final super-resolution result is:

wherein the content of the first and second substances,

8. The light field acquisition device-based image super-resolution device according to any one of claims 5 to 7, wherein the acquisition module further comprises:

the second acquisition unit is used for acquiring the edge confidence of each light field image to obtain an edge confidence mask;

the third acquisition unit is used for acquiring the parallax value of the pixel point marked as the confidence edge according to the edge confidence mask;

the fourth obtaining unit is used for filtering the initial disparity map by combining bilateral median filtering to obtain disparity values of pixel points in a non-edge area and pixel points with disparity confidences smaller than a preset threshold; and

and the generating unit is used for generating a disparity map according to the disparity value of each pixel point.