CN105427243A

CN105427243A - Video super-resolution reconstruction method based on adaptive interpolation kernel learning

Info

Publication number: CN105427243A
Application number: CN201510725909.8A
Authority: CN
Inventors: 胡晰远; 马斌斌; 彭思龙
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Suzhou Zhongke Whole Elephant Intelligent Technology Co Ltd
Priority date: 2015-10-29
Filing date: 2015-10-29
Publication date: 2016-03-23

Abstract

The invention discloses a video super-resolution reconstruction method based on adaptive interpolation kernel learning, comprising obtaining the interpolation kernel dictionary of a high resolution image block and a corresponding dual matrix according to a video image training set, wherein the video image training set comprises high and low resolution image blocks; obtaining the interpolation kernel of an image block structure corresponding to each atom in the interpolation kernel dictionary of a high resolution image block according to the interpolation kernel dictionary of the high resolution image block; and constructing the interpolation kernel of image small blocks of an image to be processed, and utilizing the interpolation kernel of image small blocks of the image to be processed to amplify the interpolation of the image to be processed according to the dual matrix and the interpolation kernel of an image block structure corresponding to each atom. The method at least partially better maintains edge and texture information in a video image, and effectively reduces distortion interference of aliasing, sawtooth, ringing, etc.

Description

Video super-resolution reconstruction method based on adaptive interpolation kernel learning

Technical Field

The embodiment of the invention relates to the technical field of video image processing, in particular to a video super-resolution reconstruction method based on adaptive interpolation kernel learning.

Background

Digital image super-resolution reconstruction technology^[1,2]The super-resolution reconstruction technology has shown important application value and wide application prospect in various practical fields, including the satellite remote sensing field, the night vision infrared imaging field, the video monitoring field, the military target analysis and tracking field and the like, wherein, the video signal which is closely related to the daily life of people is the related application in the video monitoring field, the current domestic security system adopts the video signal of PAL standard and wide resolution 352 × 288, but the resolution often can not meet the practical requirement.

The current video super-resolution reconstruction method can be divided into two types^[2,3,4]One is that the correlation between video frames is calculated, the motion vector between a plurality of adjacent frames is calculated, and complementary information is obtained, so that a single-frame high-resolution image is reconstructed; and the other type is that each frame of image in the video is directly processed, and each frame of image in the whole video is subjected to super-resolution amplification by using a single-image-based super-resolution reconstruction algorithm, so that the video after super-resolution reconstruction is obtained. Although the former method can obtain better effect, because it is not easy to calculate accurate motion vector with sub-pixel precision and has very high calculation complexity, it cannot implement real-time amplification on video, so it is better to adopt single image super-resolution reconstruction algorithm to process each frame of image in video one by one, and implement real-time super-resolution reconstruction of video.

Methods for performing super-resolution reconstruction on a single image can be roughly divided into two categories^[5]: an algorithm based on constrained reconstruction and a method based on machine learning. Algorithm using constrained reconstruction^[6,7,8]The super-resolution reconstruction problem is regarded as the inverse problem of solving the image degradation model, namely, the solution space of the inverse problem is limited by adding some prior information and constraint conditions, so that an output image with high resolution is reconstructed from a single input image. However, since the information of the input image is very limited, the reconstructed image depends on the added prior information and constraint conditions to a great extent, and has a certain distortion. To overcome these problems, methods based on machine learning^[9,10]Are proposed and widely used. Learning-based super-resolution reconstruction algorithms typically comprise two steps: (1) establishing a training image set corresponding to high and low resolutions so as to learn relevant information among the image blocks with the high and low resolutions; (2) and adding the trained related high-frequency information into the low-resolution image by utilizing some prediction algorithms according to the input low-resolution image so as to obtain an input image with higher resolution. Compared with the method with constrained reconstruction, the learning-based super-resolution algorithm generally can obtain better results, but the computation complexity is higher, so that the real-time amplification of the video is difficult to realize by using the method.

At present, by using GPU acceleration, part of documents apply a method based on band-constrained reconstruction to videos, so as to realize real-time amplification reconstruction of the videos, such as an anti-distortion real-time amplification algorithm^[11]Edge-guided kernel estimation amplification method^[12]And the like. However, no mature technology or literature is available for real-time video amplification by using a learning-based super-resolution reconstruction algorithm. In the learning-based super-resolution reconstruction algorithm, an input image is firstly divided into mutually overlapped small blocks, and then a nonlinear optimization problem needs to be solved when each small block is amplified one by one, so that the requirement of real-time super-resolution reconstruction of a video is still difficult to realize even a GPU acceleration method is adopted.

The references cited herein are as follows, and the following are incorporated by reference:

[1] zhuoli, Wangshuyu jade, Li Xiaoguang, super resolution restoration of image/video, people post and telecommunications press, 2011.

[2] Ananbovik, image and video processing manual (below) (english edition), electronic industry press, 2006.

[3] Super-resolution enhancement techniques for videos of royal courage, zhenghui, huderwen, overview computer applications research, 2005,22(1),4-7.

[4] The method comprises the steps of how to take the sea, Wuyuyuyi, Chen as Longhe and Qinling shark, and a video super-resolution reconstruction technology review, information and electronic engineering, 2011,9(1), and 1-6.

[5]J.Tian,K.K.Ma.Asurveyonsuper-resolutionimaging.SignalImageVideoProcessing,2011,5(3),329-342.

[6] Wanxuelin, Wenwei, Pengsilon, super-resolution images based on a wavelet domain local Gaussian model, China graphic newspaper A edition, 2004,9(8), 941-one 946.

[7] Korea, King hong sword, Pengsilon, single image super-resolution algorithm based on local structural similarity, computer aided design and graphics bulletin, 2005,17(5), 941-.

[8]X.Zhang,J.Jiang,S.Peng.Commutabilityofblurandaffinewarpinginsuper-resolutionwithapplicationtojointestimationoftriple-coupledvariables.IEEETransactionsonImageProcessing,2012,21(4),1796-1808.

[9] Zhangxue, Jiangjing, Pengxilong, a self-adaptive manifold learning method of face image super-resolution, computer aided design and graphics declaration, 2008,20(7), 856-.

[10]P.Wang,X.Hu,B.Xuan,J.Mu,S.Peng.Superresolutionreconstructionviamultipleframesjointlearning.IEEEInternationalConferenceonMultimediaandSignalProcessing,2011,357-361.

[11]A.Giachetti,N.Asuni.Real-timeartifact-freeimageupscaling.IEEETransactionsonImageProcessing,2011,20(10),2760-2768.

[12]W.Kang,J.Jeon,E.Lee,etal.Real-timesuper-resolutionfordigitalzoomingusingfinitekernel-basededgeorientationestimationandtruncatedimagerestoration.IEEEInternationalConferenceonImageProcessing,2013,1311-1315.

The existing learning-based image super-resolution reconstruction method has great computational complexity during reconstruction, so that the requirements of real-time super-resolution reconstruction can hardly be met.

In view of the above, the present invention is particularly proposed.

Disclosure of Invention

The embodiment of the invention provides a video super-resolution reconstruction method based on self-adaptive interpolation kernel learning, which at least partially solves the technical problem of how to realize real-time super-resolution reconstruction of videos.

In order to achieve the above object, according to one aspect of the present invention, the following technical solutions are provided:

a video super-resolution reconstruction method based on adaptive interpolation kernel learning at least comprises the following steps:

obtaining an interpolation kernel dictionary of a high-resolution image block according to a video image training set and corresponding interpolation kernel dictionary

A dual matrix, wherein the training set of video images comprises high and low resolution image blocks;

acquiring an interpolation kernel of an image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block according to the interpolation kernel dictionary of the high-resolution image block;

and constructing an interpolation kernel of the image small block of the image to be processed according to the dual matrix and the interpolation kernel of the image block structure corresponding to each atom, performing interpolation amplification on the image small block of the image to be processed by utilizing the interpolation kernel of the image small block of the image to be processed, and splicing the amplified image small blocks.

In an embodiment, the obtaining of the interpolation kernel dictionary of the high-resolution image block and the dual matrix corresponding to the interpolation kernel dictionary specifically includes:

adopting a dual learning algorithm to solve the following formula:

{{\hat{D}}_{h}, \hat{C}, \hat{Z}} = \arg \min_{D_{h}, C, Z} | | X_{h} - D_{h} Z | |_{F}^{2} + η | | Z - {CX}_{1} | |_{F}^{2} + λ | | Z | |_{1},

wherein,

representing a set of high resolution image blocks;

representing a set of low resolution image blocks corresponding to the set of high resolution image blocks;

z represents a corresponding representation coefficient of the small block of the high-resolution image;

η represents a weight coefficient;

λ represents a weight coefficient;

D_han interpolation kernel dictionary represented as a high resolution image block;

c represents D_hThe dual matrix of (2).

In an embodiment, the obtaining, according to the interpolation core dictionary of the high-resolution image block, the interpolation core of the image block structure corresponding to each atom in the interpolation core dictionary of the high-resolution image block specifically includes:

clustering the image small blocks in the image training set by using the interpolation kernel dictionary of the high-resolution image block to obtain a low-resolution image small block subclass and a high-resolution image small block subclass corresponding to the low-resolution image small block subclass;

and training the high-resolution image small block pair and the low-resolution image small block pair in each image small block subclass by utilizing a least square algorithm to obtain an interpolation kernel of the image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block.

In an embodiment, the constructing an interpolation kernel of an image patch of the image to be processed according to the dual matrix and the interpolation kernel of the image patch structure corresponding to each atom specifically includes:

the following formula is calculated:

K = (Σ_{i = 1}^{N} ω_{i} K_{i});

k is an interpolation kernel of the image block to be processed; k_iAn interpolation kernel of an image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block; omega_iIs a weight coefficient;

the following formula is calculatedTo obtain omega_iWherein α_iRepresenting coefficients of the image block corresponding to each atom;

the following formula α ═ Cx is calculated₁And α is obtained, wherein C is the dual matrix.

In one embodiment, the method further comprises:

and carrying out speed-up processing on the video by utilizing the correlation between the images of the adjacent frames in the video.

In an embodiment, the speeding up the video by using the correlation between the images of the adjacent frames in the video specifically includes:

decoding the video to obtain information of each frame of image and whether the frame of image is a key frame image;

if the current frame image is a key frame image, calculating a weight coefficient of each interpolation kernel in the interpolation kernel dictionary of the high-resolution image block by using the interpolation kernel dictionary of the high-resolution image block and the dual matrix thereof, constructing an interpolation kernel suitable for small image blocks in the key frame image, and performing super-resolution reconstruction;

if the current frame image is a non-key frame image, the similarity between the image small block in the current frame image and the image small block at the corresponding position in the previous frame image is compared, and whether the interpolation kernel of the image small block in the current frame image is recalculated or the cached interpolation kernel of the image small block in the previous frame image is used is determined to perform super-resolution reconstruction on the image small block in the current frame image.

Compared with the prior art, the technical scheme at least has the following beneficial effects:

the embodiment of the invention provides a video super-resolution reconstruction method based on self-adaptive interpolation kernel learning, which is characterized in that an interpolation kernel dictionary and a corresponding dual matrix thereof are obtained from a training image set on the basis of a dual learning theory, and the non-linear optimization problem in the traditional learning-based super-resolution reconstruction algorithm is converted into a linear optimization problem by introducing the dual matrix, so that the time complexity of the algorithm is reduced, and the calculation time is reduced; and moreover, the strong correlation between adjacent frames in the video is utilized, the available information obtained in the previous frame is repeatedly utilized to guide the calculation of the current frame, the algorithm execution speed is further improved, and the real-time super-resolution reconstruction work of the video is realized. The video super-resolution reconstruction algorithm based on the self-adaptive interpolation kernel learning at least has the following advantages:

1. by separating the training process and the reconstruction process, the number of samples of the training set is increased, the reconstruction effect of the interpolation kernel dictionary and the dual matrix thereof can be improved, and the calculation speed in the super-resolution reconstruction cannot be influenced.

2. In the reconstruction process, the expression coefficient of the image block is calculated by introducing the dual matrix, the original nonlinear optimization problem based on learning reconstruction is converted into a linear problem, and the operation speed of the algorithm is greatly improved.

3. Aiming at image blocks with different structures, such as strong edges, textures and the like, an interpolation kernel function suitable for the image blocks is constructed by carrying out weighted average on the interpolation kernel functions in an interpolation kernel dictionary, so that edge and texture information in the image can be effectively maintained, and distortion phenomena such as aliasing, ringing, sawtooth, blocking effect and the like existing in the conventional interpolation amplification are inhibited.

4. Aiming at a video with a CIF format (the resolution is 352 multiplied by 288 pixels, and 25 frames per second), after the acceleration by using a GPU, the real-time 3-time super-resolution reconstruction can be realized, and the visual quality of the video and the identification degree of a target in the video are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and are not exhaustive, and it is obvious for a person skilled in the art that other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating a video super-resolution reconstruction method based on adaptive interpolation kernel learning according to an exemplary embodiment;

fig. 2 is a flowchart illustrating a video super-resolution reconstruction method based on adaptive interpolation kernel learning according to another exemplary embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to specific embodiments and the accompanying drawings.

It should be noted that in the following description, numerous specific details are set forth in order to provide an understanding. It may be evident, however, that the subject invention may be practiced without these specific details.

It should be noted that, without being explicitly defined or conflicting, the embodiments and technical features thereof in the present application may be combined with each other to form a technical solution.

Various exemplary embodiments of the present invention are described in detail below with reference to the accompanying drawings. The flowcharts in the figures illustrate the architecture, functionality, and operation of possible implementations of methods according to various embodiments of the present invention. It should be noted that each block in the flowchart may represent a module, a program segment, or a portion of code, which may include one or more executable instructions for implementing the logical function specified in the various embodiments. It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the flowchart illustrations, and combinations of blocks in the flowchart illustrations, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or by combinations of special purpose hardware and computer instructions.

The core idea of the embodiment of the invention is based on a learning-based super-resolution reconstruction algorithm, and in the training process, an interpolation kernel dictionary of a high-resolution image block and a corresponding dual matrix are trained by establishing a training image library corresponding to high and low resolutions; in the super-resolution reconstruction process, dividing an input video frame into a plurality of image small blocks, and solving a representation coefficient of each image small block by using a dual matrix; furthermore, the coefficient is used as the weight coefficient of the interpolation kernel, and the interpolation kernel suitable for the coefficient is constructed for each image small block; and performing interpolation amplification on the image by using the self-adaptive interpolation core, thereby obtaining the image after super-resolution reconstruction. And finally, comparing image blocks corresponding to adjacent frames by using the strong correlation between the adjacent frames in the video, and repeatedly using an interpolation core obtained by calculation of the previous frame for the image block with high similarity.

The embodiment of the invention realizes the real-time super-resolution reconstruction of the input low-resolution video by using the computer to obtain the output video with high resolution. Fig. 1 is a flowchart illustrating a video super-resolution reconstruction method based on adaptive interpolation kernel learning according to an exemplary embodiment. As shown in fig. 1, the embodiment of the present invention includes two parts, a training process and a reconstruction process.

In the training process, the following steps are adopted: establishing a training image library corresponding to high and low resolutions; step T1: performing dual learning, and acquiring a high-resolution dictionary and a dual matrix thereof by using a dual learning algorithm; step T2: and (4) acquiring an interpolation kernel dictionary, namely clustering the image blocks in the training image library by using the high-resolution dictionary acquired in the step T1, training the high-resolution image blocks and the low-resolution image blocks in each class, and acquiring an interpolation kernel corresponding to each class.

In the reconstruction process, the following steps are adopted: inputting a video; step R1: decoding the input video file, and acquiring each frame of image and a mark whether the frame is a key frame; step R2: dividing the current frame image into small overlapped image blocks, and performing super-resolution reconstruction processing block by block; step R3: constructing an adaptive interpolation kernel and super-resolution amplification, namely: if the current frame is a key frame, obtaining the representation coefficient of each image small block by using the high-resolution interpolation kernel dictionary and the dual matrix thereof obtained in the training process, taking the representation coefficient as a weight value, constructing an interpolation kernel suitable for the image small block, and performing interpolation amplification; if the current frame is a non-key frame, determining whether to recalculate an interpolation kernel suitable for the current block or to use an interpolation kernel already obtained in the previous frame to perform super-resolution reconstruction on the current block by comparing the similarity between the current image block and an image block at a corresponding position in the previous frame; step V1: caching an interpolation core used by the current image small block to prepare for subsequent frame processing; and after all the image small blocks in the current frame are processed, splicing the image small blocks into a high-resolution image, and outputting an amplified video.

Fig. 2 is a flowchart illustrating a video super-resolution reconstruction method based on adaptive interpolation kernel learning according to another exemplary embodiment. As shown in fig. 2, the method includes steps 1 to 3.

Step 1: and acquiring an interpolation kernel dictionary of the high-resolution image block and a corresponding dual matrix thereof according to a video image training set, wherein the video image training set comprises the high-resolution image block and the low-resolution image block.

In this step, the video image training set may be represented as P ═ X_h,X₁Therein of

X_{h} = {x_{h}^{(1)}, x_{h}^{(2)}, ..., x_{h}^{(n)}}

And

X_{1} = {x_{1}^{(1)}, x_{1}^{(2)}, ..., x_{1}^{(n)}}

respectively representing a set of high resolution image blocks and its corresponding set of low resolution image blocks. Traditional dictionary learning-based super-resolution reconstruction algorithm optimizes the problem by minimizing the following formulaTo obtain a dictionary D corresponding to the high and low resolution image block sets_hAnd D_l：

{{\hat{D}}_{h}, {\hat{D}}_{1}, \hat{Z}} = \arg \min_{D_{h}, D_{1}, Z} | | X_{h} - D_{h} Z | |_{F}^{2} + | | X_{l} - D_{l} Z | |_{F}^{2} + λ | | Z | |_{1} - - - (1)

Wherein Z is a dictionary D_hAnd D₁Coefficients representing the input high-resolution and low-resolution images, λ being a weighting parameter, byAnd manually inputting to adjust the sparseness degree of the representation coefficient Z. Although the dictionary obtained by the conventional learning method can effectively represent different structural and texture information in the image, in the process of reconstruction, the following nonlinear optimization problem also needs to be solved:

\hat{α} = \arg \min_{α} | | X_{1} - D_{1} α | |_{F}^{2} + λ | | α | |_{1} - - - (2)

a good representation coefficient of a low resolution dictionary block can be obtained and thus has a very high temporal complexity.

Wherein, X₁For a certain input low resolution image patch, D₁For the low resolution dictionary obtained by solving in equation 1, α is for D₁To represent X₁The time-of-flight representation factor, λ, is a weighting parameter used to adjust α the sparsity, as can be seen from equation 2, the problem is a non-linear optimization problem, and therefore equation 2 is used to calculate each input image patch X₁The representation coefficients α have a very high temporal complexity.

The embodiment of the invention adopts a dual learning algorithm, preferably a sparse constrained dual learning algorithm, converts the nonlinear optimization problem in the formula 2 into a linear optimization problem by learning a high-resolution dictionary and a dual matrix thereof, and has a form of an analytic solution.

In an alternative embodiment, the embodiment of the present invention adopts a dual learning algorithm, and changes the training process in equation 1 into solving the optimization problem as shown below:

{{\hat{D}}_{h}, \hat{C}, \hat{Z}} = \arg \min_{D_{h}, C, Z} | | X_{h} - D_{h} Z | |_{F}^{2} + η | | Z - {CX}_{1} | |_{F}^{2} + λ | | Z | |_{1} - - - (3)

wherein, X_hAnd X₁The method comprises the steps of inputting a small block set of high-resolution and low-resolution images; d_hInterpolation of image blocks represented as high resolutionA value kernel dictionary, Z is a representation coefficient corresponding to a small block of the high-resolution image, η and lambda are weight coefficients which are manually input to adjust the sparsity of a dual matrix and the representation coefficient, and the dual matrix C is D_hThe dual matrix C has the effect that, by acting on the low-resolution image block, the obtained representation coefficients can be directly applied to the high-resolution dictionary D_hAnd the method can well approach the high-resolution image block corresponding to the low-resolution image block, thereby avoiding the problem of nonlinear optimization in the solution formula 2.

With respect to solving D in equation 3_hThe specific iterative method of and C is shown in equations 4 and 5:

D_{h}^{n + 1} = π (D_{h}^{n} + \frac{1}{σ_{d}} (X_{h} - D_{h}^{n} Z) Z^{T}) - - - (4)

C^{n + 1} = π (C^{n} + \frac{1}{σ_{c}} (Z - C^{n} X_{h}) X_{h}^{T}) - - - (5)

wherein n represents the nth iteration;andis an iteration step length; π (x) is the projection function, which is defined as

Step 2: and acquiring an interpolation kernel of the image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block according to the interpolation kernel dictionary of the high-resolution image block.

In practical application, the interpolation kernel dictionary D of the high-resolution image block obtained according to the formula 3_hAnd then, an interpolation kernel K of the image block structure corresponding to each atom in the dictionary can be obtained_iI.e. the interpolation kernel corresponding to each image patch sub-class.

In an optional embodiment, the obtaining an interpolation kernel of an image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block may specifically include:

clustering the image small blocks in the image training set by using an interpolation kernel dictionary of the high-resolution image block to obtain a low-resolution image small block subclass and a corresponding high-resolution image small block subclass; and training the high-resolution image small block pair and the low-resolution image small block pair in each image small block subclass by utilizing a least square algorithm to obtain an interpolation kernel of the image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block.

Preferably, the specific steps are as follows:

suppose the ith atom in the dictionary is denoted asBy computing all high resolution patches to atoms in a set of patches in an image training setA distance of (i) thatAll high-resolution image blocks with distances smaller than a certain threshold and corresponding low-resolution image blocks are grouped into one type and recorded asThe interpolation kernel K of the image block structure corresponding to the ith atom_iThis can be obtained by solving a least squares problem as follows:

{\hat{K}}_{i} = \arg \min_{K_{i}} | | X_{h_{i}} - K_{i} X_{l_{i}} | |_{F}^{2} - - - (6)

the specific solution method of equation 6 can be obtained by the following iterative format:

K_{i}^{n + 1} = K_{i}^{n} + \frac{1}{σ_{i}} (X_{h_{i}} - K_{i}^{n} X_{l_{i}}) X_{l_{i}}^{T} - - - (7)

wherein n represents the nth iteration,in incremental steps for each iteration.

According to the method of equation 7, when each subclass P_iBy interpolation kernel K_iAfter the calculation is finished, all the interpolation kernels are combined together to form an interpolation kernelDian, is marked as D_k＝{K₁,K₂,…,K_M}。

And step 3: and constructing an interpolation kernel of the image small block of the image to be processed according to the dual matrix and the interpolation kernel of the image block structure corresponding to each atom, performing interpolation amplification on the image small block of the image to be processed by utilizing the interpolation kernel of the image small block of the image to be processed, and splicing the amplified image small blocks.

It should be noted that, for components having different structures in the input image, an interpolation kernel suitable for the structure should be used to interpolate and enlarge the components. Therefore, for an input single frame image, it is first required to divide it into small blocks, each of which has a size of 5 × 5, 7 × 7, or 9 × 9 pixels (the specific size is determined by the size of atoms in the interpolation kernel dictionary set in the training process); at the same time, in order to avoid discontinuous distortion between adjacent tiles, the image tiles should overlap each other.

In an alternative embodiment, the input low resolution image patch is written as x₁Corresponding to the high resolution image patch obtained after the super resolution reconstructionIs calculated by the formula Wherein the calculation formula of K is as follows:

K = Σ_{i = 1}^{N} ω_{i} K_{i} - - - (8)

wherein K_iFor the interpolation kernel dictionary D obtained in step (1)_kThe ith interpolation kernel in (e.g., as shown in equation 6); omega_iIs a weight coefficient and satisfies ω_i>0 andω_iis a value reflecting the input low resolution image patch x₁And the i-th class image structure. It follows that K is the sum of all the low resolution image patches x that fit into the input₁The interpolation kernel K is different from the interpolation kernel K of different low-resolution image small blocks, and is self-adaptive. Weight coefficient ω in equation 8_iThe calculation formula of (a) is as follows:

ω_{i} = \frac{| α_{i} |}{Σ_{i = 1}^{N} | α_{i} |} - - - (9)

α therein_iFor applying to small blocks x of the low-resolution image by means of a dual matrix₁After that, the obtained expression coefficient α, i.e., α ═ Cx₁Wherein α ═ α₁,α₂,…,α_N]。

When each image small block in the input image is constructed by adopting the method shown in the formula 8 to be suitable for the interpolation kernel K of the small block and is amplified, the amplified high-resolution image small blocks are spliced back to the high-resolution image again, and then the super-resolution reconstruction of the single-frame image is completed. As can be seen from the calculation methods in the formula 8 and the formula 9, the super-resolution reconstruction of the image by adopting the embodiment of the invention only has linear time complexity, so the method is a very efficient algorithm.

After acceleration by using CUDA (universal parallel computing architecture) and GPU (graphics processing unit), the method used in steps 1 and 2 in the embodiment of the present invention can basically meet the real-time super-resolution reconstruction requirement for CIF format video, for example. However, in consideration of enabling the video to be smoothly played as much as possible and avoiding the phenomenon of blocking, the embodiment of the invention further optimizes the video processing to improve the processing effect of the algorithm.

In an optional embodiment, the embodiment of the present invention may further include: and carrying out speed-up processing on the video by utilizing the correlation between the images of the adjacent frames in the video.

The method specifically comprises the following steps:

and decoding the video to acquire each frame of image and information whether the frame of image is a key frame image. If the current frame image is a key frame image, calculating the weight coefficient of each interpolation kernel in the interpolation kernel dictionary of the high-resolution image block by using the interpolation kernel dictionary of the high-resolution image block and the dual matrix thereof, constructing an interpolation kernel suitable for small image blocks in the key frame image, and performing super-resolution reconstruction. If the current frame image is a non-key frame image, the similarity between the image small block in the current frame image and the image small block at the corresponding position in the previous frame image is compared, and whether the interpolation kernel of the image small block in the current frame image is recalculated or the cached interpolation kernel of the image small block in the previous frame image is used is determined to perform super-resolution reconstruction on the image small block in the current frame image.

Preferably, the specific speed-up processing procedure is divided into two cases to be respectively processed:

it should be noted that, in this embodiment, the similarity between the image patch in the current frame image and the image patch in the corresponding position in the previous frame image can be measured by using the euclidean distance.

(a) For a key frame (I frame) image in a video, the method in step 2 is adopted in the embodiment of the invention, and an interpolation kernel suitable for each image small block in the key frame image is constructed and super-resolution reconstruction is carried out;

(b) for non-key frame (B-frame and P-frame) images in video, the image small blocks are divided as well, but for each image small blockFirst, calculate its image small block corresponding to the position in the previous frame imageOf the European type, i.e. betweenIf the Euclidean distance d is smaller than a given threshold value, the image small block in the current frame is considered to be very close to the image small block at the corresponding position in the previous frame. Therefore, the interpolation kernel of the image small block is not calculated any more, and the interpolation kernel in the image of the previous frame is directly adopted to carry out super-resolution reconstruction on the image small block in the current frame. If the distance d is greater than the given threshold value, repeating the method in the step 2 in the embodiment of the present invention to construct an interpolation kernel suitable for the image small block and perform super-resolution reconstruction on the interpolation kernel, and meanwhile, caching the interpolation kernel to prepare for the image of the next frame.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A video super-resolution reconstruction method based on adaptive interpolation kernel learning is characterized by comprising the following steps:

acquiring an interpolation kernel dictionary of a high-resolution image block and a corresponding dual matrix thereof according to a video image training set, wherein the video image training set comprises the high-resolution image block and a low-resolution image block;

2. The method for reconstructing video super resolution based on adaptive interpolation kernel learning according to claim 1, wherein the obtaining of the interpolation kernel dictionary of the high-resolution image block and the dual matrix corresponding thereto specifically comprises:

adopting a dual learning algorithm to solve the following formula:

{{\hat{D}}_{h}, \hat{C}, \hat{Z}} {argmin}_{D_{h}, C, Z} | | X_{h} - D_{h} Z | |_{F}^{2} + η | | Z - {CX}_{1} | |_{F}^{2} + λ | | Z | |_{1},

wherein,

X_{h} = {x_{h}^{(1)}, x_{h}^{(2)}, ..., x_{h}^{(n)}}

representing a set of high resolution image blocks;

η represents a weight coefficient;

λ represents a weight coefficient;

c represents D_hThe dual matrix of (2).

3. The method for reconstructing super-resolution video based on adaptive interpolation kernel learning according to claim 1, wherein the obtaining an interpolation kernel of an image block structure corresponding to each atom in the interpolation kernel dictionary of the high-resolution image block according to the interpolation kernel dictionary of the high-resolution image block specifically comprises:

4. The method for reconstructing super-resolution video based on adaptive interpolation kernel learning according to claim 1, wherein the constructing an interpolation kernel of an image patch of the image to be processed according to the dual matrix and the interpolation kernel of the image patch structure corresponding to each atom specifically includes:

the following formula is calculated:

K = (Σ_{i = 1}^{N} ω_{i} K_{i});

5. The method for reconstructing the video super resolution based on the adaptive interpolation kernel learning as claimed in claim 1, wherein the method further comprises:

6. The method for reconstructing super-resolution video based on adaptive interpolation kernel learning according to claim 5, wherein the accelerating processing is performed on the video by using correlation between images of adjacent frames in the video, and specifically comprises: