CN102521799B

CN102521799B - Construction method of structural sparse dictionary for video image recovery enhancement

Info

Publication number: CN102521799B
Application number: CN2011103715055A
Authority: CN
Inventors: 袁梓瑾
Original assignee: Sichuan Hongwei Technology Co Ltd
Current assignee: Sichuan Hongwei Technology Co Ltd
Priority date: 2011-11-21
Filing date: 2011-11-21
Publication date: 2013-12-04
Anticipated expiration: 2031-11-21
Also published as: CN102521799A

Abstract

The invention discloses a construction method of a structural sparse dictionary for video image recovery enhancement. The construction method comprises the following steps of: carrying out DCT (Discrete Cosine Transform) on image pieces respectively selected from a natural image library and a fitting image with a clear edge, mapping the image pieces to a radio frequency space, carrying out an initial clustering process based on characteristics of a normal frequency domain of the image piece, then carrying out a secondary flustering process on each cluster based on the high-frequency information characteristics of the cluster, finally extracting first m main element components from the obtained secondary cluster to obtain a sparse sub-dictionary subDi-j of the secondary cluster, combining all obtained sparse sub-dictionaries subDi-j into a final structural sparse dictionary The established two-level structural sparse dictionary library is different from a traditional tediously long low-efficiency linear over-complete dictionary, can be used for quickly and effectively solving sparse expression of input image video signals, sparse coefficient vectors obtained by carrying out synergic and hierarchical sparse modeling on any image video signals are exact and effective, and an anti-noise characteristic with a high degree is provided.

Description

A kind of building method that recovers the structural sparse dictionary of enhancing for video image

Technical field

The invention belongs to the video image enhancement processing technology field, more specifically say, relate at image/video and strengthen the building method that recovers the structural sparse dictionary of enhancing in processing for video image.

Background technology

Because image/video acquisition system self intrinsic defect or limitation, the Digital Image Data collected from reality scene by camera system is through the result after all image quality degradation effects.That is to say, because of various reasons, there are obvious gap in picture and real scene that the digital picture from gathering is seen at visual quality.Most typical, there is the point spread function (PSF) of camera to bring blurring effect, the resolution limit of camera CMOS or CCD induction chip is brought the down-sampling effect, the reason that the three major types image qualities such as obscures effect after air, the stack of camera system noise are degenerated.

In typical digital television system, be implied with the blocking effect flaw of degree varies through the video council of all kinds of video compression technology codings, and from traditional radio and television simulating signal except its resolution be confined to PAL/NTSC upper, also have the simulating signal decoding to transfer digital signal to, go in the process such as interlacing processing because the noise that error is brought or image quality are degenerated.Here it is we in the goodish reason of the image quality of seeing from televisor today.When playing the HD video film source as blue light disc, pleasing picture effect has just in time illustrated how serious our ordinary video image quality degeneration has.

Because the existence in the non-HD video of the tradition of magnanimity source, broadcast television signal still be take the pal mode standard as main at present, use the low image quality video of various low-cost mobile imaging equipment shootings in continuous increase, these all show to have the great demand of the new video image picture quality enhancement treatment technology of innovation.

The compressed sensing technology thinks under meeting some requirements, and the signal after stained by various degenerations can accurately be rebuild to a certain extent and recover.Specifically, in image processing field, modeling is as follows:

I _LR＝UI _HR+w (1)

I _HRUntainted desirable high-definition image signal, I _LRBe viewed low clear picture signal after various degenerations, w is the noise variable of additivity, and the degeneracy operator of U linearity can be fuzzy operator, down-sampling operator, the stained operator of additivity etc.The image enhancement processing task is exactly according to known low clear image I _LRData recover unknown desirable high-definition image I _HRData.In order to obtain the high-definition image I approached of euphorosia _HR, need to integrate more natural image priori, with constraint high-definition image I _HRRejuvenation.The general modeling of this process is as follows:

\tilde{a} = \underset{a}{\arg \min} {| | a | |}_{1},

And

{| | I_{LR} - UD \tilde{a} | |}_{2}^{2} \leq ϵ - - - (2)

I_{HR} = D \tilde{a} - - - (3)

The priori of this natural image is the sparse property of picture signal in the specific definitions territory namely, that is to say high-definition image I _HRCan be at transform domain D with sparse coefficient vector

Rationally express, wherein the threshold value of ε for setting.And the coefficient vector sparse coefficient vector

Middle most elements is all close to zero.As a rule, transform domain D is the complete sparse dictionary of mistake that a base element forms.Traditional sparse dictionary acquisition methods has two kinds: the one, and predefine transform-based well-known transform domains as various as Fourier basis, wavelet basis etc. form; Another kind is to obtain the complete dictionary of such mistake from a large amount of training data learnings.In a word, the structure of sparse dictionary has determined sparse coefficient vector in formula (3)

Sparse property, determined speed of convergence and the stability of oval protruding optimization problem in formula (2) simultaneously, also just finally affected the performance that high-definition signal high-definition image IHR recovers.

Sparse dictionary and the defect thereof of current prior art

The complete sparse dictionary methods of traditional mistake, implying assumed conditions and be between each sparse dictionary element, be discontinuous mutually independently, thereby also just suppose sparse coefficient vector

In the nonzero coefficient position be equally distributed at random.On the other hand in order to express high-definition image I _HRCompleteness, sparse dictionary, the element number N in transform domain D is with respect to sparse coefficient vector

In nonzero value number M very large, that is to say need to meet N>>condition of M.Therefore when solving the optimization problem of formula (2), solution space size in theory, degree of freedom is:

(\begin{matrix} N \\ M \end{matrix})

This has caused, and speed of convergence is slow, calculated amount is large, and final accuracy and stability of separating is weakened.

There is a kind of method of preliminary structural sparse dictionary to obtain sparse dictionary.Be to extract T angle from 0 to π direction is evenly unified specifically, each angle gauge is calculated a PCA base, will have subsequently T the additional DCT base of PCA base to form together this structurized sparse dictionary, as shown in Figure 1.In solution procedure, by each PCA base regard as one independently dictionary carry out the sparse signal recovery, selecting the wherein base of trueness error item and sparse constraint item sum minimum is that this signal recovers the transform domain dictionary calculated.Yet its defect of the building method of this structural sparse dictionary is, at first to calculate initialized oriented PCA base from the image at comprehensive T black and white line angle edge out, use subsequently EM (expectation maximization) algorithm iteration to upgrade the content in the PCA base.

Structurized its advantage of sparse dictionary is significantly, and solution space size (degree of freedom) exists

(\begin{matrix} N \\ M \end{matrix})

Basis on greatly reduce.If if the element number of each PCA base is identical, the degree of freedom solved on this structuring dictionary so is down to

(\begin{matrix} N \\ M / (T + 1) \end{matrix}) * (T + 1) .

Secondly, can be according to the low clear image I of difference _LRSelect different PCA bases, namely sub-dictionary, improved low clear image I like this _LRAnd the coupling adaptability between structurized sparse dictionary, thereby obtain sparse property constraint more accurately.

Sparse dictionary carries out the value that the structuring constraint has two aspects, can obtain more healthy and strong sparse expression on the one hand, and in signal interpretation, the former subitem collection of active dictionary has shown certain signal physical attribute on the other hand.

Yet, use the local sheet of expressing natural image from the PCA base of comprehensive image calculation, himself intrinsic defect is arranged.Because various natural image sheets and regular image depart from and can only carry out modeling by the variance parameter in Gauss model without exception, so just lost greatly the accuracy of sparse expression.

Summary of the invention

The object of the invention is to overcome the deficiencies in the prior art, a kind of building method that recovers the structural sparse dictionary of enhancing for video image is provided, make sparse expression more effective and accurate.

For achieving the above object, the present invention recovers the building method of the structural sparse dictionary of enhancing for video image, it is characterized in that, comprises the following steps:

(1), choosing respectively some levels image from the clearly demarcated fitted figure picture in natural image storehouse and edge generates and is of a size of

Whole image sheet S set, the pixel number that wherein n is image sheet;

(2), the image sheet in the image sheet S set is carried out to dct transform, the DCT coefficient formation DCT coefficient sets S obtained _dct

(3), to DCT coefficient sets S _dctIn image sheet DCT coefficient carry out K ₁The one-level clustering processing at individual center, thus put respectively image sheet DCT coefficient under corresponding K ₁Individual cluster

(4), for each cluster, it is done to structuring operation again: to cluster S _{Dct_i}(1≤i≤K ₁) in image sheet DCT coefficient extract its high fdrequency component, obtain the cluster that high fdrequency component forms

(5), to cluster

In high fdrequency component carry out K _{2_i}(1≤i≤K _i) clustering processing at individual center, obtain K thereby divide _{2_i}Individual secondary cluster

(6), to each secondary cluster

(1≤i≤K ₁, 1≤j≤K _{2_i}) carry out principal component decomposition, extract its front m major component component, form sparse sub-dictionary subD that should cluster _{i_j}(1≤i≤K ₁, 1≤j≤K _{2_i}); Whole sparse sub-dictionary subD _{i_j}Form final structural sparse dictionary.

Goal of the invention of the present invention is achieved in that

The present invention recovers the building method of the structural sparse dictionary of enhancing for video image, carry out dct transform by the image sheet to choosing respectively from the clearly demarcated fitted figure picture in natural image storehouse and edge, be mapped to the frequency domain space, general frequency domain character based on image sheet carries out preliminary clustering processing, subsequently each cluster is further carried out to the secondary clustering processing based on its high-frequency information feature, finally, the secondary cluster obtained is extracted to front m major component component, obtain the sparse sub-dictionary subD of this cluster _{i_j}, whole sparse sub-dictionary subD _{i_j}Form final structural sparse dictionary, the sparse dictionary of the two-layer configurationization of setting up like this, the linearity that is different from traditional tediously long poor efficiency is crossed complete dictionary.

The present invention has following beneficial effect:

1, under the condition of the sparse dictionary of secondary hierarchical structuring of the present invention, can solve fast and effectively the sparse expression of input picture vision signal, and avoided traditional tediously long calculating that sparse expression solves of carrying out in a complete dictionary of mistake, its reason is the invention provides the two-stage hierarchical structure, thereby has effectively removed the information redundancy between each former subitem in sparse dictionary.

2,, on the basis of the sparse dictionary of secondary hierarchical structuring of the present invention, the arbitrary image vision signal is worked in coordination with to the resulting sparse coefficient vector of the sparse modeling of level

Be accurate and effective, possess the noiseproof feature of certain degree.

The accompanying drawing explanation

Fig. 1 is that the sparse dictionary of prior art preliminary structure forms structural drawing;

Fig. 2 is the particular flow sheet of the present invention for the building method of the structural sparse dictionary of video image recovery enhancing;

Fig. 3 is the schematic diagram of the sparse dictionary of hierarchical structuring;

Fig. 4 is sparse sub-dictionary configuration schematic diagram.

Embodiment

Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described, so that those skilled in the art understands the present invention better.Requiring particular attention is that, in the following description, when perhaps the detailed description of known function and design can desalinate main contents of the present invention, these are described in here and will be left in the basket.

Fig. 2 is the particular flow sheet of the present invention for the building method of the structural sparse dictionary of video image recovery enhancing.

In this enforcement, as shown in Figure 2, specific implementation process of the present invention is as follows:

(1), the image of some levels is provided in provided natural image storehouse 201 and the clearly demarcated matching image library 202 in edge that provides, the ratio of the two can depend on the circumstances, for example 3: 1.Extract length and width and be of a size of all choose on image Image sheet, wherein the interval of n is generally [25,100].

In Fig. 2, can be from the outside disclosed high-quality natural image in natural image storehouse 201 is collected, its required image order of magnitude generally 100 with interior just enough.The image that in Fig. 2, the image of matching image library 202 is the pixel value black and white boundary graph line that is 0 or 255, its separatrix angle is 18 the even decile angles of direction from 0 to π.Step 203 is extracted

Image sheet pie graph photo S set.

(2) step 204 signal is carried out pre-service to the image sheet S set, and it is projected to the frequency domain space, thereby provides basis for these image sheet samples are carried out to cluster based on features such as texture, edges.Specifically, the image sheet in the image sheet S set is carried out to dct transform, the DCT coefficient obtained forms DCT coefficient sets S _dct, that is:

s _dct＝DCT(s)，s∈S，s _dct∈S _dct (4)

In (4), s represents an image sheet, s _dctRepresent the DCT coefficient after its dct transform.It should be noted that dct transform can have multiple replacement method, only need effectively the image sheet S set to be projected in the frequency domain space and to get final product.

As in Fig. 2 in step 205, image sheet is carried out to clustering processing based on the significant characteristic of the frequency domain characters such as texture, edge, be exactly to DCT coefficient sets S specifically _dctIn image sheet DCT coefficient carry out K ₁The clustering processing at individual center, thereby by DCT coefficient sets S _dctBe divided into K ₁Individual cluster The clustering processing method here of it should be noted that can have several different methods, a kind of classics and effective method is exactly the K means clustering method, also:

{S_{dct_1}, S_{dct_2}, S_{dct_3} \cdot \cdot \cdot S_{dct_K_{1}}} = kmean (S_{dct}) - - - (5)

That is to say the DCT coefficient sets S expressed on frequency domain here _dctAll be classified into K after the kmean conversion ₁In individual one-level cluster, namely

S_{dct} = S_{dct_1} + S_{dct_2} + S_{dct_3} + \cdot \cdot \cdot + S_{dct_K_{1}} .

According to this one-level cluster, expressing epigraph sheet S set in spatial domain has the classification of unique correspondence to express

S = S_{1} + S_{2} + S_{3} + \cdot \cdot \cdot + S_{K_{1}} .

In Fig. 2, step 205 is done structuring for each cluster to it again and is processed, and need to extract its high fdrequency component to each image sheet DCT coefficient in each one-level cluster, is specially cluster S _{Dct_i}(1≤i≤K ₁) in each DCT coefficient extract its high fdrequency component and be designated as

Also:

S_{dct_i}^{h} = highFrequence (S_{dct_i}), 1 \leq i \leq K_{1} - - - (6)

Conversion that it should be noted that the extraction high fdrequency component in formula (6) can be by accomplished in many ways, and a kind of is directly in the expression by the DCT coefficient, to extract high fdrequency component.Another kind method is to use classical Laplace operator to extract its high fdrequency component in the spatial domain expression before DCT frequency domain projection conversion, also:

S_{i}^{h} = Laplacian (S_{i}), 1 \leq i \leq K_{1} - - - (7)

In formula (7)

Quite and cluster

As shown in Figure 2, in step 207, after the high fdrequency component of step 206 is extracted pre-service, to cluster

Carry out K _2-i(1<i<K ₁) clustering processing at individual center, carry out K _{2_i}(1≤i≤K _i) clustering processing at individual center, obtain K thereby divide _{2_i}Individual secondary cluster

Also:

{S_{dct_i_1}^{h}, S_{dct_i_2}^{h}, S_{dct_i_3}^{h} \cdot \cdot \cdot S_{dct_i_K_{2_i}}^{h}} = kmean (S_{dct_i}^{h}) - - - (8)

S_{dct_i}^{h} = S_{dct_i_1}^{h} + S_{dct_i_2}^{h} + S_{dct_i_3}^{h} + \cdot \cdot \cdot + S_{dct_i_k_{2_i}}^{h} - - - (9)

In Fig. 2, in step 208 and 209, to each secondary cluster

(1≤i≤K ₁, 1≤j≤K _{2_i}) construct its corresponding sparse sub-dictionary subD _{i_j}, that is to say each secondary cluster

Carry out principal component decomposition, extract its front m major component component V ₁～V _mThereby form sparse sub-dictionary subD that should cluster _{i_j}.But a kind of implementation method of carrying out principal component decomposition is exactly classical PCA method, also:

sub D_{i - j} = [V_{1}, V_{2}, \cdot \cdot \cdot, V_{m}] = PCA (S_{dct_i_j}^{h}) - - - (10)

If use the PCA method to carry out principal component decomposition, before extracting, m important proper vector sequentially arranged and formed together sparse sub-dictionary subD as the former subitem of sub-dictionary _{i_j}, as schematically shown in Figure 4.

In step 210, through formula (10), process and obtain the corresponding sparse sub-dictionary subD of each secondary cluster _{i_j}After, their complete or collected works anticipate as shown in Figure 3 and have jointly formed the structural sparse dictionary with secondary level.

That is to say, this sparse dictionary has two-stage retrieval dimension, search index i, and the definition space of j is 1≤j≤K _{2_i}, 1≤i≤K ₁, and each sparse sub-dictionary subD _{i_j}By m major component component, as former subitem, sequentially rearranged again.

It should be noted that the building method that the present invention recovers the structural sparse dictionary of enhancing for video image can processed offline complete, finally obtain a structural sparse dictionary with secondary level, for the image/video of online processing input is subsequently prepared.

After obtaining structural sparse dictionary, need on its basis, the arbitrary image vision signal is worked in coordination with the sparse modeling of level, thereby tries to achieve its sparse coefficient vector

.

Although the above is described the illustrative embodiment of the present invention; so that those skilled in the art understand the present invention; but should be clear; the invention is not restricted to the scope of embodiment; to those skilled in the art; as long as various variations appended claim limit and definite the spirit and scope of the present invention in, these variations are apparent, all innovation and creation that utilize the present invention to conceive are all at the row of protection.

Claims

1. a building method that recovers the structural sparse dictionary of enhancing for video image, is characterized in that, comprises the following steps:

(1), choosing respectively some levels image from the clearly demarcated fitted figure picture in natural image storehouse and edge generates and is of a size of the whole image sheet S set of n * n, the pixel number that wherein n is image sheet;

(4), for each cluster, it is done to structuring operation again: to cluster S _{Dct_i}, 1≤i≤K ₁In image sheet DCT coefficient extract its high fdrequency component, obtain the cluster that high fdrequency component forms

(5), to cluster

In high fdrequency component carry out K _{2_i}The clustering processing at individual center, obtain K thereby divide _{2_i}Individual secondary cluster

{S_{dct_i_1}^{h}, S_{dct_i_2}^{h}, S_{dct_i_3}^{h} \cdot \cdot \cdot S_{dct_i_K_{2_i}}^{h}}_{,},

(6), to each secondary cluster

1≤i≤K ₁, 1≤j≤K _{2_i}Carry out principal component decomposition, extract its front m major component component, form sparse sub-dictionary subD that should cluster _{i_j}, 1≤i≤K ₁, 1≤j≤K _{2_i}Whole sparse sub-dictionary subD _{i_j}Form final structural sparse dictionary;

Described one-level cluster is based on texture, frequency domain character significant characteristic in edge is carried out clustering processing.