CN104966276A - Image video scene content conformal mapping sparse representation method - Google Patents

Image video scene content conformal mapping sparse representation method Download PDF

Info

Publication number
CN104966276A
CN104966276A CN201510337089.5A CN201510337089A CN104966276A CN 104966276 A CN104966276 A CN 104966276A CN 201510337089 A CN201510337089 A CN 201510337089A CN 104966276 A CN104966276 A CN 104966276A
Authority
CN
China
Prior art keywords
dictionary
image
conformal
sample
alpha
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510337089.5A
Other languages
Chinese (zh)
Other versions
CN104966276B (en
Inventor
陈小武
李健伟
邹冬青
赵沁平
高博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201510337089.5A priority Critical patent/CN104966276B/en
Publication of CN104966276A publication Critical patent/CN104966276A/en
Application granted granted Critical
Publication of CN104966276B publication Critical patent/CN104966276B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention provides an image video scene content conformal mapping sparse representation method. The method comprises the following steps: 1) inputting an original image or video and carrying out sampling in a characteristic space; 2) calculating K nearest neighbor of each sample and establishing a local full adjacent map and calculating the distance between adjacent samples; 3) according to conformal mapping rules, combining the conformal mapping rules with the sparse representation method and learning a dictionary having a conformal property; and 4) carrying out reconstruction on the original image or video by utilizing the dictionary. According to the method, by introducing the conformal mapping rules, angle information between the adjacent samples is kept to the maximum degree, and the dictionary having more powerful presentation capability is obtained. Meanwhile, conformal mapping drives the adjacent samples to use the similar dictionary to carry out reconstruction, so that the dictionary is allowed to be more concise and compact. The method has a wide application prospect in the technical field of picture processing, computer vision and reality augmentation.

Description

A kind of conformal projection sparse expression method of image/video scene content
Technical field
The present invention relates to image procossing, computer vision and augmented reality field, specifically a kind of conformal projection sparse expression method of image/video scene content.
Background technology
In the last few years, sparse expression and dictionary learning technology were paid close attention in a large number as a study hotspot, and were widely used in image procossing and computer vision field, such as image super-resolution, image denoising, classification and color editor etc.Sparse expression technology is that signal was used the linear combination of sample in complete dictionary to reconstruct, and the number limiting reconstructed sample is to reach sparse property.
At present, a lot of researcher is devoted to the research of sparse expression method, and dictionary plays very important effect in sparse expression technology.The people such as Michal Aharon proposed K-SVD dictionary learning method in 2006 and are applied to image procossing.The people such as Honglak Lee proposed a kind of rapid sparse coding method in 2006, accelerate solving speed.The people such as Mairal proposed the online dictionary learning method based on stochastic approximation in 2009, the method effectively can process large data sets.The method for solving focusing on sparse expression of these methods and operational efficiency.These methods are absorbed in the re-configurability of dictionary, but need to depend on a large amount of training samples.Further, the dictionary number of these methods needs manually to arrange, can not auto scaling, makes the dictionary redundancy obtained.Other sparse expression method obtain certain achievement in the tight ness rating and expressivity of dictionary.Such as, the people such as Qiu proposed the action attributes dictionary learning method based on maximum mutual information in 2011; The people such as Siyahjani proposed context-aware dictionary in 2013 and for the identification of image object and location.These dictionary learning methods add the otherness between classification, but do not consider the local relation in data space and contextual information, cause the ability to express of dictionary low.And some researchs show, keep the partial structurtes relation between data interconnects can strengthen fidelity when data reconstruction, avoid the generation of distortion situation.
Sparse expression technology is applied to image procossing and computer vision field more and more.Such as, K-SVD method is used for image denoising by the people such as Elad; The people such as Yang proposed the method simultaneously learning out high resolving power and low resolution two dictionaries by sparse expression method in 2010, and for image super-resolution; The people such as Chen proposed in 2014 to utilize sparse expression technology to carry out editing the theory propagated, and can process the image/video of ultrahigh resolution and greatly reduce calculating internal memory.In addition, sparse expression technology can also be applied to the aspects such as recognition of face, Postprocessing technique, Images Classification.And in the processing procedure of above-mentioned application, generate the emphasis that the higher result of eye fidelity remains sparse expression technical research.
Summary of the invention
In order to overcome above-mentioned the deficiencies in the prior art, the present invention proposes a kind of conformal projection sparse expression method of image/video scene content, the method, by introducing conformal projection, maintains the local angle information between adjacent sample to greatest extent, and obtains the stronger dictionary of ability to express.Meanwhile, conformal projection impels the similar dictionary of adjacent sample to be reconstructed, and makes dictionary concision and compact more.Finally, make the reconstruction result after picture editting keep original partial structurtes better, strengthen the visual effect and the sense of reality that generate result.
For completing goal of the invention, the technical solution used in the present invention is:
The conformal projection sparse expression method of a kind of image/video scene content of the present invention, its concrete steps are as follows:
Step one: input original image or video sampling in feature space;
Step 2: in feature space, calculates the k nearest neighbor of each sample and sets up local adjacent map completely, then calculating the distance between adjacent sample;
Step 3: according to conformal projection rule, it combined with sparse expression method, study has the dictionary of conformal character;
Step 4: for embody rule, utilizes this dictionary to be reconstructed original image or video, obtains result.
Wherein, " local is adjacent map completely " described in step 2, referring in the set formed for certain sample and its k nearest neighbor, is all connected between any two samples.
Wherein, " conformal projection rule " described in step 3 is a kind of manifold learning, specifically describes to be: given feature space M is to the mapping g:M → N of another feature space N, (x i, x j, x k) be sample point adjacent in feature space M and form triangle, (α i, α j, α k) be the mappings of these sample points in feature space N.Need meet according to conformal projection rule:
m i n Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, N irepresent sample x ik nearest neighbor set, s irepresent the change of scale mapped.
Wherein, combining with sparse expression method described in step 3 learns to have the dictionary of conformal character, and concrete steps are: conformal projection rule be combined with sparse expression algorithm, obtain following energy theorem:
m i n D , α , S Σ i | | x i - Dα i | | 2 2 + λ 1 Σ i | | α | | 1 + λ 2 Σ i Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, x is input amendment feature, and D is characteristics dictionary, and α is reconstruction coefficients, λ 1, λ 2for weight coefficient, minimize this energy theorem by iterative algorithm, finally try to achieve the dictionary D with conformal character.
Wherein, the method can be applied to the video image editor application such as image super-resolution, video image color editor, image denoising.
Compared with prior art, its useful feature is in the present invention:
1, in sparse expression technical foundation, by introducing conformal projection rule, maintaining the local angle information between adjacent sample to greatest extent, obtaining the dictionary that ability to express is stronger; By conformal projection, impel the similar dictionary of adjacent sample to be reconstructed, make dictionary concision and compact more.
2, benefit from more succinct and that ability to express is stronger dictionary, the reconstruction result after the present invention makes picture editting keeps original partial structurtes better, strengthens the visual effect and the sense of reality that generate result.
3, the method that the present invention proposes can be applied to a lot of field and Be very effective, comprising: image super-resolution, video image color editor, image denoising etc.
Accompanying drawing explanation
Fig. 1 is the method for the invention process flow diagram;
Fig. 2 is principle schematic of the present invention;
Fig. 3 is dictionary learning total algorithm process flow diagram of the present invention;
In figure, symbol description is as follows:
D: the dictionary learnt under particular feature space;
A: reconstruction coefficients;
S: change of scale coefficient;
X i, x j, x k: the sample point of input, the i.e. sample characteristics of image/video;
α i, α j, α k: the sample point being mapped to another space, namely sparse reconstruction coefficients.
Embodiment
In order to make object of the present invention, technical scheme and advantage clearly understand, below in conjunction with accompanying drawing, in detail explanation is explained to method of the present invention.Should be appreciated that instantiation described herein only in order to explain the present invention, be not intended to limit the present invention.
The present invention proposes a kind of conformal projection sparse expression method of image/video scene content, and the method, by introducing conformal projection rule, maintains the local angle information between adjacent sample to greatest extent, obtains more succinct and that ability to express is stronger dictionary; The dictionary using the method to generate carries out video image editor, and its reconstruction result can keep original partial structurtes better, strengthens the visual effect and the sense of reality that generate result.Meanwhile, the method is applied to three typical apply, comprises image super-resolution, video image color editor, image denoising.
The conformal projection sparse expression method of a kind of image/video scene content of the present invention, as shown in Figure 1, embodiment is as follows for flow process:
Step one: input original image or video sampling in feature space.
The original image inputted or video are sampled, obtains input amendment collection X.Different feature spaces is chosen according to no application demand.Such as, for image super-resolution application, by image from RGB color space conversion to Ycbcr color space, patch rank is sampled to the luminance channel Y of image.For color editing application, pixel scale is sampled to RGB color characteristic; For image denoising application, patch rank is sampled to gray feature or RGB color characteristic.
Step 2: in feature space, calculates the k nearest neighbor of each sample and sets up local adjacent map completely, then calculating the distance between adjacent sample.
First in feature space, each sample x is calculated by Kd-tree method ik nearest neighbor, use Euclidean distance during calculating, at sample x iand in the set of K neighbour's sample composition, connect every two compositions of sample local adjacent map completely; The Euclidean distance connected between sample is calculated in feature space.
Step 3: according to conformal projection rule, it combined with sparse expression method, study has the dictionary of conformal character.
Given input amendment collection X=[x 1, x 2..., x n], utilize sparse expression method, can in the hope of crossing complete dictionary D, and reconstruction coefficients α:
m i n D , α Σ i | | x i - Dα i | | 2 2 + λ Σ i | | α i | | 1 .
In order to improve the performance of sparse expression method, invention introduces the partial structurtes information of input data, the basis of above-mentioned formula adds conformal item f (α).
Conformal projection has been proved to be to improve manifold learning effect in manifold learning field.Concrete grammar is: given feature space M is to the mapping g:M → N of another feature space N, (x i, x j, x k) be sample point adjacent in feature space M and form triangle, (α i, α j, α k) be the mappings of these sample points in feature space N, as shown in Figure 2.Need meet according to conformal projection rule:
m i n Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, N irepresent sample x ik nearest neighbor set, s irepresent the change of scale after mapping.
Then, conformal projection rule is combined with sparse expression algorithm, obtains following energy theorem:
m i n D , α , S Σ i | | x i - Dα i | | 2 2 + λ 1 Σ i | | α | | 1 + λ 2 Σ i Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, x is input amendment feature, and D is characteristics dictionary, and α is reconstruction coefficients, λ 1, λ 2for weight coefficient.Minimize this energy theorem by iterative algorithm, finally try to achieve the dictionary D with conformal character.
Above-mentioned formula has three unknown variables (D, α, S), and wherein D is dictionary to be asked, and α is sparse reconstruction coefficients, and S is change of scale.Therefore the present invention is decomposed into three subproblems: sparse coding, dictionary updating, and yardstick upgrades.When each subproblem solves, only optimize a variable and fix other Two Variables.These three continuous loop iterations of step are until obtain optimum solution.
First, the value of initializing variable D and S is needed to be stochastic matrix.In the sparse coding stage, the value of fixing D and S, by following equations factor alpha:
J ( A ) = arg m i n α Σ i | | x i - Dα i | | 2 2 + λ 1 Σ i | | α i | | 1 + λ 2 Σ i Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 .
Here, the present invention uses iterative projection method to solve this formula.
Then, in the dictionary updating stage, the value of fixing α and S solves D, and solution formula is:
J ( D ) = arg min D Σ i | | x i - Dα i | | 2 2 .
Here each d in dictionary is required jfor unit vector, namely meet this formula is quadratic programming problem, can upgrade each in dictionary item by item.
Finally, in the yardstick more new stage, fixing D and α solves S, and solution formula is:
J ( S ) = arg m i n S Σ i Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 .
Notice each s in above-mentioned formula ibe all independently, therefore can be solved respectively by least square method.Method for solving is:
s i = Σ j , k ∈ N i | | x j - x k | | 2 · | | α j - α k | | 2 Σ j , k ∈ N i ( | | α j - α k | | 2 ) 2 .
By the continuous iteration optimization of these three processes, finally try to achieve optimum solution.Algorithm flow chart is shown in Fig. 3.
Step 4: utilize this dictionary to be reconstructed original image or video, obtain result.
The present invention provides three kinds of different performances that should be used for verifying the method, comprises image super-resolution, video image color editor, image denoising.
Image super-resolution application is high-resolution image by the Image Reconstruction of low resolution.First set up high-definition picture and low-resolution image storehouse one to one, utilize above-mentioned dictionary learning method simultaneously from storehouse learning two dictionaries.When the image of an input low resolution, utilize low-resolution dictionary to be reconstructed and try to achieve coefficient, then usage factor and high resolving power dictionary reconstruct corresponding high-definition picture.
Video image color editing application is the colouring information being changed video image by interactive mode.After inputted video image, first learn out its color dictionary, when user by paintbrush on image object during marker color, color corresponding in dictionary can change into the color of user's mark, this change simultaneously can propagate into whole video image, obtains final color edited result.
Image denoising application is the Gaussian noise filtered out on image.Input the image of a band noise, first gather the image block of 8*8 size, and learn out dictionary as data.Then utilize match tracing method reconstructed image, obtain the image after noise filtering.
The dictionary utilizing the present invention to try to achieve has good presentation skills and re-configurability, and dictionary is also more succinct simultaneously.By can this point be proved with the comparison of classic method.Such as traditional dictionary learning method K-SVD, the dictionary size of trying to achieve is 256, and the present invention can be reduced to 205, and ability to express is stronger.Can be represented the ability to express of this dictionary by the related coefficient of dictionary internal, the less ability to express of coefficient is stronger.The related coefficient of dictionary that tradition sparse expression method is tried to achieve is 0.8817, and after the present invention introduces conformal projection, related coefficient is reduced to 0.8477, illustrates that the dictionary that the present invention learns to obtain has stronger learning ability.
The foregoing is only basic explanations more of the present invention, any equivalent transformation done according to technical scheme of the present invention, all should belong to protection scope of the present invention.

Claims (6)

1. a conformal projection sparse expression method for image/video scene content, is characterized in that comprising the following steps:
(1) input original image or video and sample in feature space;
(2) in feature space, calculate the k nearest neighbor of each sample and set up local adjacent map completely, then calculating the distance between adjacent sample;
(3) according to conformal projection rule, it combined with sparse expression method, study has the dictionary of conformal character;
(4) utilize this dictionary to be reconstructed original image or video, obtain result.
2. the conformal projection sparse expression method of a kind of image/video scene content according to claim 1, it is characterized in that: the complete adjacent map in local described in step (2), referring in the set formed for certain sample and its k nearest neighbor, is all connected between any two samples.
3. the conformal projection sparse expression method of a kind of image/video scene content according to claim 1, it is characterized in that: conformal projection rule described in step (3), it is a kind of manifold learning, specific descriptions are: given feature space M is to the mapping g:M → N of another feature space N, (x i, x j, x k) be sample point adjacent in feature space M and form triangle, (α i, α j, α k) be the mappings of these sample points in feature space N; Need meet according to conformal projection rule:
m i n Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, N irepresent sample x ik nearest neighbor set, s irepresent the change of scale mapped.
4. the conformal projection sparse expression method of a kind of image/video scene content according to claim 1, it is characterized in that: combining with sparse expression method in step (3) learns to have the dictionary of conformal character, concrete steps are: conformal projection rule be combined with sparse expression algorithm, obtain following energy theorem:
min D , α , S Σ i | | x i - Dα i | | 2 2 + λ 1 Σ i | | α i | | 1 + λ 2 Σ i Σ j , k ∈ N i ( | | x j - x k | | 2 - s i | | α j - α k | | 2 ) 2 ,
Wherein, x is input amendment feature, and D is characteristics dictionary, and α is reconstruction coefficients, λ 1, λ 2for weight coefficient, minimize this energy theorem by iterative algorithm, finally try to achieve the dictionary D with conformal character.
5. the conformal projection sparse expression method of a kind of image/video scene content according to claim 1, is characterized in that: described method is applied to video image editor application, comprises image super-resolution, video image color editor, image denoising.
6. the conformal projection sparse expression method of a kind of image/video scene content according to claim 1, it is characterized in that: by introducing conformal projection rule, maintain the local angle information between adjacent sample to greatest extent, obtain the dictionary that ability to express is stronger; Meanwhile, conformal projection impels the similar dictionary of adjacent sample to be reconstructed, and makes dictionary concision and compact more.
CN201510337089.5A 2015-06-17 2015-06-17 A kind of conformal projection sparse expression method of image/video scene content Active CN104966276B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510337089.5A CN104966276B (en) 2015-06-17 2015-06-17 A kind of conformal projection sparse expression method of image/video scene content

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510337089.5A CN104966276B (en) 2015-06-17 2015-06-17 A kind of conformal projection sparse expression method of image/video scene content

Publications (2)

Publication Number Publication Date
CN104966276A true CN104966276A (en) 2015-10-07
CN104966276B CN104966276B (en) 2017-10-20

Family

ID=54220307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510337089.5A Active CN104966276B (en) 2015-06-17 2015-06-17 A kind of conformal projection sparse expression method of image/video scene content

Country Status (1)

Country Link
CN (1) CN104966276B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548476A (en) * 2016-11-25 2017-03-29 天津工业大学 Using medical image statistics pulmonary three-dimensional feature Method On Shape

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100033551A1 (en) * 2008-08-08 2010-02-11 Adobe Systems Incorporated Content-Aware Wide-Angle Images
CN103049340A (en) * 2012-10-26 2013-04-17 中山大学 Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100033551A1 (en) * 2008-08-08 2010-02-11 Adobe Systems Incorporated Content-Aware Wide-Angle Images
CN103049340A (en) * 2012-10-26 2013-04-17 中山大学 Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SIVARAM N. RAMADAS等: ""Application of conformal map theory for design of 2-D ultrasonic array structure for ndt imaging application: a feasibility study"", 《IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL》 *
高文娟: ""稀疏流形建模及其在人脸识别中的应用"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106548476A (en) * 2016-11-25 2017-03-29 天津工业大学 Using medical image statistics pulmonary three-dimensional feature Method On Shape
CN106548476B (en) * 2016-11-25 2019-11-12 天津工业大学 Lung's three-dimensional feature Method On Shape is counted using medical image

Also Published As

Publication number Publication date
CN104966276B (en) 2017-10-20

Similar Documents

Publication Publication Date Title
Ying et al. A stereo attention module for stereo image super-resolution
Hui et al. Unsupervised multi-domain image translation with domain-specific encoders/decoders
CN108765512B (en) Confrontation image generation method based on multi-level features
CN110349087B (en) RGB-D image high-quality grid generation method based on adaptive convolution
CN112488055B (en) Video question-answering method based on progressive graph attention network
CN111861886B (en) Image super-resolution reconstruction method based on multi-scale feedback network
Luo et al. Lattice network for lightweight image restoration
CN113870124B (en) Weak supervision-based double-network mutual excitation learning shadow removing method
CN112991493A (en) Gray level image coloring method based on VAE-GAN and mixed density network
CN114742985A (en) Hyperspectral feature extraction method and device and storage medium
CN103413331B (en) A kind of support edits the high resolution video image content sparse expression method propagated
CN109993702A (en) Based on the language of the Manchus image super-resolution rebuilding method for generating confrontation network
CN112884758A (en) Defective insulator sample generation method and system based on style migration method
CN109447897B (en) Real scene image synthesis method and system
CN112686817B (en) Image completion method based on uncertainty estimation
CN114373073A (en) Method and system for road scene semantic segmentation
Wei et al. Generative steganographic flow
CN113140023A (en) Text-to-image generation method and system based on space attention
CN104966276A (en) Image video scene content conformal mapping sparse representation method
CN117291232A (en) Image generation method and device based on diffusion model
CN110097615B (en) Stylized and de-stylized artistic word editing method and system
CN107221019B (en) Chart conversion method and device
CN113436094B (en) Gray level image automatic coloring method based on multi-view attention mechanism
Shen et al. Itsrn++: Stronger and better implicit transformer network for continuous screen content image super-resolution
CN114529450A (en) Face image super-resolution method based on improved depth iterative cooperative network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant