CN102663391B

CN102663391B - Image multifeature extraction and fusion method and system

Info

Publication number: CN102663391B
Application number: CN201210045645.8A
Authority: CN
Inventors: 王军; 吴金勇; 王一科; 龚灼
Original assignee: China Security and Surveillance Technology PRC Inc
Current assignee: Zhongan Xiao Co ltd
Priority date: 2012-02-27
Filing date: 2012-02-27
Publication date: 2015-03-25
Anticipated expiration: 2032-02-27
Also published as: CN102663391A

Abstract

The invention provides an image multifeature extraction and fusion method and the system, comprising the following steps of, extracting color features from an image to be matched, matching color with the target image, making sure the color similarity degree, if the color similarity degree exceeds the given color similarity degree threshold, entering the next step; extracting additional feature from the image to be matched, matching the auxiliary features with the target image, determining the auxiliary feature similarity degree, the auxiliary features comprise at least one of texture feature and shape feature; on the base of color similarity degree and auxiliary features similarity degree, making synthesis judgment and acquiring the synthesis similarity degree between the image to be matched and the target image, or matching the target image with the auxiliary features extracted from the matching zone of the image to be matched. The method and the system of the invention apply a cascade connected way to match from roughness to exactness, and can determine exactly the similarity zone of the target image and the image to be matched. The method and the system of the invention achieve a rapid and highly effective matching effect and save manpower and resources.

Description

A kind of multi-feature extraction of image and fusion method and system

Technical field

The present invention relates to technical field of image processing, particularly relate to a kind of multi-feature extraction of image and fusion method and system.

Background technology

Along with the construction of the large-scale safety defense monitoring system such as " safe city " and " smart city ", the data volume of monitor video is day by day huge, from the image and video data of magnanimity, how to retrieve needs fast and accurately image and video become a problem become more and more important, and retrieve a maximum difficult problem at present and be how to obtain one robust features fast and accurately.

Content-based image and video retrieval technology are by extracting the interested feature of user in image, comprise color, texture, some visual signatures such as shape, the image of user's input is retrieved in large nuber of images, achieve the retrieval of real image vision content characteristic, this mode is to the important breakthrough of key search, improve work scientific and technological content, improve way to manage, strengthen law-enforcing supervision, improve the level of keeping a lookout of public security, such as Chinese patent: a kind of comprehensive multi-feature image retrieval method (publication number: CN101551823, publication date: 2009-10-07) color that will obtain, texture and shape facility synthesize total similarity by weighting summation, the design patent image search method (publication number a: CN101847163A, publication date: 2010-09-29) of multiple features fusion, by the distance weighted fusion after each feature normalization, obtains similarity between final image.

Existing multi-feature extraction and integration technology mainly contain some deficiency following:

1, speed is slow: existing multi-feature extraction and integration technology are all for comparing between image and image, along with the trend in video progressively high Qinghua, has adopted this method, and speed is extremely slow, in the middle of the real-time retrieval that can not be applied to video.

2, the change such as the convergent-divergent of target and rotation false drop rate is high: the combination of prior art many employings color, edge and texture, because these three kinds of features are high to change false drop rates such as the convergent-divergent of target and rotations, is difficult to practical.

3, existing multi-feature extraction and integration technology are by mode in parallel by color, edge and texture, weighted sum is adopted to obtain final degree of confidence, because color, edge and texture do not have comparability, the mode of this weighted sum can produce great error.

Summary of the invention

Aspects and advantages of the present invention are partly stated in the following description, or can be apparent from this description, or learn by putting into practice the present invention.

For overcoming the problems such as speed adds up slowly, by each feature unfavorable factor, false drop rate is high existed in existing multi-feature extraction and integration technology, the invention provides a kind of multi-feature extraction of image and fusion method and system, the advantage of each feature is made full use of by the mode of cascade, from slightly progressively mating to essence, can similar area in Query refinement determination target image and image to be matched, and be no longer the direct coupling between two sub-pictures, reach effect that is quick and efficiently and accurately coupling, save manpower and materials.

It is as follows that the present invention solves the problems of the technologies described above adopted technical scheme:

According to an aspect of the present invention, provide a kind of multi-feature extraction and fusion method of image, it comprises the following steps:

S1. image to be matched is divided into polylith image-region, color characteristic is extracted from described image-region, slightly mate with target image, determine color similarity, find out image-region the most similar in described image to be matched as matching area, if the color similarity of described matching area exceedes the color similarity threshold value of setting, enter next step;

S2. from described matching area, extract supplemental characteristic, carry out supplemental characteristic and mate, determine supplemental characteristic similarity with described target image, described supplemental characteristic comprises at least one item in textural characteristics and shape facility;

S3. on the basis of described color similarity and supplemental characteristic similarity, the fusion carrying out color characteristic and textural characteristics judges, judge whether to meet degree of confidence requirement, if meet, then provide comprehensive degree of confidence: if described textural characteristics similarity is greater than the textural characteristics similarity threshold of setting, then described comprehensive degree of confidence is determined by textural characteristics similarity or based on textural characteristics similarity; Otherwise the proportion in comprehensive degree of confidence shared by color similarity improves;

Wherein, the thick coupling in described step S1 comprises step:

A1, treat matching image and carry out color space conversion and color layered method;

Described color layered method comprises: tone H is divided into 8 parts, and saturation degree S and brightness V is divided into 3 parts, and according to color space and people, the subjective perception characteristic to color carries out quantification layering, color space is divided into 72 kinds of colors; Formula is as follows:

H = \{\begin{matrix} 0 & if & h &Element; [316,20] \\ 1 & if & h &Element; [21,40] \\ 2 & if & h &Element; [41,75] \\ 3 & if & h &Element; [76,155] \\ 4 & if & h &Element; [156,190] \\ 5 & if & h &Element; [191,270] \\ 6 & if & h &Element; [271,195] \\ 7 & if & h &Element; [296,315] \end{matrix}

S = \{\begin{matrix} 0 & if & s &Element; [0,0.2] \\ 1 & if & s &Element; [0.2,0.7] \\ 2 & if & s &Element; [0.7,1] \end{matrix}

V = \{\begin{matrix} 0 & if & v &Element; [0,0.2] \\ 1 & if & v &Element; [0.2,0.7] \\ 2 & if & v &Element; [0.7,1] \end{matrix}

A2, undertaken determining described color similarity based on the template matches of color by color histogram: the image-region divided each, according to splitting the color region obtained, absolute value distance method is adopted to calculate the similarity of the image-region of sample color region and image to be matched

If two color regions are respectively I, Q, by concentric rectangles division methods, image is divided, obtain a n concentric rectangles, according to the 72 dimension HSV histograms that layering obtains, the distance D of corresponding part _ifor:

D_{i} = Σ_{j = 0}^{71} (| h_{i} (j) - h_{q} (j) |)

Wherein, h _i(j), h _qj () respectively corresponding color region I, Q ties up histogrammic value in jth, sequencing of similarity preserved;

When carrying out supplemental characteristic coupling in described step S2, comprise step:

B1, converting coloured image to gray level image, is the image of N level for gray scale, and co-occurrence matrix is that N*N ties up matrix, namely wherein, the element m of (h, k) is positioned at _hkvalue represent that the gray scale at a distance of (h, k) is h, and another gray scale is that the pixel of k is to the number of times occurred.

Four characteristic quantities extracted by texture co-occurrence matrix are:

Contrast:

CON = \underset{h}{Σ} \underset{k}{Σ} {(h - k)}^{2} m_{hk}

Energy:

ASM = \underset{h}{Σ} \underset{k}{Σ} {(m_{hk})}^{2}

Entropy:

ENT = - \underset{h}{Σ} \underset{k}{Σ} m_{hk} \lg (m_{hk})

Relevant:

COR = [\underset{h}{Σ} \underset{k}{Σ} {hkm}_{hk} - μ_{x} μ_{y})] / σ_{x} σ_{y}

Wherein, it is every column element sum in matrix M; it is every row element sum; μ _x, μ _y, σ _x, σ _ym respectively _x, m _yaverage and standard deviation;

B2, the Gaussian difference pyrene of different scale and image convolution is utilized to generate Gaussian difference scale space (DOG scale-space);

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

Wherein G (x, y, σ) is changeable scale Gaussian function, and (x, y) is volume coordinate, and σ is yardstick coordinate.

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (x^{2} + y^{2})} / 2 σ^{2}

B3, each sampled point and its all consecutive point to be compared, see that it is whether large or little than the consecutive point of its image area and scale domain, middle check point and it with 8 consecutive point of yardstick and 9 × 2 points corresponding to neighbouring yardstick totally 26 points compare, to guarantee all extreme point to be detected at metric space and two dimensional image space;

B4, by fitting three-dimensional quadratic function accurately to determine position and the yardstick of key point, remove the key point of low contrast and unstable skirt response point, to strengthen coupling stability, to improve noise resisting ability simultaneously;

B5, utilize the gradient direction distribution characteristic of key point neighborhood territory pixel to be each key point assigned direction parameter, make operator possess rotational invariance;

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}

θ (x, y) = \sqrt{α \tan 2 ((L (x, y + 1) - L (x, y - 1)) / ((L (x + 1, y) - L (x - 1, y)))}

Upper two formulas are modulus value and the direction formula of (x, y) place gradient; The yardstick that wherein L is used is the yardstick at each key point place separately.

B6, be the direction of key point by X-axis rotate, to guarantee rotational invariance;

B7, get centered by key point 8 × 8 window, central authorities' stain is the position of current key point, each little lattice represent a pixel of key point neighborhood place metric space, then on the fritter of every 4 × 4, calculate the gradient orientation histogram in 8 directions, draw the accumulated value of each gradient direction, a Seed Points can be formed;

B8, each key point used 4 × 4 totally 16 Seed Points describe, 128 data are produced to a key point, the final SIFT feature vector forming 128 dimensions;

B9, the similarity determination adopting the Euclidean distance of key point proper vector to be used as key point in two width images are measured; Certain key point in sampling illustration, and find out it and mate European nearest the first two key point in figure, in these two key points, if nearest distance is less than preset ratio threshold value except distance near in proper order, then accept this pair match point.

According to one embodiment of present invention, in described step S2, described supplemental characteristic adopts textural characteristics, and it is one or more that described textural characteristics comprises in following features: the textural characteristics of gray level co-occurrence matrixes, rotate the constant textural characteristics of convergent-divergent.

According to another aspect of the present invention, provide a kind of multi-feature extraction and emerging system of image, it comprises matching module, and described matching module comprises:

Color matching module, for image to be matched is divided into polylith image-region, color characteristic is extracted from described image-region, slightly mate with target image, determine color similarity, find out image-region the most similar in described image to be matched as matching area, if the color similarity of described matching area exceedes the color similarity threshold value of setting, enter supplemental characteristic matching module and process;

Supplemental characteristic matching module, for extracting supplemental characteristic from described matching area, carrying out supplemental characteristic and mating, determine supplemental characteristic similarity with described target image, described supplemental characteristic comprises at least one item in textural characteristics and shape facility;

Synthetic determination module, for the basis at described color similarity and supplemental characteristic similarity, the fusion carrying out color characteristic and textural characteristics judges, judge whether to meet degree of confidence requirement, if meet, then provide comprehensive degree of confidence: if described textural characteristics similarity is greater than the textural characteristics similarity threshold of setting, then described comprehensive degree of confidence is determined by textural characteristics similarity or based on textural characteristics similarity; Otherwise the proportion in comprehensive degree of confidence shared by color similarity improves;

Color matching module, when carrying out described thick coupling, carries out color space conversion and color layered method for treating matching image; Described color layered method comprises: tone H is divided into 8 parts, and saturation degree S and brightness V is divided into 3 parts, and according to color space and people, the subjective perception characteristic to color carries out quantification layering, color space is divided into 72 kinds of colors; Formula is as follows:

H = \{\begin{matrix} 0 & if & h &Element; [316,20] \\ 1 & if & h &Element; [21,40] \\ 2 & if & h &Element; [41,75] \\ 3 & if & h &Element; [76,155] \\ 4 & if & h &Element; [156,190] \\ 5 & if & h &Element; [191,270] \\ 6 & if & h &Element; [271,195] \\ 7 & if & h &Element; [296,315] \end{matrix}

S = \{\begin{matrix} 0 & if & s &Element; [0,0.2] \\ 1 & if & s &Element; [0.2,0.7] \\ 2 & if & s &Element; [0.7,1] \end{matrix}

V = \{\begin{matrix} 0 & if & v &Element; [0,0.2] \\ 1 & if & v &Element; [0.2,0.7] \\ 2 & if & v &Element; [0.7,1] \end{matrix}

Also determine described color similarity for being undertaken by color histogram based on the template matches of color: the image-region that each is divided, according to splitting the color region obtained, absolute value distance method is adopted to calculate the similarity of the image-region of sample color region and image to be matched

D_{i} = Σ_{j = 0}^{71} (| h_{i} (j) - h_{q} (j) |)

Supplemental characteristic matching module carries out supplemental characteristic coupling for performing step:

Four characteristic quantities extracted by texture co-occurrence matrix are:

Contrast:

CON = \underset{h}{Σ} \underset{k}{Σ} {(h - k)}^{2} m_{hk}

Energy:

ASM = \underset{h}{Σ} \underset{k}{Σ} {(m_{hk})}^{2}

Entropy:

ENT = - \underset{h}{Σ} \underset{k}{Σ} m_{hk} \lg (m_{hk})

Relevant:

COR = [\underset{h}{Σ} \underset{k}{Σ} {hkm}_{hk} - μ_{x} μ_{y})] / σ_{x} σ_{y}

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (x^{2} + y^{2})} / 2 σ^{2}

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}

θ (x, y) = \sqrt{α \tan 2 ((L (x, y + 1) - L (x, y - 1)) / ((L (x + 1, y) - L (x - 1, y)))}

Certainly, described matching module also can be set to perform other the one or more step in the multi-feature extraction of above-mentioned image and fusion method feature.Multi-feature extraction of the present invention and fusion method, be mainly used in images match or image and video retrieval technology, overcome the problem that each feature unfavorable factor is cumulative existed in existing multi-feature extraction and fusion method, the advantage of each feature is made full use of by the mode of cascade, from slightly progressively mating to essence, the method of simultaneously being mated by cascade can similar area in Query refinement determination target image and image to be matched, and be no longer the direct contrast between two sub-pictures, reach effect that is quick and efficiently and accurately coupling, save manpower and materials.

Specifically, relative to prior art, the present invention can bring following beneficial effect:

1, the present invention is by the method for cascade, from slightly progressively mating to essence, and can reduce surveyed area, matching speed can be made greatly to improve.

2, the present invention is utilized, shape facility (as edge feature) can be removed and adopt textural characteristics as supplemental characteristic, adopt more healthy and stronger, convergent-divergent and rotation change to be had to detection very well SIFT (or SURF) algorithm, accuracy can improve greatly.

3, the cascade system of the present invention by color, supplemental characteristic are combined by series, parallel, the threshold value finally whether being greater than setting according to two similarities gets final degree of confidence, not only substantially increase matching speed, avoid the error that prior art directly adopts weighted sum to cause to the feature that color, edge and texture etc. do not have comparability simultaneously.

4, when intended target (as vehicle, the row humans and animals etc.) retrieval being applied to image, first the present invention by obtaining the approximate region of target from image based on the template matching method of color, and then adopt more accurate supplemental characteristic (can textural characteristics be adopted, as rotated convergent-divergent invariant features) to target area exact matching, thus obtain the accurate similarity of target.

By reading instructions, those of ordinary skill in the art will understand the characteristic sum aspect of these embodiments and other embodiment better.

Accompanying drawing explanation

Below by with reference to accompanying drawing describe the present invention particularly in conjunction with example, advantage of the present invention and implementation will be more obvious, wherein content shown in accompanying drawing is only for explanation of the present invention, and does not form restriction of going up in all senses of the present invention, in the accompanying drawings:

Fig. 1 is the multi-feature extraction of image and the general flow chart of fusion method according to an embodiment of the invention.

Fig. 2 is the thick coupling process flow diagram in Fig. 1.

Fig. 3 is the structural representation of matching module according to an embodiment of the invention.

Embodiment

The invention provides a kind of multi-feature extraction and emerging system of image, this system can be a main frame or special equipment, also can be a network system, or the software systems that can be installed in main frame or Special Equipment, key is that it comprises matching module, as shown in Figure 3, matching module comprises:

Color matching module, for extracting color characteristic from image to be matched, carrying out color-match with target image, determining color similarity, if color similarity exceedes the color similarity threshold value of setting, enters supplemental characteristic matching module and processes;

Supplemental characteristic matching module, for extracting supplemental characteristic from image to be matched, carrying out supplemental characteristic and mating, determine supplemental characteristic similarity with target image, supplemental characteristic comprises at least one item in textural characteristics and shape facility;

Synthetic determination module, for the basis at color similarity and supplemental characteristic similarity, carries out synthetic determination, draws the comprehensive similarity of image to be matched and target image.

According to embodiments of the invention, Color matching module is placed through the template matches of image to be matched and target image to be carried out based on color, determines color similarity.Preferably, Color matching module is set to utilize the template matches based on color to determine matching area the most similar with target image in image to be matched simultaneously; Supplemental characteristic matching module is placed through and from the matching area of image to be matched, extracts supplemental characteristic mate with the supplemental characteristic of target image, determines supplemental characteristic similarity.

In the following embodiments, textural characteristics is adopted to be described for supplemental characteristic, certain those of ordinary skill in the art also can adopt shape facility (such as edge feature) as supplemental characteristic, also textural characteristics and shape facility can be adopted as supplemental characteristic, and these all within the scope of the present invention simultaneously.

As depicted in figs. 1 and 2, treat the color histogram of matching image (original image) by colo(u)r breakup and after obtaining layering and other color characteristic to carry out the first order and slightly mate, by slightly mating the target area that the most similar target area of acquisition is carefully mated as next stage, most of color characteristic can be got rid of and differ great target, simultaneously because condition is not harsh, so undetected target cumulative errors can not be caused; The basis of thick coupling is carefully mated in conjunction with supplemental characteristic (for textural characteristics), and thin coupling is further confirmation, is not the direct negative to thick coupling; Slightly mating and on thin basis of mating, by the similarity of slightly mating and thin coupling obtains respectively, carrying out synthetic determination, draw the comprehensive degree of confidence of image to be matched and target target image.Its concrete steps are as follows:

1, color space conversion and color layered method;

2, the first order is slightly mated (adopting the template matching technique based on color): on the basis of color layering, is slightly mated by color histogram and further feature, determines the Probability Area (matching area) the most similar with intended target; If color similarity exceedes the color similarity threshold value of setting, enter next step; Otherwise, terminate the coupling to current image to be matched.

3, the second level is carefully mated: texture feature extraction carries out exact matching, is mated, determine textural characteristics similarity by texture feature extraction in the matching area from image to be matched with the textural characteristics of target image;

4, third level synthetic determination: combine thick coupling and carry out synthetic determination with thin coupling according to certain rule, draw comprehensive degree of confidence.

The process flow diagram of the method refers to Fig. 1 and Fig. 2, and in specific embodiment as shown in Figure 1, the treatment scheme of matching module comprises the following steps:

101. obtain original image (i.e. image to be matched);

102. in conjunction with the parameter of input, carry out color space conversion;

103. carry out color layered method;

104. carry out color-match (namely slightly mating) with target image;

105. determine matching area the most similar in image to be matched;

106. extractions carrying out textural characteristics from matching area;

107. carry out textural characteristics with target image mates (namely carefully mating);

108. fusions carrying out color and textural characteristics judge;

109. judge whether to meet degree of confidence requirement, if it is provide comprehensive degree of confidence, otherwise exit.

In specific embodiment as shown in Figure 2, the treatment scheme of thick coupling comprises the following steps:

201. layered image inputs;

202. combine the parameter inputted, and the size of image according to target image are divided;

203. take out one piece of image-region;

204. calculate HSV (hue, saturation, intensity) color histogram;

205.HSV Histogram Matching;

Sequencing of similarity is preserved by 206.;

207. determine whether last block image-region;

208. if then take out the most similar matching area, otherwise get back to step 203, repeats the coupling of next block image-region.

Below several steps of the present embodiment are described in detail successively:

The conversion of the first step, color space and color layered method

1) color space conversion;

Because needs come colo(u)r breakup at HSV (hue, saturation, intensity) color space, thus first by image from RGB (red, green, blue) color space conversion to hsv color space:

H = \{\begin{matrix} \arccos \frac{(R - G) + (R - B)}{2 \sqrt{(R - G) * (R - G) + (R - B) * (G - * B)}} (B \leq G) \\ 2 π - \arccos \frac{(R - G) + (R - B)}{2 \sqrt{(R - G) * (R - G) + (R - B) * (G - * B)}} (B > G) \end{matrix} - - - (1)

S = \frac{\max (R, G, B) - \min (R, G, B)}{\max (R + G + B)} - - - (2)

V = \frac{\max (R, G, B)}{255} - - - (3)

2) color layered method

Color layering is exactly be mapped in certain subset by color space, thus improves images match speed.General color of image system nearly 2 ²⁴plant color, and the color that human eye can really distinguish is limited, therefore when carrying out image procossing, needs to carry out layering to color space, the dimension size of layering is extremely important, and layering dimension is higher, and matching precision is higher, but matching speed can decline thereupon.

Color layering is divided into the colo(u)r breakup of equivalent spacing and the colo(u)r breakup of non-equivalent spacing, if due to the dimension of equivalent spacing layering too low, then precision can decline greatly, calculation of complex can be caused again if too high, by analysis and experiment, the present embodiment selects the colo(u)r breakup of non-equivalent spacing, and step is as follows:

According to the perception of people, tone H is divided into 8 parts, saturation degree S and brightness V is divided into 3 parts, and according to color space and people, the subjective perception characteristic to color carries out quantification layering, and formula is as follows:

H = \{\begin{matrix} 0 & if & h &Element; [316,20] \\ 1 & if & h &Element; [21,40] \\ 2 & if & h &Element; [41,75] \\ 3 & if & h &Element; [76,155] \\ 4 & if & h &Element; [156,190] \\ 5 & if & h &Element; [191,270] \\ 6 & if & h &Element; [271,195] \\ 7 & if & h &Element; [296,315] \end{matrix} - - - (4)

S = \{\begin{matrix} 0 & if & s &Element; [0,0.2] \\ 1 & if & s &Element; [0.2,0.7] \\ 2 & if & s &Element; [0.7,1] \end{matrix} - - - (5)

V = \{\begin{matrix} 0 & if & v &Element; [0,0.2] \\ 1 & if & v &Element; [0.2,0.7] \\ 2 & if & v &Element; [0.7,1] \end{matrix} - - - (6)

According to above method, color space is divided into 72 kinds of colors.

Second step, the first order are slightly mated: on the basis of color layering, slightly mated, determine the Probability Area the most similar with target by color histogram

1) image-region divides;

For the coupling of intended target, better mate to make the target in sample target and image to be matched, matching image is divided into the region of one piece of block and the few size of sample goal discrepancy by us in the first order, the step-length of horizontal and vertical movement can set according to the requirement of matching precision, wanting that precision is high a bit can a little bit smaller by step size settings, want speed fast, can larger by step size settings.

2) color template and characteristic matching

To the image-region that each divides, according to splitting the color region obtained, calculating the similarity in sample color region and region to be matched, adopting absolute value distance method here.

If two color regions are respectively I, Q, by concentric rectangles division methods, image is divided, obtain a n concentric rectangles, according to the 72 dimension HSV histograms that layering above obtains, the distance D of corresponding part _ifor:

D_{i} = Σ_{j = 0}^{71} (| h_{i} (j) - h_{q} (j) |) - - - (7)

Wherein, h _i(j), h _qj () respectively corresponding color region I, Q ties up histogrammic value in jth, to result of calculation sequence, find out the most similar region as matching area.

3rd step, carefully coupling: extract supplemental characteristic (for textural characteristics) and carry out exact matching

It is one or more that textural characteristics can comprise in following features: the textural characteristics of gray level co-occurrence matrixes, rotate the constant textural characteristics of convergent-divergent (as SIFT feature).

1) textural characteristics of gray level co-occurrence matrixes

First converting coloured image to gray level image, is the image of N level for gray scale, and co-occurrence matrix is that N*N ties up matrix, namely wherein, the element m of (h, k) is positioned at _hkvalue represent that the gray scale at a distance of (h, k) is h, and another gray scale is that the pixel of k is to the number of times occurred.

Four characteristic quantities extracted by texture co-occurrence matrix are:

Contrast:

CON = \underset{h}{Σ} \underset{k}{Σ} {(h - k)}^{2} m_{hk} - - - (8)

Energy:

ASM = \underset{h}{Σ} \underset{k}{Σ} {(m_{hk})}^{2} - - - (9)

Entropy:

ENT = - \underset{h}{Σ} \underset{k}{Σ} m_{hk} \lg (m_{hk}) - - - (10)

Relevant:

COR = [\underset{h}{Σ} \underset{k}{Σ} {hkm}_{hk} - μ_{x} μ_{y})] / σ_{x} σ_{y} - - - (11)

Wherein, it is every column element sum in matrix M; it is every row element sum; μ _x, μ _y, σ _x, σ _ym respectively _x, m _yaverage and standard deviation.

Concrete steps are in the present embodiment as follows:

A, the gray scale of image is divided into 64 gray shade scales;

B, structure four direction gray level co-occurrence matrixes: M (1,0), M (0,1), M (1,1), M (1 ,-1)

C, calculate four texture characteristic amounts on each co-occurrence matrix respectively;

With the average of each characteristic quantity and standard deviation: μ _cON, σ _cON, μ _aSM, σ _aSM, μ _eNT, σ _eNT, μ _cOR, σ _cORas eight components of textural characteristics.

2) SIFT (scale invariant feature conversion) feature

SIFT algorithm is a kind of algorithm extracting local feature, finds extreme point, extracting position, yardstick, rotational invariants at metric space.

Its main detecting step is as follows:

A) yardstick spatial extrema point is detected;

B) accurately extreme point is located;

C) be each key point assigned direction parameter;

D) generation of key point descriptor

● the generation of metric space

Scale-space theory object is the Analysis On Multi-scale Features of simulated image data, and Gaussian convolution core is the unique linear core realizing change of scale, so the metric space of a secondary two dimensional image is defined as:

L(x,y,σ)＝G(x,yσ)*I(x,y) (12)

Wherein G (x, y, σ) is changeable scale Gaussian function.

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (x^{2} + y^{2})} / 2 σ^{2} - - - (13)

(x, y) is volume coordinate, and σ is yardstick coordinate.

Stable key point detected in order to effective at metric space, propose Gaussian difference scale space (DOG scale-space).The Gaussian difference pyrene of different scale and image convolution is utilized to generate.

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)＝L(x,y,kσ)-L(x,y,σ) (14)

The structure of image pyramid: image pyramid is O group altogether, and often group has S layer, the image of next group is obtained by upper one group of image drop sampling, O and S is set by the user.

● spatial extrema point detects

In order to find the extreme point of metric space, the consecutive point that each sampled point will be all with it compare, and see that it is whether large or little than the consecutive point of its image area and scale domain.Middle check point and it with 8 consecutive point of yardstick and 9 × 2 points corresponding to neighbouring yardstick totally 26 points compare, to guarantee all extreme point to be detected at metric space and two dimensional image space.

● build the parameter that metric space need be determined

σ-metric space coordinate

O-octave coordinate

S-sub-level coordinate

Relation σ (o, the s)=σ of σ and O, S ₀2 ^o+s/S,

o∈o _min+[0,...,O-1],s∈[0,...,S-1]

Wherein σ ₀it is key horizon yardstick.

Volume coordinate x is the function of group octave, if x ₀the volume coordinate of 0 group, then x=2 ^ox ₀, o ∈ Z, x ₀∈ [0 ..., N ₀-1] × [0 ..., M ₀-1]

If (M ₀, N ₀) be the resolution of base set o=0, then resolution of other groups are obtained by following formula:

The parameter that general use is following:

σ _n＝0.5,σ ₀＝1.6·2 ^1/S,o _min＝-1,S＝3

At group o=-1, the expansion of image bilinear interpolation is twice (for the image σ expanded _n=1).

● accurately determine extreme point position

By fitting three-dimensional quadratic function accurately to determine position and the yardstick (reaching sub-pixel precision) of key point, remove key point and the unstable skirt response point (because DoG operator can produce stronger skirt response) of low contrast, to strengthen coupling stability, to improve noise resisting ability simultaneously.

The removal of skirt response

An extreme value defining bad difference of Gaussian has larger principal curvatures in the place across edge, and has less principal curvatures in the direction of vertical edge.Principal curvatures is obtained by the Hessian matrix H of a 2x2:

H = [\begin{matrix} D_{xx} & D_{xy} \\ D_{xy} & D_{yy} \end{matrix}] - - - (16)

Derivative is obtained by the adjacent poor estimation of sampled point.

The principal curvatures of D and the eigenwert of H are directly proportional, and make α be eigenvalue of maximum, β is minimum eigenwert, then

Tr(H)＝D _xx+D _yy＝α+β (17)

Det(H)＝D _xxD _yy-(D _xy) ²＝αβ (18)

Make α=γ β, then:

\frac{Tr {(H)}^{2}}{Det (H)} = \frac{{(α + β)}^{2}}{αβ} = \frac{{(rβ + β)}^{2}}{r β^{2}} = \frac{{(r + 1)}^{2}}{r} - - - (19)

(r+1) ²the value of/r is minimum when two eigenwerts are equal, increases along with the increase of r, therefore, in order to detect principal curvatures whether under certain thresholding r, only needs to detect:

\frac{Tr {(H)}^{2}}{Det (H)} < \frac{{(r + 1)}^{1}}{r} - - - (20)

Generally get r=10.

● key point direction is distributed

Utilize the gradient direction distribution characteristic of key point neighborhood territory pixel to be each key point assigned direction parameter, make operator possess rotational invariance.

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}} - - - (21)

θ (x, y) = \sqrt{α \tan 2 ((L (x, y + 1) - L (x, y - 1)) / ((L (x + 1, y) - L (x - 1, y)))} - - - (22)

Upper two formulas are modulus value and the direction formula of (x, y) place gradient.The yardstick that wherein L is used is the yardstick at each key point place separately.

When actual computation, we sample in the neighborhood window centered by key point, and with the gradient direction of statistics with histogram neighborhood territory pixel.The scope of histogram of gradients is 0 ~ 360 degree, wherein every 10 degree of posts, altogether 36 posts.Histogrammic peak value then represents the principal direction of this key point place neighborhood gradient, namely as the direction of this key point.

In gradient orientation histogram, when there is another and being equivalent to the peak value of main peak value 80% energy, then this direction is thought the auxiliary direction of this key point.A key point may be designated has multiple directions (principal direction, more than one auxiliary direction), and this can strengthen the robustness of coupling.

So far, the key point of image has detected complete, and each key point has three information: position, residing yardstick, direction.A SIFT feature region can be determined thus.

● unique point descriptor generates

First be the direction of key point by X-axis rotate, to guarantee rotational invariance.

Next centered by key point, get the window of 8 × 8, central authorities' stain is the position of current key point, each little lattice represent a pixel of key point neighborhood place metric space, then on the fritter of every 4 × 4, calculate the gradient orientation histogram in 8 directions, draw the accumulated value of each gradient direction, a Seed Points can be formed.

In actual computation process, in order to strengthen the robustness of coupling, to each key point use 4 × 4 totally 16 Seed Points describe, just can produce 128 data for a key point like this, the SIFT feature namely finally forming 128 dimensions are vectorial.Now SIFT feature vector has eliminated the impact of the geometry deformation such as dimensional variation, rotation factor, then continues the length normalization method of proper vector, then can remove the impact of illumination variation further.

After the SIFT feature vector of two width images generates, we adopt the Euclidean distance of key point proper vector to be used as the similarity determination tolerance of key point in two width images for next step.Certain key point in sampling illustration, and find out it and mate European nearest the first two key point in figure, in these two key points, if nearest distance is less than certain proportion threshold value except distance near in proper order, then accept this pair match point.Reduce this proportion threshold value, SIFT match point number can reduce, but more stable.

3) comprehensive characteristics

Utilize single features to carry out images match and have respective advantage, in order to improve the accuracy of coupling, combined with texture feature of the present invention, constructs a structured features and carries out images match.

Because color is not identical with the physical significance of textural characteristics, do not have direct comparability, need to do normalization process to several feature, formula is as follows:

D＝w ₁d ₁+w ₂d ₂(23)

Wherein, d ₁, d ₂represent the distance between the Color Characteristic of 2 width images, texture characteristic amount respectively; w ₁, w ₂for the weights (0≤w of characteristic quantity ₁≤ 1, and w ₁+ w ₂=1).

4th step, synthetic determination: combine thick coupling and carry out synthetic determination with thin coupling according to certain rule, draw comprehensive degree of confidence

The rule of synthetic determination is such:

If a) thick coupling and thin coupling have high similarity, are similar coupling target so certainly, based on textural characteristics (as SIFT feature) similarity during comprehensive confidence calculations;

If b) similarity of thick coupling is general, but the similarity of thin coupling is high, and due to the stability of SIFT feature, based on textural characteristics similarity during comprehensive confidence calculations, but the proportion of color similarity improves;

If c) similarity of thick coupling is high, and the similarity of carefully mating is general, during comprehensive confidence calculations, the proportion of two kinds of similarities is more or less the same, and determines according to actual conditions;

If d) two kinds of similarities are all very low, so not think it is similar purpose.

In the present embodiment, the realization of method is mainly by the matching way of cascade, from slightly progressively determining to essence, comprises three parts: the first order is slightly mated, the second level is carefully mated and third level synthetic determination; Thick coupling determines the approximate region that may comprise target according to the color histogram after colo(u)r breakup by the method for template matches; Thin coupling, mainly by obtaining textural characteristics, the basis of thick coupling is determined further; Synthetic determination is the independent similarity obtained according to the thick coupling of color and the thin coupling of textural characteristics, draws comprehensive similarity (or comprehensive degree of confidence).

Adopt the method for the invention, compared with prior art, overcome the some shortcomings existed in existing multi-feature extraction and fusion method, the advantage of each feature is made full use of by the mode of cascade, from slightly progressively mating to essence, accurately can determine the similar area in target image and image to be matched, and being no longer the direct contrast between two sub-pictures, reach effect that is quick and efficiently and accurately coupling, save manpower and materials.

The present invention not only can be applied in the retrieval to static images, also can be applied in the middle of video frequency searching simultaneously.Those of ordinary skill in the art should be appreciated that, when being applied to video frequency searching, the acquisition of matching area the most similar in image to be matched except by above-mentioned based on except the template matching method of color, also can pass through other technology, obtain motion target area as by background subtraction.

Above with reference to the accompanying drawings of the preferred embodiments of the present invention, those skilled in the art do not depart from the scope and spirit of the present invention, and multiple flexible program can be had to realize the present invention.For example, to illustrate as the part of an embodiment or the feature that describes can be used for another embodiment to obtain another embodiment.These are only the better feasible embodiment of the present invention, not thereby limit to interest field of the present invention that the equivalence change that all utilizations instructions of the present invention and accompanying drawing content are done all is contained within interest field of the present invention.

Claims

1. the multi-feature extraction of image and a fusion method, is characterized in that, comprise the following steps:

Wherein, the thick coupling in described step S1 comprises step:

H = \{\begin{matrix} 0 & if & h &Element; [316,20] \\ 1 & if & h &Element; [21,40] \\ 2 & if & h &Element; [41,75] \\ 3 & if & h &Element; [76,155] \\ 4 & if & h &Element; [156,190] \\ 5 & if & h &Element; [191,270] \\ 6 & if & h &Element; [271,195] \\ 7 & if & h &Element; [296,315] \end{matrix}

S = \{\begin{matrix} 0 & if & s &Element; [0,0.2] \\ 1 & if & s &Element; [0.2,0.7] \\ 2 & if & s &Element; [0.7,1] \end{matrix}

V = \{\begin{matrix} 0 & if & v &Element; [0,0.2] \\ 1 & if & v &Element; [0.2,0.7] \\ 2 & if & v &Element; [0.7,1] \end{matrix}

D_{i} = Σ_{j = 0}^{71} (| h_{i} (j) - h_{q} (j) |)

B1, converting coloured image to gray level image, is the image of N level for gray scale, and co-occurrence matrix is that N*N ties up matrix, namely wherein, the element m of (h, k) is positioned at _hkvalue represent that the gray scale at a distance of (h, k) is h, and another gray scale is that the pixel of k is to the number of times occurred;

Four characteristic quantities extracted by texture co-occurrence matrix are:

Contrast:

CON = \underset{h}{Σ} \underset{k}{Σ} {(h - k)}^{2} m_{hk}

Energy:

ASM = \underset{h}{Σ} \underset{k}{Σ} {(m_{hk})}^{2}

Entropy:

ENT = - \underset{h}{Σ} \underset{k}{Σ} m_{hk} \lg (m_{hk})

Relevant:

COR = [\underset{h}{Σ} \underset{k}{Σ} {hkm}_{hk} - μ_{x} μ_{y})] / σ_{x} σ_{y}

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

Wherein G (x, y, σ) is changeable scale Gaussian function, and (x, y) is volume coordinate, and σ is yardstick coordinate;

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (x^{2} + y^{2})} / 2 σ^{2}

B4, by fitting three-dimensional quadratic function accurately to determine position and the yardstick of key point, remove the key point of low contrast and unstable skirt response point simultaneously;

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}

θ (x, y) = \sqrt{α \tan 2 ((L (x, y + 1) - L (x, y - 1)) / ((L (x + 1, y) - L (x - 1, y)))}

Upper two formulas are modulus value and the direction formula of (x, y) place gradient; The yardstick that wherein L is used is the yardstick at each key point place separately;

2. the multi-feature extraction of image according to claim 1 and fusion method, it is characterized in that, in described step S2, described supplemental characteristic adopts textural characteristics, and it is one or more that described textural characteristics comprises in following features: the textural characteristics of gray level co-occurrence matrixes, textural characteristics.

3. the multi-feature extraction of image and an emerging system, it is characterized in that, comprise matching module, described matching module comprises:

H = \{\begin{matrix} 0 & if & h &Element; [316,20] \\ 1 & if & h &Element; [21,40] \\ 2 & if & h &Element; [41,75] \\ 3 & if & h &Element; [76,155] \\ 4 & if & h &Element; [156,190] \\ 5 & if & h &Element; [191,270] \\ 6 & if & h &Element; [271,195] \\ 7 & if & h &Element; [296,315] \end{matrix}

S = \{\begin{matrix} 0 & if & s &Element; [0,0.2] \\ 1 & if & s &Element; [0.2,0.7] \\ 2 & if & s &Element; [0.7,1] \end{matrix}

V = \{\begin{matrix} 0 & if & v &Element; [0,0.2] \\ 1 & if & v &Element; [0.2,0.7] \\ 2 & if & v &Element; [0.7,1] \end{matrix}

D_{i} = Σ_{j = 0}^{71} (| h_{i} (j) - h_{q} (j) |)

Four characteristic quantities extracted by texture co-occurrence matrix are:

Contrast:

CON = \underset{h}{Σ} \underset{k}{Σ} {(h - k)}^{2} m_{hk}

Energy:

ASM = \underset{h}{Σ} \underset{k}{Σ} {(m_{hk})}^{2}

Entropy:

ENT = - \underset{h}{Σ} \underset{k}{Σ} m_{hk} \lg (m_{hk})

Relevant:

COR = [\underset{h}{Σ} \underset{k}{Σ} {hkm}_{hk} - μ_{x} μ_{y})] / σ_{x} σ_{y}

D(x,y,σ)＝(G(x,y,kσ)-G(x,y,σ))*I(x,y)

G (x, y, σ) = \frac{1}{2 π σ^{2}} e^{- (x^{2} + y^{2})} / 2 σ^{2}

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}

θ (x, y) = \sqrt{α \tan 2 ((L (x, y + 1) - L (x, y - 1)) / ((L (x + 1, y) - L (x - 1, y)))}