The content of the invention
In order to overcome at present for still image pedestrian detection method verification and measurement ratio is low, rate of false alarm is high, and root can not be reached
The technical problem of pedestrian retrieval is carried out according to comprehensive characteristics, the present invention provides one kind and realizes efficient foreground extraction for still image,
And verification and measurement ratio is high, rate of false alarm is low, the multi-model of comprehensive characteristics retrieval and the pedestrian retrieval method of fuzzy color can be realized.
In order to realize above-mentioned technical purpose, the technical scheme is that,
A kind of pedestrian retrieval method of multi-model and fuzzy color, comprises the following steps:
Step 1:Input pedestrian's testing result is as the object for needing to retrieve and carries out foreground extraction, first with human body
Retrieval sensitizing range of the half body as required for is that prospect scope is bound, to obtain preliminary prospect, then to preliminary prospect
Canny edge calculations are carried out to obtain final prospect;
Step 2:Detected in the final prospect and the pedestrian detection result of input that are drawn in combining step one represented by square frame
Positional information, calculate the improvement CEDD features and fuzzy color feature of each pedestrian, and the two features are stored in respectively and changed
Enter in CEDD feature databases and fuzzy color feature database;Here positional information represents position of the pedestrian in a two field picture;
Step 3:Object stored in property data base is retrieved according to given search characteristics, if being given as
Pedestrian image and foreground features, then the improvement CEDD features of given pedestrian are calculated, and with improving every note in CEDD feature databases
Record, which is compared, draws characteristic distance, then record is ranked up to obtain retrieval result according to characteristic distance;If it is given as color spy
Sign, then the fuzzy color feature of given color is calculated, and characteristic distance is drawn compared with every record of fuzzy color feature database,
And record is ranked up to obtain retrieval result according to characteristic distance.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step one, pedestrian's inspection of input
Survey result and detection acquisition is carried out by DPM models.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step one, obtain final prospect
The step of include:
The part square frame in pedestrian detection result is sorted from top to bottom by the ordinate on picture first, and in sequence
The upper part of the body of the default multiple parts as human body is selected, each part square frame for forming upper half of human body is then converted into prospect
Mask figure, and the space filled between part is to form the preliminary prospects of DPM;
Canny edges are calculated to the preliminary prospects of DPM, edge graph are obtained, then in the preliminary foreground areas of DPM of edge graph
Progressive scan, in each row, from the boundary point of the left and right of preliminary prospect two to human body, the picture at composition canny edges is found in centre
Vegetarian refreshments, if finding edge pixel point in certain contiguous range, the left or right boundary point using the edge pixel point as the row,
After the completion of i.e. obtain the final prospects of DPM.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step one, pedestrian's inspection of input
Survey result and detection acquisition is carried out by ICF models.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step one, obtain final prospect
The step of include:
Upper part of the body scope is carried out to pedestrian's square frame in testing result to define, then take a height equally above the waist first
Scope, the window that horizontally slips that width is presetted pixel, from one end of width to another in the range of the upper part of the body defined
End movement;Take again width equally upper part of the body scope, highly slide up and down window for presetted pixel, in the upper part of the body defined
In the range of from one end of short transverse to the other end move;
The number of edge pixel point in the window moving process that horizontally slips in statistical window, one is formed with image x
Axial coordinate is x values, using the number of edge points at each x coordinate as 2 dimension curves of y values, then the left-half in image and the right side
Peak-peak is found in half part respectively, as right boundary;Same statistics slides up and down the number of the edge pixel point in window
Mesh, one is formed using image y-axis coordinate as x values, using the number of edge points of each y-coordinate as 2 dimension curves of y values, is then being schemed
Peak-peak is found respectively in the top half of picture and the latter half, as up-and-down boundary, that is, obtains the preliminary prospects of ICF;
Canny edges are calculated to the preliminary prospects of ICF, edge graph are obtained, then in the preliminary foreground areas of ICF of edge graph
Progressive scan, in each row, from the boundary point of the left and right of preliminary prospect two to human body, the picture at composition canny edges is found in centre
Vegetarian refreshments, if finding edge pixel point in certain contiguous range, the left or right boundary point using the edge pixel point as the row,
After the completion of i.e. obtain the final prospects of ICF.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, changing for pedestrian is calculated in step 2 and step 3
The step of entering CEDD features includes:
Pedestrian detection square frame and final prospect are inputted first, and pedestrian detection square frame is averagely then divided into 64 grids,
Each grid is checked, if pixel in all final prospects of pixel in grid, is positioned as effective grid, and count
The CEDD features of this grid are calculated, otherwise without calculating, finally the CEDD features of effective grid adds up, that is, obtain foreground area
Improvement CEDD features.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, the mould of pedestrian is calculated in step 2 and step 3
The step of pasting color characteristic includes:
Pedestrian detection square frame and final prospect are inputted first, are then calculated and included using fuzzy color algorithm in foreground area
The histogram of 10 fuzzy color components, then calculate the mean flow rate in foreground area, the fuzzy color features of the dimension of composition 11 to
Amount;" fuzzy color algorithm " is a step in " improving CEDD features " computational algorithm;Described mean flow rate calculating process
For:Rgb color values are first converted into hsv color value, then calculate the average value of the V values in foreground area.Described herein is fuzzy
Color algorithm is the first step in CEDD algorithms, belongs to known method.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step three, according to what is calculated
Step of the improvement CEDD features that given pedestrian and foreground features are drawn with improving the characteristic distance of every record in CEDD feature databases
Suddenly include:
One be stored in improvement CEDD features and step 2 that given pedestrian and foreground features are calculated in feature database
Bar improves CEDD features and is compared, and first characteristic distance distance1 is calculated with Tanimoto methods, if not less than default maximum
Value M, then it is final result distance by distance1 outputs;M can take arbitrary positive number, typically take 100;
If distance1 exceedes maximum M, distance2 is calculated, and makes distance=distance2+M, as
The characteristic distance of output.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, described distance2 computational methods are:
For the corresponding dimension sequence number of each element included in input feature vector t1 and t2, by comprising element by element value with descending
Sequence, the dimension sequence number being ordered as corresponding to preceding 3 element values is taken, form 1 3-dimensional vector, so obtain 2 3-dimensional vectors, ask this
The absolute value of the difference of the corresponding element of 2 3-dimensional vectors, and absolute value will and be made according to being summed after sort order weighting value
For distance2.
The pedestrian retrieval method of described a kind of multi-model and fuzzy color, in described step three, according to given color
Feature includes to draw with the step of every in fuzzy color feature database characteristic distance recorded:
The fuzzy color feature of given color is calculated according to given color first;
Then by the record single comparison one by one of fuzzy color feature and fuzzy color feature database, fuzzy color feature conduct
Characteristic vector, dimension are the number of color, and each a kind of color of element representation, element value is color value;Find out wherein color value
Fundamental distance d1 is preset as no more than 4 and more than 0 more than predetermined threshold value and k color elements of color value maximum, wherein k
K, this k color elements is next detected one by one, sequence number is tieed up according to color sequence number corresponding to each color elements, from fuzzy face
Obtain corresponding color element value in 1 of color characteristic storehouse record, if the color elements value obtained also greater than above-mentioned predetermined threshold value,
Then distance value subtracts 1, after so checking out k color component, obtains 1 distance value, is set to h, then fundamental distance d1=(h/k),
1 record described herein is 1 characteristic vector;
If given color is black, grey or white, the difference of the luminance components of 2 fuzzy color features is calculated;Wherein
Brightness is to be worth to using the average v of hsv feature calculations;Finally distance is:Distance=d1+ luminance differences;Otherwise mould is calculated
The absolute value sum of the difference of the color value of 3 primary color compositions before paste histogram, then be added with fundamental distance, i.e., final distance
For:D1+ (difference of the main component color value of histogram the 1st)+(difference of the main component color value of histogram the 2nd)+(histogram the 3rd
The difference of main component color value).
The technical effects of the invention are that:(1) it can be used for still image;(2) the Analysis on Prospect algorithm of efficiently and accurately, can be with
Adapt to various pedestrian detection models;(3) body major part had both been contained as retrieval sensitizing range using " broad sense is above the waist "
Color and textural characteristics, eliminating again influences less on retrieval effectiveness and is difficult to the body part analyzed;(4) using unification
Feature represents texture and color simultaneously, and more efficient is all compared in the calculating of feature calculation and characteristic distance;(5) higher integrated retrieval is accurate
True rate;
Embodiment
The abbreviation referred in the present invention includes:
HOG:Histograms of Oriented Gradients;
FHOG:Felzenszwalb’s HOG;
DPM:Deformable Part Model;Deformable part model;Open source software;
ICF:Integral Channel Features;Integrated channel model;Open source software;
FCTH:FUZZY COLOR AND TEXTURE HISTOGRAM;Fuzzy color and Texture similarity;Open source software;
CEDD:Color and Edge Directivity Descriptor;Color and edge direction descriptor;Increase income
Software;
ROI:Region Of Interest;Area-of-interest;
CBIR:Content-based image retrieval;CBIR;
lire:Lucene Image REtrieval;Increase income cbir engines, be integrated with a variety of characteristics of image;
The foreground extraction referred in the present invention, related description can be obtained in following discloses document:
[a]Level-Set Person Segmentation and Tracking with Multi-Region
Appearance Models and Top-Down Shape Information,Esther Horbert,Konstantinos
Rematas,Bastian Leibe,2011;
[b] Semantic Segmentation with Second-Order Pooling, Jo~ao Carreira,
2012;
The feature calculation referred in the present invention, related description can be obtained in following discloses document:
[a]FCTH:FUZZY COLOR AND TEXTURE HISTOGRAM A LOW LEVEL FEATURE FOR
ACCURATE IMAGE RETRIEVAL,Savvas A.Chatzichristofis and Yiannis S.Boutalis,
2008;
[b]CEDD:Color and Edge Directivity Descriptor.A Compact Descriptor
for Image Indexing and Retrieval,Savvas A.Chatzichristofis and Yiannis
S.Boutalis,2008;
[b]Image retrieval based on fuzzy color histogram processing,
K.Konstantinidis,2004;
The retrieval and matching referred in the present invention, related description can be obtained in following discloses document:
[a]Part-based Clothing Segmentation for Person Retrieval,Michael
Weber,2011;
[b]Person Re-identification Using Spatial Covariance Regions of Human
Body Parts,Bak,2004;
[c]Person Reidentification Using Spatiotemporal Appearance,Niloofar
Gheissari,2006;
The inventive method includes three key steps:
(1) extraction of " retrieval sensitizing range ";(2) feature calculation and it is stored in property data base;(3) examined according to given information
Pedestrian as rope phase.
The input of this paper searching system can be 2 kinds of pedestrian detection results:(1) testing result of DPM models, includes row
The small square frame of people periphery square frame and each body part;(2) testing result of ICF models or other pedestrian detection models, only wrap
Peripheral square frame containing pedestrian.
In the extraction stage of " retrieval sensitizing range " (namely prospect), according to the species of testing result, take respectively not
Same extraction algorithm.Sensitizing range is retrieved as " broad sense is above the waist " from shoulder to thigh, not comprising head.
Each " the retrieval sensitizing range " finished for analysis, calculate 2 kinds of features:(1) based on the improved of ROI
CEDD, color and texture are included by 1 feature simultaneously;(2) fuzzy color.And it is stored in property data base by this 2 kinds.In feature database
A record is established for each pedestrian detected, following information is included in record:The affiliated picture number of pedestrian's object, pedestrian
Position of the object in affiliated image, the foreground mask figure of pedestrian's object, the improved CEDD features of pedestrian's object, pedestrian's object
Fuzzy color feature.
After feature database creates, in retrieval phase, information to be checked can be expressed as 2 kinds:(1) given pedestrian, and hand
Work or " the retrieval sensitizing range " of automatic mark pedestrian, automated process can use pedestrian detection and Analysis on Prospect method;(2) without given
Pedestrian, only given color.For situation (1), according to " the retrieval sensitizing range " of given pedestrian, the CEDD of computed improved is special
Sign, and compared with the CEDD of each object in feature database, then by sequencing of similarity, as main retrieval result;Count again
Fuzzy color feature is calculated, and compared with the fuzzy color feature of each object in feature database, then by sequencing of similarity, as
The retrieval result of auxiliary.For situation (2), according to given color, calculate fuzzy color feature, and with it is each in feature database
The fuzzy color feature of object compares, then by sequencing of similarity.
After the completion of pedestrian's foreground extraction, it is desirable to select subregion therein so that the degree of accuracy highest of search, searcher
Method is most simple.
Certain methods calculate feature for whole human body, and effect is undesirable, and reason is:Complete human body's foreground extraction compares
Difficulty, especially in crowd, the Analysis on Prospect error of leg is larger, while the head feature of majority absolutely is all similar.
Human body is divided into 2 parts by certain methods, i.e.,:The upper lower part of the body, this is for some fairly simple clothes and color
Effect is preferable, but poor for the accuracy of complex situations search, and reason is:Some dressings are difficult to judge the upper lower part of the body,
Such as:Shorts, one-piece dress, overcoat, and also clothes has the color segments of bulk sometimes.It is a kind of situation of difficult analysis in Fig. 8, it is right
In the other dress of loins, it can be regarded as the upper part of the body or the lower part of the body, it appears that more difficult decision, " skirt+trousers " in winter
There is also Similar Problems for dressing.After being divided into 2 parts, how to form retrieval result also turns into a problem, because may have 3 kinds
As a result:[a] only upper body;[b] only lower part of the body;The lower part of the body on [c];Which increase the complexity of application.
The sensitizing range selected herein is " broad sense is above the waist " from shoulder to thigh, not comprising head, this subregion
Analysis on Prospect accuracy it is higher, avoid and easily cause the shank and pin of Analysis on Prospect mistake;This part also eliminates area
Divide performance little head;The judgement of the lower part of the body is it also avoid simultaneously, can preferably handle the situation of complicated dressing.
Here retrieval sensitizing range, is referred to as prospect, and prospect refers to valuable for user or application in image
Region, rather than the region of prospect is then background, can there is foreground and background in image and video.Examined for pedestrian
Survey, prospect can be the square frame comprising pedestrian;For pedestrian retrieval, prospect can be accurately further the image occupied by pedestrian
Region;And from the sensitiveness and accuracy angle of retrieval, the part in above-mentioned prospect can be chosen, i.e. " broad sense is above the waist ",
Foreground extraction hereinafter, sensitizing range is retrieved all referring to analysis.Image for having extracted prospect, can generate association
Mask figure (or being Mask, ROI), the purposes of mask figure are to mark foreground and background, the size and input picture phase of mask figure
Together.Mask figure is generally bianry image, and wherein foreground area is 1, background area 0;In some situations, mask figure can also be
Coloured image, rather than bianry image, at this moment prospect is a certain color (such as red, blueness), and background is another color
(such as black).
Set forth herein the foreground extraction process based on DPM models, mainly include 3 steps:
1st, the foreground mask of DPM human part is manually marked;
2nd, for the human part in pedestrian detection result, it is replaced with the part mask of DPM models, and human body will be removed
Region outside part is all set to background, so obtains preliminary prospect;
3rd, preliminary prospect is optimized with edge optimization algorithm, the error section in elimination prospect (is exactly that should be the back of the body
Scape, misjudge the part for prospect).DPM models include 8 parts, represent 8 positions of human body, and the positions of these parts can be with
Change within the specific limits.
The testing result of DPM models includes 9 square frames:1 peripheral square frame and 8 small square frames of part, as shown in Figure 4.
8 parts of testing result correspond with 8 parts of DPM models.For each part in DPM models, all have
FHOG features, and the FHOG feature instantiations profile of human part, can along part profile and organization of human body general knowledge by hand
Foreground and background in method mark each " the small square frame of part ".It is first according to organization of human body such as the small square frame of left shoulder part in Fig. 4
The contour line of general knowledge and model, the approximate range of shoulder contour can be estimated, then select within this range and connect brightness value
Larger FHOG Eigenvectors, profile is formed, for left shoulder, the left side of profile is labeled as background, by the right indicia of profile
For prospect.
For the testing result of DPM models, this 8 parts are first pressed into ordinate altogether comprising 8 parts, during Analysis on Prospect
(y-axis) sorts from the top down, the 2nd~5 part composition " broad sense is above the waist " is then selected, here it is considered that the 1st part is (also
It is the part of extreme higher position) it is head, visible Fig. 4 of sequence number of the 2nd~5 part.
Then the image comprising the small square frame of part is converted to and is with the mask figure that " broad sense is above the waist " is prospect, method:
It is first background by the zone marker in image in addition to the 2nd~5 part, can specify that by context marker be black, then will
The small square frame of 2nd~5 part is replaced with the foreground mask figure of part, is so merged part 2~5, is obtained basic
Prospect.The gap between part is refilled, forms preliminary prospect.Calculating process is as shown in Figure 5.
Because preliminary prospect is made up of the mask of the part of DPM models, and actual pedestrian's prospect have some errors, it is necessary to
Further processing, eliminate the error section in preliminary prospect (actual is background, is mistaken for prospect).
For preliminary prospect, optimize prospect using " edge contraction algorithm ".The visible Fig. 6 of calculating process, process are:First
The canny edges in preliminary prospect are calculated, obtain edge graph, are then scanned in the preliminary foreground area of edge graph per a line,
In each row, the right boundary of preliminary prospect is first obtained, the blue horizontal line in Fig. 6 represents the scanned pixel of a line, left and right
Border is the right boundary of foreground area (green area), and then from the boundary point of left and right two to human body, composition is found in centre
The pixel at canny edges, if finding canny edge pixels point in certain contiguous range of boundary point, by a new left side or
Right margin point moves on to canny edge pixels point;For left margin point, then neighborhood is the part on the right side of this boundary point and boundary point
Region;For right margin point, then neighborhood is a part of region on the left of this boundary point and boundary point.
In this way, the wrong prospect near the right boundary of every a line can be deleted, before final after being optimized
Scape.This method is simply efficient, and accuracy is high.
In addition to the pedestrian detection method of DPM models, there are similar ICF a variety of pedestrian detection methods, these methods
The characteristics of be the peripheral square frame that can only obtain pedestrian, it is impossible to the position of each human part is provided, so need it is a kind of with DPM not
Same foreground extraction algorithm, therefore a kind of foreground extracting method based on marginal point statistical nature is proposed, flow is:
1st, the canny edge graphs of image in pedestrian's square frame are calculated;
2nd, the right boundary of " broad sense is above the waist " is sought;
In Fig. 7 and Fig. 8, for y-coordinate axle, corresponding y-coordinate value 0 at the top of square frame, square frame bottom corresponds to the maximum of y-coordinate value
Value (i.e. square frame height).The scope of estimation above the waist in the block first, this is predefined value, can use pedestrian's square frame y-coordinate
Ratio between value and y-coordinate value maximum (i.e. square frame height) represents, generally 30%~70%, it is seen that Fig. 8;Then exist
The sliding window that a height is equal to predefined scope, width is 3 pixels is defined in the range of this y-coordinate value, is moved from left to right
It is dynamic;This sliding window is the green box in " c. determines right boundary " in Fig. 8;
In Fig. 8, x coordinate value is integer, and the left margin of 0 corresponding pedestrian's square frame, the maximum of x coordinate value is pedestrian's square frame
The number for the pixel that a line is included.In the moving process of sliding window, step-length is 1 pixel, is counted in sliding window
Composition edge pixel (i.e. the white pixel point in " b. edge graphs " in Fig. 8) number;So, sat for each x
Scale value, all correspond to 1 statistical value.When the left end of sliding window from pedestrian's square frame is moved to right-hand member, 1 is obtained by 2-D data member
The array of element composition:{ (x1, statistical value 1), (x2, statistical value 2), (x3, statistical value 3) ... }, can be song by this array representation
Line, as shown in fig. 7, curve then is divided into left-half and right half part, as the vertical blue line in Fig. 7 represents x coordinate value
The intermediate point (i.e. the intermediate point of the horizontal direction of pedestrian's square frame) of excursion;In left-half curve and right half part curve
Peak-peak is found respectively, and the right boundary of pedestrian is used as using the x coordinate value at this 2 peak points.
A plurality of curve in Fig. 7, it is the result that statistics is segmented in vertical direction (y-axis), such as by the predefined model of the upper part of the body
Enclose and be divided into 4 sections, every section of height is the 1/4 of predefined scope height, so forms the curve that 4 segmented plain windows obtain
The curves obtained with the sliding window of 1 whole altitude range, then this 5 curves are added up, obtain final detection curve,
It is expected the detection more stablized.The sliding window of curve 1~3 is marked in Fig. 7 in the position in y-axis direction.
3rd, the horizontal line of shoulder and waist is sought;
The scope of shoulder and waist is all the 10%~30% and 40%~70% of preset value, the respectively vertical y-axis of pedestrian's square frame.
The slip window sampling similar with (2) is now still used, moving direction is changed to from the top down, the left and right of sliding window
Border is the result of analysis in (2), is highly 3 pixels;Obtain the array being made up of 2-D data element:{ (y1, statistical value
1), (y2, statistical value 2), (y3, statistical value 3) ... }, curve is then expressed as, the level of shoulder and waist is judged further according to peak of curve
The y-coordinate value of line;
4th, the scope of whole " broad sense is above the waist " is obtained, the preliminary prospect represented by a rectangular area is formed, in Fig. 8
Shown blue hatched example areas;
5th, the scope in 4 is optimized according to the foregoing prospect optimization based on edge, final result is obtained, with Fig. 6
It is similar.
The color and textural characteristics of numerous species are presently, there are, represent color has:Rgb histograms etc., represent texture
Have:Small echo, gabor, integrating representation color and texture have:mpeg-7-color-layout、CEDD、FCTH.
In the case of given pedestrian, it is desirable to have a kind of feature can Color and texture, while have higher retrieval
Efficiency.By testing the introduction with paper, CEDD meets this requirement.
CEDD refers to:Color and Edge Directivity Descriptor, it is the vectors of 144 dimensions, is included in feature
The color at edge, the feature of texture and color can be embodied simultaneously.Basic CEDD is open source software, and principle comes from paper:
“CEDD:Color and Edge Directivity Descriptor.A Compact Descriptor for Image
Indexing and Retrieval,Savvas A.Chatzichristofis and Yiannis S.Boutalis,
2008 ",
Algorithm routine comes from:“http://chatzichristofis.info/Page_id=15 ".
Basic CEDD is directed to rectangular area, and above-mentioned pedestrian's prospect is irregular area, so needing to basic
CEDD algorithms are improved.
Basic CEDD algorithms provide characteristic distance (embodiment similarity) computational methods of recommendation, and the method is special for some
Sign can reach maximum, lead to not sort, it is also desirable to be improved.
In addition, mpeg-7-color-layout principle and algorithm are visible:
http://en.wikipedia.org/wiki/Color_Layout_Descriptor;
The algorithm of wavelet texture and gabor textures is visible:http://www.semanticmetadata.net/lire/;
According to paper and test, FCTH and CEDD recall precision difference are smaller, so CEDD is only considered herein,
Do not consider FCTH;
Similarity hereinafter is represented with characteristic distance, the characteristic distance of 2 image-regions is smaller, represents that similarity degree is got over
Greatly, i.e., it is more similar;And characteristic distance is bigger, discriminative degree is bigger, that is, gets over " dissmilarity ".
Basic CEDD algorithms are for whole image, i.e. square region, are changed to support ROI now.
One square region is divided into 64 lattices by basic CEDD, calculates the feature of each lattice respectively, then will
These features are added up, and obtain total feature.To support ROI, it is changed to only calculate the spy of the lattice in ROI now
Sign.Here ROI can be foreground area.
The visible Fig. 9 of the principle of rudimentary algorithm and innovatory algorithm, grid is only identified in schematic diagram as signal, is not drawn
64 grids.Innovatory algorithm calculating process is:
(1) input is the square frame and the prospect of " broad sense is above the waist " of pedestrian, as shown in Figure 6;
(2) pedestrian's square frame is divided into 64 grids according to basic CEDD identicals method;
(3) each grid is checked, if the pixel in grid all belongs to prospect, efficacious prescriptions lattice are located, and calculate this
The CEDD features of grid;
(4) the CEDD features of effective grid are added up, obtains the CEDD features of foreground area.
Basic CEDD calculates similarity (being inversely proportional with characteristic distance) using Tanimoto methods, i.e.,:Xi and xj in formula are 2 CEDD features, and Tij scope is
[0,1].And characteristic distance is expressed as:Distance=M-M*Tij, here M be characterized the maximum of distance.
For some images, occurs the situation that multiple characteristic distances are maximum sometimes, such as:Image query and image b1,
B2, b3 characteristic distance are all above-mentioned maximum M, so lead to not carry out sequencing of similarity.Because there are a variety of feelings
Condition can make Tij be 0, if image query feature be (1,0,0), and image b1 and b2 feature for (0,2,0) and (0,0,
3), then (1,0,0) and (0,2,0), the inner product of (0,0,3) are all 0, i.e. Tij is 0, causes characteristic distance to take maximum, here
Feature be 3-dimensional vector, be used only as the explanation of principle.Although now query and b1, b2 distance are all maximum,
CEDD features it is every it is one-dimensional between can essentially evaluate distance, still by taking above-mentioned 3-dimensional feature as an example, if 3-dimensional color difference
Represent (it is red, it is purple, blue), then it is considered that query and b1 distance is smaller than query and b2 distance, i.e., red and purple more phase
Picture, and then difference is larger for red and blueness.
Improved for this, characteristic distance calculating process such as Figure 10 after improvement.Algorithm after improvement is:
(1) input is 2 CEDD features;
(2) characteristic distance distance1 first is calculated with Tanimoto methods, if not less than maximum M, output is most to terminate
Fruit distance;
(3) if distance1 exceedes maximum M, distance2 is calculated, and makes distance=distance2+M,
Characteristic distance as output.
Distance2 computational methods are:CEDD features are considered as histogram, per the face under one-dimensional representation certain condition
Color, the number of the pixel for the color for meeting this condition is represented per one-dimensional value.For input feature vector t1 and t2, value is found out respectively
3 maximum dimensions, then directly calculate the Weighted distance sum of the sequence number of the dimension of vector.Such as:T1 is (10,20,0,70,30),
T2 is (0,30,50,100,10), and CEDD here is characterized as 5 dimensions, is used only as illustrating Computing Principle, according to value descending is arranged by t1 and t2
The serial number { 4,5,2,1,3 } of the dimension of row and { 4,3,2,5,1 }, the sequence number for coming 3 dimensions above is then found out, is respectively
{ 4,5,2 } and { 4,3,2 }, then the absolute value sum of the difference of the sequence number of corresponding dimension is calculated, while weights are set according to sequence,
Using result as distance2, i.e. distance2=| 4-4 |+| 5-3 | * 0.5+ | 2-2 | * 0.25.
CEDD features contain color and texture simultaneously, are adapted to the retrieval for having given pedestrian.Determine pedestrian for being not provided to
Situation, only provide some fuzzy messages, such as:Red jacket etc., as shown in Fig. 2 at this moment having wished to a kind of retrieval character, have
There is the characteristics of less high accuracy, broad covered area.Given fuzzy color can obtain from similar windows palettes.
A kind of fuzzy color histogram is selected herein to represent feature, and texture is not used, because texture is relatively multiple
It is miscellaneous, cause coverage rate wideless.The calculating of fuzzy color has been contained in CEDD features, color can be divided into 10 or 24 moulds
Color (being referred to as bin colors) is pasted, forms fuzzy color histogram, bin colors is common and intelligible, such as:Black, grey,
Red, green, blueness etc., are hereinafter referred to as fuzzy color feature by the fuzzy color feature included in CEDD.
Find, calculated according to fuzzy color feature and by above-mentioned " improved characteristic distance computational methods " special in test
Distance is levied, or using the feature calculation characteristic distance such as mpeg-7-colorlayout, common rgb histograms, the degree of accuracy is not
Ideal, especially monochromatic and grey situation, major problem is that the range of coverage rate is smaller.An example on coverage rate
It can be seen that Figure 12.
Herein in the case of given information is fuzzy color, using fuzzy color feature, without considering texture, and propose
A kind of new feature calculation method, can reach wider array of coverage rate." fuzzy color characteristic distance " and CEDD characteristic distances enter
The comparison of row sequencing of similarity such as Figure 12, in figure, characteristic distance being ordered as from small to large:Lastrow is less than next line, each
The left side is less than the right in row.It can be seen that 2 can retrieve target, the accuracy of CEDD retrievals is higher, and " fuzzy color phase
Like degree " coverage rate is wider.
The fuzzy color feature (or being fuzzy color histogram) of context of methods is expanded CEDD fuzzy color
Exhibition, comprising 10 fuzzy colors in CEDD (such as:Red, purple, blue, black, ash is in vain, green etc.), and increase by 1:Mean flow rate, group
Into the fuzzy color feature of 11 dimensions.
The computational methods of the characteristic distance of fuzzy color feature are as follows, reference can be made to Figure 11.
1st, fuzzy color feature (or being fuzzy color histogram) is calculated according to given color;
2nd, fundamental distance d1 is calculated according to whether fuzzy histogram primary color sequence number is overlapping;
First find out color value in given color histogram and be more than 4 primary color compositions of certain threshold value (not including 11 dimensions
In luminance components), be 4 by d1 pre-determined distances, then detect each color component one by one, according to this color component sequence number from spy
Levy in the characteristic color histogram in storehouse and obtain corresponding color value, if also greater than certain threshold value, distance value subtracts 1.So check 4
After color component, fundamental distance d1 maximum is 4, minimum value 0.As given some color in color and feature database
Histogram is respectively (10,20,50,40,30) and (10,20,0,30,0), is here 5 dimension datas, is merely to illustrate principle.It is right
In given color histogram, if electing threshold value as 10, and select 4 maximum color components of color value, then the color obtained into
Divide serial number { 3,4,5,2 }, corresponding color value is (50,40,30,20).And in the feature of feature database, with this 4
Color value corresponding to individual sequence number is (0,30,0,20).Compare the color vector of this 24 dimensions, only the 2nd and the 4th color value
Both greater than 0, then fundamental distance d1=4-2=2.
If it is black, grey or white the 3, to give color, the difference of the luminance components of 2 fuzzy color features is calculated;Its
Middle brightness is to be obtained using the average v values (value part of hsv) of hsv feature calculations;Finally distance is:Distance=
D1+ luminance differences;
If the 4th, giving color is not:Black, grey or white, then calculate 3 primary color compositions before fuzzy histogram
The absolute value sum of the difference of color value, then be added with fundamental distance, i.e., final distance is:D1+ (the main component face of histogram the 1st
The difference of colour)+(difference of the main component of histogram the 2nd)+(difference of the main component of histogram the 3rd).
Here main component is to calculate to get from given color.Such as the example in (2), given color is arranged by color value
Sequence, and it is respectively { 3,4,5 } and (50,40,30) to take the color sequence number of preceding 3 color values and color value, by this color component sequence
Number, the color value obtained from from the color characteristic of feature database is (0,30,0), then colour-difference=| 50-0 |+| 40-30 |+|
30-0 |=90, final distance is:Distance=d1+90.
Relevant comparative's experimental data is given below:
In cbir engines of increasing income at present, lire performances are best, there is provided many features and comparative approach, but do not support
ROI, it is herein used as a kind of control methods;
Comparison for ROI (i.e. foreground mask), program is worked out according to the method for main flow, and methods herein is compared
Compared with, including:Mpeg-7-colorlayout, common rgb histograms, gabor textures;
Image of the test set from actual monitored video interception and various scene captures, about 5000 altogether, the row of detection
People about 25000;Select given object of 2000 pedestrians repeated in different scenes or picture as retrieval.
On the calculating of retrieval rate, using it is a kind of it is fairly simple by the way of, calculate accuracy rate just for given pedestrian,
In the case of color is only given, then do not consider.
For giving pedestrian, if in minimum preceding 30 results of the characteristic distance of retrieval, occur to setting the goal, then recognizing
To retrieve successfully.
In addition, according to paper, FCTH and CEDD are serial from same open source software, and retrieval performance is almost identical,
So there is no FCTH in contrast test.
Peripheral square frame is directly used, using in lire:CEDD, mpeg-7-color-layout, common color Nogatas
Figure, gabor textures:
(1) CEDD retrieval rate highest, about 70%;
(2) mpeg-7-color-layout accuracys rate second are high, and about 60%;
(3) common rgb histograms, about 50%;
(4) gabor textures, about 40%;
Use the inventive method:
(1) ROI CEDD is supported;Accuracy rate about 95%;
(2) fuzzy color histogram and fuzzy color similarity;Accuracy rate about 75%;
(3)mpeg-7-color-layout;Accuracy rate about 70%;
(4) common rgb histograms, about 60%;
(5) gabor textures, about 50%.