CN101187984A

CN101187984A - An image detection method and device

Info

Publication number: CN101187984A
Application number: CNA2007101788290A
Authority: CN
Inventors: 邓亚峰; 黄英; 王浩; 邱嵩; 霍晓芳; 温小勇; 俞青; 邓中翰
Original assignee: Vimicro Corp
Current assignee: Beijing Vimicro Artificial Intelligence Chip Technology Co ltd
Priority date: 2007-12-05
Filing date: 2007-12-05
Publication date: 2008-05-28
Anticipated expiration: 2027-12-05
Also published as: CN100561505C

Abstract

The invention discloses a method and a device for image detection and provides an image detection technology with more hardware applicability, simpler operation and smaller memory occupancy. The image detection method provided by the invention comprises the following steps: the integral image and the square integral image of an input image are computed, wherein, the integral image and/or the square integral image are obtained with the brightness of all image elements from each row of image elements to the present image element of the input image and by computation in a sequence from the upper to the lower and from the left to the right; microstructural characteristic values for the input image are acquired through the integral image and the square integral image according to an object detector obtained by training in advance; and the object region of the input image is determined according to the microstructural characteristic value. The invention, used for the image detection, simplifies the image detection operation and reduces the memory occupancy, thereby adapting to hardware realization and reducing the product cost.

Description

A kind of image detecting method and device

Technical field

The present invention relates to technical field of image processing, relate in particular to a kind of image detecting method and device.

Background technology

In computer vision and technical field of image processing, people's face information of obtaining in image or the video all has important use such as fields such as man-machine interaction, safety, amusements.Therefore, obtain number, the size of people's face, the technology of positional information automatically from image, promptly human face detection tech has been subjected to greatly paying attention to.In recent years, along with the development of computer vision and mode identification technology, human face detection tech has also obtained development fast, and trend is ripe gradually.

Voila etc. have proposed a kind of human face detection tech based on microstructure features (Haar-like Features) and level type self-adaptation enhancing (Adaboost) sorter, this technology on performance with suitable based on the method for vector machine (SVM) and neural network (Neural Network), but, on speed, be higher than far away based on vector machine and neural network method, can reach the level of real time execution substantially.This method has just obtained researcher's attention after proposing, and has proposed a lot of improvement technology, and, be applied in a lot of products of industry member.

The fireballing reason of method for detecting human face that Viola proposes mainly is 2 points, the one, calculate the microstructure features value owing to adopted based on the method for integral image (Integral Image), and can calculate the microstructure features value of input picture apace; The 2nd, owing to adopted level type Adaboost algorithm, this algorithm adopts the little layer of operand to refuse most of interference of getting rid of easily earlier, then, adopts the big layer of operand to handle a small amount of candidate and disturbs.The microstructure features of the employing in this method as shown in Figure 1, each microstructure features value defined be the interior pixel brightness (being grey scale pixel value) of grey rectangle zone and with the interior pixel brightness of white rectangle zone and poor.

In order to calculate the microstructure features value fast, the integral image that Viola proposes as shown in Figure 2, integral image point (x, the value defined of y) locating for all grey scale pixel values in the grey rectangle zone in its upper left corner with, that is:

II (x, y) = \underset{0 \leq x^{'} \leq x, 0 \leq y^{'} \leq y}{Σ} I (x^{'}, y^{'})

Wherein, (x, y) the expression integral image is at point (x, the value of y) locating, the grey scale pixel value that I (x ', y ') expression input picture is located at point (x ', y ') for II.The mode that Viola adopts following iteration obtains integral image one time to image from the grey rectangle sector scanning in the upper left corner:

s(x，y)＝s(x，y-1)+I(x，y)

II(x，y)＝II(x-1，y)+s(x，y)

Wherein, s (x, y) all grey scale pixel value sums of (comprising y) before the capable y of the being in row of expression x, and, definition s (x ,-1)=0, and II (1, y)=0.

Adopt integral image can ask for any rectangular area grey scale pixel value sum fast.Grey scale pixel value sum with sum (r) expression rectangular area r.As shown in Figure 3, according to the definition of integral image, can be according to following formula:

sum(D)＝II(4)-II(2)-II(3)+II(1)

Ask for the grey scale pixel value sum (A, B, C, D represent a shaded rectangle zone respectively, and

point

1,2,3,4 is corresponding region A respectively, B, C, the summit, the lower right corner of D) in any rectangular area D.

In order to get rid of the interference of conditions such as illumination, Viola further adopts the brightness of image variance that above-mentioned microstructure features value is carried out normalization.Viola is defined as the brightness of image variance:

σ^{2} = \frac{1}{N} \underset{i, j}{Σ} {(I (i, j) - m)}^{2}

Wherein,

m = \frac{1}{N} \underset{i, j}{Σ} I (i, j)

Be the brightness average, (i is that (i, the brightness value of j) locating, N are the number of pixels in the input picture to some j) to I.The brightness of image variance can adopt formula:

σ^{2} = m^{2} - \frac{1}{N} \underset{i, j}{Σ} I {(x, y)}^{2}

Calculate, then normalization microstructure features value defined is g _j=f _j/ σ, wherein f _jBe the microstructure features value of above-mentioned definition, i.e. grey rectangle zone interior pixel brightness and with the interior pixel brightness of white rectangle zone and poor.

Viola adopts each microstructure features is constructed the simplest tree classification device as Weak Classifier, and is specific as follows:

Wherein, x is the input picture of fixed size, g _j(x) represent j microstructure features value of this image correspondence, θ _jBe the decision threshold of j microstructure features correspondence, p _jValue be 1 or-1, work as p _jBe 1 o'clock, the judgement symbol of decision device is a greater-than sign, works as p _jBe at-1 o'clock, the symbol of decision device is an is less than, h _j(x) the judgement output of j Weak Classifier of expression.Like this, each Weak Classifier only needs threshold ratio once just can finish judgement.

The level type Adaboost sorter structure that Viola proposes for all candidate window, adopts the ground floor sorter to judge earlier as shown in Figure 4, if can pass through the ground floor sorter, then adopts second layer sorter to proceed to judge, otherwise, directly refuse.In like manner, carry out follow-up each layer processing, the rectangular area that can pass through all sorters processing is as final human face region.

Further, in order to detect people's face of different sizes, diverse location, Viola adopts and handles based on the mode of feature scaling.At first set the width of human-face detector model and highly be respectively MW and MH (MW=24 that Viola adopts MH=24), adopts to extract and scaling the people's face sample and the non-face sample of yardstick for this reason training level type AdaBoost people face detection model.Suppose that the scaling ratio is SR, then adopt a series of different scales that the mode scaling of feature scaling obtains sorter width and highly be respectively ROUND (MW*SR ^s) and ROUND (MH*SR ^s).Wherein, s is the integer greater than 0, and ROUND () expression is carried out the round computing to the numerical value in the bracket.In order to detect people's face of different sizes, input picture is calculated an integral image, then, adopt the human-face detector of the above-mentioned different scale that obtains to carry out traversal search respectively, thereby detect different sizes, people's face of diverse location, and add all candidate rectangles by level type detecting device to the people face and detect in the formation and note.

Consider that people's face may be because of yardstick, change in displacement and corresponding a plurality of testing result, therefore, common people's face detection algorithm all can adopt post-processing step to come the fusion detection result, makes people's face position only export a testing result.Simultaneously, can also merge some flase drop result, thereby reduce false drop rate by merging.In the above-mentioned treatment step, will add people's face to and detect in the formation, need to merge the candidate face frame of overlapping below by people's face position candidate (being called the candidate face frame) of human-face detector.

The corresponding rectangle of each candidate face frame to any two candidate face frames, at first calculates the overlapping region area of two corresponding rectangular areas; Secondly, calculate the ratio (being called the overlapping degree) of overlapping region area and two rectangular area average areas.Overlapping degree and threshold value are compared,, think that then two candidate face frames overlap, be same individual face, otherwise think do not have to overlap if greater than threshold value.To merge with all candidate face frames and this candidate face frame that certain candidate face frame overlaps, concrete steps comprise: left frame horizontal ordinate, left frame horizontal ordinate, upper side frame ordinate and the lower frame ordinate of all rectangles are averaged left frame horizontal ordinate, left frame horizontal ordinate, upper side frame ordinate and the lower frame ordinate that is finally merged rectangle respectively.

Though the method for detecting human face that Viola proposes has lot of advantages,, this method is not to realize and design at hardware.For portable equipments such as camera, digital camera, Digital Video, need to adopt chip design to realize people's face detection algorithm, thereby provide people's face information for processing such as the automatic exposure that detects based on people's face, automatic focusing, Automatic white balance.And what be concerned about most in the hardware design is area of chip, and this directly determines the cost of chip.For algorithm and chip area maximally related be the amount of ram that takies of algorithm and the complexity of algorithm operation quantity, therefore, in order to save cost, the method for detecting human face that Viola proposes also is not suitable for hardware and realizes.

Summary of the invention

The embodiment of the invention provides a kind of image detecting method and device, in order to propose a kind of be more suitable for hard-wired, algorithm is simpler, the littler image detecting technique of internal memory that takies.

The image detecting method that the embodiment of the invention provides comprises:

The integral image of calculating input image and square integral image, wherein, described integral image and/or described integrated square image are according to from top to bottom, order from left to right, and the every capable pixel that adopts described input picture is to all pixel intensity of current pixel and calculate;

Object detection device according to training in advance obtains obtains the required microstructure features value of described input picture by described integral image and square integral image, and according to described microstructure features value, determines the object area of described input picture.

The image detection device that the embodiment of the invention provides comprises:

Calculated product partial image unit, the integral image and square integral image that are used for calculating input image, wherein, described integral image and/or described integrated square image are according to from top to bottom, order from left to right, the every capable pixel that adopts described input picture is to all pixel intensity of current pixel and calculate;

Determine the object area unit, be used for the object detection device that obtains according to training in advance, obtain the required microstructure features value of described input picture by described integral image and square integral image, and, determine the object area of described input picture according to described microstructure features value.

The embodiment of the invention, the integral image of calculating input image and square integral image, wherein, described integral image and/or described integrated square image are according to from top to bottom, order from left to right, the every capable pixel that adopts described input picture is to all pixel intensity of current pixel and calculate; Object detection device according to training in advance obtains obtains the required microstructure features value of described input picture by described integral image and square integral image, and according to described microstructure features value, determines the object area of described input picture.By this technical scheme, simplified the algorithm that the object in the image is detected, reduced taking of internal memory, thereby be more suitable for realizing in hardware, reduced cost of products.

Description of drawings

Fig. 1 is the microstructure features synoptic diagram that human face detection tech adopted of propositions such as Viola in the prior art;

Fig. 2 is an integral image synoptic diagram of the prior art;

Fig. 3 for the available technology adopting integral image ask for any rectangular pixels gray scale and synoptic diagram, wherein, the summit, the lower right corner that

point

1,2,3,4 is respectively regional A, B, C, D;

Fig. 4 is a level type human-face detector structural representation of the prior art;

The image detecting method schematic flow sheet that Fig. 5 provides for the embodiment of the invention;

The microstructure features synoptic diagram that Fig. 6 provides for the embodiment of the invention.

Embodiment

The embodiment of the invention at the characteristics of chip design, from reducing the angle of EMS memory occupation and shortcut calculation operand, has proposed a kind of hard-wired image detecting method and device of being suitable for.

In the embodiment of the invention, with the human face region in the detected image is example, and the algorithm that the human face region to the image of prior art proposition is detected from several aspects such as Weak Classifier definition, integral image account form, integrated square image calculation mode, human-face detector yardstick, Weak Classifier makes has been made improvement respectively.

In the object detection field, it is a sub-field of object detection that people's face detects, Automobile Detection, and other application such as pedestrian detection and people's face detection type seemingly all belong to two class sorting techniques of area of pattern recognition.Therefore, the scheme that the embodiment of the invention proposes not only is applicable to the human face region in the detected image, can also be applied to the shared zone of other types object in the detected image according to actual needs.For example, the automobile region in can detected image, each human body in can also detected image or the zone at animal place or the like.

Below in conjunction with accompanying drawing, describe the embodiment of the embodiment of the invention in detail.

Referring to Fig. 5, a kind of image detecting method that the embodiment of the invention provides comprises:

S501, training in advance obtain human-face detector.Wherein, aspect the Weak Classifier definition, with the microstructure features value defined be in advance similar number in two rectangular areas pixel intensity and poor.The yardstick of described human-face detector model is 2 power, i.e. wide and high 2 the power that all is taken as of described human-face detector.

The integral image of S502, calculating input image and square integral image, wherein, described integral image and/or described integrated square image are according to from top to bottom, order from left to right, and the every capable pixel that adopts described input picture is to all pixel intensity of current pixel and calculate.And obtain the microstructure features value of input picture according to this integral image and square integral image.

The human-face detector of S503, a certain yardstick of employing with a fixed step size, obtains the possible position, rectangular area of institute of this scale size in the horizontal direction with the vertical direction traversal.

S504, according to the microstructure features value of input picture, judge by the good human-face detector of training in advance whether each rectangular area is candidate's human face region, if then carry out step S505; Otherwise, carry out step S506.

S505, candidate face zone is added the human face region formation.

S506, give up described rectangular area.

Whether S507, the human-face detector of judging all yardsticks all detect and finish, if then carry out step S508; Otherwise, return step S503, adopt the human-face detector of next yardstick to detect, all detect up to the human-face detector of all yardsticks and finish.

S508, from the human face region formation, determine human face region.

Introduce step S501 below in detail.

Referring to Fig. 4, every layer of sorter in the human-face detector all is a strong classifier, and each strong classifier is made of a plurality of Weak Classifiers.

From Weak Classifier definition, the embodiment of the invention proposes to adopt the corresponding brightness in equal area zone and poor, rather than the gray area as shown in Figure 1 of Viola proposition and white portion brightness with poor.Particularly, the area in hypothesis district 1 and zone 2 is than being RA, then embodiment of the invention definition microstructure features value be zone 1 pixel intensity and divided by behind the RA with the pixel intensity in zone 2 with poor.

According to the mode of Viola with microstructure features be defined as ash, white two area pixel brightness and poor, for the different microstructure features of the area of gray area and white portion (for example feature of lower left corner type among Fig. 1), can cause the brightness of gray area and white portion and differ greatly, thereby the microstructure features value that obtains departs from 0 value far away, need can represent this value with higher figure place, be unfavorable for the realization of hardware.And the method that adopts the embodiment of the invention to provide is done the brightness of difference and from the adding up of the pixel of similar number, difference is the center with the null value, and it is less to depart from the null value scope, and the data that need during the expression difference are less, therefore are more conducive to the realization of hardware.And, in follow-up detection, need carry out scaling to microstructure features, coordinate to pixel is similar to, the area ratio of doing two zones of difference can change, therefore the method for Viola can cause bigger error, and the method that adopts the embodiment of the invention to provide can make error reduce.

Preferably, a few class microstructure features shapes that propose to be different from the employed microstructure features of Viola among Fig. 1 in the embodiment of the invention as shown in Figure 6.Gray area is an essential part among the figure, the area of white portion be gray area area 2 power time doubly, for example, can be 1 times, 2 times or 4 times, can certainly for other greater than 42 power.Concrete, to a, b class shape, white portion is the whole rectangular area that comprises gray area, and gray area overlaps with the white portion center, and the white portion area is inferior times of 2 the power of gray area area; For c, d, e, f class shape, white portion is the whole rectangular area that comprises gray area, gray area is in an angle (upper left, lower-left, upper right or bottom right) of white portion, and the white portion area be the gray area area 2 power time doubly; For g class shape, gray area is identical with white portion upper and lower side frame ordinate, but there is certain distance (this distance can for arbitrarily greater than 0 number) in horizontal direction, does not overlap mutually, the white portion area be gray area 2 power time doubly; For h class shape, gray area is identical with the horizontal ordinate of white portion left and right side frame, but there is certain distance (this distance can for arbitrarily greater than 0 number) in vertical direction, does not overlap mutually, the white portion area be gray area 2 power time doubly.

Suppose in the above-mentioned microstructure features that the white rectangle region area is the 2BS of grey rectangle region area, promptly 2 power is BS.So, according to above-mentioned improved microstructure features value defined, the microstructure features value that the embodiment of the invention proposes, can be defined as in the white portion after all pixel intensity and displacement BS position with the gray area pixel intensity and poor.She Ji advantage is division arithmetic is converted to shift operation like this, with shortcut calculation, takies internal memory still less.

More excellent, limit for a among Fig. 6, the microstructure features among the b, BS=1, promptly the white portion area is 2 times of gray area area; For the microstructure features among c, d, e, the f, BS=2, promptly the white portion area is 4 times of gray area area; For the microstructure features among g, the h, BS=0, promptly white portion is identical with the gray area area.Then for the microstructure features among a, the b, adopt the pixel intensity in whole white rectangle zone and move to right one with the pixel intensity of gray area with difference as the microstructure features value; For the microstructure features among c, d, e, the f, adopt whole white portion brightness and the sum-bit gray area brightness and make difference of moving to right as the microstructure features value; For g, the microstructure features among the h adopts whole white portion brightness and makes difference as the microstructure features value with the gray area pixel intensity.

Above-mentioned shape and the eigenvalue calculation mode that has just defined all kinds of microstructure features, during training, every class microstructure features, need be according to diverse location, different sizes travel through exhaustively in normalized human-face detector magnitude range, obtain thousands of concrete microstructure features.With the above-mentioned all kinds of microstructure features that obtain, combine feature a little less than the candidate that the AdaBoost that proposes as Viola selects.For the number of feature a little less than the further qualification candidate, can limit the magnitude range of above-mentioned rectangle, and the offset step-length.

On the make of Weak Classifier, in order to improve the classification capacity of Weak Classifier, the embodiment of the invention adopts the manner of comparison of dual threshold to construct Weak Classifier, and each Weak Classifier is by two threshold value (θ _j ¹And θ _j ², and

θ_{j}^{1} < θ_{j}^{2}

) and a polarity sign (p _j, p _jValue be 1 or-1) form,

Work as p _jBe 1 o'clock, sorter is defined as:

Work as p _jBe at-1 o'clock, sorter is defined as:

Wherein, x is the image of fixed size, g _j(x) corresponding j microstructure features value of presentation video, h _j(x) the judgement output of j Weak Classifier of expression.

The make of above-mentioned Weak Classifier, more pervasive than the mode that Viola proposes.Work as p _jBe 1, and θ _j ²During for positive infinity,

Be converted to

Middle p _j=1 situation; Work as p _jBe 1, and θ _j ¹During for negative infinitesimal, Be converted to

Middle p _j=-1 situation.That is to say that the dual threshold mode that the present invention proposes has included the single threshold situation that Viola proposes.

A kind of embodiment is, to (the train a classifier hj of the Weak Classifier training algorithm among the Viola, whichis restricted to using a single feature), the single threshold Weak Classifier construction algorithm with Viola replaces with the dual threshold Weak Classifier construction algorithm that the embodiment of the invention proposes.That is, to current microstructure features g _j(x), select p _j, θ _j ¹And θ _j ², make Weak Classifier that this microstructure features forms weighting error rate minimum to all samples.

Therefore, increase the possible form of candidate's Weak Classifier, thereby made it possible to select the stronger Weak Classifier of classification capacity, thus the performance of raising strong classifier and even final level type sorter.

Introduce step S502 below in detail.

On the account form of integral image, the method for Viola needs extra internal memory preservation s, and (x y), therefore can take W * H s (x, y) Dui Ying internal memory.And the calculated product partial image that proposes among the step S502 of employing embodiment of the invention method and the mode of square integral image can further be saved internal memory.

For example, adopt rs (x, y) expression y capable to current pixel (x, y) till (comprising current pixel) all pixel intensity and, promptly

rs (x, y) = \underset{X^{'} < = X}{Σ} I (x^{'}, y) .

Adopt following formula iterative computation integral image:

rs(x，y)＝rs(x-1，y)+I(x，y)

II(x，y)＝II(x，y-1)+rs(x，y)

Prior art adopts every row pixel all pixel intensity and (be s (x till the current pixel, y)) come recursion calculated product partial image, and the embodiment of the invention adopt every row till the current pixel (comprising current pixel) all pixel intensity and (be that rs (x, y)) comes recursion calculated product partial image.The embodiment of the invention is when the calculated product partial image, according to from top to bottom, order recursion is from left to right calculated, prior art need preserve all position correspondences s (x, y), and the inventive method only need be preserved the rs (x of current pixel, y), (x y) does not need to preserve the rs of other pixels, therefore can save internal memory widely.

Specific implementation is as follows:

To any y=0,1,2...H-1 and x=0,1,2...W-1, setting II (1, y)=0, II (x ,-1)=0;

To all row of image, according to y=0,1, the order of 2...H-1 is carried out following processing:

Setting rs=0 represents all pixels of current line and is initially 0;

All pixels to image y in capable are according to x=0, and 1, the order of 2...W-1 is carried out following processing:

Make rs=rs+I (x, y);

Then current pixel (x, integral image II y) (x, y)=II (x, y-1)+rs;

Then calculate the capable integral image of y+1 after having calculated the capable integral image of y.

After all capable disposing to image, finish the calculating of integral image.

As seen, the embodiment of the invention only need be preserved a rs (for the higher application of some request memories, such as chip design, the method that adopts the embodiment of the invention to provide is had more advantage for x, y) Dui Ying data.

For calculating the integrated square image, with the calculated product partial image in like manner.Particularly, (x, the value defined of y) locating is with the point of integrated square image

sqInteg (x, y) = \underset{0 \leq i \leq x, 0 \leq j \leq y}{Σ} I (i, j) * I (i, j),

Then point (x, y) Dui Ying integrated square image be SqInteg (x, y).Suppose sqrs (x, y) expression y capable to current pixel (x, y) till (comprising current pixel) all pixel intensity square and, promptly

sqInteg (x, y) = \underset{X < = X}{Σ} I (i, j) * I (i, j) .

Adopt following formula to calculate the integrated square image:

sqrs(x，y)＝sqrs(x-1，y)+I(x，y)*I(x，y)

SqInteg(x，y)＝SqInteg(x，y-1)+sqrs(x，y)。

Specific implementation is as follows:

To any y=0,1,2...H-1 and x=0,1,2...W-1, setting SqInteg (1, y)=0, SqInteg (x ,-1)=0;

Set sqrs=0 represent current line all pixels and initial value be 0;

Make sqrs=sqrs+I (x, y) * I (x, y);

Then current pixel (x, integrated square image SqInteg y) (x, y)=SqInteg (x, y-1)+sqrs;

Then calculate the capable integrated square image of y+1 after having calculated the capable integrated square image of y.

After all capable disposing, finish the calculating of integrated square image.

In order to detect people's face of different scale, a kind of mode is to adopt the mode of feature scaling to handle.But for the human-face detector of more complicated, the number of the weak feature of the final sorter that constitutes can be a lot, have several thousand, and each weak feature need write down the coordinate and the threshold value of two rectangle frames, and is very big to the demand of internal memory.For chip design, adopt the mode of ram in slice to store data, can increase the cost of chip greatly.And ROM outside the sheet and relative will the hanging down of cost of FLASH.

Preferably, the embodiment of the invention is kept at the Weak Classifier of all yardsticks among the outside ROM or flash, during processing, elder generation reads into the sorter of a yardstick outside sheet, then, adopts this yardstick to handle, dispose, read the sorter of next yardstick.And the like, only need the detecting device of a yardstick of storage in the sheet, thereby when guaranteeing processing speed, reduced taking of sheet stored space, thereby saved chip cost.

The yardstick of the human-face detector of Viola training is 24x24, and for convenience of calculation, the embodiment of the invention adopts the fixed size (comprising wide and high) of the human-face detector that 2 power time obtains as training in advance.Adopt formula

σ^{2} = m^{2} - \frac{1}{N} \underset{i, j}{Σ} I {(i, j)}^{2}

With

m = \frac{1}{N} \underset{i, j}{Σ} I (i, j)

When calculating normalization coefficient, need carry out division arithmetic, promptly divided by N, N be in the image of this fixed size all number of pixels with.And the embodiment of the invention can be converted to wide and high 2 the power that all is taken as of human-face detector shift operation with division arithmetic, thereby save calculated amount greatly.Wherein, height can be identical with width, also can be different.

For human face detection tech,, usually can near same real human face position, detect a plurality of people's face candidate frames, so need carry out union operation owing to adopt the human-face detector of different scale to carry out the interior search of full figure scope.The method that prior art adopts is will to get off by all rectangular area location records of human-face detector earlier, and after all yardstick disposes, adjacent overlapping human face region position is merged.Handling like this needs to preserve all candidate face frames, bigger to the demand of internal memory.

Preferably, among the step S505 of embodiment of the invention method, when detecting people's face rectangular area frame and add to it in human face region formation, carry out the union operation of human face region frame automatically, to save the needed internal memory of depositary's face formation.

The formation of embodiment of the invention initialization human face region is empty, when step S505 adds the candidate face rectangular area to the human face region formation, specifically comprises:

Judge whether the human face region formation is empty, if then the candidate face regional frame of directly step S504 being judged (hereinafter referred to as adding people's face frame) adds the human face region formation to; Otherwise, judge that this candidate face regional frame is whether similar with the human face region frame (hereinafter referred to as recorder's face frame) preserved in the human face region formation, if it is similar to certain recorder's face frame, then with the two merging, otherwise, add in the human face region formation as new entry adding people's face frame.

Preferably, a kind of judge add people's face frame and recorder's face frame whether similar methods be when the overlapping of the big or small close and position of two rectangle frames, think that the two is similar.

For example, suppose that candidate face frame table to be added is shown R (i, j, TWidth _n, THeight _n), wherein, i represents the left frame horizontal ordinate of candidate face frame to be added, j represents the upper side frame ordinate of candidate face frame to be added, TWidth _nRepresent the wide of candidate face frame to be added, THeight _nThe height of representing candidate face frame to be added.Suppose in the human face region formation that m recorder's face frame is R _m(l, t, wd, ht), wherein, l is the left frame horizontal ordinate of recorder's face frame, and t is the upper side frame ordinate of recorder's face frame, and wd is the wide of recorder's face frame, and ht is the height of recorder's face frame.A kind ofly judge that whether close the two size method as follows:

If satisfy

ENLARGE 0 \leq \frac{wd}{{TWidth}_{n}} \leq ENLARGE 1,

Think that then the two is close, otherwise, think that the two is not close.Wherein ENLARGE0 and ENLARGE1 are respectively the upper and lower bound in close width ratio interval.

The mode whether a kind of position of judging the two overlaps is as follows:

Make l _i=max (i, l), t _i=max (j, t), r _i=min (i+TWidth _n, l+wd), b _i=min (j+THeight _n, t+ht), the area that then overlaps is area _i=(r _i-l _i) * (b _i-t _i), the rectangular area area of m recorder's face frame is area in the human face region formation _m ^r=wd*ht is if then satisfy

\frac{{area}_{i}}{{area}_{m}^{r}} &GreaterEqual; ENLARGE 2,

Think that the two position overlaps, otherwise, think that the two position does not overlap.Wherein,

ENLARGE2 is a threshold value, and min represents to get minimum value, and max represents to get maximal value.

The information of the similar recorder's face frame in two sizes is close and rectangular areas that the position overlaps the merge operations, the information that is about to add people's face frame and human face region formation merges, as new recorder's face frame.A kind of feasible merging mode is: will add the left frame horizontal ordinate, upper side frame horizontal ordinate, width of people's face frame and recorder's face frame, be averaged left frame horizontal ordinate as new recorder's face frame, upper side frame horizontal ordinate, width, highly highly respectively.

After above-mentioned merging is finished dealing with, preferably, when from the human face region formation, determining human face region among the step S508, specifically comprise: judge in the human face region formation between any two recorder's face frames the relation of inclusion (promptly a rectangle frame is in the another one rectangle frame) on the location whether, if there is relation of inclusion, then delete the little rectangle frame of degree of confidence, if degree of confidence is identical, the little rectangle frame of deletion area.Will be through after above-mentioned union operation and the deletion action, be kept at human face region in the human face region formation as final detected human face region.

Whether two recorder's face frames of a kind of judgement exist the method for relation of inclusion to comprise:

Suppose that two recorder's face frames are respectively R _m(l, t, wd, ht) and R _m' (l ', t ', wd ', ht '), then the left frame horizontal ordinate of the two overlapping rectangle frame, upper side frame ordinate, left frame horizontal ordinate, lower frame ordinate are respectively: l _i=max (l, l '), t _i=max (t, t '), r _i=min (l+wd, l '+wd '), b _i=min (t+ht, t '+ht ').

If satisfy l _i==l, t _i==t, r _i==(l+wd), b _j==(t+ht) or l _i==l ', t _i==t ', r _i==(l '+wd '), b _j==(t '+ht '), think that then there is relation of inclusion in the two.

The degree of confidence of above-mentioned people's face frame can be defined as that this people's face frame merges in merging process all add the number of people's face frames.

Calculated product partial image unit, the integral image and square integral image that are used for calculating input image, wherein, described integral image and/or described integrated square image are according to from top to bottom, order from left to right, the every capable pixel that adopts described input picture is to all pixel intensity of current pixel and calculate.

In sum, the present invention is from reducing the angle of EMS memory occupation and shortcut calculation operand, a kind of being more suitable in hard-wired image detection algorithm proposed, improve respectively from several aspects such as Weak Classifier definition, integral image account form, integrated square image calculation mode, object detection device yardstick, Weak Classifier makes respectively, thereby simplified the algorithm that figure is carried out object detection, reduce taking of internal memory, finally reached the purpose that reduces cost of products.

Obviously, those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these are revised and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification interior.

Claims

1. an image detecting method is characterized in that, this method comprises:

2. method according to claim 1 is characterized in that, described microstructure features value be in two rectangular areas the similar number pixel intensity and poor.

3. method according to claim 2 is characterized in that, the ratio of the area of described two rectangular areas is 2 power.

4. according to claim 2 or 3 described methods, it is characterized in that the position of described two rectangular areas is closed and is:

A rectangular area overlaps with the central point of another rectangular area; Perhaps,

Two limits of a rectangular area overlap with two limits of another rectangular area; Perhaps,

A rectangular area and another rectangular area are in the horizontal direction or/and exist certain distance on the vertical direction.

5. method according to claim 1 is characterized in that, the wide and high value of described object detection device model is 2 power.

6. method according to claim 1 is characterized in that, described object detection device is made of a plurality of Weak Classifiers, and described Weak Classifier comprises: polarity, the first threshold and second threshold value; Described polarity is 1 or-1, and described first threshold is less than described second threshold value;

If the polarity of described Weak Classifier is 1, and described microstructure features value is greater than described first threshold, and, during less than described second threshold value, represent that described microstructure features value passed through the judgement of described Weak Classifier; Otherwise, represent the not judgement by described Weak Classifier of described microstructure features value;

If the polarity of described Weak Classifier is-1, and described microstructure features value is less than described first threshold, perhaps, during greater than described second threshold value, represents that described microstructure features value passed through the judgement of described Weak Classifier; Otherwise, represent the not judgement by described Weak Classifier of described microstructure features value.

7. method according to claim 1 is characterized in that, determines that the step of the object area of described input picture comprises:

In advance the object detection device is carried out convergent-divergent, obtain the object detection device of a plurality of different scales;

Adopt the object detection device of described each yardstick respectively, with a fixed step size, and microstructure features value according to described input picture, successively the rectangular area of all position candidate in the input picture is judged, when this rectangular area is the material standed for body region, with this material standed for body region admixture body region formation;

From described object area formation, determine the object area of described input picture.

8. method according to claim 7 is characterized in that, the object detection device model with all yardsticks is stored in the chip external memory in advance;

The object detection device that reads each yardstick successively from described chip external memory is judged in order to the rectangle to all position candidate of described yardstick in on-chip memory.

9. method according to claim 7 is characterized in that, the step of described material standed for body region admixture body region formation is comprised:

Size and position according to described material standed for body region to be added, and the size and the position that have been added to the material standed for body region in the object area formation, judge whether described material standed for body region to be added is close with described material standed for body region of having added, if, then close material standed for body region is merged, and with the number of the merged material standed for body region degree of confidence as the material standed for body region after merging; Otherwise, described material standed for body region to be added is added in the described object area formation.

10. method according to claim 9 is characterized in that, determines that from described object area formation the step of the object area of described input picture comprises:

When the material standed for body region in the described object area formation is contained in another material standed for body region, the material standed for body region deletion that degree of confidence is less; When degree of confidence is identical, the less material standed for body region of deletion area;

Remaining material standed for body region is defined as the object area on the described input picture in the object area formation after will handling through described merging and deletion.

11. an image detection device is characterized in that, this device comprises: