CN101406390B

CN101406390B - Method and apparatus for detecting part of human body and human, and method and apparatus for detecting objects

Info

Publication number: CN101406390B
Application number: CN2007101639084A
Authority: CN
Inventors: 陈茂林; 郑文植
Original assignee: Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Current assignee: Beijing Samsung Telecommunications Technology Research Co Ltd; Samsung Electronics Co Ltd
Priority date: 2007-10-10
Filing date: 2007-10-10
Publication date: 2012-07-18
Anticipated expiration: 2027-10-10
Also published as: KR101441333B1; KR20090037275A; CN101406390A

Abstract

The invention discloses a method and equipment for detecting human body parts and humans by using differential images as characteristic images. The method comprises the following steps: calculating the differential images of detected images; and detecting a first human body part by using a fist human body part model corresponding to the first human body part on the basis of the calculated differential images of the detected images, wherein the first human body part model is obtained through studying characteristic sets extracted from the differential images of a positive sample and a negative sample of the first human body part.

Description

Human body position and people's method and apparatus and object detection method and apparatus

Technical field

The present invention relates to the object detection method and apparatus, more particularly, relate to a kind of through using difference image to come human body position and people's method and apparatus and object detection method and apparatus as characteristic image.

Background technology

Object detection is extremely important for video analysis technology (for example, content-based video or image object recovery, video monitoring, video compress and driver assistance) automatically.In real world, people's detection is one of challenging detection classification of tool.People's detection can be applied to three kinds of situation:

First kind of situation is used for confirming whether have the people in the visual field.For example, when assist driving, as the pedestrian on the road during near vehicle, system makes warning to the driver.This can be implemented as the embedded intelligence device of integration imaging device.

Second kind of situation is used for still camera or the same people's of ptz camera coherent tracking.But the former collector's motion track can be suitable for the intelligent behavior analysis.The latter can be adjusted into the people who follow the tracks of to move with its attitude, and make its at the center of image to write down its details or behavior.This can be implemented as smart camera or the ptz camera that is connected through the internet to storage device or display device.

The third situation is when robot hopes to follow the people or attempts to watch attentively the person to person when exchanging.Robot attempts detecting people's in its image of camera position, and makes corresponding action, for example, moves, follows, watches attentively or adjust its posture.This can be embodied as the total embedded equipment of functional unit that is integrated in robot.

The clothing of all kinds and pattern cause people's part and whole pattern to have very big transmutability; Therefore only there is regional area seldom to can be described as the characteristic of its affiliated classification, even also have the feature set of robustness and recognizability in the background of the confusion of these needs under different lighting conditions.In addition; Global shape experience since various possible definition be out of shape with the wide region that the appurtenance that becomes obstruction in a large number causes; Perhaps can cause the wide region distortion that a plurality of people's of appearance obstruction causes in same image-region of people's profile change; This just need overcome minority and disturb the algorithm of inferring correct result from total sign (evidence).

Various trials have been carried out to overcoming these problems.The example of these trials comprises: the method for many views people's head detection (is disclosed in " multi-view human head detection in static images "; Machine vision application 2005; The author is M.Chen etc.; Below be called R1), use motion and appearance detection of information method (to be disclosed in " international conference on computer vision2003 "; The author is V.Paul etc., below is called R2), (be disclosed in " international conference on computer vision and pattern recognition ", the author is Q.Zhu etc. to use the block diagram of gradient to detect people's method; Below be called R3), use detection people's the method (be disclosed in the 20060147108A1 United States Patent (USP), below be called R4) of the statistical distribution of edge orientations.

Outrunner's detection method is not to use world model's (for example, the set of whole health appearance or profile detector or local feature or location detection device).The former extracts people's global feature and sets up world model [R1] based on its appearance or profile.The latter is decomposed into several sections (for example, people's head, people's trunk, people's lower limb and people's arm) [R2, R3, R4] with human body.Show people's detection through location detection, and research contents is reduced to and the corresponding model of human body.The model learning method generally includes SVM, Adaboosting and other householder methods.

We know that people's face detects has obtained very big progress in recent years, and it can reach very high verification and measurement ratio and lower false alarm rate in handling in real time.Yet for the application of actual purpose, people's detection still need be done a lot of work.At first, the people detection device can be adapted to the change of people's appearance along with clothes pattern and different lighting conditions, and should be based upon and can catch on the basis of robust type characteristic of character shapes from the various distortion of people's appearance; At last, should need less amount of calculation and handle in real time.

Summary of the invention

Exemplary embodiment of the present invention overcomes above-mentioned shortcoming and top other shortcomings of not describing.In addition, the present invention need not overcome above-mentioned shortcoming, and exemplary embodiment of the present invention can not overcome above-mentioned any problem.

According to an aspect of the present invention, human body/people's in a kind of detected image method is provided, said method comprises: the difference image that calculates image to be detected; Difference image use based on the image to be detected that calculates detects said the first body region with the corresponding the first body region model of the first body region; Wherein, said the first body region model is to learn to obtain through the feature set that the difference image from the positive sample of said the first body region and negative sample is extracted.

According to a further aspect in the invention, provide a kind of in image human body position/people's equipment, said equipment comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage human body; Sub-window processor, the positive sample of the human body of from the training DB that image processor calculates, storing and the difference image of negative sample extract feature set; The first body region grader; The difference image of the image to be detected that calculates based on image processor; Use the first body region model; Detect and the corresponding the first body region of said the first body region grader, wherein, to be sub-window processor learn to obtain through the feature set to the difference image extraction of the positive sample of the said the first body region from train DB, stored and negative sample said the first body region model.

Provide according to a further aspect in the invention a kind of in image human body position/people's method, said method comprises: the difference image that (a) calculates image to be detected; (b) based on the difference image of the image to be detected that calculates; Use with a plurality of different a plurality of one to one human body models of human body in a human body model detect and a said human body that the human body model is corresponding; Wherein, each of said a plurality of human body models is to learn to obtain through the feature set that the difference image from the positive sample of each pairing human body of said a plurality of human body models and negative sample is extracted; (c) repeating step (b); For another human body in the said a plurality of different human body position that is different from the said human body in the step (b); Use the human body model corresponding to detect said another human body with said another human body; And, from detected human body, remove false-alarm according to human geometry based on said detected another human body; (d), finally confirm the human body of detection, and learn the people who confirms detection according to the human geometry based on the result of step (c).

Provide according to a further aspect in the invention a kind of in image human body position/people's equipment, said equipment comprises: a plurality of human body detectors, corresponding one by one with a plurality of different human bodies, and detect corresponding human body; Determiner is learned according to the human geometry, based on the detected human body of a plurality of human body detectors, removes false-alarm, to confirm human body and the people in the image to be detected; Wherein, each of said a plurality of human body detectors comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage human body; Sub-window processor, the positive sample of the human body of from the training DB that image processor calculates, storing and the difference image of negative sample extract feature set; The human body grader; The difference image of the image to be detected that calculates based on image processor; End user's body region model; Detect and the corresponding human body of said human body grader, wherein, to be sub-window processor learn to obtain through the feature set to the difference image extraction of the positive sample of the said human body from train DB, stored and negative sample said human body model.

According to a further aspect in the invention, a kind of imaging device is provided, comprises: image-generating unit, the image of reference object; Detecting unit; Difference image based on the image of the object of taking; Use the zone of reference object in the object model detected image, wherein, said object model is through the feature set from the difference image extraction of the positive sample of the object of said shooting and negative sample is learnt to obtain; The attitude parameter computing unit, the object of the shooting of coming out according to detection is the zone in image, calculates the parameter that produces adjustment imaging device attitude, so that object placed the central area of image; Control unit receives the parameter of adjusting the imaging device attitude from said attitude parameter computing unit, the attitude of adjustment imaging device; Memory element, the image of the object that storage is taken; Display unit, the image of the object that demonstration is taken.

According to a further aspect in the invention, the method for object in a kind of detected image is provided, said method comprises: the difference image that calculates image to be detected; Based on the difference image of the image to be detected that calculates, use object model to detect said object, wherein, said object model is through the feature set from the difference image extraction of the positive sample of said object and negative sample is learnt to obtain.

According to a further aspect in the invention, the equipment of object in a kind of detected image is provided, said equipment comprises: image processor, the difference image of computed image; Training DB, the positive sample and the negative sample of storage object; Sub-window processor, the difference image of the positive sample of objects stored and negative sample extracts feature set from the training DB that image processor calculates; The object grader based on the difference image of the image to be detected that calculates, uses object model to detect said object, and wherein, said object model is through the feature set from the difference image extraction of the positive sample of said object and negative sample is learnt to obtain.

Description of drawings

The description of in conjunction with the drawings exemplary embodiment being carried out below, these and/or other aspect of the present invention, characteristics and advantage will become clear and be more prone to and understand, wherein:

Fig. 1 is the block diagram according to a checkout equipment of exemplary embodiment of the present invention;

Fig. 2 illustrates the diagrammatic sketch of the method for image processor 110 calculating difference images according to an exemplary embodiment of the present invention;

Fig. 3 illustrates some negative samples that typically have the contour shape similar with the contour shape of the positive sample of head;

Fig. 4 illustrates the difference image with the object of linear structure;

Fig. 5 illustrates the example according to the difference image of the people's of method calculating shown in Figure 2 head;

Fig. 6 illustrates the sub-window that in feature extracting method, uses according to an exemplary embodiment of the present invention;

Fig. 7 illustrates the division of the view of people's head according to an exemplary embodiment of the present;

Fig. 8 illustrates the pyramid detector of the head that is used to detect the people according to an exemplary embodiment of the present invention;

Fig. 9 is the flow chart of the detection of pyramid detector shown in Figure 8 according to an exemplary embodiment of the present invention;

Figure 10 is the block diagram that has the detector of many human bodies detector according to an exemplary embodiment of the present invention;

Figure 11 is the detailed diagram of an exemplary detector of the detector with many human bodies detector of Fig. 10 according to an exemplary embodiment of the present invention;

Figure 12 is the block diagram of the detector with many human bodies detector of another exemplary embodiment according to the present invention;

Figure 13 illustrates the block diagram of the imaging device of exemplary enforcement according to the present invention;

Figure 14 illustrates the block diagram of the detecting unit among Figure 13 according to an exemplary embodiment of the present invention.

The specific embodiment

Below, with reference to accompanying drawing the present invention is described more all sidedly, exemplary embodiment of the present invention is shown in the drawings, and wherein, label identical in whole accompanying drawing is represented components identical.Below, embodiment is described to explain the present invention.

Fig. 1 is the block diagram according to a checkout equipment of exemplary embodiment of the present invention.

With reference to Fig. 1, head detection equipment 100 comprises: image processor 110, training DB 120, sub-window processor 130 and head grader 140.

The difference image of image processor 110 computed image.The positive sample and the negative sample of training DB 120 storage people's head.Sub-window processor 130 is from extracting feature set by the positive sample that is stored in the head of training the people the DB120 of image processor 110 calculating and the difference image of negative sample.Grader 140 is based on the difference image of wanting of calculating of image processor 110 input picture to be detected, uses the head model that obtains from the feature set study of extracting to detect people's head zone.To describe the operation of image processor 110 now with reference to Fig. 2 in detail.

Fig. 2 illustrates the diagrammatic sketch of the method for image processor 110 calculating difference images according to an exemplary embodiment of the present invention.Image processor 110 calculates dx, dy, du and four difference images of dv.

With reference to Fig. 2, each pixel value of difference image dx is in the horizontal direction a pixel difference in the N*N adjacent area in original image.With reference to the dx of Fig. 2, if N equals 3, then the pixel value of grey rectangle with deducting Lycoperdon polymorphum Vitt circular pixel value and be the value of the center pixel of difference image dx.Each pixel value of difference image dy is in vertical direction a pixel difference in the N*N adjacent area in original image.With reference to the dy of Fig. 1, if N equals 3, then the pixel value of grey rectangle with deducting Lycoperdon polymorphum Vitt circular pixel value and be the value of the center pixel of difference image dy.Each pixel value of difference image du is the pixel difference on the right side-left diagonal in the N*N adjacent area in original image.With reference to the du of Fig. 1, if N equals 3, then the pixel value of grey rectangle with deducting Lycoperdon polymorphum Vitt circular pixel value and be the value of the center pixel of difference image du.Each pixel value of difference image dv is the pixel difference on the L-R diagonal in the N*N adjacent area in original image.With reference to the dv of Fig. 1, if N equals the pixel value and the pixel value that deducts the Lycoperdon polymorphum Vitt circle of 3 grey rectangle and is the value of the center pixel of difference image dv.By this way, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction.

Simultaneously, can calculate difference image with different yardsticks.For example, in dx, dy, du and the dv of Fig. 1, the pixel value of black rectangle with deducting black circular pixel value and be the value of the center pixel of 5*5 adjacent area, can be expressed as (2*n+1) * (2*n+1) with formula, n=1,2 ....For multiple dimensioned, image for example, when n equals 2, is represented whenever to carry out the calculating of an error image at a distance from a pixel as will be the same by continuous sub-sampling.For the coarse resolution image, calculating difference image than large scale (that is, bigger adjacent area), the influence of background noise when reducing to extract characteristic.Simultaneously, for high-definition picture, yardstick that can be less (that is, less adjacent domain) calculates difference image, to catch local detail.

Supposing that picture traverse is the individual pixel of w (0...w-1), highly is (0....h-1) individual pixel, width from 1 to w-2 and 1 to h-2 high computational difference image, and be 0 calculating the pixel value that exceeds the image edge season.For example, when to being 1 at width, when highly being the pixel difference of 1 the pixel horizontal direction of calculating the 5*5 adjacent area, the 5*5 adjacent area has only part within image, and the value that order exceeds two pixels outside the image is 0.Before calculating difference image, to source gray level image sub-sampling with thick yardstick.Then, according to the calculated characteristics image of top introduction.

Fig. 3 illustrates some negative samples that typically have the contour shape similar with the contour shape of the positive sample of head.Although these negative samples are made up of approximate rectangular line, they appear to the shape of distortion of people's head, and this identification ability to grader is a challenge.This is because the false-alarm that usually in testing result, occurs mainly comprises the object that has with the shape shapes similar of people's head.On the other hand, this has shown the importance of Feature Conversion and extraction.The present invention provides the method preferably that reduces this difficulty.

Fig. 4 illustrates the difference image with the object of linear structure.

With reference to Fig. 4, the straight line object of in image, constructing can be decomposed into different difference images.This is decomposed into feature extraction and has stayed resource preferably.Difference image dx maintains image and changes in the horizontal direction, and difference image dy maintains image and changes in vertical direction.Difference image du and dv maintain image and on diagonal, change.Fig. 5 illustrates the example according to the difference image of the people's of method calculating shown in Figure 2 head.Compare with the difference image among Fig. 4, we can find: although lines are decomposed, the profile of head keeps ground better.

Fig. 6 illustrates the sub-window that quilt window processor 130 uses in feature extracting method according to an exemplary embodiment of the present invention.Through the single window that on training image, slides, and with its width and Level Change for creating single window characteristic with the possible ratio that image scaled is regulated.Through two windows that on training image, slide; With its width and Level Change is the possible yardstick that can regulate with graphical rule, and at one time with same magnification factor with the width of double window and Level Change for creating the double window characteristic by the yardstick that yardstick is caught the stealth mode of wide region.Two windows can be movable relative to each other in the horizontal and vertical directions, to catch the stealth mode of wide region.Window through these three connections of on training sample, sliding is created three window characteristics.These three windows are in order to catch the pattern of wide region and to have identical width during changing yardstick and height by yardstick.Second window can move with respect to first window and the 3rd window, to catch protruding and recessed profile.Have two kind of three window characteristic, a kind of is the horizontal layout of three windows, and another kind is the vertical layout of three windows.For horizontal layout, second window can move horizontally with respect to first window and the 3rd window; For vertical layout, second window can move with respect to first window is vertical with the 3rd window.

Suppose the characteristic of f for extracting, G is a difference image, and w is the characteristic window, OP ¹For being used for the operator of feature extraction, it comprises '+' and '-' two simple operators.OP ¹＝{+，-}。OP ²Be second operator that uses in the present invention, it comprises '+' and '-' and main operator domin.Two simple operators.OP ²＝{+，-，domin}。

For the feature extraction of single difference image, can through respectively with single characteristic window, bicharacteristic window and the corresponding equalities of three characteristic windows (1), (2) (3) calculated characteristics.A is in four difference images.

f_{1}^{a} = \underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j) - - - (1)

f_{2}^{a} = ({OP}^{1}) (\underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j), \underset{(i, j) &Element; w_{2}}{Σ} G_{a} (i, j)) - - - (2)

f_{3}^{a} = ({OP}^{1}) (\underset{(i, j) &Element; w_{1}}{Σ} G_{a} (i, j), \underset{(i, j) &Element; w_{2}}{Σ} G_{a} (i, j) \underset{(i, j) &Element; w_{3}}{Σ} G_{a} (i, j),) - - - (3)

For the feature extraction of two difference images, can through respectively with single characteristic window, bicharacteristic window and the corresponding equalities of three characteristic windows (4), (5) (6) calculated characteristics that overlap on the difference image.A and b are any two images in four difference images.

f_{1}^{ab} = ({OP}^{2}) (f_{1}^{a}, f_{1}^{b}) - - - (4)

f_{2}^{ab} = ({OP}^{2}) (f_{2}^{a}, f_{2}^{b}) - - - (5)

f_{3}^{ab} = ({OP}^{2}) (f_{3}^{a}, f_{3}^{b}) - - - (6)

For the feature extraction of three difference images, can through respectively with single characteristic window, bicharacteristic window and the corresponding equalities of three characteristic windows (7), (8) (9) calculated characteristics.A, b and c are any three images in four difference images.

f_{1}^{abc} = ({OP}^{2}) (f_{1}^{a}, f_{1}^{b}, f_{1}^{c}) - - - (7)

f_{2}^{abc} = ({OP}^{2}) (f_{2}^{a}, f_{2}^{b}, f_{2}^{c}) - - - (8)

f_{3}^{abc} = ({OP}^{2}) (f_{3}^{a}, f_{3}^{b}, f_{3}^{c}) - - - (9)

For the feature extraction of four difference images, can through respectively with single characteristic window, bicharacteristic window and the corresponding equalities of three characteristic windows (10), (11) (12) calculated characteristics.A, b, c and d are four difference images.

f_{1}^{abcd} = ({OP}^{2}) (f_{1}^{a}, f_{1}^{b}, f_{1}^{c}, f_{1}^{d}) - - - (10)

f_{2}^{abcd} = ({OP}^{2}) (f_{2}^{a}, f_{2}^{b}, f_{2}^{c}, f_{2}^{d}) - - - (11)

f_{3}^{abcd} = ({OP}^{2}) (f_{3}^{a}, f_{3}^{b}, f_{3}^{c}, f_{3}^{d}) - - - (12)

Shown in equality (13)～(17), operator 1 is addition operator and subtraction operator, and operator 2 also comprises main operator except addition operator and subtraction operator.

OP ¹(a，b)＝(a+b)or(a-b) (13)

{OP}^{2} (a, b) = (a + b) or (a - b) or (\frac{a}{a + b}) or (\frac{b}{a + b}) - - - (14)

OP ¹(a，b，c)＝(a+b+c)or(2b-a-c) (15)

OP ²(a，b，c)＝(a+b+c)

OP ²(a，b，c)＝2b-a-c (16)

{OP}^{2} (a, b, c) = \frac{aorborc}{a + b + c}

OP ²(a，b，c，d)＝(a+b+c+d) (17)

OP ²(a，b，c，d)＝3a-b-c-d

OP ²(a，b，c，d)＝3b-a-c-d

OP ²(a，b，c，d)＝3c-a-b-d

OP ²(a，b，c，d)＝3d-a-b-c

{OP}^{2} (a, b, c, d) = \frac{a, b, cord}{a + b + c + d}

It only is exemplary in the method for extracting characteristic shown in Fig. 6, using sub-window, rather than in order to limit purpose.The present invention can be implemented with the sub-window of other quantity and type.

Sub-window processor 130 can extract characteristic through the single difference diagram of scanning, two difference images, three difference images or four difference images.Feature set can be to extract the characteristic one or extract the combination in any of characteristics from single difference diagram, two difference images, three difference images or four difference images from single difference diagram, two difference images, three difference images or four difference images.In addition,, can use different yardstick to calculate difference image in order to obtain more characteristic, to and extract feature set from the difference image that calculates according to different scale.

For the feature set of extracting, use statistical learning method (for example, Adaboost, SVM etc.) to select to have the characteristic of identification ability, can produce final disaggregated model.In mode identification technology, disaggregated model is implemented as usually and uses this model classification device.

In exemplary embodiment of the present invention, prepare the positive sample and the negative sample of people's head, and the positive sample of use image processor 110 calculating people's head and the difference image of negative sample.Sub-window processor 130 is based on the positive sample that calculates and the difference image of negative sample; Use above-mentioned feature extracting method to create and have the feature set of big measure feature; Thereby use statistical method to learn to obtain the head part class model then, thereby obtain to use the head grader 140 of this model.Equally, also can use the same method for other positions of human body and obtain the model and the grader of this human body.For example, for trunk, prepare the positive sample and the negative sample of trunk.The feature extracting method extraction feature set that sub-window processor 130 is introduced above from the difference image of the positive sample of trunk and negative sample, using.Use statistical learning method (for example, Adaboost, SVM etc.) to select to have the characteristic of identification ability, and produce final human human trunk model and trunk grader.

Will be understood by those skilled in the art that; Head detection equipment 100 shown in Figure 1 only is exemplary; (for example can the head grader 140 in the head detection equipment 100 be replaced with other people grader of body region; Trunk grader, human leg portion grader, human body arm grader and whole human body grader), form the checkout equipment of other people body region, in image, to detect said other people body region.

Head grader 140 among Fig. 1 can also be the head detector with a plurality of head graders, pyramid detector 800 for example shown in Figure 8.

Fig. 7 illustrates the division of the view of people's head according to an exemplary embodiment of the present.Fig. 8 illustrates the pyramid detector of the head that is used to detect the people according to an exemplary embodiment of the present invention.

With reference to Fig. 7, when contour shape with camera during towards the change of the angle of people's head, the head view is classified as eight divisions, is respectively that front side, a left side are half side, left side, left back half side, rear side, right back half side, right side and a left side be half side.All views of 360 degree that eight discrete view shows are obtained around people's head.Each view covers the visual angle of certain limit, rather than the certain viewing angles point.For example, if front view is represented 0 degree, 90 degree are represented in the left side, and then the actual covering of front view [22.5 ,+22.5] is spent, the actual covering of left half side view [22.5,67.5] degree, the actual covering of left side view [67.5,112.5] degree.

First head model that the feature set study that difference image through half side positive sample in, left side half side to front side, a left side from people's head, left back half side, rear side, right back half side, right side and a left side and negative sample extracts obtains and the first head grader that uses first head model.

Second head model that obtains through the feature set study that the difference image from the positive sample of the front side of people's head and rear side and negative sample is extracted and the second head grader that uses second head model.

The 3rd head model that obtains through the feature set study that the difference image from the positive sample on the left side of people's head and right side and negative sample is extracted and the 3rd head grader that uses the 3rd head model.

Four-head portion model that obtains through the feature set study that the difference image from left half side, left back half side, the right back half side and left half side positive sample of people's head and negative sample is extracted and the four-head portion grader that uses four-head portion model.

In producing the process of above-mentioned four graders, use with produce Fig. 1 in the process of grader 140 in extract the identical method extraction feature set of method of feature set.

Above-mentioned first, second, third and four-head portion grader correspond respectively to A, FP and HP grader among Fig. 8.

With reference to Fig. 8; When detecting; Image processor 110 is four difference images of calculating input image at first, that is, and and the difference image of level, vertical, left and right sides diagonal, right left diagonal; Pyramid detector 800 detected image then, pyramid detector 800 search and test each possible yardstick and position in image when detecting.Estimate through pyramid detector 800, be accepted as people's head candidate.Otherwise, be removed and be false-alarm.

Specifically, at first each possible yardstick and position (that is the detected head zone of grader A) of the search and the head of test person in image of grader A.To further be estimated by the detected head zone of grader A by grader F, P and HP.If sample is by an acceptance among grader F, P and the HP, then this sample is confirmed as the head candidate of the pairing head view of this grader.Grader F, P and HP estimate the sample that grader A accepts respectively one by one, until having estimated all samples that grader A accepts.

Fig. 9 is the flow chart of the detection of pyramid detector 800 shown in Figure 8 according to an exemplary embodiment of the present invention.

With reference to Fig. 9, in training process, front side, a left side of preparing people's head is half side, left side, left back half side, rear side, right back half side, half side positive sample and the negative sample in a right side and a left side.

As stated, first head model that obtains of the feature set study of extracting of the difference image through half side positive sample in, left side half side, left back half side, rear side, right back half side, right side and a left side and negative sample to front side, a left side from people's head; Second head model that obtains through the feature set study that the difference image from the positive sample of the front side of people's head and rear side and negative sample is extracted; The 3rd head model that obtains through the feature set study that the difference image from the positive sample on the left side of people's head and right side and negative sample is extracted; The four-head portion model that obtains through the feature set study that the difference image from left half side, left back half side, the right back half side and left half side positive sample of people's head and negative sample is extracted.Grader A, F, P and the HP of pyramid detector 800 uses first, second, third and four-head portion model respectively.

In testing process, four of calculating input image difference images at first, that is, and the difference image of level, vertical, left and right sides diagonal, right left diagonal.Pyramid detector 800 detects input pictures, and the head that will detect the people is exported as testing result.

In another exemplary embodiment of the present invention; (for example use at least one other people body region except people's head; People's trunk, people's lower limb and people's arm) grader (detector); Learn according to the human geometry, the people's who confirms from pyramid detector 800 shown in Figure 8 head zone is further removed the head false-alarm, with the precision of the head detection that improves the people.In addition, people's head detector can be used as people's detector according to an exemplary embodiment of the present invention, and this is because when people's head is detected, and can confirm people's existence.

Figure 10 is the block diagram that has the detector 1000 of many human bodies detector according to an exemplary embodiment of the present invention.

With reference to Figure 10, detector 1000 comprises three location detection devices, and one of them location detection device is as primary detector.Detector 1000 is used for detecting and the corresponding human body of primary detector at input picture.In exemplary embodiment of the present invention, location detection device I uses as primary detector.

Below, with the operation of describing detector 1000.At first, with input picture input part bit detector I to be detected.Location detection device I detects with the human body I corresponding to location detection device I regional accordingly in input picture.Subsequently, location detection device II detects with the human body II corresponding to location detection device II regional accordingly in input picture.At this moment; Learn relation constraint according to the human geometry; The zone of the human body II that detects based on location detection device II; Be used as false-alarm from the position candidate's of the human body I of location detection device I output a part and remove (the position candidate's of position II a part also is used as false-alarm and removes), keep the human body II corresponding simultaneously with the position candidate of remaining human body I.Location detection device N detects with the human body N corresponding to location detection device N regional accordingly in input picture.At this moment; Learn constraint according to the human geometry; The zone of the human body N that detects based on location detection device N; Part among the position candidate of pricer body region I through location detection device II is used as false-alarm and removes (the position candidate of position II and the position candidate's of position N a part also is used as false-alarm and removes), and obtains human body II corresponding with the position candidate of remaining human body I and human body N.Like this, compare, the detector at a plurality of positions is used carries out human body position I, can access more accurate result with only using a detection about the detector of human body I.In addition, use the detector at a plurality of positions like this, can people and other people body region be detected from input picture according to human geometry.

As stated, location detection device II conduct is from the position candidate's of location detection device I validator.In this case, location detection device II scanning comes human body position II from position candidate's adjacent domain of location detection device I.Said adjacent domain is learned relation constraint according to the human geometry and is confirmed.For example; If location detection device I is a head detector; Location detection device II is the trunk detector, then based on the head candidate from location detection device I, can determine roughly at scanning space (promptly according to people's the head and the relative position constraint and the dimension constraint of trunk; Input picture to be detected) position and the size of trunk in, this can carry out statistical analysis and obtain based on positive training sample.

In the present invention, the location detection device that detector 1000 is had is not limited to three, can with less than or realize detector 1000 greater than three location detection device.Said location detection device can be head detector, trunk detector, shank detector, arm detector and whole human body detector etc.

Figure 11 is the detailed diagram of an exemplary detector 1100 of the detector with many human bodies detector 1000 of Figure 10 according to an exemplary embodiment of the present invention.

With reference to Figure 11, detector 1100 comprises: determiner 1110, head detector 1120, shank detector 1130 and trunk detector 1140.In the present embodiment, be primary detector with head detector 1120.

Head detector 1120 can be a head detection equipment 100 shown in Figure 1.Shank detector 1130 has the structure identical with the head detection equipment of Fig. 1 with trunk detector 1140.

Each of head detector 1120, shank detector 1130 and trunk detector 1140 comprises: image processor (not shown), training DB (not shown), sub-window processor (not shown); It has with Fig. 1 in image processor 110, training DB 120, sub-window processor 130 identical functions, therefore will omit detailed description to it.Simultaneously; Head detector 1120 also comprises the head grader that obtains according to the head model that obtains from feature set study; Wherein, Be included in sub-window processor in the head detector 1120 positive sample of head and the difference image of negative sample from be stored in the training DB that is included in the head detector 1120 and extract feature set, be included in the difference image that image processor in the head detector 1120 calculates positive sample of head and negative sample.Shank detector 1130 also comprises the shank grader that obtains according to the shank model that obtains from feature set study; Wherein, Be included in sub-window processor in the shank detector 1130 positive sample of shank and the difference image of negative sample from be stored in the training DB that is included in the shank detector 1130 and extract feature set, be included in the difference image that image processor in the shank detector 1130 calculates positive sample of shank and negative sample.Trunk detector 1140 also comprises the trunk grader that obtains according to the human trunk model that obtains from feature set study; Wherein, Be included in sub-window processor in the trunk detector 1140 positive sample of trunk and the difference image of negative sample from be stored in the training DB that is included in the trunk detector 1140 and extract feature set, be included in the difference image that image processor in the trunk detector 1140 calculates positive sample of trunk and negative sample.

At first, head detector 1120, shank detector 1130 and trunk detector 1140 be detected image respectively, with out-feed head candidate, shank candidate and trunk candidate.

Determiner 1110 is based on from the head candidate of head detector 1120 with from the trunk candidate of head detector 1140; Learn (for example, constraint of the relative position of people's head and trunk and dimension constraint) according to the human geometry and from detected head candidate, remove false-alarm.Then, determiner 1110 is learned according to the human geometry based on the head candidate and the shank candidate that combine the trunk candidate to remove false-alarm, further removes the false-alarm among the head candidate, thereby can people's head, trunk, shank and people detection be come out.

In another exemplary embodiment of the present invention, head detector 1120 also can be a pyramid detector 800 shown in Figure 8.

In another exemplary embodiment of the present invention; In head detector 1120, shank detector 1130 and trunk detector 1140, do not comprise image processor, training DB, sub-window processor, head detector 1120, shank detector 1130 and trunk detector 1140 shared same image processors, training DB, sub-window processor.

Figure 12 is the block diagram of the detector with many human bodies detector of another exemplary embodiment according to the present invention.

With reference to Figure 12, detector 1200 comprises: the individual location detection device I-N of N (N is a natural number).In addition; Detector 1200 also comprises image processor (not shown), training DB (not shown) and sub-window processor (not shown); It has with Fig. 1 in image processor 110, training DB 120, sub-window processor 130 identical functions, therefore will omit detailed description to it.Each comprised image processor of said a plurality of location detection device I-N, training DB and sub-window processor also can shared same image processors, training DB and sub-window processor.

In addition, detector 1200 also comprises the determiner (not shown), learns according to the human geometry, based on the detected human body of location detection device I-N, removes the false-alarm in the detected human body, to confirm human body and the people in the image to be detected.

Detector I comprises m ₁Individual grader S11 about human body I, S12, S13 ..., S1m ₁Detector II comprises m ₂Individual grader S21 about human body II, S22, S23 ..., S1m ₂... detector n (n=1,2,3..., n represent n detector in N the location detection device) comprises m _nIndividual grader S21 about human body n, S22, S23 ..., S1m _n... detector N comprises m _NIndividual grader S11 about human body N, S12, S13 ..., S1m _N, (wherein, m ₁, m ₂..., m _NBe natural number).The training method of grader is identical with method about the grader at a plurality of positions described in Figure 11, and all omit the description to the training of grader.Of Figure 12, in each of N location detection device that detector 1200 comprises, grader is arranged with the ascending order of its required amount of calculation and is used.Usually, the quantity of the characteristic that the amount of calculation of grader has with it basically is corresponding, so grader is arranged with the ascending order of its characteristic number that has.Below, will describe the operation of detector 1200 with reference to Figure 12 in detail.

After image input detector 1200 that will be to be detected, at first in detector I, pass through a plurality of front end graders (that is, S11 and S12) detected image, obtain candidate's (that is, operation arrives the A point) about the human body I of detector I.Among detector IIs through a plurality of end graders (that is, S21 and S2) detected image, obtain candidate about the human body II of detector II thereafter.Determiner is learned according to the human geometry based on the candidate of candidate who obtains human body I and human body II then, from the candidate of human body I, removes false-alarm (that is, operation arrives the B point).By this way, use the front end grader of remaining detector III-N successively, further remove false-alarm (that is, operation arrives the F point) through determiner.Then, operation arrives K, and the grader that reuses among the detector I detects, and further removes false-alarm through determiner, then uses detector II to N successively, repeated use detector I-N, all grader in using detector I-N.The principle of this detection mode is: through the front end grader (grader that characteristic quantity is less) that uses each detector at first ordinatedly; Can use less amount of calculation to remove most false-alarm; And then the more grader of use characteristic amount progressively, can greatly improve detection speed.For same false-alarm; If only use detector I; Then need use grader S11, S12 and S13 to comprise that altogether 50 characteristics just are removed,, then only need 5 characteristics just can remove this false-alarm if use the grader S21 of grader S11, S12 and the detector II of detector I.According to the predetermined threshold value of the characteristic number in the grader, automatically switch to next detector from a detector.Although can select threshold value in a different manner, its principle is identical, that is, between different detector, switch with less characteristic at front end and to remove false-alarm.

In one exemplary embodiment of the present invention, for all graders in the detector 1200, the ascending order of the characteristic number that has with grader uses each grader to detect.That is to say, when using detector 1200 to detect, do not consider the location detection device under the grader, but the ascending order of the characteristic number that has with grader, the grader that at first the use characteristic number is few uses each grader in the detector 1200.

The invention is not restricted to human body and people's detection, can be applied to any detection (for example, animal, plant, building, natural scene and daily productive life articles for use etc.) with object of definite shape.

Figure 13 illustrates the block diagram of the imaging device with object detection functions 1300 exemplary according to the present invention.

With reference to Figure 13, imaging device 1300 comprises image-generating unit 1301, subject detecting unit 1302, parameter calculation unit 1303, control unit 1304, memory element 1305, display unit 1306 and mark unit 1307.This imaging device can be PTZ (PAN, TILT, ZOOM) any among video camera, static surveillance camera, DC (digital camera), shooting mobile phone, DV (digital camera) and the PVR (personal video recorder).

Image-generating unit 1301 is hardware units, and for example CCD or COMS device are used for perception and produce natural image.

For moving object tracking, there are two kinds of methods that the position and the size in motion of objects zone are provided.First method is an automated process, uses embedded object to detect size and position that merit is extracted the zone of interested object.Second method is a manual methods, and user or operator go up the zone of the interested object of mark in images displayed (for example, touch screen).For automated process, use according to object method of the present invention, object can be by automatic detection.Mark unit 1307 offers user or operator with marking Function, so that user or operator can enough pens or finger interested objects of manual mark on image.

Subject detecting unit 1302 can receive view data from image-generating unit 1301, also can receive the user for example with the size and the positional information of the interested subject area of the form mark of rough labelling.The accurate zone at subject detecting unit 1302 detected objects places, parameter calculation unit 1303 is calculated the parameter that produces adjustment imaging device attitude according to the zone at the 1303 object places that provide.It should be noted that mark unit 1307 is optional when using the first method (that is, automated process) that object's position and size are provided.When a plurality of tracing objects were selective, when a plurality of motion object of following the tracks of was for example arranged, the user can revise the tracing object that imaging device is selected automatically among the present invention.Describe subject detecting unit 1302 in detail with reference to Figure 14 below.

With reference to Figure 14, subject detecting unit 1302 comprises: image processor 1410, training DB 1420, sub-window processor 1430, object grader 1440 and output unit 1450.

The method that image processor 1410 uses Fig. 2 to describe is calculated the difference image of the image of image-generating unit 1301 outputs among Figure 13.The positive sample and the negative sample of the training DB 1420 various objects of storage (for example, various biologies, plant, building, natural scene and daily productive life articles for use etc.).Sub-window processor 1430 uses the feature extracting method of Fig. 6 description from extracting feature set by the positive sample that is stored in the object the training DB 1420 of image processor 1410 calculating and the difference image of negative sample.Object grader 1440 uses the subject area from the object model detected image that the feature set study of extracting obtains based on the difference image of wanting of calculating of image processor 1410 input picture to be detected.Can use the method identical to obtain said object model with the learning method of above-mentioned head model.The zone that output unit 1450 output belongs in image from the object of object grader 1440 and/or mark unit 1307.

Subject detecting unit 1302 also can not comprise training DB 1420 and sub-window processor 1430, and the disaggregated model (that is object model) of various predetermine ones is preset in the object grader 1440.Object grader 1440 hopes that according to the user object type that detects comes the allocating object disaggregated model.

Control unit 1304 can be adjusted to the attitude of picture equipment, and the attitude of imaging device is perhaps controlled by convergent-divergent and the focusing operation of static surveillance camera, DC, DV or PVR by regional the controling of swing, inclination, convergent-divergent and selective focus of Pan/Tilt/Zoom camera.Control unit 1304 receives the parameter of adjustment imaging device attitude from parameter calculation unit 1303.Subject detecting unit 1302 can provide the position and the dimension information of object in new time point or the new frame data.Control unit 1304 is according to the attitude of said parameter adjustment imaging device; To make object placed in the middle in image through swing/tilt operation; Select the zone of interested object through the operation in selective focus zone; And come the zone of interested object is focused on through zoom operations, with the details of the motion object that obtains high image quality.In the operation in selective focus zone, the new zone at control unit may command imaging device alternative place is as focusing on foundation, so that said zone is focused on.In addition; When control unit is controlled to picture choice of equipment focal zone; Imaging device is except selecting the imaging and focusing zone as acquiescence with the picture centre zone, and dynamically the new image-region at alternative place is as the imaging and focusing zone, according to the image data information of focal zone; Dynamically adjust convergent-divergent multiple, focal length, swing or the tilt parameters of imaging device, thereby obtain better imaging effect.

For the electronic product that is held in user's hands; Such as DC, DV or PVR; But its attitude of user's manual adjustment; Make interested object placed in the middle in image, the control unit in the exemplary embodiment of the present can dynamically be adjusted the convergent-divergent multiple and the focusing parameter of this imaging device according to the parameter that provides of testing result and parameter calculation unit.

Memory element 1305 memory images or video, display unit 1306 is shown to the user with the image or the video at scene.

Detecting unit 1302 also can be implemented as software according to an exemplary embodiment of the present invention, and this software is used to be connected to imaging device and puts the embedded system with control unit, to regulate the parameter of imaging device attitude.For embedded imaging device system, but its receiver, video and sends to the control unit of imaging device with order as input, with the attitude of regulating imaging device, lens focus zone etc.

The present invention has following effect and advantage:

(1) still less amount of calculation.The calculating of difference image only needs the subtraction of neighbor, and does not need division or arctangent cp cp operation.

(2) performance fully.Source images is divided into a plurality of subimages of a plurality of ratios, fully shows source images and do not need to quantize.

(3) very strong separating capacity.The new characteristic of experiment proof can reduce complexity and raise the efficiency.

(4) can freely make up and use many views multi-section bit detector for the people detection device according to user's situation.

(5) feature extraction of many ratios can reduce the influence of background noise.Under the situation of thick ratio, can suppress assorted speckle background; Under the situation of good ratio, can obtain local detail.

Although specifically shown and described the present invention with reference to its exemplary embodiment; But it should be appreciated by those skilled in the art; Under the situation that does not break away from the spirit and scope of the present invention that are defined by the claims, can carry out the various changes on form and the details to it.

Claims

1. people's method in the detected image, said method comprises:

Calculate the difference image of image to be detected, wherein, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction;

Difference image use based on the image to be detected that calculates detects said the first body region with the corresponding the first body region model of the first body region; Wherein, said the first body region model is to learn to obtain through the feature set that the difference image from the difference image of the positive sample of said the first body region and negative sample is extracted.

2. the method for claim 1 also comprises:

Difference image based on the image to be detected that calculates; Use the second human body model corresponding to detect said second human body with at least one second human body that is different from said the first body region; Wherein, the said second human body model is to learn to obtain through the feature set that the difference image from the difference image of the positive sample of said second human body and negative sample is extracted;

The locus at the different human body position of learning according to the human geometry and the constraint of size based on said detected second human body, are removed false-alarm from said detected the first body region.

3. according to claim 1 or claim 2 method, wherein, said difference image is included in the difference image that calculates with at least one yardstick on horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal of image.

4. method as claimed in claim 3; Wherein, the step of the feature set of extraction difference image comprises the said feature set of extraction at least one that use in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

5. method as claimed in claim 4, wherein, one of the head that said the first body region is the people, trunk, human body shank, human body arm and whole human body.

6. method as claimed in claim 5, wherein, said the first body region is people's a head.

7. method as claimed in claim 6, wherein, the positive sample and the negative sample of said the first body region comprise: half side, left back half side, the rear side in the front side of people's head, left side, a left side, right back half side, right side and right half side positive sample and negative sample.

8. method as claimed in claim 7, wherein, said the first body region model comprises:

The feature set that the difference image of the difference image of half side, left back half side, rear side, right back half side, right side and right half side positive sample from front side, left side, the left side of people's head and negative sample is extracted learns to obtain being used for first head model of head;

The feature set that difference image from the difference image of the positive sample of the front side of people's head and rear side and negative sample is extracted learns to obtain to be used to detect front side and second head model of rear side of people's head;

The feature set that difference image from the difference image of the positive sample on the left side of people's head and right side and negative sample is extracted learns to obtain to be used to detect left side and the 3rd head model on right side of people's head;

The feature set that the difference image of the difference image of half side, the right back half side and right half side positive sample from left back half side, the left side of people's head and negative sample is extracted learns to obtain to be used to detect left back half side, half side, the right back half side and right half side four-head portion model in a left side of people's head.

9. method as claimed in claim 8, wherein, use the step that detects said the first body region with the corresponding the first body region model of the first body region to comprise based on the difference image of the image to be detected that calculates:

Difference image based on the image to be detected that calculates uses first head model to detect the head in the image to be detected;

Difference image based on the image to be detected that calculates; Use second, third or four-head portion model that the head that detects through first head model is estimated, with left back half side, left half side, the right back half side and right half side false-alarm of the left side of the front side of the head of removing the people respectively and rear side false-alarm, people's head and right side false-alarm or people's head.

10. method as claimed in claim 2, wherein, said second human body comprises: at least one in trunk, human body shank, human body arm and the whole human body.

11. an equipment that in image, detects the people, said equipment comprises:

Image processor, the difference image of computed image, wherein, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction;

Training DB, the positive sample and the negative sample of storage human body;

Sub-window processor, the difference image of the positive sample of the human body of from the training DB that image processor calculates, storing and the difference image of negative sample extract feature set;

The first body region grader; The difference image of the image to be detected that calculates based on image processor; Use the first body region model; Detect and the corresponding the first body region of said the first body region grader, wherein, to be sub-window processor learn to obtain through the feature set to the difference image extraction of the difference image of the positive sample of the said the first body region from train DB, stored and negative sample said the first body region model.

12. equipment as claimed in claim 11 also comprises:

At least one second human body grader; The difference image of the image to be detected that calculates based on image processor; Use the second human body model; Detect second human body, wherein, to be sub-window processor learn to obtain through the feature set to the difference image extraction of the difference image of the positive sample of said second human body from training DB, stored and negative sample the said second human body model; Wherein, each of said at least one second human body grader is corresponding one by one with each of at least one second human body that is different from said the first body region;

Determiner, the locus at the different human body position of learning according to the human geometry and the constraint of size based on said detected second human body, are removed false-alarm from said detected the first body region.

13. like claim 11 or 12 described equipment, wherein, said image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

14. equipment as claimed in claim 13, wherein, sub-window processor uses and extracts said feature set at least one in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

15. equipment as claimed in claim 14, wherein, one of the head that said the first body region is the people, trunk, human body shank, human body arm and whole human body.

16. equipment as claimed in claim 15, wherein, said the first body region is people's a head.

17. equipment as claimed in claim 16, wherein, the positive sample and the negative sample of said the first body region comprise: half side, left back half side, the rear side in the front side of people's head, left side, a left side, right back half side, right side and right half side positive sample and negative sample.

18. equipment as claimed in claim 17, wherein, said the first body region grader comprises:

The first head grader; Based on the difference image of head, first head model that the feature set of using the difference image to the difference image of half side, left back half side, rear side, right back half side, right side and right half side positive sample from front side, left side, the left side of people's head and negative sample to extract is learnt to obtain comes head;

The second head grader; Based on the difference image of head, use second head model to come the front side and the rear side of head to learning to obtain from the feature set of the difference image extraction of the difference image of the positive sample of the front side of people's head and rear side and negative sample;

The 3rd head grader; Based on the difference image of head, use the 3rd head model to come the left side and the right side of head to learning to obtain from the feature set of the difference image extraction of the difference image of the positive sample on the left side of people's head and right side and negative sample;

Four-head portion grader; Based on the difference image of head, the four-head portion model that the feature set of using the difference image to the difference image of half side, the right back half side and right half side positive sample from left back half side, the left side of people's head and negative sample to extract is learnt to obtain detect the people head left back half side, a left side is half side, right back half side and right half side.

19. equipment as claimed in claim 18; Wherein, Use second, third and four-head portion grader that the head that detects through the first head grader is estimated, with left back half side, left half side, the right back half side and right half side false-alarm of the left side of the front side of the head of removing the people respectively and rear side false-alarm, people's head and right side false-alarm and people's head.

20. equipment as claimed in claim 12, wherein, said second human body comprises: at least one in trunk, human body shank, human body arm and the whole human body.

21. a method that in image, detects the people, said method comprises:

(a) difference image of calculating image to be detected, wherein, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction;

(b) based on the difference image of the image to be detected that calculates; Use with a plurality of different a plurality of one to one human body models of human body in a human body model detect and a said human body that the human body model is corresponding; Wherein, each of said a plurality of human body models is to learn to obtain through the feature set that the difference image from the difference image of the positive sample of each pairing human body of said a plurality of human body models and negative sample is extracted;

(c) repeating step (b); For another human body in the said a plurality of different human body position that is different from the human body corresponding in the step (b) with a said human body model; Use the human body model corresponding to detect said another human body with said another human body; And, from detected human body, remove false-alarm according to human geometry based on said detected another human body;

(d), finally confirm the human body of detection, and learn the people who confirms detection according to the human geometry based on the result of step (c).

22. method as claimed in claim 21, wherein, said difference image comprise image in the horizontal direction, the difference image that calculates with at least one yardstick on vertical direction, L-R diagonal and the right side-left diagonal.

23. want water 22 described methods like right; Wherein, the step of the feature set of extraction difference image comprises the said feature set of extraction at least one that use in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

24. method as claimed in claim 23, wherein, each corresponding a plurality of human body model of said a plurality of different human bodies.

25. method as claimed in claim 24; Wherein, Step (b) comprising: to a human body in said a plurality of different human bodies; The ascending order of the quantity of the characteristic that has with the human body model uses in a plurality of human body models corresponding with a said human body at least one to remove false-alarm.

26. method as claimed in claim 25; Wherein, Step (c) comprising: to said a plurality of different human bodies; With the predesigned order about said a plurality of different human bodies, repeating step (b) all was used until the pairing human body model of said a plurality of different human bodies.

27. method as claimed in claim 24, wherein, step (b) and (c) also comprise: the ascending order of the quantity of the characteristic that has with the human body model, use all human body models, and the detection order of the pairing human body of account of human body region model not.

28. like claim 26 or 27 described methods, wherein, said a plurality of different human bodies comprise: people's head, trunk, human body shank, human body arm and whole human body.

29. an equipment that in image, detects the people, said equipment comprises:

A plurality of human body detectors, corresponding one by one with a plurality of different human bodies, and detect corresponding human body;

Determiner is learned according to the human geometry, based on the detected human body of a plurality of human body detectors, removes false-alarm, to confirm human body and the people in the image to be detected;

Wherein, each of said a plurality of human body detectors comprises:

Training DB, the positive sample and the negative sample of storage human body;

The human body grader; The difference image of the image to be detected that calculates based on image processor; End user's body region model; Detect and the corresponding human body of said human body grader, wherein, to be sub-window processor learn to obtain through the feature set to the difference image extraction of the difference image of the positive sample of the said human body from train DB, stored and negative sample said human body model.

30. equipment as claimed in claim 29, wherein, said image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

31. equipment as claimed in claim 30, wherein, sub-window processor uses and extracts said feature set at least one in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

32. equipment as claimed in claim 31, wherein, each of said a plurality of human body detectors has a plurality of human body graders.

33. equipment as claimed in claim 31, wherein, with the predesigned order of said a plurality of different human bodies; Use at least one the human body grader in the corresponding human body detector to detect; With the said a plurality of different human bodies of said predesigned order duplicate detection, all human body graders in the said equipment that in image, detects the people all are used, wherein; After everyone body region is to be detected; Determiner is removed false-alarm based on testing result, wherein, and when end user's body region detector; The ascending order of the quantity of the characteristic that has with the employed human body model of the human body grader in this human body detector is used said at least one human body grader.

34. equipment as claimed in claim 32; Wherein, the ascending order of the quantity of the characteristic that has with the employed human body model of human body grader uses all human body graders to detect; And do not consider the detection order with the pairing human body of human body grader; Wherein, after everyone body region was to be detected, determiner was removed false-alarm based on testing result.

35. like claim 33 or 34 described equipment, wherein, said a plurality of different human bodies comprise: people's head, trunk, human body shank, human body arm and whole human body.

36. an imaging device comprises:

Image-generating unit, the image of reference object;

Detecting unit; Difference image based on the image of the object of taking; Use the zone of the object of taking in the object model detected image, wherein, said object model is through the feature set from the difference image extraction of the difference image of the positive sample of the object of said shooting and negative sample is learnt to obtain; Wherein, each pixel in the difference image is illustrated in the average gray of pixel change in the adjacent area on the target direction;

The attitude parameter computing unit, the parameter that produces adjustment imaging device attitude is calculated in the zone of object in image of the shooting of coming out according to detection, so that object placed the central area of image;

Control unit receives the parameter of adjusting the imaging device attitude from said attitude parameter computing unit, the attitude of adjustment imaging device;

Memory element, the image of the object that storage is taken;

Display unit, the image of the object that demonstration is taken.

37. imaging device according to claim 36 also comprises:

The mark unit, the subject area that the user is manually marked on image offers detecting unit.

38. according to the described imaging device of claim 37, wherein, at least a according in the operation in swing, inclination, convergent-divergent and the selective focus zone of the parameter adjustment imaging device of adjustment imaging device attitude of control unit.

39. according to the described imaging device of claim 38, wherein, in the operation in selective focus zone, control unit is controlled to the new zone at picture choice of equipment object place as focusing on foundation, so that said new zone is focused on.

40. according to the described imaging device of claim 39; Wherein, When control unit was controlled to picture choice of equipment focal zone, imaging device was with the regional imaging and focusing zone of selecting as acquiescence of picture centre, and perhaps dynamically the new zone at alternative place is regional as imaging and focusing; According to the image data information of focal zone, dynamically adjust convergent-divergent multiple, focal length, swing or the tilt parameters of imaging device.

41. the method for object in the detected image, said method comprises:

Based on the difference image of the image to be detected that calculates, use object model to detect said object, wherein, said object model is through the feature set from the difference image extraction of the difference image of the positive sample of said object and negative sample is learnt to obtain.

42. method as claimed in claim 41, wherein, said difference image is included in the difference image that calculates with at least one yardstick on horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal of image.

43. method as claimed in claim 42; Wherein, the step of the feature set of extraction difference image comprises the said feature set of extraction at least one that use in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.

44. the equipment of object in the detected image, said equipment comprises:

Training DB, the positive sample and the negative sample of storage object;

Sub-window processor, the difference image of the difference image of the positive sample of objects stored and negative sample extracts feature set from the training DB that image processor calculates;

The object grader; Difference image based on the image to be detected that calculates; Use object model to detect said object, wherein, said object model is through the feature set from the difference image extraction of the difference image of the positive sample of said object and negative sample is learnt to obtain.

45. equipment as claimed in claim 44, wherein, said image processor with at least one yardstick computed image in the horizontal direction, difference image on vertical direction, L-R diagonal and the right side-left diagonal.

46. equipment as claimed in claim 45, wherein, sub-window processor uses and extracts said feature set at least one in single window or the difference image of many windows on said horizontal direction, vertical direction, L-R diagonal and the right side-left diagonal.