CN102663363B

CN102663363B - Human detection method in single frame image

Info

Publication number: CN102663363B
Application number: CN201210101292.9A
Authority: CN
Inventors: 胡幸福; 彭先蓉; 徐勇
Original assignee: Institute of Optics and Electronics of CAS
Current assignee: Institute of Optics and Electronics of CAS
Priority date: 2012-04-09
Filing date: 2012-04-09
Publication date: 2015-01-07
Anticipated expiration: 2032-04-09
Also published as: CN102663363A

Abstract

The invention provides a human detection method in a single frame image. The method studies a new feature based on the Haar-like feature for human detection, and the new feature is called Multi-Block feature. The Multi-Block feature is a rectangule feature, when the pixel of an image sample is 24*36, the feature is divided into 12 rectangular blocks with the same size. Six of the rectangular blocks are selected as white areas, and the other six are selected as black areas, and the feature value is defined as the difference between the sum of the white area pixels and the sum of the black area pixels. In the sample image with the pixel of 24*36, the Multi-Block feature can obtain hundreds of billions of weak features. Experimental results show that Multi-Block feature for human detection provided by the invention has the advantages of maintaing fast speed in the Haar-like feature detection and high detection efficiency and can overcome the problem of the overhigh false-alarm rate of the Haar-like feature.

Description

Human body detecting method in a kind of single-frame images

Technical field

The present invention take human detection as technical background, in order to the human body detected in single-frame images proposes.Multi-Block feature in the present invention not only may be used for human detection, also can be used for other target detection.

Background technology

Human Detection has important using value in fields such as driver assistance system, human body motion capture, porny filtration and virtual videos.The change more complicated of body shape, and human body may wear the clothes of shades of colour and various style, and the human body therefore in detection still image is a very difficult task.Want the human body target in detected image, first need to select a kind of feature for describing human body target, feature at present for human detection mainly contains Haar-like characteristic sum HOG feature, HOG feature is gradient orientation histogram feature, what represent is edge, therefore can describe the shape information of local.But this feature is slow for detection speed during human detection, there is a large amount of floating-point operations, can not meet the requirement of system real time; And Haar-like feature is a kind of rectangular characteristic, be for Face datection at the beginning, this feature can describe some simple picture structures, may be used for the obvious target of this structure comparison of face.Human body target is nonrigid, and therefore, it is more rough that Haar-like feature is used for human detection.Although this feature has the fast advantage of detection speed, higher for human detection false alarm rate.

Due to the existence of the problems referred to above, need to study a kind of method for human detection newly, the feature used in the method not only can keep the fireballing advantage of Haar-like feature detection, and can describe the graphic structure of more complicated, reduce the false alarm rate detected, improve system stability.

The relative merits of further investigation Haar-like feature, propose a kind of feature for human detection newly on this basis, are called " Multi-Block feature ".The research of new feature is based upon in the image basis of existing Haar-like characteristic-integration, is therefore necessary to be described Haar-like characteristic sum integral image.

Haar-like feature is a kind of rectangular characteristic of similar Haar small echo, and this rectangular characteristic is to some simple graphic structures, such as edge, line segment, and ability to express is strong, but it can only describe the structure at particular orientation (level, vertical, diagonal angle).Fig. 1 is several Haar-like rectangular characteristic templates, and comprise two kind of two rectangular characteristic (description edge), two kind of three rectangular characteristic (description line segment) and a kind of four rectangular characteristic (description diagonal angle), each rectangular block is in the same size.The eigenwert of Haar-like feature be defined as white portion pixel and subtract black region pixel and, conveniently calculate the eigenwert of Haar-like feature rapidly, introduce the concept of " integral image ".

" integral image " is a kind of new image representing method that the people such as Viola propose in order to swift nature estimates.In order to calculate these features fast on multiple dimensioned, just introducing integral image thus and image has been represented.And this integral image can by generating a series of operation of each pixel in image.Once calculate, the eigenwert of any one Haar-like feature can be calculated on any yardstick, optional position on the time complexity of a constant.The value of present rectangular characteristic can be calculated fast by integral image expression.The value of this integral image at (x, y) place be all pixel values in original image upper left side and.

ii (x, y) = \underset{x^{'} \leq x, y^{'} \leq y}{Σ} i (x^{'}, y^{'}) - - - (1)

In formula: the integral image values of ii (x, y) point for this, the gray-scale value that i (x ', y ') is this pixel in original image.

This calculating having come above to recursion formula below can be utilized:

s(x，y)＝s(x，y-1) (2)

ii(x，y)＝ii(x-1，y)+s(x，y) (3)

In formula: s (x, y) is row cumulative sum.

The use of integral image makes the pixel of any one rectangular area in computed image and very quick, as long as three plus-minus method just can complete.

Summary of the invention

Technology of the present invention is dealt with problems: in order to solve Haar-like feature for the too high problem of false alarm rate in human detection, the basis of Haar-like characteristic sum integral image proposes the human body detecting method in a kind of single-frame images, the method use the feature of a kind of description human body newly proposed, be called " Multi-Block feature ".Experiment shows, Multi-Block feature is used for human detection, not only maintain the advantage that Haar-like feature detection speed is fast and verification and measurement ratio is high, and its false alarm rate reduces an order of magnitude than Haar-like feature.

Technical solution of the present invention: why Haar-like feature has the fast advantage of detection speed is because this feature structure simple (just simple be made up of several rectangular block), eigenwert computation complexity is low, but this feature can only describe some simple graphic structures, more rough for describing human body.In order to the graphic structure that feature can be made can to describe more complicated, anthropomorphic phantom's type, is divided into 18 rectangular blocks by feature templates, choose 9 rectangular blocks as white portion, all the other 9 blocks as black region, eigenwert be defined as white portion pixel and deduct black region pixel and.Experimental result shows: although 18 rectangular block features are used for the false alarm rate that human detection can reduce detection, can reduce verification and measurement ratio, and detection speed is excessively slow simultaneously.The too low reason of 18 rectangular block verification and measurement ratios is: just sample image is divided into 18 rectangular areas simply, do not have the conversion of yardstick, therefore this feature can only describe the one-piece construction of human body, effectively can not describe the partial structurtes of human body.In view of above reason, be in the sample image of 24 × 36 pixels, feature templates be divided into 12 rectangular blocks in size, 3 rectangular blocks on line direction, on column direction, 4 rectangular blocks, are called Multi-Block feature by this feature here, and concrete construction process is as follows:

In (1) 12 rectangular block, optionally wherein 6 rectangular blocks are as white portion, and all the other 6 rectangular blocks are black region, can obtain 924 kinds of feature templates altogether;

(2) eigenwert of this feature templates be white portion pixel and subtract black region pixel and, white portion pixel that its form of Definition is twice can be changed and deduct total template area pixel and;

(3) feature templates optional position in sample image occurs, as long as its size is 3 pixel multiples in the row direction, column direction is 4 pixel multiples, such as feature templates size can be 4 × 3 pixels, 8 × 3 pixels, 4 × 6 pixels etc.Like this, often kind of feature templates can obtain 13860 kinds of weak features;

(4) 924 kinds of Multi-Block feature templates are can obtain the weak feature of up to ten million kind in the sample image of 24 × 36 pixels in size.

Different with Haar-like feature, Multi-Block feature preserves its parameter by two files, the leaching process of Multi-Block feature: Multi-Block feature can obtain two feature sets, a file is for preserving 924 kinds of Multi-Block feature templates, namely the numbering of 12 rectangular block white portions is preserved, totally 6 parameters; Another file is for preserving the location parameter of Multi-Block feature templates in sample image and dimensional parameters.Therefore, Multi-Block characteristic parameter comprises: white rectangle zone number in Weak Classifier threshold value, the position that biased, feature templates is in sample image and size and this feature templates.Table 1 gives the parameter training result of Multi-Block feature.

The parameter training result of table 1 Multi-Block feature

In Table 1, x and y represents the ordinate of rectangular characteristic template in sample image and horizontal ordinate respectively, obviously, and 1≤x≤36,1≤y≤24; X_scale and y_scale represents the x direction size of rectangular characteristic template in sample image and y direction size, i.e. feature templates size respectively.

The present invention compared with prior art has the following advantages: Multi-Block feature can not only describe the graphic structure of more complicated, many for the weak feature kind of training, and has for human detection the advantage that detection speed is fast, false alarm rate is low.

Accompanying drawing explanation

Fig. 1 is several Haar-like features;

Fig. 2 is a kind of Multi-Block feature templates;

Fig. 3 utilizes the human detection result of Haar-like characteristic sum Multi-Block feature to contrast;

Fig. 4 is the training schematic diagram of Multi-Block feature.

Embodiment

Here is specific embodiments of the invention.

From the construction process of Multi-Block feature, be that in the sample image of 24 × 36 pixels, Multi-Block feature can obtain the weak feature of up to ten million kind in size, if directly utilize more than 1,000 ten thousand kinds of weak features to train, there is two problems: one is train very consuming time, and two is that calculator memory is inadequate.For solving this two problems, produce two random numbers in the training process.The concrete training process of Multi-Block feature is as follows:

(1) produce a random number for selecting Multi-Block feature templates, major parameter is the numbering of 6 rectangular blocks in this feature templates; Namely select a kind of from 924 kinds of Multi-Block feature templates, major parameter is the numbering of white portion in 12 rectangular areas.Here, 12 rectangular areas are numbered (1-4,5-8,9-12) from top to bottom, from left to right;

(2) random number is produced again for the position of selecting feature templates in sample image and size;

(3) (1), (2) step is repeated, select weak feature, major parameter to comprise in Multi-Block feature templates white portion numbering, and the position of feature templates in sample image and size, until weak Characteristic Number reaches the feature set sum preset;

(4) utilize Adaboost algorithm train by first three step obtain weak feature set, choose classifying quality the best weak feature and by its linear weighted function in strong classifier, until the false alarm rate of strong classifier and verification and measurement ratio meet the demands;

(5) one-level strong classifier is often trained all to need to utilize two random numbers to produce the weak feature set of weak feature composition for training;

(6) training strong classifier is continued until false alarm rate is less than setting value.

Fig. 4 gives the training schematic diagram of Multi-Block feature.In experiment, training obtains the cascade classifier of 16 layers altogether.Utilize and train the cascade classifier obtained to detect input picture, repeat no more here.Multi-Block characteristic sum Haar-like feature is compared below from stability, diversity and real-time three aspects.

Described feature stability refers to that this feature is high for verification and measurement ratio during human detection, false alarm rate is low.Table 2 gives the Haar-like characteristic sum Multi-Block comparison of feature from detection time, false alarm rate, minimum human body target three aspects that can detect.As can be seen from Table 2: the minimum human body target that Multi-Block can detect is sample image size (24 × 36 pixel), and the minimum human body target that Haar-like feature can detect is 36 × 54; Multi-Block reduces an order of magnitude for its false alarm rate during human detection than Haar-like feature; From detection time, although the detection time of Multi-Block feature is the twice of Haar-like feature, on the whole, both are more or less the same detection time, and background is more complicated in scene, the difference of the detection time of two kinds of features is less, and this can introduce in detail below.Fig. 3 gives the comparison of the result of the human detection result of Haar-like feature and the human detection of Multi-Block feature.Figure (3-a) and figure (3-b) two width picture adopt the shooting of infrared illumination camera, and rear two width pictures are from INRIA database.

Described characteristic polymorphic refers to that Multi-Block feature is to obtain the weak feature of up to ten million kind in 24 × 36 pixels in sample image size, can describe more complicated graphic structure, and kind of the Haar-like rectangular characteristic template of five shown in Fig. 1 is can obtain 360,000 kinds of weak features in the sample image of 24 × 36 pixels in size.When sample image is 24 × 36 pixel, Multi-Block is made up of 12 rectangular blocks, and optionally 6 rectangular blocks are wherein white portion, and all the other 6 rectangular blocks, as black region, can obtain 924 kinds of Multi-Block feature templates so altogether; Each Multi-Block feature templates optional position in the sample image of 24 × 36 pixels occurs, its size is 3 pixel multiples, column direction are 4 pixel multiples in the row direction,

Table 2 feature compares

Feature	False alarm rate	Detection time	The minimum target detected
				Haar-like feature	5*10^-4	80ms	36*54 pixel
Multi-Block feature	2*10^-5	160ms	24*36 pixel

Have 13860 kinds of weak features like this.Two kinds of combinations can obtain more than 1,000 ten thousand kinds of weak features (Weak Classifier).Fig. 2 gives a kind of Multi-Block feature templates.Then the weak feature of positive negative sample and the machine learning algorithm selection sort best results from Weak Classifier pond collected is utilized.Because Multi-Block feature is made up of 12 rectangular blocks, and white portion and black region can be selected arbitrarily, and its Weak Classifier kind is selected more, so Multi-Block can describe more complicated graphic structure.

Described feature real-time refers to that the eigenwert computation complexity of this feature is low, for meeting the requirement of system real time during human detection.The calculating of Multi-Block character value and Haar-like feature similarity, eigenwert be 6 white rectangle area pixel and subtract 6 black rectangle area pixel and, eigenwert calculates very quick, and this is just for the real-time of last detection provides guarantee.Haar-like feature for four rectangular characteristic, the pixel of each rectangular block and need 3 plus-minus method, 4 rectangular blocks need 12 plus-minus method, and therefore, the eigenwert for the Haar-like feature of four rectangles calculates needs 15 plus-minus method and can complete; Multi-Block feature is made up of 12 rectangular blocks, 6 white rectangle blocks and 6 black rectangle blocks, its eigenwert calculate can convert the white portion pixel that becomes twice and subtract total template pixel and.6 white rectangle block pixels and needs 18 plus-minus method, the pixel of total template and needs 3 plus-minus method, so the eigenwert of Multi-Block feature calculates needs 28 plus and minus calculations and multiplication operation, in addition, when two kinds of features are used for human detection, background is more complicated, difference detection time of two kinds of features is less, because Multi-Block feature can describe more complicated graphic structure, during for human detection, repulsive energy force rate Haar-like feature is strong, when background more complicated, most of non-human target window can be repelled with what strong classifier before Multi-Block features training sorter, and be merely able to describe fairly simple graphic structure due to Haar-like feature, its training sorter front what can only get rid of relatively less non-human target window, make the window of follow-up strong classifier process relatively many.

Claims

1. the human body detecting method in single-frame images, is characterized in that: utilize Multi-Block feature to detect human body; Concrete, the positive negative sample off-line training sorter first by collecting, then detects input picture with the sorter trained; Research object is characterized as with Multi-Block, sample image size two factors considered detection speed and choose, by Multi-Block characterizing definition for being 12 rectangular blocks of the same size by a rectangular partition, wherein 6 rectangular blocks are white portion, 6 rectangular blocks are black region, its eigenwert be defined as white portion pixel and deduct black region pixel and;

The concrete training process of described Multi-Block feature is as follows:

(1) produce a random number for selecting Multi-Block feature templates, major parameter is the numbering of 6 rectangular blocks in this feature templates; Namely select a kind of from 924 kinds of Multi-Block feature templates, parameter is the numbering of white portion in 12 rectangular areas; Here, 12 rectangular areas from top to bottom, are from left to right numbered as 1-4,5-8,9-12;

(3) (1), (2) step is repeated, select weak feature, to comprise in Multi-Block feature templates white portion numbering, the position of feature templates in sample image and size, until weak Characteristic Number reaches the feature set sum preset;

2. the human body detecting method in a kind of single-frame images according to claim 1, is characterized in that: in 12 rectangular blocks, 3 rectangular blocks on line direction, and 4 rectangular blocks on column direction, obtain 924 kinds of Multi-Block feature templates.

3. the human body detecting method in a kind of single-frame images according to claim 1, it is characterized in that: often kind of Multi-Block feature templates occurs in sample image optional position, its size is 3 pixel multiples, column direction are 4 pixel multiples in the row direction, and namely the size of Multi-Block feature is 4 × 3 pixels, 8 × 3 pixels or 4 × 6 pixels.

4. the human body detecting method in a kind of single-frame images according to claim 1, it is characterized in that: often kind of Multi-Block feature templates is obtain 13860 kinds of weak features in the sample image of 24 × 36 pixels in size, then 924 kinds of feature templates can obtain the weak feature of up to ten million kind.

5. the human body detecting method in a kind of single-frame images according to claim 1, is characterized in that: utilize two random numbers to choose the weak feature of some as training characteristics collection from more than 1,000 ten thousand kinds of weak features.