CN101477626A - Method for detecting human head and shoulder in video of complicated scene - Google Patents

Method for detecting human head and shoulder in video of complicated scene Download PDF

Info

Publication number
CN101477626A
CN101477626A CNA200910077108XA CN200910077108A CN101477626A CN 101477626 A CN101477626 A CN 101477626A CN A200910077108X A CNA200910077108X A CN A200910077108XA CN 200910077108 A CN200910077108 A CN 200910077108A CN 101477626 A CN101477626 A CN 101477626A
Authority
CN
China
Prior art keywords
window
gradient
vector
picture
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA200910077108XA
Other languages
Chinese (zh)
Other versions
CN101477626B (en
Inventor
孙立峰
丁锡锋
徐辉
崔鹏
杨士强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN200910077108XA priority Critical patent/CN101477626B/en
Publication of CN101477626A publication Critical patent/CN101477626A/en
Application granted granted Critical
Publication of CN101477626B publication Critical patent/CN101477626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention relates to a method for detecting head and shoulders of human body in a video with complex scene, which belongs to the technical field of computer information mining. The method comprises the following steps: manually determining the head and shoulder picture, the background picture and the pictures of other parts of the human body in each frame of a video as the positive and the negative samples, and mirroring the pictures; extracting the gradient vectors of the positive and the negative samples, and training a first classifier; selecting the head and the shoulder picture as the new positive sample, and the pictures of other parts of the human body as the new negative sample; extracting the gradient vectors of the positive and the negative samples, and training a second classifier; determining the position and the size of a window to be detected in one frame of a video frame to be detected, and extracting the gradient vector; classifying the gradient vector by the first classifier, and stopping detection if the classification result is negative; classifying the gradient vector by the second classifier if the first classification result is positive, and determining that the window contains the head and the shoulder if the classification is positive; and changing the position and the size of the window to detect new windows. The method can improve the accuracy as well as the detection speed.

Description

A kind of method of in the video of complex scene, carrying out the human head and shoulder detection
Technical field
The invention belongs to computerized information digging technology field, particularly a kind of method that in the picture of complex scene, the head shoulder of human body is detected, the frame middle row people's who relates in particular at the monitor video of real world head shoulder detects.
Background technology
In recent years, at the computer video analysis field, the detection to human body in video is the research direction of a hot topic.In the whole bag of tricks of human detection, coming human body by the each several part that detects health is an important supplementary means.And to these health each several parts, head shoulder zone is the feature of a highly significant.Owing to human body occurs by the situation of partial occlusion through regular meeting in the video, cause detection difficult, and this moment, the head shoulder also had very high probability to be detected, so the detection head shoulder is to the good booster action of human body.Simultaneously, at the Video Events detection range, near the many actions of people head shoulder often comprise event informations that some imply, such as waving or making a phone call etc.So the head shoulder under the complex background detects and has great importance.
Head shoulder detects and belongs to target detection, and in object detection field, method can be divided into two classes, and the one, to do background extracting or cut apart, isolated foreground target is as testing result.The 2nd, in image, directly search target.Method with background extracting in video can only be applied to still camera, for actionless target in the scene, detects very difficulty, and this has limited its range of application.So generally adopt the method for direct search target in image now.These methods generally use sorter according to clarification of objective target to be classified.Clarification of objective is the characteristic information that target itself comprises, such as the color histogram of object region, texture, gradient etc.Extract after the clarification of objective, sorter is judged the classification of target according to clarification of objective.International sorter mainly is support vector machine (hereinafter to be referred as SVM) at present, but present single-stage svm classifier device is often owing to only carry out a subseries, and accuracy rate is not high.
Summary of the invention
The objective of the invention is for overcoming the weak point of prior art, propose a kind of method that human head and shoulder detects of in the video of complex scene, carrying out, with gradient orientation histogram as describing clarification of objective.The SVM that adopts two-stage can improve accuracy rate as sorter, improves detection speed simultaneously.
The present invention be with the head shoulder picture of some and background picture as positive and negative sample set, SVM is as first order sorter in training.With the positive negative sample of picture conduct of head shoulder picture and the non-head-and-shoulder area of health, SVM is as second level sorter in training.So just constituted the cascade classifier of a two-stage.Surveyed area detects through this two-stage SVM successively, with this testing result as net result.
The SVM that uses among the present invention is present at the popular LibSVM sorter of international comparison, it is not made amendment.
The method of carrying out the human head and shoulder detection in the video of complex scene that the present invention proposes mainly may further comprise the steps:
(1) from a class video to be detected, selects a video.The picture of other parts of health of artificial head shoulder picture, some (at least 1000) background pictures and some (at least 1000) of demarcating some (at least 1000) from each frame of this video wherein requires at least 1 centimetre of the length of side of these pictures.With head shoulder picture as samples pictures just, with background picture as the negative sample picture;
(2) the positive and negative samples pictures that will obtain is carried out left and right sides mirror image, increases the quantity of sample;
(3) extract the gradient orientation histogram of the positive and negative samples pictures obtain, and gradient orientation histogram is converted into the form of vector, as the gradient vector of samples pictures;
(4) use the gradient vector that from positive negative sample, extracts that first order support vector machine (SVM) is trained, generate a first order model that is used to classify;
(5) with described head shoulder picture as the sample first month of the lunar year, replace described background picture as new negative sample with the picture of described other parts of health;
(6) extract the gradient orientation histogram of described first month of the lunar year of negative sample picture, and gradient orientation histogram is converted into 1 form of taking advantage of the vector of N, N is a positive integer, as the gradient vector of new samples;
(7) use the gradient vector that from new positive negative sample, extracts that second level support vector machine (SVM) is trained, generate a second level model that is used to classify;
(8) read in a video to be detected, extract a two field picture of this video;
(9) determine some position of window to be detected and size on this two field picture, adopt the method for step (3) to extract the gradient orientation histogram of this window, and obtain the gradient vector of this window;
(10) this gradient vector is carried out classification and Detection by first order sorter, if classification results for negative (be this window do not comprise head on the shoulders as), then finish the detection of this window, change step (11); If first order classification results for just (be first order sorter judge this window comprise the head on the shoulders as), then this gradient vector is carried out classification and Detection by second level sorter; If second level classification results changes step (11) for negative, if classification results confirms then that for just this window comprises the head shoulder, preserves the coordinate of window get off, as the testing result of this window;
(11) change position of window and size, adopt the method for step (3) to extract the gradient orientation histogram of this window, and obtain the gradient vector of this window, change the classification and Detection that step (10) is carried out this window, finally obtain the testing result of each window.
The step of said method (3) specifically comprises the steps:
(31) with each samples pictures as a window, window is divided into the piece of MxN, the overlapping of 30%-50% arranged between the piece, M, N are positive integer;
(32) this each piece is divided equally into a plurality of unit;
(33) to pixel compute gradient direction and size in this each unit;
(34) gradient with the pixel in each unit becomes a histogram by directional statistics, and gradient orientation histogram is converted into vectorial form, as the gradient vector of this unit;
(35) gradient vector with each unit connects into a long vector, as the gradient vector of this piece;
(36) gradient vector with each piece in the window connects into a long vector, as the gradient vector of this window;
(37) gradient vector of this window is carried out normalization, as the gradient vector of this samples pictures;
The step of said method (11) changes position of window and size, carries out the classification and Detection of this window, specifically comprises the steps:
(111) move the coordinate of this window, and keep the 30%-80% degree of overlapping, carry out the classification and Detection of this window;
(112) change the size (size according to head shoulder in the video is determined span) of window, and the position of moving window successively, carry out the classification and Detection of this window.
Characteristics of the present invention and effect:
The method of carrying out the human head and shoulder detection in the video of complex scene that the present invention proposes is used for the people of the monitor video in the real world is carried out the detection of head shoulder.Selected the character representation of gradient orientation histogram for use as the head shoulder.Gradient orientation histogram begins to be used for object detection field recent years, selects its expression head shoulder image for use, and the edge that can preserve target moves towards feature, test verifiedly, and it is a kind of feature of robust, can improve the performance of detection.Simultaneously, use two support vector machine of different sample trainings, they are constituted cascade classifier.In testing process, first order sorter removes a zone that does not obviously contain the head shoulder, and the window by first order sorter is detected by second level sorter again.Like this,, can improve accuracy rate, improve detection speed simultaneously owing to adopt the two-stage classification device.Head shoulder with the people in video detects, and can be used for human body tracking, event detection, for realizing that monitoring is significant automatically.
Embodiment
A kind of method of carrying out the human head and shoulder detection in the video of complex scene that the present invention proposes is described in detail as follows in conjunction with the embodiments:
The head shoulder that carries out human body in video detects, and the essence method is still on picture and detects, and present embodiment adopts the method for directly searching on picture, specifically may further comprise the steps:
(1) from a class video to be detected, selects a video; The picture of other parts of health of artificial head shoulder picture, some (at least 1000) background pictures and some (at least 1000) of demarcating some (at least 1000) from each frame of this video wherein requires at least 1 centimetre of the length of side of these pictures; With head shoulder picture as samples pictures just, with background picture as the negative sample picture;
(2) the positive and negative samples pictures that will obtain is carried out left and right sides mirror image, increases the quantity of sample;
(3) extract the gradient orientation histogram of the positive and negative samples pictures obtain, and gradient orientation histogram is converted into the form of vector, as the gradient vector of samples pictures; Specifically comprise:
(31) regard each picture as a window, this window be divided into the fritter of 3x3, have between the fritter 50% overlapping;
(32) to be divided equally again be four junior units to this each fritter;
(33) to pixel compute gradient direction and size in this each unit, the Grad of present embodiment adopts the method for template [1,0,1] to calculate, and direction is: grad ( x , y ) = arctan I ( x , y + 1 ) - I ( x , y - 1 ) I ( x + 1 , y ) - I ( x - 1 , y )
Size is:
value ( x , y ) = [ I ( x + 1 , y ) - I ( x - 1 , y ) ] 2 + [ I ( x , y + 1 ) - I ( x , y - 1 ) ] 2
Wherein, grad (x, y)It is the gradient direction of this pixel.l (x, y)It is the brightness value of this pixel.Value (x, y)It is the value of the gradient of this pixel;
(34) Grad of each pixel in the unit is added up into color histogram according to gradient direction, preserve with the form of vector, the gradient vector of this vector as this unit;
(35) gradient vector with each unit couples together, and each fritter is represented with the vector of normalized 4 an x 9=36 dimension, as the gradient vector of this piece;
(36) gradient vector with each piece couples together, and the gradient orientation histogram of each window just can be represented with the vector of one 9 x36=324 dimension, as the gradient vector of this window;
(4) use the gradient vector that from positive negative sample, extracts that first order support vector machine (LibSVM) is trained, generate a first order model that is used to classify;
(5) with described head shoulder picture as the sample first month of the lunar year, replace described background picture as new negative sample with the picture of described other parts of health;
(6) extract obtain the first month of the lunar year negative sample picture gradient orientation histogram, and gradient orientation histogram is converted into 1 form of taking advantage of 324 vector, as the gradient vector of new samples;
(7) use the gradient vector that from new positive negative sample, extracts that second level support vector machine (LibSVM) is trained, generate a second level model that is used to classify;
(8) call the openCV storehouse and read in a video to be detected, parse the frame on this video;
(9) determine some position of window to be detected and size on this two field picture, adopt step (31) to extract the gradient orientation histogram of this window, and obtain the gradient vector of this window to the method for step (37);
(10) this gradient vector is carried out classification and Detection by first order LibSVM sorter, if classification results for negative (be this window do not comprise head on the shoulders as), then finish the detection of this window, then change step (13); If first order classification results for just (be first order sorter judge this window comprise the head on the shoulders as), then this gradient vector is carried out classification and Detection by second level LibSVM sorter; If second level classification results changes step (13) for negative, if classification results confirms then that for just this window comprises the head shoulder, preserves the coordinate of window get off, as the testing result of this window;
(11) move the coordinate of this window, and keep the 30%-80% degree of overlapping, this window to be detected is detected;
(12) change the size (size according to head shoulder in the video is determined span) of window, and the position of moving window successively, carry out the classification and Detection of this window, finally obtain the testing result of each window.

Claims (3)

1, carry out the method that human head and shoulder detects in a kind of video of complex scene, it is characterized in that, mainly may further comprise the steps:
(1) from a class video to be detected, selects a video.The picture of other parts of health of artificial head shoulder picture, some background pictures and some of demarcating some from each frame of this video wherein requires at least 1 centimetre of the length of side of these pictures.With head shoulder picture as samples pictures just, with background picture as the negative sample picture;
(2) the positive and negative samples pictures that will obtain is carried out left and right sides mirror image, increases the quantity of sample;
(3) extract the gradient orientation histogram of the positive and negative samples pictures obtain, and gradient orientation histogram is converted into the form of vector, as the gradient vector of samples pictures;
(4) use the gradient vector that from positive negative sample, extracts that first order support vector machine is trained, generate a first order model that is used to classify;
(5) with described head shoulder picture as the sample first month of the lunar year, replace described background picture as new negative sample with the picture of described other parts of health;
(6) extract the gradient orientation histogram of described first month of the lunar year of negative sample picture, and gradient orientation histogram is converted into 1 form of taking advantage of the vector of N, N is a positive integer, as the gradient vector of new samples;
(7) use the gradient vector that from new positive negative sample, extracts that second level support vector machine is trained, generate a second level model that is used to classify;
(8) read in a video to be detected, extract a two field picture of this video;
(9) determine some position of window to be detected and size on this two field picture, adopt the method for step (3) to extract the gradient orientation histogram of this window, and obtain the gradient vector of this window;
(10) this gradient vector is carried out classification and Detection by first order sorter,, change step (11) if classification results then finishes the detection of this window for negative; If first order classification results then carries out classification and Detection with this gradient vector by second level sorter for just; If second level classification results changes step (11) for negative, if classification results confirms then that for just this window comprises the head shoulder, preserves the coordinate of window get off, as the testing result of this window;
(11) change position of window and size, adopt the method for step (3) to extract the gradient orientation histogram of this window, and obtain the gradient vector of this window, change the classification and Detection that step (10) is carried out this window, finally obtain the testing result of each window.
2, method according to claim 1 is characterized in that the step of described method (3) specifically comprises the steps:
(31) with each samples pictures as a window, window is divided into the piece of M x N, the overlapping of 30%-50% arranged between the piece, M, N are positive integer;
(32) this each piece is divided equally into a plurality of unit;
(33) to pixel compute gradient direction and size in this each unit;
(34) gradient with the pixel in each unit becomes a histogram by directional statistics, and gradient orientation histogram is converted into vectorial form, as the gradient vector of this unit;
(35) gradient vector with each unit connects into a long vector, as the gradient vector of this piece;
(36) gradient vector with each piece in the window connects into a long vector, as the gradient vector of this window;
(37) gradient vector of this window is carried out normalization, as the gradient vector of this samples pictures
3, method according to claim 1 is characterized in that, the step of described method (11) changes position of window and size, carries out the classification and Detection of this window, specifically comprises the steps:
(111) move the coordinate of this window, and keep the 30%-80% degree of overlapping, carry out the classification and Detection of this window;
(112) change the size of window, and the position of moving window successively, carry out the classification and Detection of this window.
CN200910077108XA 2009-01-16 2009-01-16 Method for detecting human head and shoulder in video of complicated scene Active CN101477626B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910077108XA CN101477626B (en) 2009-01-16 2009-01-16 Method for detecting human head and shoulder in video of complicated scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910077108XA CN101477626B (en) 2009-01-16 2009-01-16 Method for detecting human head and shoulder in video of complicated scene

Publications (2)

Publication Number Publication Date
CN101477626A true CN101477626A (en) 2009-07-08
CN101477626B CN101477626B (en) 2010-08-25

Family

ID=40838335

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910077108XA Active CN101477626B (en) 2009-01-16 2009-01-16 Method for detecting human head and shoulder in video of complicated scene

Country Status (1)

Country Link
CN (1) CN101477626B (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102243706A (en) * 2011-08-18 2011-11-16 杭州海康威视软件有限公司 Target classification method and system based on target edge direction
CN102567998A (en) * 2012-01-06 2012-07-11 西安理工大学 Head-shoulder sequence image segmentation method based on double-pattern matching and edge thinning
CN102831442A (en) * 2011-06-13 2012-12-19 索尼公司 Abnormal behavior detection method and equipment and method and equipment for generating abnormal behavior detection equipment
CN103021059A (en) * 2012-12-12 2013-04-03 天津大学 Video-monitoring-based public transport passenger flow counting method
CN103310194A (en) * 2013-06-07 2013-09-18 太原理工大学 Method for detecting head and shoulders of pedestrian in video based on overhead pixel gradient direction
CN103632146A (en) * 2013-12-05 2014-03-12 南京理工大学 Head-shoulder distance based human body detection method
CN103971135A (en) * 2014-05-05 2014-08-06 中国民航大学 Human body target detection method based on head and shoulder depth information features
CN104021605A (en) * 2014-04-16 2014-09-03 湖州朗讯信息科技有限公司 Real-time statistics system and method for public transport passenger flow
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem
CN104881675A (en) * 2015-05-04 2015-09-02 北京奇艺世纪科技有限公司 Video scene identification method and apparatus
CN105809183A (en) * 2014-12-31 2016-07-27 深圳中兴力维技术有限公司 Video-based human head tracking method and device thereof
CN107103279A (en) * 2017-03-09 2017-08-29 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of passenger flow counting method under vertical angle of view based on deep learning
CN108021925A (en) * 2016-10-31 2018-05-11 北京君正集成电路股份有限公司 A kind of detection method and equipment
CN108388883A (en) * 2018-03-16 2018-08-10 广西师范大学 A kind of video demographic method based on HOG+SVM

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9628755B2 (en) * 2010-10-14 2017-04-18 Microsoft Technology Licensing, Llc Automatically tracking user movement in a video chat application
CN102982598B (en) * 2012-11-14 2015-05-20 三峡大学 Video people counting method and system based on single camera scene configuration

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831442A (en) * 2011-06-13 2012-12-19 索尼公司 Abnormal behavior detection method and equipment and method and equipment for generating abnormal behavior detection equipment
CN102243706B (en) * 2011-08-18 2014-12-10 杭州海康威视数字技术股份有限公司 Target classification method and system based on target edge direction
CN102243706A (en) * 2011-08-18 2011-11-16 杭州海康威视软件有限公司 Target classification method and system based on target edge direction
CN102567998A (en) * 2012-01-06 2012-07-11 西安理工大学 Head-shoulder sequence image segmentation method based on double-pattern matching and edge thinning
CN103021059A (en) * 2012-12-12 2013-04-03 天津大学 Video-monitoring-based public transport passenger flow counting method
CN103310194A (en) * 2013-06-07 2013-09-18 太原理工大学 Method for detecting head and shoulders of pedestrian in video based on overhead pixel gradient direction
CN103310194B (en) * 2013-06-07 2016-05-25 太原理工大学 Pedestrian based on crown pixel gradient direction in a video shoulder detection method
CN103632146A (en) * 2013-12-05 2014-03-12 南京理工大学 Head-shoulder distance based human body detection method
CN103632146B (en) * 2013-12-05 2017-01-04 南京理工大学 A kind of based on head and shoulder away from human body detecting method
CN104021605A (en) * 2014-04-16 2014-09-03 湖州朗讯信息科技有限公司 Real-time statistics system and method for public transport passenger flow
CN103971135A (en) * 2014-05-05 2014-08-06 中国民航大学 Human body target detection method based on head and shoulder depth information features
CN104268593B (en) * 2014-09-22 2017-10-17 华东交通大学 The face identification method of many rarefaction representations under a kind of Small Sample Size
CN104268593A (en) * 2014-09-22 2015-01-07 华东交通大学 Multiple-sparse-representation face recognition method for solving small sample size problem
CN105809183A (en) * 2014-12-31 2016-07-27 深圳中兴力维技术有限公司 Video-based human head tracking method and device thereof
CN104881675A (en) * 2015-05-04 2015-09-02 北京奇艺世纪科技有限公司 Video scene identification method and apparatus
CN108021925A (en) * 2016-10-31 2018-05-11 北京君正集成电路股份有限公司 A kind of detection method and equipment
CN107103279A (en) * 2017-03-09 2017-08-29 广东顺德中山大学卡内基梅隆大学国际联合研究院 A kind of passenger flow counting method under vertical angle of view based on deep learning
CN107103279B (en) * 2017-03-09 2020-06-05 广东顺德中山大学卡内基梅隆大学国际联合研究院 Passenger flow counting method based on deep learning under vertical visual angle
CN108388883A (en) * 2018-03-16 2018-08-10 广西师范大学 A kind of video demographic method based on HOG+SVM

Also Published As

Publication number Publication date
CN101477626B (en) 2010-08-25

Similar Documents

Publication Publication Date Title
CN101477626B (en) Method for detecting human head and shoulder in video of complicated scene
CN106446930B (en) Robot operative scenario recognition methods based on deep layer convolutional neural networks
CN107844785B (en) A kind of method for detecting human face based on size estimation
CN100361138C (en) Method and system of real time detecting and continuous tracing human face in video frequency sequence
CN105160317B (en) One kind being based on area dividing pedestrian gender identification method
CN103824070B (en) A kind of rapid pedestrian detection method based on computer vision
CN102509104B (en) Confidence map-based method for distinguishing and detecting virtual object of augmented reality scene
CN109919981A (en) A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary
CN101655914B (en) Training device, training method and detection method
CN106886216A (en) Robot automatic tracking method and system based on RGBD Face datections
CN110070074A (en) A method of building pedestrian detection model
CN109886245A (en) A kind of pedestrian detection recognition methods based on deep learning cascade neural network
CN102201059A (en) Pedestrian detection method and device
CN105608446A (en) Video stream abnormal event detection method and apparatus
CN102722698A (en) Method and system for detecting and tracking multi-pose face
CN103605986A (en) Human motion recognition method based on local features
CN102609724B (en) Method for prompting ambient environment information by using two cameras
CN107909027A (en) It is a kind of that there is the quick human body target detection method for blocking processing
CN101196991A (en) Close passenger traffic counting and passenger walking velocity automatic detection method and system thereof
CN108074234A (en) A kind of large space flame detecting method based on target following and multiple features fusion
CN104268514A (en) Gesture detection method based on multi-feature fusion
CN103530892A (en) Kinect sensor based two-hand tracking method and device
JP2014093023A (en) Object detection device, object detection method and program
CN101826155B (en) Method for identifying act of shooting based on Haar characteristic and dynamic time sequence matching
CN107330370A (en) A kind of brow furrows motion detection method and device and vivo identification method and system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant