CN105138979A - Method for detecting the head of moving human body based on stereo visual sense - Google Patents

Method for detecting the head of moving human body based on stereo visual sense Download PDF

Info

Publication number
CN105138979A
CN105138979A CN201510512540.2A CN201510512540A CN105138979A CN 105138979 A CN105138979 A CN 105138979A CN 201510512540 A CN201510512540 A CN 201510512540A CN 105138979 A CN105138979 A CN 105138979A
Authority
CN
China
Prior art keywords
head
image
gray
sigma
video camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510512540.2A
Other languages
Chinese (zh)
Inventor
孙爱娟
顾国华
周玉蛟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Science and Technology
Original Assignee
Nanjing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Science and Technology filed Critical Nanjing University of Science and Technology
Priority to CN201510512540.2A priority Critical patent/CN105138979A/en
Publication of CN105138979A publication Critical patent/CN105138979A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method for detecting the head of a moving human body based on stereo visual sense. The method comprises the steps: constructing a hardware platform; arranging two calibrated cameras of the same type above the target scene to be shot in parallel, wherein one camera is arranged at the left side and the other one camera is arranged at the right side in the same height; calculating the parallax between two binocular stereo images through a stereo matching algorithm based on a window; acquiring the distance between the cameras and the target scene through triangle operation based on the parallax so as to acquire the original depth image of the target scene; and performing head target segmentation for the original depth image according to the gray level and the geometrical characteristic of the head target of the human body so as to achieve identification of the head of the human body.

Description

Based on the movement human head detection method of stereoscopic vision
Technical field
The present invention relates to the detection and tracking technology of moving target, a kind of especially movement human head object detection method based on stereoscopic vision.
Background technology
Along with the quick raising of the aspect such as Computer Storage, computing performance, computing machine is progressively applied to the sophisticated functionss such as scene reconstruction, target identification, human-computer interaction by people, this has not only opened up scale and the research direction of computer application field, and facilitates the fast development of related discipline.As current active research field, the essence of computer vision utilizes video camera to replace human eye exactly, utilizes computer to replace the brain of people, carries out recognition and tracking to target, and make corresponding pattern analysis process, generate the image being applicable to instrument detection or eye-observation.
The identification of movement human target is the prerequisite of human body in video being carried out to tracking lock and understanding and the behavior of description human body.Human body target recognition technology based on two dimensional image process is a newer technology, also achieves larger progress in recent years.But due to the human body target recognition technology technical finesse based on two dimensional image process is visible images, therefore higher to the requirement of illumination, thus accuracy of identification and speed are all very easy to the impact being subject to illumination.
Summary of the invention
The object of the present invention is to provide a kind of movement human head detection method based on stereoscopic vision, comprise the following steps:
Step S101, building of hardware platform: video camera sustained height one the first from left right side of the same model of having demarcated two is parallel to be placed on directly over target scene to be captured;
Step S102, by based on window Stereo Matching Algorithm calculate binocular stereo image between parallax;
Step S103, adopts the triangulo operation based on parallax to obtain the distance of video camera to target scene, thus obtains the original depth image of target scene;
Step S104, to original depth image according to the gray scale of human body head target and geometric properties, carries out head Target Segmentation, reaches the object of human body head identification.
Adopt said method, described Stereo Matching Algorithm comprises:
Step S1021, two video camera shooting background images;
Step S1022, image and the background image of the shooting of each video camera do difference, obtain two width foreground images;
Step S1023, for benchmark, chooses the unique point in benchmark foreground image with a wherein width foreground image, sets up the window of m*m size centered by this unique point;
Step S1024, another width foreground image is set up the window of m*m size, and in units of pixel moving window, calculate the difference of gray scale on given parallax in two windows;
Step S1025, gray scale difference value and minimum time parallax then as the parallax of this pixel.
Adopt said method, described body head is known method for distinguishing and is comprised:
Step S1041, makes statistics with histogram to depth image, chooses the region at local maximum place as target area;
Step S1042, choose the threshold value of segmentation image, to target area, the pixel being not less than threshold value in region forms doubtful head zone, and the pixel lower than threshold value forms non-head region;
Step S1043, maximum in the target area according to the average gray value of head, adopt average gray and gray variance filtering part nontarget area;
Step S1044, to doubtful head zone remaining after filtering, according to the geometric properties determination human body head of head.
The present invention compared with prior art, have the following advantages: (1) adopts the method based on stereoscopic vision, generated the depth image of target scene by stereovision technique, the three-dimensional information of display-object human body, breach the limitation of two dimensional image by illumination effect; (2) what simultaneously the present invention adopted is top-down style of shooting, even if therefore when crowded, also has certain space between head and head, effectively can avoid the blocking of the stream of people, identification error that the phenomenon such as overlapping causes.
Below in conjunction with Figure of description, the present invention is described further.
Accompanying drawing explanation
Fig. 1 is method flow diagram of the present invention.
Fig. 2 is the present invention's two video camera optimal location arrangenent diagrams.
Fig. 3 is Stereo Matching Algorithm process schematic.
Embodiment
Composition graphs 1, a kind of movement human head detection method based on stereoscopic vision, comprises the following steps:
Step S101, building of hardware platform: video camera sustained height one the first from left right side of the same model of having demarcated two is parallel to be placed on directly over target scene to be captured;
Step S102, by based on window Stereo Matching Algorithm calculate binocular stereo image between parallax;
Step S103, adopts the triangulo operation based on parallax to obtain the distance of video camera to target scene, thus obtains the original depth image of target scene;
Step S104, to original depth image according to the gray scale of human body head target and geometric properties, carries out head Target Segmentation, reaches the object of human body head identification.
Demarcation described in step S101 comprises demarcates the Intrinsic Matrix of video camera and outer parameter matrix, and implementation method is as follows:
Step S1011, utilizes the MTV-1881EX-3 video camera of the calibration tool case of matlab to two experiments to demarcate.Obtain a mapping matrix H to every piece image, principle is as formula (1):
s u v 1 = K r 1 r 2 r 3 t X Y 0 1 = K r 1 r 2 t X Y 1 - - - ( 1 )
Suppose that the stencil plane demarcated is in the world coordinate system plane of Z=0, wherein s represents unknown scale factor, [uv1] tspot projection on expression stencil plane is to the homogeneous coordinates on the plane of delineation, and K shows video camera internal reference matrix, [r 1r 2r 3] representing the rotation matrix of camera coordinate system relative to world coordinate system, t represents the translation vector of camera coordinate system relative to world coordinate system, [XY1] trepresent the homogeneous coordinates that template is put.(1) formula of arrangement, can obtain the homography matrix H of a 3*3,
H=[h 1,h 2,h 3]=λK[r 1,r 2,t](2)
Wherein λ represents arrangement coefficient out, according to (2) formula, can obtain:
r 1 = 1 λ K - 1 h 1 r 2 = 1 λ K - 1 h 2 - - - ( 3 )
The wherein character of rotation matrix: r 1 tr 2=0, and || r 1||=|| r 2||=1
By the character of formula (2), (3) and rotation matrix, two constraints substantially of camera intrinsic parameter A can be obtained:
h 1 TK -TK -1h 2=0(4)
h 1 TK -TK -1h 1=h 2 TK -TK -1h 2(5)
Can K be calculated according to (4), (5), then calculate every width image for the outer parameter matrix R of plane template and translation vector t by K and mapping matrix H:
r 1=λA -1h 1(6)
r 2=λA -1h 2(7)
r 3=r 1r 2(8)
t=λA -1h 3(9)
Demarcated two video cameras are placed on the top of required photographed scene, highly unanimously by step S1012, left and right is each places one, regulate the distance between left and right cameras according to the height of visual field, under the image taken as far as possible clearly condition, expand the scope of visual field.This experiment is through experimental demonstration, and video camera antenna height is 2.5 meters, and the distance between video camera is 0.8 meter (as described in Figure 2).
Composition graphs 3, in step s 102, described Stereo Matching Algorithm comprises:
Step S1021, two video camera shooting background images;
Step S1022, image and the background image of the shooting of each video camera do difference, obtain two width foreground images;
Step S1023, for benchmark, chooses the unique point in benchmark foreground image with a wherein width foreground image, sets up the window of m*m size, such as 5*5 pixel centered by this unique point;
Step S1024, another width foreground image is set up the window of m*m size, and in units of pixel moving window, calculate the difference of gray scale on given parallax in two windows according to formula (10)
Σ p = - m 2 m 2 Σ q = - m 2 m 2 | I r i g h t [ x + p ] [ y + q ] - I l e f t [ x + p + d ] [ y + q ] | - - - ( 10 )
Wherein, m represents the size of window, in units of pixel.I left, I rightrepresent left images grey scale pixel value respectively, p, q represent the distance of window movement, and d represents the parallax value of setting.
Step 5, gray scale difference value and minimum time parallax then as the parallax of this pixel, the d namely in time calculating the minimum value of formula (10) is parallax.
In step s 103, the focal length of known video camera is f, and the distance between two video cameras is B, according to formula (11), calculates the depth information Z of scene
Z = f · d d - B - - - ( 11 )
The target scene image be made up of the depth information of image is referred to as the depth image of scene.
In step S104, described body head is known method for distinguishing and is comprised:
Step S1041, makes statistics with histogram to depth image, chooses the region at local maximum place as target area;
Step S1042, choose the threshold value (threshold range is [25,30]) of segmentation image, to target area, the pixel being not less than threshold value in region forms doubtful head zone, and the pixel lower than threshold value forms non-head region; According to statistics, target area pixel bulk deposition is in head and shoulder.With gray level t for Threshold segmentation head and shoulder regions, form head zone in region higher than the pixel of t gray level, the pixel lower than gray level t forms non-head region.So the computing formula of the entropy of non-head region and head zone is:
H B = - Σ i ( p i p t ) lg ( p i p t )
H O = - Σ i [ p i / ( 1 - p t ) ] lg [ p i / ( 1 - p t ) ]
H t = - Σ i p i lgp i
H L = - Σ i p i lgp i
Wherein, p ithe ratio of gray-scale value shared by the pixel of i in presentation image, t is the threshold value of segmentation image, h brepresent non-head region in image unitary gray level entropy, H orepresent the unitary gray level entropy of head zone in image, be two entropy function sums, when when obtaining maximum, gray level t is as the threshold value of segmentation image.
Step S1043, maximum in the target area according to the average gray value of head, adopt average gray and gray variance filtering part nontarget area.By continuous emulation, known, under the viewing conditions of this experiment, the scope of the ratio of width to height w/h of the total pixel of head is [0.65,1.5], and wherein, w represents the width of the total pixel of head, and h represents the height of the total pixel of head.Be specially: selected threshold [0.65,1.5], to doubtful head zone, if its gray variance is greater than the threshold value of setting, so by this doubtful head zone filtering, when the columns of doubtful head zone falls in [0.65,1.5] than the value of line number (the ratio of width to height), this region is human body head zone.
g ‾ = Σ i = 0 M - 1 Σ j = 0 N - 1 f ( i , j ) M * N
var = Σ i = 0 M - 1 Σ j = 0 N - 1 ( f ( i , j ) - g ‾ ) 2 M * N
Wherein, M, N represent the row, column number of doubtful head zone, and f (i, j) represents the frequency that gray feature two tuple (i, j) occurs. represent that size is the average gray of the doubtful head zone of M*N, var represents gray variance.
Step S1044, to doubtful head zone remaining after filtering, according to the geometric properties determination human body head of head.

Claims (8)

1., based on a movement human head detection method for stereoscopic vision, it is characterized in that, comprising:
Building of hardware platform: video camera sustained height one the first from left right side of the same model of having demarcated two is parallel to be placed on directly over target scene to be captured;
By based on window Stereo Matching Algorithm calculate binocular stereo image between parallax;
Adopt the triangulo operation based on parallax to obtain the distance of video camera to target scene, thus obtain the original depth image of target scene;
To original depth image according to the gray scale of human body head target and geometric properties, carry out head Target Segmentation, reach the object of human body head identification.
2. the movement human head detection method based on stereoscopic vision according to claim 1, is characterized in that, described demarcation comprises demarcates the Intrinsic Matrix of video camera and outer parameter matrix.
3. the movement human head detection method based on stereoscopic vision according to claim 1, it is characterized in that, described Stereo Matching Algorithm comprises:
Step 1, two video camera shooting background images;
Step 2, image and the background image of the shooting of each video camera do difference, obtain two width foreground images;
Step 3, for benchmark, chooses the unique point in benchmark foreground image with a wherein width foreground image, sets up the window of m*m size centered by this unique point;
Step 4, another width foreground image is set up the window of m*m size, and in units of pixel moving window, calculate the difference of gray scale on given parallax in two windows;
Step 5, gray scale difference value and minimum time parallax then as the parallax of this pixel.
4. the movement human head detection method based on stereoscopic vision according to claim 1, is characterized in that, described body head is known method for distinguishing and comprised:
Statistics with histogram is done to depth image, chooses the region at local maximum place as target area;
Choose the threshold value of segmentation image, to target area, the pixel being not less than threshold value in region forms doubtful head zone, and the pixel lower than threshold value forms non-head region;
Maximum in the target area according to the average gray value of head, adopt average gray and gray variance filtering part nontarget area;
To doubtful head zone remaining after filtering, according to the geometric properties determination human body head of head.
5. the movement human head detection method based on stereoscopic vision according to claim 4, is characterized in that, the threshold value of described segmentation image is obtained by following formula:
H B = - Σ i ( p i p t ) lg ( p i p t )
H O = - Σ i [ p i / ( 1 - p t ) ] lg [ p i / ( 1 - p t ) ]
H t = - Σ i p i lgp i
H L = - Σ i p i lgp i
Wherein, p ithe ratio of gray-scale value shared by the pixel of i in presentation image, t is the threshold value of segmentation image, h brepresent non-head region in image unitary gray level entropy, H orepresent the unitary gray level entropy of head zone in image, be two entropy function sums, when when obtaining maximum, gray level t is as the threshold value of segmentation image.
6. the movement human head detection method based on stereoscopic vision according to claim 4, it is characterized in that, described filtering part nontarget area is specially: selected threshold, to doubtful head zone, if its gray variance is greater than the threshold value of setting, so by this doubtful head zone filtering
g ‾ = Σ i = 0 M - 1 Σ j = 0 N - 1 f ( i , j ) M * N
var = Σ i = 0 M - 1 Σ j = 0 N - 1 ( f ( i , j ) - g ‾ ) 2 M * N
Wherein, M, N represent the row, column number of doubtful head zone, and f (i, j) represents the frequency that gray feature two tuple (i, j) occurs. represent that size is the average gray of the doubtful head zone of M*N, var represents gray variance.
7. the movement human head detection method based on stereoscopic vision according to claim 4, is characterized in that, when the columns of doubtful head zone falls in [0.65,1.5] than the value of line number, this region is human body head zone.
8. the movement human head detection method based on stereoscopic vision according to claim 1, it is characterized in that, video camera antenna height is 2.5 meters, and the distance between video camera is 0.8 meter.
CN201510512540.2A 2015-08-19 2015-08-19 Method for detecting the head of moving human body based on stereo visual sense Pending CN105138979A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510512540.2A CN105138979A (en) 2015-08-19 2015-08-19 Method for detecting the head of moving human body based on stereo visual sense

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510512540.2A CN105138979A (en) 2015-08-19 2015-08-19 Method for detecting the head of moving human body based on stereo visual sense

Publications (1)

Publication Number Publication Date
CN105138979A true CN105138979A (en) 2015-12-09

Family

ID=54724323

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510512540.2A Pending CN105138979A (en) 2015-08-19 2015-08-19 Method for detecting the head of moving human body based on stereo visual sense

Country Status (1)

Country Link
CN (1) CN105138979A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119668A1 (en) * 2016-12-27 2018-07-05 深圳大学 Method and system for recognizing head of pedestrian
CN109064686A (en) * 2018-08-17 2018-12-21 浙江捷尚视觉科技股份有限公司 A kind of ATM trailing detection method based on human body segmentation
CN110706262A (en) * 2019-10-09 2020-01-17 上海思岚科技有限公司 Image processing method, device, equipment and storage medium
CN111667563A (en) * 2020-06-19 2020-09-15 北京字节跳动网络技术有限公司 Image processing method, device, equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101393497A (en) * 2008-10-30 2009-03-25 上海交通大学 Multi-point touch method based on binocular stereo vision
CN103927747A (en) * 2014-04-03 2014-07-16 北京航空航天大学 Face matching space registration method based on human face biological characteristics
CN104658272A (en) * 2015-03-18 2015-05-27 哈尔滨工程大学 Street traffic volume statistics and sped measurement method based on binocular stereo vision

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101393497A (en) * 2008-10-30 2009-03-25 上海交通大学 Multi-point touch method based on binocular stereo vision
CN103927747A (en) * 2014-04-03 2014-07-16 北京航空航天大学 Face matching space registration method based on human face biological characteristics
CN104658272A (en) * 2015-03-18 2015-05-27 哈尔滨工程大学 Street traffic volume statistics and sped measurement method based on binocular stereo vision

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
于海滨: "基于头部特征提取的人体检测与跟踪及其应用", 《中国优秀博士学位论文全文数据库》 *
尹章芹: "三维自动客流计数系统设计", 《中国优秀硕士学位论文全文数据库》 *
徐新凤: "基于双目视觉的行人识别与跟踪系统的设计与实现", 《中国优秀硕士学位论文全文数据库》 *
王小鹏 等: "基于最大熵分割和肤色模型的人眼定位", 《计算机工程》 *
聂鹏鹏: "基于双目立体视觉三维人脸重构及其识别", 《中国优秀硕士学位论文全文数据库》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018119668A1 (en) * 2016-12-27 2018-07-05 深圳大学 Method and system for recognizing head of pedestrian
JP2019505866A (en) * 2016-12-27 2019-02-28 シェンチェン ユニバーシティー Passerby head identification method and system
CN109064686A (en) * 2018-08-17 2018-12-21 浙江捷尚视觉科技股份有限公司 A kind of ATM trailing detection method based on human body segmentation
CN110706262A (en) * 2019-10-09 2020-01-17 上海思岚科技有限公司 Image processing method, device, equipment and storage medium
CN110706262B (en) * 2019-10-09 2023-06-02 上海思岚科技有限公司 Image processing method, device, equipment and storage medium
CN111667563A (en) * 2020-06-19 2020-09-15 北京字节跳动网络技术有限公司 Image processing method, device, equipment and storage medium
CN111667563B (en) * 2020-06-19 2023-04-07 抖音视界有限公司 Image processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US10198623B2 (en) Three-dimensional facial recognition method and system
CN101443817B (en) Method and device for determining correspondence, preferably for the three-dimensional reconstruction of a scene
CN104574375B (en) Image significance detection method combining color and depth information
Zhuang et al. Rolling-shutter-aware differential sfm and image rectification
CN103530599A (en) Method and system for distinguishing real face and picture face
US20130136302A1 (en) Apparatus and method for calculating three dimensional (3d) positions of feature points
CN106952274B (en) Pedestrian detection and distance measuring method based on stereoscopic vision
CN104317391A (en) Stereoscopic vision-based three-dimensional palm posture recognition interactive method and system
CN103839277A (en) Mobile augmented reality registration method of outdoor wide-range natural scene
CN105335955A (en) Object detection method and object detection apparatus
CN111160291B (en) Human eye detection method based on depth information and CNN
US20140340486A1 (en) Image processing system, image processing method, and image processing program
KR100560464B1 (en) Multi-view display system with viewpoint adaptation
CN106503605A (en) Human body target recognition methods based on stereovision technique
CN104065954B (en) A kind of disparity range method for quick of high definition three-dimensional video-frequency
CN103248906A (en) Method and system for acquiring depth map of binocular stereo video sequence
CN105138979A (en) Method for detecting the head of moving human body based on stereo visual sense
CN105809664B (en) Method and device for generating three-dimensional image
CN106295657A (en) A kind of method extracting human height's feature during video data structure
CN104243970A (en) 3D drawn image objective quality evaluation method based on stereoscopic vision attention mechanism and structural similarity
JP6285686B2 (en) Parallax image generation device
CN110120012A (en) The video-splicing method that sync key frame based on binocular camera extracts
CN116778094B (en) Building deformation monitoring method and device based on optimal viewing angle shooting
CN103310482A (en) Three-dimensional reconstruction method and system
CN108090930A (en) Barrier vision detection system and method based on binocular solid camera

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20151209

WD01 Invention patent application deemed withdrawn after publication