CN102521582B - Human upper body detection and splitting method applied to low-contrast video - Google Patents

Human upper body detection and splitting method applied to low-contrast video Download PDF

Info

Publication number
CN102521582B
CN102521582B CN2011104465964A CN201110446596A CN102521582B CN 102521582 B CN102521582 B CN 102521582B CN 2011104465964 A CN2011104465964 A CN 2011104465964A CN 201110446596 A CN201110446596 A CN 201110446596A CN 102521582 B CN102521582 B CN 102521582B
Authority
CN
China
Prior art keywords
human body
area
upper half
foreground
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2011104465964A
Other languages
Chinese (zh)
Other versions
CN102521582A (en
Inventor
谢迪
童若锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN2011104465964A priority Critical patent/CN102521582B/en
Publication of CN102521582A publication Critical patent/CN102521582A/en
Application granted granted Critical
Publication of CN102521582B publication Critical patent/CN102521582B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to a human upper body detection and splitting method applied to a low-contrast video. The method mainly comprises two processes. In the first process, a communicated area representing a foreground object is extracted from a current frame by a background subtraction technology and a morphological method; and for each foreground area, the shape features of a polar-coordinate-based two-dimensional histogram corresponding to the foreground area are extracted as the input of a pre-trained support-vector-machine-based classifier, and a class tag corresponding to a human upper body class and a class tag corresponding to a non-human upper body class are output. In the second process, when an area which is identified as a human body area is misjudged as a non-human body area,the area is represented by an energy function, an inaccurate contour line is corrected by an energy function minimization process at the same time, and finally, a background frame is updated on the basis that an accurate foreground human body contour is obtained. By the method, a video with low contrast and resolution can be processed in real time, and both detection accuracy and a splitting result can meet the requirements of application.

Description

A kind of upper half of human body detection of low contrast video and method of cutting apart of being applicable to
Technical field:
The present invention relates to technical field of video processing, relate in particular to detection and the extracting method in upper half of human body zone, specifically a kind of upper half of human body detection of low contrast video and method of cutting apart of being applicable to.
Background technology:
It is two different monitoring key in application steps that automatic detection is cut apart with the human region in the video.Human body detecting method finds foreground object usually and based on shape, color and further feature their signs is people or inhuman zone from video.The background scalping method is a kind of preconditioning technique of common extraction foreground area.Another kind of is method based on machine learning, and has used many new features that are applicable to machine learning.Feature based on gradient is the most representative.But these methods do not need to carry out the pre-service of background rejecting are cost with high assessing the cost but, have therefore limited its application in real-time system.Methods of video segmentation is rejected technology based on background equally, and simultaneously integrated probabilistic framework is as bayesian theory and Markov chain monte-Carlo model.
Because many methods need provide an arithmetic result of background rejecting relatively preferably, in case because illumination variation makes ambient lighting change, these methods just can lose efficacy.Though some improved backgrounds are rejected algorithm and can be addressed the above problem, if foreground object keeps the quite a long time transfixion before camera lens, prospect can be gradually varied to background so.In addition because the quality of its CCD chip of video camera of being equipped with of many supervisory systems is not high, thereby make that the video contrast who obtains is lower, it will be more difficult that existing method is handled these videos.
Summary of the invention:
(1) foreground extraction: first frame of designated frame as a setting at first, its form from the RGB color space conversion to the Lab color space, for each frame of input, all carry out color conversion in the same way then; The method that output frame after the conversion and background frames use background to reject is extracted the foreground object zone; To each zone after extracting, use the morphological operation of the corrosion of expanding that noise and cavity are carried out filtering then, use breadth First connected region searching algorithm that mark is carried out in preceding background area at last, generate the foreground area mask;
(2) Shape Feature Extraction: at first extract the outline line of foreground area by the profile detection algorithm and to its sampling; Be that initial point is set up a polar coordinate system with regional barycenter then, for each sampled contour point, it be mapped to a two dimensional surface, finally all sampled points have just formed a two-dimensional histogram; Histogram normalization and expansion to obtaining just can obtain a high dimension vector at last;
(3) based on the upper half of human body model training of support vector machine: with the vector that obtains in the previous step as sample, use with the radius basis function and as the non-linear algorithm of support vector machine of kernel function all training samples are carried out K cross validation analysis, finally generate a non-linear decision-making lineoid as the sorter of upper half of human body zone with non-upper half of human body zone;
(4) based on the upper half of human body category of model of support vector machine: the vector that is obtained in poly-to go on foot equally (2) is as the input of training gained sorter in the step (3), the class label of output after sorter decision-making mapping;
(5) energy function minimizes optimizing process: for a foreground area that begins to be considered to human region, when being classified device in its processing procedure and detecting its class label and be non-human region, come contour curve is carried out modeling with an energy function, be initial value with contour curve correct in the former frame, find the solution with Euler-Lagrangian method.
Method of the present invention mainly is made of two big processes.At first, from present frame, extract the connected region of expression foreground object by background rejecting technology and morphological method, then for each foreground area, extract the shape facility based on the polar coordinates two-dimensional histogram of its correspondence, as the good input based on the sorter of support vector machine of training in advance, export a class label corresponding to upper half of human body class and non-upper half of human body class.The second step process, when the zone that is identified as human body was mistaken for non-human region, the present invention characterized respective regions with an energy function, simultaneously the outline line of correcting a mistake by an energy function minimization process.Upgrade background frames on the basis that obtains correct prospect human body contour outline at last.The present invention can handle the video than low contrast and resolution in real time, detects accuracy and segmentation result and can both satisfy demands of applications.
Description of drawings:
Fig. 1 is process flow diagram of the present invention.
Embodiment:
Be elaborated according to the various piece of process flow diagram Fig. 1 of the present invention below:
1. foreground extraction
First frame of designated frame as a setting at first, its form from the RGB color space conversion to the Lab color space.Then for each frame of importing, all carry out color conversion in the same way, the method that output frame after the conversion and background frames use background to reject is extracted the foreground object zone and (is namely according to pixels asked the poor mode that takes absolute value with two frames, its value is higher than certain threshold value, just think foreground pixel, otherwise be background pixel).To each zone after extracting, use the morphological operation of the corrosion of expanding that noise and cavity are carried out filtering, use breadth First connected region searching algorithm that mark is carried out in preceding background area at last, generate the foreground area mask.
2. Shape Feature Extraction
Feature proposed by the invention compares to the local phase gradient histogram feature that has can describe the shape of upper half of human body more, therefore has bigger discrimination, has littler computation complexity simultaneously.
A people's profile, particularly Shang Banshen profile can be regarded a star convex set as.If have 1 x among the collection S 0, make by x 0The straight-line segment of any point all belongs to S in the S, claims that then S is starlike domain or star convex set.Shape facility of the present invention is to design on this basis.
For a specific foreground area, the present invention finds the barycenter of foreground area by BFS (Breadth First Search), finds the boundary contour of the same area then by the border following algorithm.Then sample counterclockwise to outline line in equal angles ground on outline line, namely with the barycenter of foreground area as a polar coordinate system initial point, then each sampled point on the outline line just can be expressed as one group of polar coordinates (θ under this coordinate system i, r i), i=1,2 ..., N, wherein r iBe the Euclidean distance of regional barycenter to each point, θ iBe the polar angle of each point, N is the sum of sampled point.These polar coordinates values are projected on the two dimensional surface subsequently, and the x axle on plane is represented the θ value, and the y axle is represented the r value, and each dimension is quantized respectively, are divided into m and n part.As a polar coordinates value (θ i, r i) when satisfying following condition:
θ k≤θ i≤θ k+1,r l≤r i≤r l+1,k=0,...,m-1,l=0,...,n-1
Then increase corresponding unit (k, value l).When having traveled through all points as stated above, will form a two-dimensional histogram with AD HOC.This specific pattern is characterizing the given shape of corresponding outline line.At last, carry out to obtain the vector f that a m * n ties up after the normalization by the value of this each cell of histogram of row expansion and to it.Obviously have nothing to do through the shape facility of the present invention's acquisition and position and the size of object.
3. based on upper half of human body model training and the detection of support vector machine
In the training stage, a large amount of upper half of human body images and non-upper half of human body image are collected, thereby extract the shape facility of prospect by the manual markings foreground area.These shape facilities the set of corresponding high dimension vector formed the sample set that the present invention is used for training.As the algorithm of training, its kernel function has adopted Gauss's radius basis function with support vector machine in the present invention:
K(x i,x j)=exp(-γ||x i-x j|| 2)
X wherein i, x jBe proper vector, γ is normaliztion constant.
In order to train the sorter that obtains optimum performance, the present invention has used the method for K cross validation to determine two parameter γ and the C of support vector machine classifier.Namely all data are divided into K one's share of expenses for a joint undertaking data, a independent subdata is retained as verification msg, other K-1 one's share of expenses for a joint undertaking data are used for training. and as above process is repeated inferior K time, selects for use different subdata combinations as verification msg and training data at every turn, at last asking result is averaged.The parameter combinations of determining the optimal classification performance by the present invention of this mode is γ=0.25, and during C=2.0, classification accuracy is about 98%.
At detection-phase, for each frame, if there is foreground area, using the same method so extracts the region contour shape facility, as the input of the good sorter of precondition, sorter will export whether a Boolean explanation current region is the upper half of human body zone.
4. energy function minimizes optimizing process
In case the upper half of human body zone can not be supported vector machine classifier and be identified as the human body class, cause classification error, the present invention will carry out an energy minimization process to the prospect profile line of mistake so, eliminate because surround lighting changes the outline line that causes and expand the error that causes, thereby guarantee the correctness in prospect profile zone.For the integrity profile of one section closure, the present invention is with an energy function E c(s) characterize:
E c(s)=∮(E int(s)+η(s)E ext(s))ds
E wherein Int(s) be the inside potential energy of outline line, E Ext(s) provided outside limits based on image.η (s) is the weight corresponding to each sampled point, is defined as:
η ( s i ) = | | ▿ I ( x ( s i ) , y ( s i ) ) | | 2 ∑ i N | | ▿ I ( x ( s i ) , y ( s i ) ) | | 2
Wherein
Figure GDA00002917651400046
The gradient of presentation video, N is the sum of sampled point.The target of optimizing is to find to make energy functional E c(s) minimized curvilinear function v (s)=(x (s), y (s)).The present invention adopts Euler-lagrange's method of multipliers that the functional formula is converted into the problem of finding the solution of partial differential equation, then to its discretize, finally obtains a linear system Ax=b, and wherein A is the matrix that has and only have five nonzero elements on the diagonal line.Can use this linear system of Cholesky decomposition method solution.
5. the background area is upgraded
On the basis that obtains correct foreground area, the present invention upgrades the background area with the mode of linear interpolation:
I B ( x , y ) = α I B ( t ) ( x , y ) + ( 1 - α ) I B * ( x , y )
I wherein B(x, y) for upgrade the position, back (x, y) Dui Ying background frames pixel value,
Figure GDA00002917651400043
Pixel value for same position before upgrading.
Figure GDA00002917651400044
The pixel value that belongs to the background area in the present frame for correspondence.For foreground area, only copy the pixel value on the relevant position simply
Figure GDA00002917651400045
What should be understood that is: above-described embodiment is just to explanation of the present invention, rather than limitation of the present invention, and any innovation and creation that do not exceed in the connotation scope of the present invention all fall within protection scope of the present invention.

Claims (5)

1. one kind is applicable to the upper half of human body detection of low contrast video and the method for cutting apart, and it is characterized in that this method may further comprise the steps:
(1) foreground extraction: first frame of designated frame as a setting at first, its form from the RGB color space conversion to the Lab color space, for each frame of input, all carry out color conversion in the same way then; The method that output frame after the conversion and background frames use background to reject is extracted the foreground object zone; To each zone after extracting, use the morphological operation of the corrosion of expanding that noise and cavity are carried out filtering then, use breadth First connected region searching algorithm that mark is carried out in preceding background area at last, generate the foreground area mask;
(2) Shape Feature Extraction: at first extract the outline line of foreground area by the profile detection algorithm and to its sampling; Be that initial point is set up a polar coordinate system with regional barycenter then, for each sampled contour point, it be mapped to a two dimensional surface, finally all sampled points have just formed a two-dimensional histogram; Histogram normalization and expansion to obtaining obtains a high dimension vector at last;
(3) based on the upper half of human body model training of support vector machine: with the vector that obtains in the previous step as sample, use with the radius basis function and as the non-linear algorithm of support vector machine of kernel function all training samples are carried out K cross validation analysis, finally generate a non-linear decision-making lineoid as the sorter of upper half of human body zone with non-upper half of human body zone;
(4) based on the upper half of human body category of model of support vector machine: equally with the input as training gained sorter in the step (3) of the vector that obtained in the step (2), the class label of output after sorter decision-making mapping;
(5) energy function minimizes optimization: for a foreground area that begins to be considered to human region, when being classified device in its processing procedure and detecting its class label and be non-human region, come contour curve is carried out modeling with an energy function, be initial value with contour curve correct in the former frame, find the solution with Euler-Lagrangian method, and upgrade the background area with last result.
2. the upper half of human body that the is applicable to the low contrast video as claimed in claim 1 method that detects and cut apart, it is as follows to it is characterized in that method that the use background described in the step (1) is rejected is extracted the process in foreground object zone: according to pixels ask the mode that takes absolute value that differs from two frames, its value is higher than certain threshold value, just think foreground pixel, otherwise be background pixel.
3. the upper half of human body that the is applicable to the low contrast video as claimed in claim 1 method that detects and cut apart is characterized in that the detailed process of step (2) is as follows:
For a specific foreground area, find the barycenter of foreground area by BFS (Breadth First Search), find the boundary contour of the same area then by the border following algorithm; Then sample counterclockwise to outline line in equal angles ground on outline line, and the barycenter of foreground area is labeled as a polar coordinate system initial point, and each sampled point on the outline line just can be expressed as one group of polar coordinates (θ under this coordinate system i, r i), i=1,2 ..., N, wherein r iBe the Euclidean distance of regional barycenter to each point, θ iBe the polar angle of each point, N is the sum of sampled point; These polar coordinates values are projected on the two dimensional surface subsequently, and the x axle on plane is represented the θ value, and the y axle is represented the r value, and each dimension is quantized respectively, are divided into m and n part; As a polar coordinates value (θ i, r i) when satisfying following condition:
θ k≤θ i≤θ k+1,r l≤r i≤r l+1,k=0,...,m-1,l=0,...,n-1
Then increase corresponding unit (k, value l); When having traveled through all points as stated above, will form a two-dimensional histogram with AD HOC, this specific pattern is characterizing the given shape of corresponding outline line; At last, carry out to obtain the vector f that a m * n ties up after the normalization by the value of this each cell of histogram of row expansion and to it.
4. the upper half of human body that the is applicable to the low contrast video as claimed in claim 1 method that detects and cut apart is characterized in that the detailed process of step (3) is as follows:
In the training stage, a large amount of upper half of human body images and non-upper half of human body image are collected, thereby extract the shape facility of prospect by the manual markings foreground area, these shape facilities the set of corresponding high dimension vector formed the sample set that the present invention is used for training, adopt support vector machine as the algorithm of training, its kernel function adopts Gauss's radius basis function:
K(x i,x j)=exp(-γ||x i-x j|| 2)
X wherein i, x jBe proper vector, γ is normaliztion constant;
Adopt the method for K cross validation to determine two parameter γ and the C of support vector machine classifier: all data are divided into K one's share of expenses for a joint undertaking data, a independent subdata is retained as verification msg, other K-1 one's share of expenses for a joint undertaking data are used for training, as above process is repeated inferior K time, select for use different subdata combinations as verification msg and training data at every turn, at last asking result is averaged; At detection-phase, for each frame, if there is foreground area, using the same method so extracts the region contour shape facility, as the input of the good sorter of precondition, sorter will export whether a Boolean explanation current region is the upper half of human body zone.
5. the upper half of human body that the is applicable to the low contrast video as claimed in claim 1 method that detects and cut apart is characterized in that the detailed process of step (5) is as follows:
Count E with an energy functional c(s) characterize the integrity profile of one section closure:
E c(s)=∮(E int(s)+η(s)E ext(s))ds
E wherein Int(s) be the inside potential energy of outline line, E Ext(s) provided outside limits based on image, η (s) is the weight corresponding to each sampled point, is defined as:
η ( s i ) = | | ▿ I ( x ( s i ) , y ( s i ) ) | | 2 ∑ i N | | ▿ I ( x ( s i ) , y ( s i ) ) | | 2
Wherein
Figure FDA00002917651300036
The gradient of presentation video, N is the sum of sampled point;
Adopt Euler-lagrange's method of multipliers that the functional formula is converted into the problem of finding the solution of partial differential equation, then to its discretize, finally obtain a linear system Ax=b, wherein A is the matrix that has and only have five nonzero elements on the diagonal line; Can use this linear system of Cholesky decomposition method solution;
On the basis that obtains correct foreground area, upgrade the background area with the mode of linear interpolation:
I B ( x , y ) = α I B ( t ) ( x , y ) + ( 1 - α ) I B * ( x , y )
I wherein B(x, y) for upgrade the position, back (x, y) Dui Ying background frames pixel value,
Figure FDA00002917651300033
Be the pixel value of same position before upgrading,
Figure FDA00002917651300034
The pixel value that belongs to the background area in the present frame for correspondence; For foreground area, only copy the pixel value on the relevant position simply
Figure FDA00002917651300035
CN2011104465964A 2011-12-28 2011-12-28 Human upper body detection and splitting method applied to low-contrast video Active CN102521582B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011104465964A CN102521582B (en) 2011-12-28 2011-12-28 Human upper body detection and splitting method applied to low-contrast video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011104465964A CN102521582B (en) 2011-12-28 2011-12-28 Human upper body detection and splitting method applied to low-contrast video

Publications (2)

Publication Number Publication Date
CN102521582A CN102521582A (en) 2012-06-27
CN102521582B true CN102521582B (en) 2013-09-25

Family

ID=46292493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011104465964A Active CN102521582B (en) 2011-12-28 2011-12-28 Human upper body detection and splitting method applied to low-contrast video

Country Status (1)

Country Link
CN (1) CN102521582B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10326928B2 (en) 2016-01-08 2019-06-18 Samsung Electronics Co., Ltd. Image processing apparatus for determining whether section of target area matches section of person area and control method thereof

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104683765B (en) * 2015-02-04 2019-04-12 上海依图网络科技有限公司 A kind of video concentration method based on detecting moving object
CN108804992B (en) * 2017-05-08 2022-08-26 电子科技大学 Crowd counting method based on deep learning
CN107707975A (en) * 2017-09-20 2018-02-16 天津大学 Video intelligent clipping method based on monitor supervision platform
CN113379930B (en) * 2021-05-25 2023-03-24 广州紫为云科技有限公司 Immersive interaction method and device through human body graph and storage medium
CN114998390B (en) * 2022-08-02 2022-10-21 环球数科集团有限公司 Visual analysis system and method for embedded intelligent camera

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101131728A (en) * 2007-09-29 2008-02-27 东华大学 Face shape matching method based on Shape Context
CN101996307A (en) * 2009-08-10 2011-03-30 上海理视微电子有限公司 Intelligent video human body identification method
CN101834982B (en) * 2010-05-28 2012-04-25 上海交通大学 Hierarchical screening method of violent videos based on multiplex mode

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10326928B2 (en) 2016-01-08 2019-06-18 Samsung Electronics Co., Ltd. Image processing apparatus for determining whether section of target area matches section of person area and control method thereof

Also Published As

Publication number Publication date
CN102521582A (en) 2012-06-27

Similar Documents

Publication Publication Date Title
Kamal et al. Automatic traffic sign detection and recognition using SegU-Net and a modified Tversky loss function with L1-constraint
JP6395481B2 (en) Image recognition apparatus, method, and program
CN107316031B (en) Image feature extraction method for pedestrian re-identification
CN110033002B (en) License plate detection method based on multitask cascade convolution neural network
Shi et al. Spectral–spatial classification and shape features for urban road centerline extraction
CN102521582B (en) Human upper body detection and splitting method applied to low-contrast video
Maurya et al. Road extraction using k-means clustering and morphological operations
CN107239730B (en) Quaternion deep neural network model method for intelligent automobile traffic sign recognition
Romdhane et al. An improved traffic signs recognition and tracking method for driver assistance system
Biasutti et al. Lu-net: An efficient network for 3d lidar point cloud semantic segmentation based on end-to-end-learned 3d features and u-net
CN110866430B (en) License plate recognition method and device
CN109934224B (en) Small target detection method based on Markov random field and visual contrast mechanism
CN103886589A (en) Goal-oriented automatic high-precision edge extraction method
CN107480585B (en) Target detection method based on DPM algorithm
JP2016062610A (en) Feature model creation method and feature model creation device
CN103996018A (en) Human-face identification method based on 4DLBP
CN108734200B (en) Human target visual detection method and device based on BING (building information network) features
CN110751154B (en) Complex environment multi-shape text detection method based on pixel-level segmentation
CN105718866A (en) Visual target detection and identification method
CN107895379A (en) The innovatory algorithm of foreground extraction in a kind of video monitoring
CN111461039A (en) Landmark identification method based on multi-scale feature fusion
CN113408584A (en) RGB-D multi-modal feature fusion 3D target detection method
Wang et al. A novel sparse boosting method for crater detection in the high resolution planetary image
CN106778777B (en) Vehicle matching method and system
Ahmed et al. Traffic sign detection and recognition model using support vector machine and histogram of oriented gradient

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant