CN104573659B

CN104573659B - A kind of driver based on svm takes phone-monitoring method

Info

Publication number: CN104573659B
Application number: CN201510013139.4A
Authority: CN
Inventors: 张卡; 何佳; 曹昌龙; 尼秀明
Original assignee: ANHUI QINGXIN INTERNET INFORMATION TECHNOLOGY Co Ltd
Current assignee: ANHUI QINGXIN INTERNET INFORMATION TECHNOLOGY Co Ltd
Priority date: 2015-01-09
Filing date: 2015-01-09
Publication date: 2018-01-09
Anticipated expiration: 2035-01-09
Also published as: CN104573659A

Abstract

The step of the present invention relates to a kind of driver based on svm to take phone-monitoring method, and the monitoring method includes following order：Establish Face datection grader and hand images detect grader as the hand of training positive sample during taking phone；The driving condition image of collection driver in real time；Suitable image region is selected to detect and take the effective detection region that phone detects as hand exercise；According to effective detection area image, real-time judge driver whether there is the warming-up exercise for taking phone；According to effective detection area image, the time that the hand of driver is rested on beside ear is monitored in detail, and judge whether driver takes phone according to the length of residence time；The real-time video that driver is taken to phone is sent to remote server, and receives the order of remote server transmission.

Description

A kind of driver based on svm takes phone-monitoring method

Technical field

The present invention relates to safe driving technical field, and in particular to a kind of driver based on svm takes phone-monitoring side Method.

Background technology

With the rapid growth of car ownership, people enjoy the facility of traffic and it is quick while, also along with All kinds of traffic accidents take place frequently, and cause huge personnel and economic loss.Cause the factor of traffic accident a lot, driver drives It is one of important inducement that phone is taken in way.Due to that can not obtain the driving behavior video of driver in real time, some passenger traffics and The supervision department of Shipping enterprises can only can not carry out monitoring in advance and in advance using foundation of the deduction as division responsibility afterwards It is anti-.Therefore, monitor driver in real time takes phone behavior, feeds back to supervision department of transport enterprise in time and is prevented, right In avoiding major traffic accidents, play a part of can not be substituted.

SVMs (svm) is a kind of supervised learning algorithm, and it is solving small sample, the knowledge of non-linear and high dimensional pattern Many distinctive advantages are shown in not, the VC dimensions that this method is built upon Statistical Learning Theory are theoretical former with Structural risk minization On the basis of reason, according to limited sample information in the complexity (the study precision i.e. to specific training sample) of model and study Ability (i.e. without error identify arbitrary sample ability) between seek best compromise, to obtain best Generalization Ability (or Claim generalization ability).

At present, the monitoring of phone behavior is taken for driver, conventional technical method has following several：

(1) be monitored based on mobile phone signal, such method by driver's cabin place a mobile phone signal detector, According to the different degrees of of signal fluctuation, judge whether to take phone behavior.Such method can reach on goods stock Monitoring takes the effect of phone.But on passenger stock, more interference be present, such as the mobile phone signal interference of passenger on car Deng serious missing inspection and flase drop being present, can not realize that comprehensive monitoring driver takes phone behavior in real time.

(2) it is monitored based on video image, such method by monitoring whether driver's both hands are placed on steering wheel in real time On, once there is certain hand departure direction disk, that is, it is considered as to take phone.Serious flase drop be present in such method, because There is the custom of a hand steered steering wheel in many drivers, therefore, this method is in actual environment using having the problem of larger.

The content of the invention

It is an object of the invention to provide a kind of driver based on svm to take phone-monitoring method, and the monitoring method is Based on video image processing technology, judge to take phone behavior by monitoring the state of driver's ear region hand in real time, With the features such as monitoring degree of accuracy height, missing inspection flase drop is less, and speed is fast, and cost is low.

The technical scheme is that：

The step of a kind of driver based on svm takes phone-monitoring method, and the monitoring method includes following order：

(1) establish Face datection grader and hand images detect classification as the hand of training positive sample during taking phone Device；

(2) the driving condition image of driver is gathered in real time；

(3) suitable image region is selected to detect and take the effective detection region that phone detects as hand exercise；

(4) warming-up exercise for taking phone whether there is according to effective detection area image, real-time judge driver；If It is then to perform step (5)；Step (2) is performed if it is not, then returning；

(5) time rested on according to effective detection area image, the detailed hand for monitoring driver beside ear, and according to The time length that the hand of driver is rested on beside ear judges whether driver takes phone；If so, then perform step (6)；Step (2) is performed if it is not, then returning；

(6) real-time video that driver is taken to phone is sent to remote server, and receives remote server transmission Order.

In step (3), the described suitable image region of selection detects and taken phone detection as hand exercise Effective detection region, the step of specifically including following order：

(31) harr features and adaboost sorting algorithms are used, detects face location；

(32) according to five, the three front yard layout rule of face, the position that coarse positioning is left and right two；

(33) it is accurately positioned out the position of eyes；

(34) suitable effective image subregion is selected.

Described according to effective detection area image in step (4), real-time judge driver, which whether there is, takes phone Warming-up exercise, the step of specifically including following order：

(41) uniform point sampling is carried out respectively in left and right effective detection region；

(42) accurate tracking of sampled point is carried out；

(43) movable information of correct tracking sampling point is obtained；

(44) statistical nature in effective detection region is obtained, the statistical nature of described detection zone includes left effective detection The mean motion intensity avem in region_l, right effective detection region mean motion intensity avem_r, left effective detection region motion Range R_lWith the motion range R in right effective detection region_r；

(45) judge whether to lift action of the hand close to ear.

It is described according to effective detection area image in step (5), monitor that the hand of driver is rested on beside ear in detail Time, and the time length rested on according to the hand of driver beside ear judges whether driver takes phone；Tool Body includes the step of following order：

(51) using effective detection area image as present image, the Gradient Features of present image are calculated, and to gradient side It is corrected to angle, the scope of gradient direction is limited to [0 π]；

(52) one a width of W of construction, a height of H hough transform window, W and H value are respectively equal to the instruction of hand trainer Practice the wide high level of sample, and subwindow division is carried out to hough transform window；

(53) hough transform window is slided with a fixed step size in effective detection region, judges that hough transform window covers Region whether there is hand images；

(54) judge whether hough transform window is slided at the ending of present image；If so, then perform step (55)；If No, then hough transform window sliding to next position, performs step (53) again；

(55) a width of W of present image is set_image, a height of H_image, change of scale is carried out to present image, carries out yardstick change A width of σ W of image after changing_image, a height of σ H_image, the image after progress change of scale is set to present image；Wherein, 0 ＜ σ ＜ 1；

(56) judge whether the size of present image is less than minimum threshold；If the size of present image is less than minimum threshold, Then perform step (57)；If the size of present image is more than minimum threshold, returns and perform step (51)；

(57) number of the candidate target in the list of candidate region and position, the position of hand in comprehensive descision present frame Put；

(58) judge to accumulate whether frame number reaches unit frame number；If so, then perform step (59)；Performed if it is not, then returning Step (51)；

(59) in statistical unit frame number, the frame number proportion of hand images be present；And according to the frame number institute of hand images Accounting example judges whether driver is in and takes telephone state.

In step (34), the suitable effective image subregion of described selection, specifically realized using below equation：

Wherein, rect_left、rect_rightSub-rectangular areas near the left and right ear of selection is represented respectively, and rect represents inspection The face location rectangular area measured, point_l、point_rThe left hand edge point of left eye and the right hand edge point of right eye are represented respectively.

In step (43), the movable information of the described correct tracking sampling point of acquisition, specifically realized using below equation：

Wherein, M (i) represents the motion amplitude of ith sample point, and θ (i) represents the direction of motion of ith sample point, xpoint_iRepresent coordinate of the ith sample point on present frame, ypoint_iCoordinate of the ith sample point on previous frame is represented, Dx represents amount of exercise of some sampled point in x directions, and Dy represents amount of exercise of some sampled point in y directions.

In step (44), the statistical nature in described acquisition effective detection region, specifically realized using below equation：

Wherein, sum_lRepresent that there is the motion amplitude of the sampled point substantially moved, N in left detection zone_mlRepresent left detection There is the number of the sampled point substantially moved, N in region_lRepresent that left detection zone is interior with obvious motion and close to ear The number of sampled point；

sum_rRepresent that there is the motion amplitude of the sampled point substantially moved, N in right detection zone_mrRepresent in right detection zone Number with the sampled point substantially moved, N_rRepresent there is obvious motion and to the close sampled point of ear in right detection zone Number；

M_l(i) motion amplitude of i-th of correct tracking sampling point in left detection zone, θ are represented_l(i) left detection zone is represented The direction of motion of interior i-th of correct tracking sampling point；

M_r(i) motion amplitude of i-th of correct tracking sampling point in right detection zone, θ are represented_r(i) right detection zone is represented The direction of motion of interior i-th of correct tracking sampling point.

In step (45), described judges whether to lift action of the hand close to ear, is specifically realized using below equation：

Wherein, exist=1 represents action of the lift hand close to ear be present, and exist=0 represents to be not present lift hand close to ear Piece action, s_lRepresent the action with the presence or absence of lift hand close to ear in the effective detection region of left side；s_rRepresent effective on right side Action in detection zone with the presence or absence of lift hand close to ear；T_mlWhen representing to lift hand close to ear in the effective detection region of left side Motion strength threshold；T_θlMotion range threshold value when representing to lift hand close to ear in the effective detection region of left side；T_mrRepresent Motion strength threshold when lifting hand close to ear in the effective detection region of right side；T_θrRepresent that hand is lifted in the effective detection region of right side to be leaned on Motion range threshold value during nearly ear；T_mlbRepresent in the effective detection region of left side, being averaged when the lift hand of standard is close to ear Exercise intensity；T_θlbRepresent in the effective detection region of left side, the mean motion range when lift hand of standard is close to ear；T_mrbTable Show in the effective detection region of right side, the mean motion intensity when lift hand of standard is close to ear；T_θrbExpression is effectively examined on right side Survey in region, the mean motion range when lift hand of standard is close to ear.

In step (51), the Gradient Features of described calculating present image, and gradient direction angle is corrected, specifically Realized using below equation：

Wherein, M (x, y), θ (x, y) represent the gradient magnitude and gradient direction at pixel (x, y) place respectively, and f (x, y) is represented The gray value at pixel (x, y) place.Gx represents the partial derivative in pixel (x, y) place x directions, and Gy represents the inclined of pixel (x, y) place y directions Derivative.

It is described to slide hough transform window in effective detection region with a fixed step size in step (53), judge square The region of shape detection window covering whether there is hand images；Specifically realized using the step of following order：

(531) histogram of gradients of each subwindow overlay area is counted；

(532) characteristic vector of hough transform window overlay area is obtained；

(533) characteristic vector of hough transform window overlay area is normalized；

(534) characteristic vector of hough transform window overlay area is sent into hand svm graders, predicts its affiliated class Not；If hand svm graders predict that this feature vector belongs to hand images, hough transform window position is added into candidate In zone list.

Phone-monitoring method is taken compared to other drivers, the present invention uses video image processing technology, is supervised by triggering Control pattern monitors the hand existence of driver's ear region in real time, and judgement takes phone behavior, has monitoring degree of accuracy height, The features such as missing inspection flase drop is less, and speed is fast, and cost is low.

Brief description of the drawings

Fig. 1 is flow chart of the method for the present invention；

Fig. 2 is trigger module workflow diagram；

Fig. 3 is monitoring module workflow diagram；

Fig. 4 is training positive sample master drawing；

Fig. 5 is the effective detection zone schematic diagram of positioning；

Fig. 6 is point sampling design sketch；

Fig. 7 is the subwindow division schematic diagram of hough transform window；

Fig. 8 is hough transform window sliding schematic diagram；

Fig. 9 is hand candidate test position design sketch, wherein, rectangle frame 1 is effective detection region, and rectangle frame 2 is candidate Detection zone；

Figure 10 is the final test position design sketch of hand, wherein, rectangle frame 1 is effective detection region, and rectangle frame 2 is final Test position.

Embodiment

The present invention is further illustrated below in conjunction with the accompanying drawings.

As shown in figure 1, in embodiment, the driver of the present invention based on svm takes phone-monitoring system and included initially Change module, acquisition module, locating module, trigger module, monitoring module and communication module.The driver based on svm takes electricity The specific implementation step for talking about monitoring system is as follows：

S1, perform initialization module.

The function of initial module is loading and the necessary grader file of training system, to comprise the following steps that：

S11, the existing Face datection grader file of loading.

S12, as shown in figure 4, collecting hand images when taking phone is used as positive sample, examined based on the theoretical training hands of svm Survey grader file.

S2, perform acquisition module.

The function of acquisition module is to gather the head image of the driving condition image, mainly driver of driver in real time.

S3, perform locating module.

The function of locating module is to select suitable image region to be detected as hand exercise and take what phone detected Effective coverage, mainly select the regional area near left and right ear.The module can greatly lift detection speed, remove More interference regions, as shown in Figure 5.The module comprises the following steps that：

S31, using harr features and adaboost sorting algorithms, detect face location.

S32, five, the three front yard layout rule based on face, the position that coarse positioning is left and right two.

S33, the position for being accurately positioned out eyes.

S34, suitable effective image subregion selected using formula (1) and formula (2).

S4, judge trigger module open and-shut mode, if trigger module is in opening, step S5 is performed, if trigger mode Block is closed, then performs step S7.

S5, perform trigger module.

The function of trigger module is that real-time judge driver whether there is the warming-up exercise for taking phone, in particular to drive Action of the person of sailing with the presence or absence of lift hand close to ear.If it does, explanation driver is possible to preparation and takes phone, now exit Trigger module, return and open monitoring module signal；If there is no the action, system may proceed to carry out real-time judge, under wait Once take the appearance of phone warming-up exercise.As shown in Fig. 2 the module comprises the following steps that：

S51, carry out uniform point sampling, effect such as Fig. 6 respectively in left and right effective detection region.

S52, the accurate tracking for carrying out sampled point.Specific track algorithm is referring to document：Forward-Backward Error: Automatic Detection of Tracking Failures, Zdenek Kalal, Krystian Mikolajczyk, Jiri Matas, Pattern Recognition (ICPR), 2010 20th International Conference on.

S53, the movable information for obtaining using formula (3) and formula (4) correct tracking sampling point；

S54, the statistical nature for obtaining using formula (5) and formula (6) effective detection region.Described effective detection region Statistical nature include the mean motion intensity avem in left side effective detection region_l, the right effective detection region mean motion Intensity avem_r, left side effective detection region motion range R_lWith the motion range R in the right effective detection region_r。

Wherein, sum_lRepresent that there is the motion amplitude of the sampled point substantially moved, N in left detection zone_mlRepresent left detection There is the number of the sampled point substantially moved, N in region_lRepresent that left detection zone is interior with obvious motion and close to ear The number of sampled point；sum_rRepresent that there is the motion amplitude of the sampled point substantially moved, N in right detection zone_mrRepresent right detection There is the number of the sampled point substantially moved, N in region_rRepresent that right detection zone is interior with obvious motion and close to ear The number of sampled point；M_l(i) motion amplitude of i-th of correct tracking sampling point in left detection zone, θ are represented_l(i) left inspection is represented Survey the direction of motion of i-th of correct tracking sampling point in region；M_r(i) i-th of correct tracking sampling in right detection zone is represented The motion amplitude of point, θ_r(i) direction of motion of i-th of correct tracking sampling point in right detection zone is represented.

S55, using formula (7)-(10), judge whether to lift action of the hand close to ear.If in the presence of exiting triggering Module；If it is not, then continue executing with trigger module.

Wherein, exist=1 represents action of the lift hand close to ear be present, and exist=0 represents to be not present lift hand close to ear Piece action.s_lRepresent the action with the presence or absence of lift hand close to ear in the effective detection region of left side；s_rRepresent effective on right side Action in detection zone with the presence or absence of lift hand close to ear；T_mlWhen representing to lift hand close to ear in the effective detection region of left side Motion strength threshold；T_θlMotion range threshold value when representing to lift hand close to ear in the effective detection region of left side；T_mrRepresent Motion strength threshold when lifting hand close to ear in the effective detection region of right side；T_θrRepresent that hand is lifted in the effective detection region of right side to be leaned on Motion range threshold value during nearly ear；T_mlbRepresent in the effective detection region of left side, being averaged when the lift hand of standard is close to ear Exercise intensity；T_θlbRepresent in the effective detection region of left side, the mean motion range when lift hand of standard is close to ear；T_mrbTable Show in the effective detection region of right side, the mean motion intensity when lift hand of standard is close to ear；T_θrbExpression is effectively examined on right side Survey in region, the mean motion range when lift hand of standard is close to ear.

S6, judge whether to trigger monitoring module.If in the presence of the warming-up exercise for taking phone, monitoring module will be triggered out Open, carry out in-depth monitoring, simultaneously close off trigger module.If in the absence of the warming-up exercise for taking phone, it is returned directly to gather mould Block, the triggering carried out next time judge.

S7, perform monitoring module.

The function of monitoring module is to monitor the time that the hand of driver is rested on beside ear in detail.If the hand of driver The time long enough rested on beside ear, illustrate that driver takes phone, then return to open communication module by signal.If It is no, then illustrate that this triggering unlatching belongs to erroneous judgement.As shown in figure 3, the module comprises the following steps that：

S71, using effective detection area image as present image, using formula (11) and formula (12), calculate current The Gradient Features of image, and gradient direction angle is corrected using formula (13), the scope of gradient direction is limited to [0 π]。

Wherein, M (x, y), θ (x, y) represent the gradient magnitude and gradient direction at pixel (x, y) place, and f (x, y) represents pixel The gray value at (x, y) place.

The one wide high respectively W and H hough transform window of S72, construction, W and H value are equal to hand svm graders The wide high level of training sample.As shown in fig. 7, subwindow division is carried out to hough transform window.Partitioning standards are：Driver does not connect During phone, due to the interference beside ear without hand, now the contouring head in effective detection region is generally vertical, because This, the narrow rectangular area in selection longitudinal direction is rational as subwindow.

S73, as shown in figure 8, hough transform window is slided with a fixed step size in effective detection region, and judge rectangle The region of detection window covering whether there is hand images, comprise the following steps that：

S731, the histogram of gradients hist [bin] for counting using formula (14) each subwindow overlay area.

Wherein, floor (x) represents to choose the maximum integer no more than x, and bin span is 1~10.

S732, the histogram of gradients of all subwindow overlay areas of series connection, obtain the feature of detection window overlay area to Amount.

S733, the characteristic vector normalization by hough transform window overlay area；

S734, judge whether detection window overlay area belongs to hand images, effect such as Fig. 9.In Fig. 9, square frame 1 represents left Right effective detection region, square frame 2 represent that hough transform window overlay area belongs to hand images.Specific method is by detection window The characteristic vector of overlay area is sent into hand svm graders, predicts its generic.If belonging to hand images, rectangle is examined Window position is surveyed to be put into the list of candidate region.

S74, judge whether detection window is slided at the ending of present image.If not reaching image ending, rectangle inspection Window sliding is surveyed to next position, and performs step S73 again；If having reached image ending, step S75 is performed.

S75, present image size is set as W_image、H_image, change of scale is carried out to present image, then schemed after change of scale As size is changed into σ W_image、σH_image, and the image after change of scale is designated as present image.Wherein 0 ＜ σ ＜ 1.

S76, judge whether present image size is less than minimum threshold.If being less than thresholding, illustrate that present image is too small, Through that there can not possibly be hand, step S77 is performed；If being more than thresholding, illustrate in present image it is possible to hand be present, perform Step S71 continues detection process.

S77, the number of candidate target in the list of candidate region and position, the position of hand in comprehensive descision present frame Put, effect such as Figure 10.In Figure 10, square frame 1 represents left and right effective detection region, and square frame 2 represents hough transform window overlay area Belong to hand images.

S78, judge to accumulate whether frame number reaches unit frame number.If reaching, step S79 is performed；If it is not, then perform step S71 continues detection process.

In S79, statistical unit frame number, the frame number proportion of hand images be present.If the ratio is more than certain threshold value, Illustrate that current driver's are in and take telephone state.

S710, trigger module is opened, close monitoring module.

S8, perform communication module.

The function of communication module is that, when driver, which is in, takes telephone state, the module sends to remote server and driven The person of sailing takes the real-time video of phone, and now supervision department of transport enterprise can timely be handled by the video；If Need to converse with driver, remote command can also be received by the module.

Embodiment described above is only that the preferred embodiment of the present invention is described, not to the model of the present invention Enclose and be defined, on the premise of design spirit of the present invention is not departed from, technical side of the those of ordinary skill in the art to the present invention The various modifications and improvement that case is made, it all should fall into the protection domain of claims of the present invention determination.

Claims

1. a kind of driver based on svm takes phone-monitoring method, it is characterised in that：The monitoring method includes following order Step：

(1) establish Face datection grader and hand images detect grader as the hand of training positive sample during taking phone；

(2) the driving condition image of driver is gathered in real time；

(4) warming-up exercise for taking phone whether there is according to effective detection area image, real-time judge driver；If so, then Perform step (5)；Step (2) is performed if it is not, then returning；

(5) according to effective detection area image, the time that the hand of driver is rested on beside ear is monitored in detail, and according to driving The time length that the hand of member is rested on beside ear judges whether driver takes phone；If so, then perform step (6)； Step (2) is performed if it is not, then returning；

(6) real-time video that driver is taken to phone is sent to remote server, and receives the order of remote server transmission；

It is described that the preparation for taking phone whether there is according to effective detection area image, real-time judge driver in step (4) Action, the step of specifically including following order：

(42) accurate tracking of sampled point is carried out；

(43) movable information of correct tracking sampling point is obtained；

(44) statistical nature in effective detection region is obtained, the statistical nature of described detection zone includes left effective detection region Mean motion intensity avem_l, right effective detection region mean motion intensity avem_r, left effective detection region motion range R_lWith the motion range R in right effective detection region_r；

(45) judge whether to lift action of the hand close to ear；

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mi>M</mi> <mo>(</mo> <mi>i</mi> <mo>)</mo> <mo>=</mo> <mi>s</mi> <mi>q</mi> <mi>r</mi> <mi>t</mi> <mo>(</mo> <msup> <msub> <mi>D</mi> <mi>x</mi> </msub> <mn>2</mn> </msup> <mo>+</mo> <msup> <msub> <mi>D</mi> <mi>y</mi> </msub> <mn>2</mn> </msup> <mo>)</mo> </mtd> </mtr> <mtr> <mtd> <mi>&theta;</mi> <mo>(</mo> <mi>i</mi> <mo>)</mo> <mo>=</mo> <mi>a</mi> <mi>r</mi> <mi>c</mi> <mi>t</mi> <mi>a</mi> <mi>n</mi> <mo>(</mo> <mfrac> <msub> <mi>D</mi> <mi>y</mi> </msub> <msub> <mi>D</mi> <mi>x</mi> </msub> </mfrac> <mo>)</mo> </mtd> </mtr> </mtable> </mfenced>

Wherein, M (i) represents the motion amplitude of ith sample point, and θ (i) represents the direction of motion of ith sample point, xpoint_iTable Show coordinate of the ith sample point on present frame, ypoint_iCoordinate of the ith sample point on previous frame is represented, Dx represents certain For individual sampled point in the amount of exercise in x directions, Dy represents amount of exercise of some sampled point in y directions；

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <msub> <mi>sum</mi> <mi>l</mi> </msub> <mo>=</mo> <msub> <mi>sum</mi> <mi>l</mi> </msub> <mo>+</mo> <msub> <mi>M</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mn>1</mn> </mrow> </msub> <mo>+</mo> <mn>1</mn> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>N</mi> <mi>l</mi> </msub> <mo>=</mo> <msub> <mi>N</mi> <mi>l</mi> </msub> <mo>+</mo> <mn>1</mn> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mfrac> <mi>&pi;</mi> <mn>4</mn> </mfrac> <mo><</mo> <msub> <mi>&theta;</mi> <mi>l</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&le;</mo> <mfrac> <mi>&pi;</mi> <mn>2</mn> </mfrac> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>sum</mi> <mi>r</mi> </msub> <mo>=</mo> <msub> <mi>sum</mi> <mi>r</mi> </msub> <mo>+</mo> <msub> <mi>M</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>r</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>N</mi> <mrow> <mi>m</mi> <mi>r</mi> </mrow> </msub> <mo>+</mo> <mn>1</mn> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <msub> <mi>N</mi> <mi>r</mi> </msub> <mo>=</mo> <msub> <mi>N</mi> <mi>r</mi> </msub> <mo>+</mo> <mn>1</mn> </mrow> </mtd> <mtd> <mrow> <msub> <mi>M</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>2</mn> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mfrac> <mi>&pi;</mi> <mn>2</mn> </mfrac> <mo><</mo> <msub> <mi>&theta;</mi> <mi>r</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>)</mo> </mrow> <mo>&le;</mo> <mfrac> <mrow> <mn>3</mn> <mo>*</mo> <mi>&pi;</mi> </mrow> <mn>4</mn> </mfrac> </mrow> </mtd> </mtr> </mtable> </mfenced>

Wherein, sum_lRepresent that there is the motion amplitude of the sampled point substantially moved, N in left detection zone_mlRepresent left detection zone The interior number with the sampled point substantially moved, N_lRepresent there is obvious motion and to the close sampling of ear in left detection zone The number of point；

sum_rRepresent that there is the motion amplitude of the sampled point substantially moved, N in right detection zone_mrRepresent have in right detection zone The number of the sampled point substantially moved, N_rRepresent there is obvious motion and to the number of the close sampled point of ear in right detection zone Mesh；

M_l(i) motion amplitude of i-th of correct tracking sampling point in left detection zone, θ are represented_l(i) the is represented in left detection zone The direction of motion of i correct tracking sampling points；

M_r(i) motion amplitude of i-th of correct tracking sampling point in right detection zone, θ are represented_r(i) the is represented in right detection zone The direction of motion of i correct tracking sampling points；

<mrow> <mi>e</mi> <mi>x</mi> <mi>i</mi> <mi>s</mi> <mi>t</mi> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <msub> <mi>s</mi> <mi>l</mi> </msub> <mo>+</mo> <msub> <mi>s</mi> <mi>r</mi> </msub> <mo>=</mo> <mn>1</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <msub> <mi>s</mi> <mi>l</mi> </msub> <mo>+</mo> <msub> <mi>s</mi> <mi>r</mi> </msub> <mo>&NotEqual;</mo> <mn>1</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

<mrow> <msub> <mi>s</mi> <mi>l</mi> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <msub> <mi>avem</mi> <mi>l</mi> </msub> <mo>&GreaterEqual;</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>l</mi> </mrow> </msub> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mi> </mi> <msub> <mi>R</mi> <mi>l</mi> </msub> <mo>&GreaterEqual;</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>l</mi> </mrow> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <msub> <mi>avem</mi> <mi>l</mi> </msub> <mo><</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>l</mi> </mrow> </msub> <mi>o</mi> <mi>r</mi> <mi> </mi> <msub> <mi>R</mi> <mi>l</mi> </msub> <mo><</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>l</mi> </mrow> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

<mrow> <msub> <mi>s</mi> <mi>r</mi> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mn>1</mn> </mtd> <mtd> <mrow> <msub> <mi>avem</mi> <mi>r</mi> </msub> <mo>&GreaterEqual;</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>r</mi> </mrow> </msub> <mi>a</mi> <mi>n</mi> <mi>d</mi> <mi> </mi> <msub> <mi>R</mi> <mi>r</mi> </msub> <mo>&GreaterEqual;</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>r</mi> </mrow> </msub> </mrow> </mtd> </mtr> <mtr> <mtd> <mn>0</mn> </mtd> <mtd> <mrow> <msub> <mi>avem</mi> <mi>r</mi> </msub> <mo><</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>r</mi> </mrow> </msub> <mi>o</mi> <mi>r</mi> <mi> </mi> <msub> <mi>R</mi> <mi>r</mi> </msub> <mo><</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>r</mi> </mrow> </msub> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>l</mi> <mi>b</mi> </mrow> </msub> <mo>*</mo> <mn>0.7</mn> </mtd> </mtr> <mtr> <mtd> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>l</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>l</mi> <mi>b</mi> </mrow> </msub> <mo>*</mo> <mn>0.7</mn> </mtd> </mtr> <mtr> <mtd> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>r</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>T</mi> <mrow> <mi>m</mi> <mi>r</mi> <mi>b</mi> </mrow> </msub> <mo>*</mo> <mn>0.7</mn> </mtd> </mtr> <mtr> <mtd> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>r</mi> </mrow> </msub> <mo>=</mo> <msub> <mi>T</mi> <mrow> <mi>&theta;</mi> <mi>r</mi> <mi>b</mi> </mrow> </msub> <mo>*</mo> <mn>0.7</mn> </mtd> </mtr> </mtable> </mfenced>

Wherein, exist=1 represents action of the lift hand close to ear be present, and exist=0 represents to be not present lift hand close to ear Action；s_lRepresent the action with the presence or absence of lift hand close to ear in the effective detection region of left side；s_rRepresent in right side effective detection Action in region with the presence or absence of lift hand close to ear；

T_mlMotion strength threshold when representing to lift hand close to ear in the effective detection region of left side；T_θlRepresent in left side effective detection Motion range threshold value when lifting hand close to ear in region；T_mrRepresent to lift hand in the effective detection region of right side close to ear luck Fatigue resistance threshold value；T_θrMotion range threshold value when representing to lift hand close to ear in the effective detection region of right side；

T_mlbRepresent in the effective detection region of left side, the mean motion intensity when lift hand of standard is close to ear；T_θlbRepresent In the effective detection region of left side, the mean motion range when lift hand of standard is close to ear；T_mrbRepresent in right side effective detection area In domain, the mean motion intensity when lift hand of standard is close to ear；T_θrbRepresent in the effective detection region of right side, the lift of standard Mean motion range when hand is close to ear.

2. a kind of driver based on svm according to claim 1 takes phone-monitoring method, it is characterised in that：Step (3) in, the described suitable image region of selection detects and taken the effective detection region of phone detection as hand exercise, The step of specifically including following order：

(33) it is accurately positioned out the position of eyes；

(34) suitable effective image subregion is selected.

3. a kind of driver based on svm according to claim 1 takes phone-monitoring method, it is characterised in that：Step (5) it is described according to effective detection area image in, the time that the hand of driver is rested on beside ear, and root are monitored in detail The time length rested on according to the hand of driver beside ear judges whether driver takes phone；Specifically include following suitable The step of sequence：

(51) using effective detection area image as present image, the Gradient Features of present image are calculated, and to gradient direction angle Degree is corrected, and the scope of gradient direction is limited to [0 π]；

(52) one a width of W of construction, a height of H hough transform window, W and H value are respectively equal to the training sample of hand trainer This wide high level, and subwindow division is carried out to hough transform window；

(53) hough transform window is slided with a fixed step size in effective detection region, judges the area of hough transform window covering Domain whether there is hand images；

(54) judge whether hough transform window is slided at the ending of present image；If so, then perform step (55)；If it is not, Then hough transform window sliding performs step (53) again to next position；

(55) a width of W of present image is set_image, a height of H_image, change of scale is carried out to present image, after carrying out change of scale A width of σ W of image_image, a height of σ H_image, the image after progress change of scale is set to present image；Wherein, 0 ＜ σ ＜ 1；

(56) judge whether the size of present image is less than minimum threshold；If the size of present image is less than minimum threshold, hold Row step (57)；If the size of present image is more than minimum threshold, returns and perform step (51)；

(57) number of the candidate target in the list of candidate region and position, the position of hand in comprehensive descision present frame；

(58) judge to accumulate whether frame number reaches unit frame number；If so, then perform step (59)；Step is performed if it is not, then returning (51)；

(59) in statistical unit frame number, the frame number proportion of hand images be present；And according to the frame number institute accounting of hand images Example judges whether driver is in and takes telephone state.

4. a kind of driver based on svm according to claim 2 takes phone-monitoring method, it is characterised in that：Step (34) in, the suitable effective image subregion of described selection, specifically realized using below equation：

Wherein, rect_left、rect_rightSub-rectangular areas near the left and right ear of selection is represented respectively, and rect represents to detect Face location rectangular area, point_l、point_rThe left hand edge point of left eye and the right hand edge point of right eye are represented respectively.

5. a kind of driver based on svm according to claim 3 takes phone-monitoring method, it is characterised in that：Step (51) in, the Gradient Features of described calculating present image, and gradient direction angle is corrected, specifically using below equation Realize：

<mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mi>M</mi> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> <mo>=</mo> <msqrt> <mrow> <msup> <msub> <mi>G</mi> <mi>x</mi> </msub> <mn>2</mn> </msup> <mo>+</mo> <msup> <msub> <mi>G</mi> <mi>y</mi> </msub> <mn>2</mn> </msup> </mrow> </msqrt> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mi>a</mi> <mi>r</mi> <mi>c</mi> <mi>t</mi> <mi>a</mi> <mi>n</mi> <mrow> <mo>(</mo> <mfrac> <msub> <mi>G</mi> <mi>y</mi> </msub> <msub> <mi>G</mi> <mi>x</mi> </msub> </mfrac> <mo>)</mo> </mrow> </mrow> </mtd> </mtr> </mtable> </mfenced>

<mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>+</mo> <mi>&pi;</mi> </mrow> </mtd> <mtd> <mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo><</mo> <mn>0</mn> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> </mrow> </mtd> <mtd> <mrow> <mi>&theta;</mi> <mrow> <mo>(</mo> <mi>x</mi> <mo>,</mo> <mi>y</mi> <mo>)</mo> </mrow> <mo>&GreaterEqual;</mo> <mn>0</mn> </mrow> </mtd> </mtr> </mtable> </mfenced> </mrow>

Wherein, M (x, y), θ (x, y) represent the gradient magnitude and gradient direction at pixel (x, y) place respectively, and f (x, y) represents pixel The gray value at (x, y) place；Gx represents the partial derivative in pixel (x, y) place x directions, and Gy represents the local derviation in pixel (x, y) place y directions Number.

6. a kind of driver based on svm according to claim 3 takes phone-monitoring method, it is characterised in that：Step (53) it is described to slide hough transform window in effective detection region with a fixed step size in, judge that hough transform window covers The region of lid whether there is hand images；Specifically realized using the step of following order：

(531) histogram of gradients of each subwindow overlay area is counted；

(534) characteristic vector of hough transform window overlay area is sent into hand svm graders, predicts its generic；If Hand svm graders predict that this feature vector belongs to hand images, then hough transform window position are added into candidate region In list.