CN104573659A

CN104573659A - Driver call-making and call-answering monitoring method based on svm

Info

Publication number: CN104573659A
Application number: CN201510013139.4A
Authority: CN
Inventors: 张卡; 何佳; 曹昌龙; 尼秀明
Original assignee: ANHUI QINGXIN INTERNET INFORMATION TECHNOLOGY Co Ltd
Current assignee: ANHUI QINGXIN INTERNET INFORMATION TECHNOLOGY Co Ltd
Priority date: 2015-01-09
Filing date: 2015-01-09
Publication date: 2015-04-29
Anticipated expiration: 2035-01-09
Also published as: CN104573659B

Abstract

The invention relates to a driver call-making and call-answering monitoring method based on an svm. The monitoring method comprises the following sequential steps that a face detection classifier and a hand detection classifier are established, wherein hand images obtained in the calling making process and the call answering process are used as positive training samples; driving state images of a driver are collected in real time; proper image subregions are selected as effective detection regions for hand movement detection and call-making and call-answering detection; whether the driver prepares for making a call or answering a call is judged in real time according to the images in the effective detection regions; according to the images in the effective detection regions, the time when the hand of the driver stays beside the ear is monitored carefully, and whether the driver is making a call or answering a call is judged according to the duration of stay; a real-time video of the call-making or call-answering process of the driver is sent to a remote server, and a command sent by the remote server is received.

Description

A kind of driver based on svm plays phone-monitoring method

Technical field

The present invention relates to safe driving technical field, be specifically related to a kind of driver based on svm and play phone-monitoring method.

Background technology

Along with the quick growth of automobile pollution, people are enjoying the facility and simultaneously efficiently of traffic, and taking place frequently also along with all kinds of traffic hazard, causes huge personnel and economic loss.Cause the factor of traffic hazard a lot, it is one of important inducement that driver drives to play in way phone.Due to cannot the driving behavior video of Real-time Obtaining driver, the supervision department of some passenger and freight enterprises using deduction afterwards as the foundation dividing responsibility, can only cannot carry out monitoring in advance and prevention.Therefore, real-time monitoring driving person plays phone behavior, feeds back to supervision department of transport enterprise in time and prevents, and for avoiding major traffic accidents, plays a part can not be substituted.

Support vector machine (svm) is a kind of supervised learning algorithm, it shows many distinctive advantages in solution small sample, non-linear and high dimensional pattern identification, the method is that the VC being based upon Statistical Learning Theory ties up on theoretical and Structural risk minization basis, between the complicacy (namely to the study precision of specific training sample) and learning ability (namely identifying the ability of arbitrary sample error-free) of model, best compromise is sought, to obtaining best Generalization Ability (or claiming generalization ability) according to limited sample information.

At present, driver is played to the monitoring of phone behavior, conventional technical method has following several:

(1) monitor based on mobile phone signal, these class methods, by placing a mobile phone signal detecting device in pilothouse, according to signal fluctuation in various degree, judge whether that existence plays phone behavior.These class methods can reach the effect that monitoring plays phone on goods stock.But on passenger stock, there is more interference, as the mobile phone signal interference etc. of passenger on passenger vehicle, there is serious undetected and flase drop, comprehensively monitoring driving person cannot be realized in real time and play phone behavior.

(2) monitor based on video image, whether these class methods are put on the steering wheel by real-time monitoring driving person's both hands, once there is certain hand departure direction dish, are namely considered to be in and play phone.There is serious flase drop in these class methods, because a lot of driver exists the custom of a hand steered bearing circle, therefore, the method uses and has larger problem in actual environment.

Summary of the invention

A kind of driver based on svm is the object of the present invention is to provide to play phone-monitoring method, this method for supervising is based on video image processing technology, judge to play phone behavior by the state of real-time monitoring driving person's ear region hand, there is monitoring accuracy high, undetected flase drop is less, speed is fast, the features such as cost is low.

Technical scheme of the present invention is:

Driver based on svm plays a phone-monitoring method, and this method for supervising comprises the step of following order:

(1) set up Face datection sorter and to play phone time hand images detect sorter for training the hand of positive sample;

(2) the driving condition image of Real-time Collection driver;

(3) suitable image region is selected to detect as hand exercise and play effective surveyed area of phone detection;

(4) according to effective surveyed area image, whether real-time judge driver exists the warming-up exercise playing phone; If so, step (5) is then performed; If not, then execution step (2) is returned;

(5) according to effective surveyed area image, monitor that in detail the hand of driver rests on the time on ear side, and judge whether driver plays phone according to the time length that the hand of driver rests on ear side; If so, step (6) is then performed; If not, then execution step (2) is returned;

(6) real-time video driver being played phone is sent to remote server, and receives the order of remote server transmission.

In step (3), the suitable image region of described selection detects and plays as hand exercise effective surveyed area that phone detects, and specifically comprises the step of following order:

(31) adopt harr characteristic sum adaboost sorting algorithm, detect face location;

(32) according to five, the three front yard layout rule of face, the position that coarse positioning is left and right two;

(33) position of eyes is accurately oriented;

(34) suitable effective image subregion is selected.

In step (4), described according to effective surveyed area image, whether real-time judge driver exists the warming-up exercise playing phone, specifically comprises the step of following order:

(41) in left and right effective surveyed area, even point sampling is carried out respectively;

(42) accurate tracking of sampled point is carried out;

(43) movable information of correct tracking sampling point is obtained;

(44) obtain the statistical nature of effective surveyed area, the statistical nature of described surveyed area comprises the mean motion intensity avem of left effectively surveyed area _l, right effectively surveyed area mean motion intensity avem _r, left effectively surveyed area motion range R _lwith the motion range R of the effective surveyed area in the right side _r;

(45) judge whether to exist the action of raising one's hand near ear.

In step (5), described according to effective surveyed area image, monitor that the hand of driver rests on the time on ear side in detail, and judge whether driver plays phone according to the time length that the hand of driver rests on ear side; Specifically comprise the step of following order:

(51) using effective surveyed area image as present image, calculate the Gradient Features of present image, and gradient direction angle corrected, the scope of gradient direction is limited to [0 π];

(52) structure one is wide is W, the high hough transform window for H, and the value of W and H equals the wide high level of the training sample of hand trainer respectively, and carries out subwindow division to hough transform window;

(53) hough transform window is slided with a fixed step size in effective surveyed area, judge whether the region that hough transform window covers exists hand images;

(54) judge whether hough transform window slides into ending place of present image; If so, step (55) is then performed; If not, then hough transform window sliding is to next position, again performs step (53);

(55) the wide as W of present image is set _image, height is H _image, carry out change of scale to present image, carrying out the wide of the image after change of scale is σ W _image, height is σ H _image, the image after carrying out change of scale is set to present image; Wherein, 0< σ <1;

(56) judge whether the size of present image is less than minimum threshold; If the size of present image is less than minimum threshold, then perform step (57); If the size of present image is greater than minimum threshold, then returns and perform step (51);

(57) according to number and the position of the candidate target in the list of candidate region, the position of hand in comprehensive descision present frame;

(58) judge whether accumulation frame number reaches unit frame number; If so, step (59) is then performed; If not, then execution step (51) is returned;

(59), in statistical unit frame number, there is the frame number proportion of hand images; And judge whether driver is according to the frame number proportion of hand images and play telephone state.

In step (34), the effective image subregion that described selection is suitable, the following formula of concrete employing realizes:

\{\begin{matrix} {rect}_{left} . x = {point}_{l} . x - rect . width * 0.6 \\ {rect}_{left} . y = {point}_{l} . y - rect . height * 0.16 \\ {rect}_{left} . width = rect . width * 0.6 \\ {rect}_{left} . height = rect . height * 1.16 \end{matrix}

\{\begin{matrix} {rect}_{right} . x = {point}_{r} . x \\ {rect}_{right} . y = {point}_{r} . y - rect . height * 0.16 \\ {rect}_{right} . width = rect . width * 0.6 \\ {rect}_{right} . height = rect . height * 1.16 \end{matrix}

Wherein, rect _left, rect _rightsub-rectangular areas near the left and right ear representing selection respectively, rect represents the face location rectangular area detected, point _l, point _rrepresent the left hand edge point of left eye and the right hand edge point of right eye respectively.

In step (43), the movable information of the correct tracking sampling point of described acquisition, the following formula of concrete employing realizes:

\{\begin{matrix} M (i) = sqrt ({D_{x}}^{2} + {D_{y}}^{2}) \\ θ (i) = \arctan (\frac{D_{y}}{D_{x}}) \end{matrix}

\{\begin{matrix} D_{x} = {xpoint}_{i} . x - {ypoint}_{i} . x \\ D_{y} = {xpoint}_{i} . y - {ypoint}_{i} . y \end{matrix}

Wherein, M (i) represents the motion amplitude of i-th sampled point, and θ (i) represents the direction of motion of i-th sampled point, xpoint _irepresent the coordinate of i-th sampled point on present frame, ypoint _irepresent the coordinate of i-th sampled point on previous frame.

In step (44), the statistical nature of the effective surveyed area of described acquisition, the following formula of concrete employing realizes:

\{\begin{matrix} {avem}_{l} = \frac{{sum}_{l}}{N_{ml}} \\ R_{l} = \frac{N_{l}}{N_{ml}} \\ {avem}_{r} = \frac{{sum}_{r}}{N_{mr}} \\ R_{r} \frac{N_{r}}{N_{mr}} \end{matrix}

\{\begin{matrix} {sum}_{l} = {sum}_{l} + M_{l} (i) & M_{l} (i) &GreaterEqual; 2 \\ N_{ml} = N_{ml} + 1 & M_{l} (i) &GreaterEqual; 2 \\ N_{l} = N_{l} + 1 & M_{l} (i) &GreaterEqual; 2 and \frac{π}{4} < θ_{l} (i) \leq \frac{π}{2} \\ {sum}_{r} = {sum}_{r} + M_{r} (i) & M_{r} (i) &GreaterEqual; 2 \\ N_{mr} = N_{mr} + 1 & M_{r} (i) &GreaterEqual; 2 \\ N_{r} = N_{r} + 1 & M_{r} (i) &GreaterEqual; 2 and \frac{π}{2} < θ_{r} (i) \leq \frac{3 * π}{4} \end{matrix}

Wherein, the equation indicated with subscript l represents the feature of the effective surveyed area in the left side, and the equation indicated with subscript r represents the feature of the effective surveyed area in the right, sum _l, sum _rrepresent in left and right surveyed area the amplitude of the sampled point with obviously motion respectively, N _ml, N _mrrepresent in left and right surveyed area the number of the sampled point with obviously motion respectively, N _l, N _rrepresent in left and right surveyed area to there is obviously motion and to the number of the close sampled point of ear respectively.

In step (45), described judges whether to exist the action of raising one's hand near ear, and the following formula of concrete employing realizes:

exist \{\begin{matrix} 1 & s_{l} + s_{r} = 1 \\ 0 & s_{l} + s_{r} &NotEqual; 1 \end{matrix}

s_{l} = \{\begin{matrix} 1 & {avem}_{l} &GreaterEqual; T_{ml} and R_{l} &GreaterEqual; T_{θl} \\ 0 & {avem}_{l} < T_{ml} or R_{l} < T_{θl} \end{matrix}

s_{r} = \{\begin{matrix} 1 & {avem}_{r} &GreaterEqual; T_{mr} and R_{r} &GreaterEqual; T_{θr} \\ 0 & {avem}_{r} < T_{mr} or R_{r} < T_{θr} \end{matrix}

\{\begin{matrix} T_{ml} = T_{mlb} * 0.7 \\ T_{θl} = T_{θlb} * 0.7 \\ T_{mr} = T_{mrb} * 0.7 \\ T_{θr} = T_{θrb} * 0.7 \end{matrix}

Wherein, exist=1 represents the action existing and raise one's hand near ear, and exist=0 represents the action not existing and raise one's hand near ear, T _mlb, T _{θ lb}, T _mrb, T _{θ rb}be illustrated respectively in left and right effective surveyed area, standard raise one's hand near ear time mean motion intensity and motion range.

In step (51), the Gradient Features of described calculating present image, and gradient direction angle is corrected, the following formula of concrete employing realizes:

\{\begin{matrix} M (x, y) = \sqrt{{G_{x}}^{2} + {G_{y}}^{2}} \\ θ (x, y) = \arctan (\frac{G_{y}}{G_{x}}) \end{matrix}

\{\begin{matrix} G_{x} = f (x - 1, y) + f (x + 1, y) - 2 * f (x, y) \\ G_{y} = f (x, y - 1) + f (x, y + 1) - 2 * f (x, y) \end{matrix}

θ (x, y) = \{\begin{matrix} θ (x, y) + π & θ (x, y) < 0 \\ θ (x, y) & θ (x, y) &GreaterEqual; 0 \end{matrix}

Wherein, M (x, y), θ (x, y) represent gradient magnitude and the direction at pixel (x, y) place respectively, and f (x, y) represents the gray-scale value at pixel (x, y) place.

In step (53), described slides hough transform window with a fixed step size in effective surveyed area, judges whether the region that hough transform window covers exists hand images; Concrete employing below order step realize:

(531) histogram of gradients of each subwindow overlay area is added up;

(532) proper vector of hough transform window overlay area is obtained;

(533) proper vector of hough transform window overlay area is normalized;

(534) proper vector of hough transform window overlay area is sent into hand svm sorter, predict its generic; If hand svm sorter predicts that this proper vector belongs to hand images, then hough transform window position is added in the list of candidate region.

Compare other driver and play phone-monitoring method, the present invention adopts video image processing technology, by triggering the hand existence of the real-time monitoring driving person's ear region of monitoring mode, judge to play phone behavior, have monitoring accuracy high, undetected flase drop is less, speed is fast, the features such as cost is low.

Accompanying drawing explanation

Fig. 1 is method flow diagram of the present invention;

Fig. 2 is trigger module workflow diagram;

Fig. 3 is monitoring module workflow diagram;

Fig. 4 is the positive sample master drawing of training;

Fig. 5 is the effective surveyed area schematic diagram in location;

Fig. 6 is point sampling design sketch;

Fig. 7 is that the subwindow of hough transform window divides schematic diagram;

Fig. 8 is hough transform window sliding schematic diagram;

Fig. 9 is hand couple candidate detection position effect figure, and wherein, rectangle frame 1 is effective surveyed area, and rectangle frame 2 is couple candidate detection regions;

Figure 10 is that hand finally detects position design sketch, and wherein, rectangle frame 1 is effective surveyed area, and rectangle frame 2 finally detects position.

Embodiment

The present invention is further illustrated below in conjunction with accompanying drawing.

As shown in Figure 1, in embodiment, the driver based on svm of the present invention plays phone-monitoring system and comprises initialization module, acquisition module, locating module, trigger module, monitoring module and communication module.The concrete implementation step that this driver based on svm plays phone-monitoring system is as follows:

S1, execution initialization module.

The function of initial module is, load and the necessary sorter file of training system, concrete steps are as follows:

S11, load existing Face datection sorter file.

S12, as shown in Figure 4, collects hand images when playing phone and, as positive sample, detects sorter file based on svm theory training hand.

S2, execution acquisition module.

The function of acquisition module is, the driving condition image of Real-time Collection driver, the mainly head image of driver.

S3, execution locating module.

The function of locating module is, selects suitable image region detect as hand exercise and play the effective coverage of phone detection, mainly selects the regional area near left and right ear.This module can promote detection speed greatly, removes more interference region, as shown in Figure 5.The concrete steps of this module are as follows:

S31, employing harr characteristic sum adaboost sorting algorithm, detect face location.

S32, five, three front yard layout rule based on face, the position that coarse positioning is left and right two.

S33, accurately orient the position of eyes.

S34, employing formula (1) and formula (2) select suitable effective image subregion.

\{\begin{matrix} {rect}_{left} . x = {point}_{l} . x - rect . width * 0.6 \\ {rect}_{left} . y = {point}_{l} . y - rect . height * 0.16 \\ {rect}_{left} . width = rect . width * 0.6 \\ {rect}_{left} . height = rect . height * 1.16 \end{matrix} - - - (1)

\{\begin{matrix} {rect}_{right} . x = {point}_{r} . x \\ {rect}_{right} . y = {point}_{r} . y - rect . height * 0.16 \\ {rect}_{right} . width = rect . width * 0.6 \\ {rect}_{right} . height = rect . height * 1.16 \end{matrix} - - - (2)

S4, judge trigger module open and-shut mode, if trigger module is in opening, then perform step S5, if trigger module is in closed condition, then perform step S7.

S5, execution trigger module.

The function of trigger module is, whether real-time judge driver exists the warming-up exercise playing phone, specifically refers to whether driver exists the action of raising one's hand near ear.If existed, illustrate that driver likely prepares to play phone, now exit trigger module, return and open monitoring module signal; If there is no this action, system can proceed real-time judge, waits for the appearance next time playing phone warming-up exercise.As shown in Figure 2, the concrete steps of this module are as follows:

S51, in left and right effective surveyed area, carry out even point sampling respectively, effect is as Fig. 6.

S52, carry out the accurate tracking of sampled point.Concrete track algorithm is see document: Forward-Backward Error:Automatic Detection of Tracking Failures, Zdenek Kalal, Krystian Mikolajczyk, Jiri Matas, Pattern Recognition (ICPR), 2010 20th International Conference on.

S53, employing formula (3) and formula (4) obtain the movable information of correct tracking sampling point;

\{\begin{matrix} M (i) = sqrt ({D_{x}}^{2} + {D_{y}}^{2}) \\ θ (i) = \arctan (\frac{D_{y}}{D_{x}}) \end{matrix} - - - (3)

\{\begin{matrix} D_{x} = {xpoint}_{i} . x - {ypoint}_{i} . x \\ D_{y} = {xpoint}_{i} . y - {ypoint}_{i} . y \end{matrix} - - - (4)

S54, employing formula (5) and formula (6) obtain the statistical nature of effective surveyed area.The statistical nature of described effective surveyed area comprises the mean motion intensity avem of the effective surveyed area in the left side _l, the effective surveyed area in the right mean motion intensity avem _r, the effective surveyed area in the left side motion range R _lwith the motion range R of the effective surveyed area in the right _r.

\{\begin{matrix} {avem}_{l} = \frac{{sum}_{l}}{N_{ml}} \\ R_{l} = \frac{N_{l}}{N_{ml}} \\ {avem}_{r} = \frac{{sum}_{r}}{N_{mr}} \\ R_{r} \frac{N_{r}}{N_{mr}} \end{matrix} - - - (5)

\{\begin{matrix} {sum}_{l} = {sum}_{l} + M_{l} (i) & M_{l} (i) &GreaterEqual; 2 \\ N_{ml} = N_{ml} + 1 & M_{l} (i) &GreaterEqual; 2 \\ N_{l} = N_{l} + 1 & M_{l} (i) &GreaterEqual; 2 and \frac{π}{4} < θ_{l} (i) \leq \frac{π}{2} \\ {sum}_{r} = {sum}_{r} + M_{r} (i) & M_{r} (i) &GreaterEqual; 2 \\ N_{mr} = N_{mr} + 1 & M_{r} (i) &GreaterEqual; 2 \\ N_{r} = N_{r} + 1 & M_{r} (i) &GreaterEqual; 2 and \frac{π}{2} < θ_{r} (i) \leq \frac{3 * π}{4} \end{matrix} - - - (6)

Wherein, the equation indicated with subscript l represents the feature of the effective surveyed area in the left side, and the equation indicated with subscript r represents the feature of the effective surveyed area in the right, sum _l, sum _rrepresent in left and right effective surveyed area the amplitude of the sampled point with obviously motion respectively, N _ml, N _mrrepresent in left and right effective surveyed area the number of the sampled point with obviously motion respectively, N _l, N _rrepresent that left and right effectively has obviously motion in surveyed area and to the number of the close sampled point of ear respectively.

S55, employing formula (7)-(10), judge whether to exist the action of raising one's hand near ear.If exist, then exit trigger module; If not, then continue to perform trigger module.

exist \{\begin{matrix} 1 & s_{l} + s_{r} = 1 \\ 0 & s_{l} + s_{r} &NotEqual; 1 \end{matrix} - - - (7)

s_{l} = \{\begin{matrix} 1 & {avem}_{l} &GreaterEqual; T_{ml} and R_{l} &GreaterEqual; T_{θl} \\ 0 & {avem}_{l} < T_{ml} or R_{l} < T_{θl} \end{matrix} - - - (8)

s_{r} = \{\begin{matrix} 1 & {avem}_{r} &GreaterEqual; T_{mr} and R_{r} &GreaterEqual; T_{θr} \\ 0 & {avem}_{r} < T_{mr} or R_{r} < T_{θr} \end{matrix} - - - (9)

\{\begin{matrix} T_{ml} = T_{mlb} * 0.7 \\ T_{θl} = T_{θlb} * 0.7 \\ T_{mr} = T_{mrb} * 0.7 \\ T_{θr} = T_{θrb} * 0.7 \end{matrix} - - - (10)

Wherein, exist=1 represents the action existing and raise one's hand near ear, and exist=0 represents the action not existing and raise one's hand near ear, T _mlb, T _{θ lb}, T _mrb, T _{θ rb}be illustrated respectively in left and right surveyed area, standard raise one's hand near ear time mean motion intensity and motion range.

S6, judge whether trigger monitoring module.Play the warming-up exercise of phone if exist, then monitoring module will be triggered unlatching, carry out in-depth monitoring, close trigger module simultaneously.If there is not the warming-up exercise playing phone, then directly get back to acquisition module, the triggering carried out next time judges.

S7, execution monitoring module.

The function of monitoring module is, monitors that the hand of driver rests on the time on ear side in detail.If the hand of driver rests on the time long enough on ear side, illustrate that driver plays phone, then return open communication module by signal.If not, then illustrate that this triggers unlatching and belongs to erroneous judgement.As shown in Figure 3, the concrete steps of this module are as follows:

S71, use effective surveyed area image as present image, adopt formula (11) and formula (12), calculate the Gradient Features of present image, and adopt formula (13) to correct gradient direction angle, the scope of gradient direction is limited to [0 π].

\{\begin{matrix} M (x, y) = \sqrt{{G_{x}}^{2} + {G_{y}}^{2}} \\ θ (x, y) = \arctan (\frac{G_{y}}{G_{x}}) \end{matrix} - - - (11)

\{\begin{matrix} G_{x} = f (x - 1, y) + f (x + 1, y) - 2 * f (x, y) \\ G_{y} = f (x, y - 1) + f (x, y + 1) - 2 * f (x, y) \end{matrix} - - - (12)

θ (x, y) = \{\begin{matrix} θ (x, y) + π & θ (x, y) < 0 \\ θ (x, y) & θ (x, y) &GreaterEqual; 0 \end{matrix} - - - (13)

Wherein, M (x, y), θ (x, y) represent gradient magnitude and the direction at pixel (x, y) place, and f (x, y) represents the gray-scale value at pixel (x, y) place.

S72, structure wide height are respectively the hough transform window of W and H, and the value of W and H equals the wide high level of the training sample of hand svm sorter.As shown in Figure 7, subwindow division is carried out to hough transform window.Partitioning standards is: when driver does not answer the call, and because ear side does not have the interference of hand, the contouring head now in effective surveyed area is vertical generally, therefore, selects longitudinally narrow rectangular area to be rational as subwindow.

S73, as shown in Figure 8, hough transform window to be slided with a fixed step size in effective surveyed area, and judges whether the region that hough transform window covers exists hand images, and concrete steps are as follows:

S731, formula (14) is adopted to add up the histogram of gradients hist [bin] of each subwindow overlay area.

\{\begin{matrix} hist [bin] = hist [bin] + M (x, y) \\ bin = floor (\frac{θ (x, y) * 10}{π}) + 1 \end{matrix} - - - (14)

Wherein, the maximum integer being not more than x is chosen in floor (x) expression, and the span of bin is 1 ~ 10.

The histogram of gradients of S732, all subwindow overlay areas of connecting, obtains the proper vector of detection window overlay area.

S733, by the proper vector normalization of hough transform window overlay area;

S734, judge whether detection window overlay area belongs to hand images, and effect is as Fig. 9.In Fig. 9, square frame 1 represents left and right effectively surveyed area, and square frame 2 represents that hough transform window overlay area belongs to hand images.Concrete grammar is that the proper vector of detection window overlay area is sent into hand svm sorter, predicts its generic.If belong to hand images, then candidate region list is put in hough transform window position.

S74, judge whether detection window slides into ending place of present image.If do not arrive image ending, then hough transform window sliding is to next position, and again performs step S73; If arrived image ending, then perform step S75.

S75, present image is established to be of a size of W _image, H _image, carry out change of scale to present image, then after change of scale, picture size becomes σ W _image, σ H _image, and the image after change of scale is designated as present image.Wherein 0< σ <1.

S76, judge whether present image size is less than minimum threshold.If be less than thresholding, then illustrate that present image is too small, can not exist hand, perform step S77; If be greater than thresholding, then illustrate in present image still likely there is hand, perform step S71 and continue testing process.

S77, according to the number of the candidate target in the list of candidate region and position, the position of hand in comprehensive descision present frame, effect is as Figure 10.In Figure 10, square frame 1 represents left and right effectively surveyed area, and square frame 2 represents that hough transform window overlay area belongs to hand images.

S78, judge whether accumulation frame number reaches unit frame number.If reach, then perform step S79; If not, then perform step S71 and continue testing process.

In S79, statistical unit frame number, there is the frame number proportion of hand images.If this ratio is greater than certain threshold value, then illustrates that current driver's is in and play telephone state.

S710, unlatching trigger module, close monitoring module.

S8, executive communication module.

The function of communication module is, when driver be in play telephone state time, this module sends driver to remote server and plays the real-time video of phone, and now supervision department of transport enterprise can be processed timely by this video; If needed and driver's call, remote command can also be received by this module.

The above embodiment is only be described the preferred embodiment of the present invention; not scope of the present invention is limited; under not departing from the present invention and designing the prerequisite of spirit; the various distortion that those of ordinary skill in the art make technical scheme of the present invention and improvement, all should fall in protection domain that claims of the present invention determine.

Claims

1. the driver based on svm plays a phone-monitoring method, it is characterized in that: this method for supervising comprises the step of following order:

(2) the driving condition image of Real-time Collection driver;

2. a kind of driver based on svm according to claim 1 plays phone-monitoring method, it is characterized in that: in step (3), the suitable image region of described selection detects and plays as hand exercise effective surveyed area that phone detects, and specifically comprises the step of following order:

(33) position of eyes is accurately oriented;

(34) suitable effective image subregion is selected.

3. a kind of driver based on svm according to claim 1 plays phone-monitoring method, it is characterized in that: in step (4), described according to effective surveyed area image, whether real-time judge driver exists the warming-up exercise playing phone, specifically comprises the step of following order:

(42) accurate tracking of sampled point is carried out;

(43) movable information of correct tracking sampling point is obtained;

(45) judge whether to exist the action of raising one's hand near ear.

4. a kind of driver based on svm according to claim 1 plays phone-monitoring method, it is characterized in that: in step (5), described according to effective surveyed area image, the hand of detailed supervision driver rests on time on ear side, and judges whether driver plays phone according to the time length that the hand of driver rests on ear side; Specifically comprise the step of following order:

5. a kind of driver based on svm according to claim 2 plays phone-monitoring method, it is characterized in that: in step (34), the effective image subregion that described selection is suitable, and the following formula of concrete employing realizes:

\{\begin{matrix} {rect}_{left} . x = {point}_{l} . x - rect . width * 0.6 \\ {rect}_{left} . y = {point}_{l} . y - rect . height * 0.16 \\ {rect}_{left} . width = rect . width * 0.6 \\ {rect}_{left} . height = rect . height * 1.16 \end{matrix}

\{\begin{matrix} {rect}_{right} . x = {point}_{r} . x \\ {rect}_{right} . y = {point}_{r} . y - rect . height * 0.16 \\ {rect}_{right} . width = rect . width * 0.6 \\ {rect}_{right} . height = rect . height * 1.16 \end{matrix}

6. a kind of driver based on svm according to claim 3 plays phone-monitoring method, it is characterized in that: in step (43), the movable information of the correct tracking sampling point of described acquisition, and the following formula of concrete employing realizes:

\{\begin{matrix} M (i) = sqrt ({D_{x}}^{2} + {D_{y}}^{2}) \\ θ (i) = \arctan (\frac{D_{y}}{D_{x}}) \end{matrix}

\{\begin{matrix} D_{x} = {xpoint}_{i} . x - {ypoint}_{i} . x \\ D_{y} = {xpoint}_{i} . y - {ypoint}_{i} . y \end{matrix}

7. a kind of driver based on svm according to claim 3 plays phone-monitoring method, it is characterized in that: in step (44), the statistical nature of the effective surveyed area of described acquisition, and the following formula of concrete employing realizes:

\{\begin{matrix} {avem}_{l} = \frac{{sum}_{l}}{N_{ml}} \\ R_{l} = \frac{N_{l}}{N_{ml}} \\ {avem}_{r} = \frac{{sum}_{r}}{N_{mr}} \\ R_{r} = \frac{N_{r}}{N_{mr}} \end{matrix}

\{\begin{matrix} {sum}_{l} = {sum}_{l} + M_{l} (i) & M_{l} (i) &GreaterEqual; 2 \\ N_{ml} = N_{ml} + 1 & M_{l} (i) &GreaterEqual; 2 \\ N_{l} = N_{l} + 1 & M_{l} (i) &GreaterEqual; 2 and \frac{π}{4} θ_{l} (i) \leq \frac{π}{2} \\ {sum}_{r} = {sum}_{r} + M_{r} (i) & M_{r} (i) &GreaterEqual; 2 \\ N_{mr} = N_{mr} + 1 & M_{r} (i) &GreaterEqual; 2 \\ N_{r} = N_{r} + 1 & M_{r} (i) &GreaterEqual; 2 and \frac{π}{2} < θ_{r} (i) \leq \frac{3 * π}{4} \end{matrix}

8. a kind of driver based on svm according to claim 3 plays phone-monitoring method, it is characterized in that: in step (45), and described judges whether to exist the action of raising one's hand near ear, and the following formula of concrete employing realizes:

exist = \{\begin{matrix} 1 & s_{l} + s_{r} = 1 \\ 0 & s_{l} + s_{r} &NotEqual; 1 \end{matrix}

s_{l} = \{\begin{matrix} 1 & {avem}_{l} &GreaterEqual; T_{ml} and R_{l} &GreaterEqual; T_{θl} \\ 0 & {avem}_{l} < T_{ml} or R_{l} < T_{θl} \end{matrix}

s_{r} = \{\begin{matrix} 1 & {avem}_{r} &GreaterEqual; T_{mr} and R_{r} &GreaterEqual; T_{θr} \\ 0 & {avem}_{r} < T_{mr} or R_{r} < T_{θr} \end{matrix}

\{\begin{matrix} T_{ml} = T_{mlb} * 0.7 \\ T_{θl} = T_{θlb} * 0.7 \\ T_{mr} = T_{mrb} * 0.7 \\ T_{θr} = T_{θrb} * 0.7 \end{matrix}

9. a kind of driver based on svm according to claim 4 plays phone-monitoring method, it is characterized in that: in step (51), the Gradient Features of described calculating present image, and gradient direction angle is corrected, the following formula of concrete employing realizes:

\{\begin{matrix} M (x, y) = \sqrt{{G_{x}}^{2} + {G_{y}}^{2}} \\ θ (x, y) = \arctan (\frac{G_{y}}{G_{x}}) \end{matrix}

\{\begin{matrix} G_{x} = f (x - 1, y) + f (x + 1, y) - 2 * f (x, y) \\ G_{y} = f (x, y - 1) + f (x, y + 1) - 2 * f (x, y) \end{matrix}

θ (x, y) = \{\begin{matrix} θ (x, y) + π & θ (x, y) < 0 \\ θ (x, y) & θ (x, y) &GreaterEqual; 0 \end{matrix}

10. a kind of driver based on svm according to claim 4 plays phone-monitoring method, it is characterized in that: in step (53), described slides hough transform window with a fixed step size in effective surveyed area, judges whether the region that hough transform window covers exists hand images; Concrete employing below order step realize:

(531) histogram of gradients of each subwindow overlay area is added up;

(532) proper vector of hough transform window overlay area is obtained;

(533) proper vector of hough transform window overlay area is normalized;