CN110119672A

CN110119672A - A kind of embedded fatigue state detection system and method

Info

Publication number: CN110119672A
Application number: CN201910232993.8A
Authority: CN
Inventors: 李璋; 任雄; 石鑫; 吴志伟; 李华涛
Original assignee: Hubei University
Current assignee: Hubei University
Priority date: 2019-03-26
Filing date: 2019-03-26
Publication date: 2019-08-13

Abstract

The present invention proposes a kind of embedded fatigue state detection system and method, the system comprises: image collection module: for the video being converted into hardwood image by default frame period, and be normalized by camera dynamic capture learner's video；Data processing module: it for the locating human face from the frame image, repositions and intercepts eye and mouth image；The opening and closed state of eye and mouth are identified respectively by trained convolutional neural networks；Using Parclos and Fom rule in convolutional neural networks close one's eyes and yawn frame number frequency carry out calculate and compared with default joint decision threshold, judge whether target tired；Output decision-making module: for being shown according to the judging result of data processing module control loudspeaker and display screen.System and method robustness of the invention is high, practicability is good, is easy to product conversion.

Description

A kind of embedded fatigue state detection system and method

Technical field

The invention belongs to fatigue state detection fields, and in particular to a kind of flush type learning based on convolutional neural networks is tired Labor condition detecting system and method.

Background technique

With the development that information technology is maked rapid progress, more and more learners can carry out line using Internet resources and go to school It practises and gathers information using tool autonomous learning, and the supervision of learning state may be decreased due to lacking in such circumstances Efficiency is so that be easier fatigue.Fatigue state can effectively reflect learner for Current Content interest and understand depth, because This detection learning fatigue state is for reminding learner to focus on adjustment state with very big value and significance.

Mainly have for fatigue detection method at present and utilizes the physiology such as biosensor detection brain electricity, electrocardio, heart rate, breathing Change indicator；And by generating learning state picture, detected person's facial characteristics is analyzed, artificially defined Efficient Characterization letter is compared (eye contour, distance etc. of opening one's mouth) is ceased to determine whether fatigue.Physical signs detection needs to install physiological signal sensor and big Mostly it is connected with learner's body, it is poor for the applicability for learning scene.And pass through artificially defined correlation in facial feature detection Parameter algorithm is complicated, system structure is cumbersome；And for the picture to be detected by external environment such as camera deviation angle, eye The interference of the factors such as mirror, hair, light luminance does not have preferable stability, increases difficulty for feature extraction and comparison, together When significantly impact judgement precision.Therefore improving upgrading method of discrimination and technology has practical significance.

Summary of the invention

Larger for Manual definition's state feature difficulty in existing fatigue state detection method, algorithm is cumbersome so that system is multiple Miscellaneous, the problems such as accuracy of identification is low and application scenarios applicability is low, the present invention provides a kind of insertion based on depth learning technology Formula learning fatigue condition detecting system and method introduce convolutional neural networks in recognition methods and carry out to test object facial characteristics Study and classification judgement realize fatigue state detection in conjunction with Perclos and Fom dicision rules.

First aspect present invention proposes a kind of embedded fatigue state detection system, and the system comprises the following contents:

Image collection module: for capturing learner's video by camera dynamic, the video is pressed into default frame period It is converted into hardwood image, and is normalized；

Data processing module: it for the locating human face from the frame image, repositions and intercepts eye and mouth image；It is logical Trained convolutional neural networks are crossed respectively to identify the opening and closed state of eye and mouth；Using Parclos and Fom rule carries out calculating to the frame number frequency of eye closing and yawn in convolutional neural networks and compared with default joint decision threshold, Judge whether target is tired；

Output decision-making module: for being shown according to the judging result of data processing module control loudspeaker and display screen.

Optionally, the data processing module specifically includes:

Face datection unit: facial image is extracted from the frame image using based on histograms of oriented gradients HOG algorithm；

Feature extraction unit: detecting human face characteristic point using facial marks estimating algorithm and is aligned face, the detection of ERT algorithm Key point in face image out is aligned face, and orients eye, mouth region according to key point serial number, is partitioned into eye And mouth image；

Model training unit: design convolutional neural networks structure is mentioned by optical data library CelebA and the feature The data set for eye and the mouth image composition for taking unit to obtain is trained convolutional neural networks；

Classification and Identification unit: the eye and mouth figure of the acquisition are identified by the trained convolutional neural networks Picture obtains eye, the opening of mouth and closure two states, and is saved in the queue of regular length；

Joint judging unit: presetting joint decision threshold, and the distribution situation being respectively worth in queue described in real-time detection uses Parclos and Fom rule calculates separately the frame number frequency that eye closure and mouth open in the unit time, if eye closure or mouth The frame number frequency that portion is opened then determines fatigue when being more than the threshold value of the joint judgement, and will determine that result is sent to the output Decision-making module.

Optionally, in the Classification and Identification unit, the full articulamentum of the convolutional neural networks is classified using softmax Device, softmax is defined as:

Wherein, j=0,1；And p_jIndicate that output result is the probability of jth class；y'_kWith y '_j=∑_ih_iw_i,j+b_jIndicate convolution The full articulamentum the last layer of neural network exports；I=0,1, h_iIt is upper one layer of output；w_i,jAnd b_jThe respectively power of the last layer Weight and biasing.

Optionally, described to be calculated separately in the unit time using Parclos and Fom rule in the joint judging unit The frame number frequency that eye closure and mouth open specifically:

If f_perIt is closed frame number frequency for eye, what is indicated is that eye is closed between frame number and totalframes in the unit time Ratio, it may be assumed that

Wherein, n indicates that eye closing frame number in the unit time, N indicate that the collected video of camera is total in the unit time Frame number；

If f_fomFor the frame number frequency that mouth opens, what is indicated is mouth opens in the unit time frame number and when unit Ratio between interior totalframes, it may be assumed that

The frame number that mouth opens in the n ' expression unit time, N indicate the collected video of camera in the unit time Totalframes.

Optionally, in the joint judging unit, the setting of joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fomIt is determined as fatigue when > 0.4；Work as f_per> 0.25 and f_fomEqually it is determined as fatigue when > 0.25, remaining is not tired.

Optionally, the output decision-making module controls shown loudspeaker and issues ring, remind and learn when determining target fatigue Habit person's attention state, shown display screen record times of fatigue and study duration are raised if study duration reaches setting study duration Sound device ring reminds learner to take a good rest.

Second aspect of the present invention proposes a kind of embedded fatigue state detection method, which comprises

S1, camera dynamic capture learner's motion video in study, and the video is changed into hardwood by default frame period Image；

S2, facial image is extracted from the frame image using based on histograms of oriented gradients HOG algorithm；

S3, it detects the key point in the facial image using regression tree set ERT algorithm, is aligned face, and according to Key point serial number positions and intercepts out eye, mouth image；

S4, design convolutional neural networks structure, pass through optical data library CelebA and the eye and mouth intercepted out The data set of portion's image composition is trained convolutional neural networks；

S5, the eye and mouth image that the acquisition is identified by the trained convolutional neural networks, obtain eye, The opening and closure two states of mouth, and be saved in the queue of regular length；

The distribution situation being respectively worth in queue described in S6, real-time detection presets joint decision threshold, using Parclos and Fom Rule calculates separately the frame number frequency that eye closure and mouth open in the unit time, if the frame number that eye closure or mouth open Frequency then determines learner's fatigue when being more than the threshold value of the joint judgement.

Optionally, after the step S6 further include:

S7, if it is determined that learner is tired, ring is issued by loudspeaker and is reminded, and by display screen record times of fatigue with Learn duration, if study duration is more than setting study duration, same ring is reminded.

Optionally, in the step S6, the use Parclos and Fom rule calculates separately eye in the unit time and closes Close the frame number frequency opened with mouth specifically:

If f_fomFor the frame number frequency that mouth opens, what is indicated is mouth opens in the unit time frame number and when unit Between ratio between totalframes, it may be assumed that

Optionally, in the step S6, the default joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fom It is determined as fatigue when > 0.4；Work as f_per> 0.25 and f_fomEqually it is determined as fatigue when > 0.25, remaining is not tired.

The present invention using convolutional neural networks come identification feature, can in training each category feature of autonomous learning, do not depend on people Work participates in, and can reach actually available Generalization Capability, and more robustness.In conjunction with Parclos and Fom rule to eyes and mouth Two kinds of features carry out joint setting threshold value, and the fusion for comparing existing its multi information of single characteristic recognition method has preferably characterization energy Power and accuracy.It is easy to for entire detection system, method being transplanted to embedded platform, structure is simple, wide usage is good, is conducive to produce Product conversion.

Detailed description of the invention

It, below will be to needed in the technology of the present invention description in order to illustrate more clearly of technical solution of the present invention Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability For the those of ordinary skill of domain, without any creative labor, it can also be obtained according to these attached drawings others Attached drawing.

Fig. 1 is fatigue detecting system structural schematic diagram provided in an embodiment of the present invention；

Fig. 2 is fatigue detecting device structure schematic diagram used in system provided in an embodiment of the present invention；

Fig. 3 is fatigue detection method flow diagram provided in an embodiment of the present invention；

Fig. 4 is volume collection neural network structure schematic diagram provided in an embodiment of the present invention.

Specific embodiment

In order to make the invention's purpose, features and advantages of the invention more obvious and easy to understand, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that disclosed below Embodiment be only a part of the embodiment of the present invention, and not all embodiment.Based on the embodiments of the present invention, this field Those of ordinary skill's all other embodiment obtained without making creative work, belongs to protection of the present invention Range.

Referring to Figure 1, a kind of embedded fatigue state detection system proposed by the present invention, the system comprises:

Image collection module 110: including video acquisition unit 1101, for capturing learner's view by camera dynamic Frequently；Further include image conversion unit 1102, for the video to be converted into hardwood image by default frame period, and is normalized Processing；

Data processing module 120: for the locating human face from the frame image, reposition and intercept eye and mouth figure Picture；The opening and closed state of eye and mouth are identified respectively by trained convolutional neural networks；Using Parclos and Fom rule calculate to the frame number frequency of eye closing and yawn in convolutional neural networks and combines judgement with default Threshold value comparison judges whether target is tired；

Export decision-making module 130: including loudspeaker control unit 1301 and display screen display unit 1302, for according to number Loudspeaker is controlled according to the judging result of processing module and display screen is shown.

Preferably, the data processing module 120 may particularly include:

Face datection unit 1201: face is extracted from the frame image using based on histograms of oriented gradients HOG algorithm Image；

Feature extraction unit 1202: human face characteristic point is detected using facial marks estimating algorithm and is aligned face, ERT algorithm It detects the key point in face image, is aligned face, and eye, mouth region are oriented according to key point serial number, is partitioned into Eye and mouth image；

Model training unit 1203: design convolutional neural networks structure passes through optical data library CelebA and the spy The data set of eye and mouth image composition that sign extraction unit obtains is trained convolutional neural networks；

Classification and Identification unit 1204: the eye and mouth of the acquisition are identified by the trained convolutional neural networks Image obtains eye, the opening of mouth and closure two states, and is saved in the queue of regular length；The convolutional Neural The full articulamentum of network uses softmax classifier, softmax is defined as:

Joint judging unit 1205: presetting joint decision threshold, and the distribution situation being respectively worth in queue described in real-time detection is adopted Calculate separately the closure of eye in the unit time with Parclos and Fom rule and frame number frequency that mouth opens, if eye closure or The frame number frequency that mouth opens then determines fatigue when being more than the threshold value of the joint judgement, and it is described defeated to determine that result is sent to Decision-making module out.

In the joint judging unit 1205, if f_perIt is closed frame number frequency for eye, what is indicated is eye in the unit time Portion is closed the ratio between frame number and totalframes, it may be assumed thatWherein, n indicates that eye closing frame number in the unit time, N indicate single The collected video totalframes of camera in the time of position；If f_fomFor the frame number frequency that mouth opens, what is indicated is unit The frame number and the ratio in the unit time between totalframes that mouth opens in time, it may be assumed thatIn the n ' expression unit time The frame number that mouth opens, N indicate the collected video totalframes of camera in the unit time.

In the joint judging unit 1205, the setting of joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fom> It is determined as fatigue when 0.4；Work as f_per> 0.25 and f_fomEqually it is determined as fatigue when > 0.25, remaining is not tired.

Fig. 2 is fatigue detecting device structure schematic diagram used in the fatigue state detection system, which sets Standby is only one embodiment of the fatigue state detection system, and the fatigue detecting equipment includes arm processor 2, and respectively With camera 1, the embedded gpu 3, display screen 4 of the arm processor 2 communication connection, loudspeaker 5, the camera 1 is used for Obtain detected person learning state motion video, embedded ARM processor 2 for receive, processing system relevant information, insertion Formula GPU 3 is used for graphics process and Classification and Identification.Display screen 4 is for recording and displaying study duration and times of fatigue, loudspeaking Device 5 is for issuing ring.Different function module in the system, which is implanted into the fatigue detecting equipment, can be realized learner Fatigue state detection, specifically:

Image acquisition unit 110 includes camera 1 shown in Fig. 2, acquires learner's video.

Shown data processing unit includes a series of videos, the processing of picture and the deployment of convolutional neural networks and fortune Make, its implementation setting and Processing Algorithm etc. are embedded in the arm processor 2 and GPU3, at execution Video Quality Metric, picture Reason and identification, control program operation etc..Convolutional neural networks algorithm is run using embedded gpu 3, identifies eye, mouth region Opening and closed state, 2 eye of arm processor, the opening of mouth region and closed state comprehensive descision fatigue state.

Exporting decision package includes the display screen 4 and loudspeaker 5, for receiving the signal of arm processor 2 and making phase It should respond.

The present invention also proposes a kind of embedded fatigue state detection method, referring to Fig. 3, Fig. 3 is fatigue state detection side Method flow diagram, which comprises

Specifically, acquiring learner's activity in study after opening above-mentioned fatigue detecting system by camera first and regarding Frequently, collected video is stored in arm processor, and video is changed into hardwood image as desired by design program and is stored in finger Catalogue is determined as detection input data.Due to not high to the fatigue detecting requirement of real-time of study, frame period can be according to practical feelings Condition value.

Record operation duration and display screen display shown in Fig. 2 when system is opened, setting and matched of classroom duration Duration is practised, it is appropriate to issue ringring prompting learner for system control loudspeaker after the learning time of record being more than setting study duration Rest.

Specifically, after getting frame image, being needed using Face datection quickly people since face only accounts for a part of full figure Face image is extracted from original image.Using the extraction for carrying out facial image based on histograms of oriented gradients HOG algorithm, walk substantially It suddenly include: that 2. gradient calculates 3. gradient orientation histogram calculating 4. histogram normalization for 1. color space normalization.

Specifically, need to continue to obtain after obtaining facial image as the eyes and mouth picture for judging tired foundation, The present invention is using facial marks estimating algorithm detection human face characteristic point and is aligned face.68 distinctive marks of face are chosen first Point model, from eyebrow outer to lower jaw bottom, including eye contour and mouth profile etc..Then using the algorithm propose based on The frame of grad enhancement (radient boosting) learns regression tree set by the summation of optimization loss function and error (Ensemble of Regression Trees, ERT), detects 68 key points in facial image, is finally aligned face, And required eye, mouth region are oriented according to key point serial number, position be partitioned into after eye and mouth region eye and Mouth feature picture.

Fig. 4 is referred to, Fig. 4 is that feature designed by the present invention identifies that convolutional neural networks, the convolutional neural networks include 2 A convolutional layer, 2 pond layers, 1 full articulamentum, the data that the convolutional layer is used to input carry out feature extraction, pond layer For compressing the characteristic pattern of input, characteristic pattern is made to become smaller, simplifies network query function complexity, while carrying out Feature Compression, Main feature is extracted, full articulamentum is for connecting all features and output.The picture that one size is 36 × 28 is inputted into volume Product neural network, after the operation of multiple convolution pondization, full articulamentum identifies eye and mouth figure by softmax classifier Picture judges to open eyes, close one's eyes or open one's mouth, state of shutting up.

Incorporated by reference to Fig. 3, the training process of convolutional neural networks are as follows: the frame picture for obtaining step S1 is as experimenter's sample This, eye, the mouth feature pictures obtained after the Face datection of step S2, S3, Image Segmentation Methods Based on Features, in training convolutional nerve net When network model, using the eye, mouth feature pictures as the number in a part of training set, with optical data library CelebA It is combined according to collection, collectively constitutes the training set of convolutional neural networks, be then loaded into designed convolutional neural networks structure, training volume Product neural network model, ultimately generates identification model.The convolutional neural networks model is constantly trained by training set, generates eye Portion's mouth state recognition model.

Softmax classifier, softmax are used in the full articulamentum of convolutional neural networks is defined as:

It is described that the frame number frequency that eye closure and mouth in the unit time open is calculated separately using Parclos and Fom rule Rate specifically: set f_perIt is closed frame number frequency for eye, what is indicated is that eye is closed between frame number and totalframes in the unit time Ratio, it may be assumed thatWherein, n indicates that eye closing frame number in the unit time, N indicate the camera acquisition in the unit time The video totalframes arrived；If f_ofmFor mouth open frame number frequency, indicate be in the unit time mouth open frame number and Ratio in unit time between totalframes, it may be assumed thatThe frame number that mouth opens in the n ' expression unit time, N indicate single The collected video totalframes of camera in the time of position, if T is the unit time of setting, f₀For camera video acquisition frame Frequency, then N=T × f₀。f_per、f_fomIt can quantify eye, mouth closure degree well, the two indexs are all to be worth bigger, expression Degree of fatigue is bigger.

In the step S6, the default joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fomWhen > 0.4 It is determined as fatigue；Work as f_per> 0.25 and f_fomEqually it is determined as fatigue when > 0.25, remaining is not tired.

The above, the above embodiments are merely illustrative of the technical solutions of the present invention, rather than its limitations, although referring to before Stating embodiment, invention is explained in detail, those skilled in the art should understand that: it still can be to preceding Technical solution documented by each embodiment is stated to modify or equivalent replacement of some of the technical features；And these It modifies or replaces, the spirit and scope for technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution.

Claims

1. a kind of embedded fatigue state detection system, which is characterized in that the system comprises the following contents:

Image collection module: for capturing learner's video by camera dynamic, the video is pressed into default frame period conversion At hardwood image, and it is normalized；

Data processing module: it for the locating human face from the frame image, repositions and intercepts eye and mouth image；Pass through instruction The convolutional neural networks perfected respectively identify the opening and closed state of eye and mouth；Using Parclos and Fom method Then in convolutional neural networks close one's eyes and yawn frame number frequency carry out calculate and compared with default joint decision threshold, judge mesh Whether mark is tired；

2. embedded fatigue state detection system according to claim 1, which is characterized in that the data processing module is specific Include:

Feature extraction unit: human face characteristic point is detected using facial marks estimating algorithm and is aligned face, ERT algorithm detects face Key point in portion's image is aligned face, and orients eye, mouth region according to key point serial number, is partitioned into eye and mouth Portion's image；

Model training unit: design convolutional neural networks structure passes through optical data library CelebA and the feature extraction list The data set for eye and the mouth image composition that member obtains is trained convolutional neural networks；

Classification and Identification unit: the eye and mouth image of the acquisition are identified by the trained convolutional neural networks, is obtained To eye, the opening of mouth and closure two states, and it is saved in the queue of regular length；

3. embedded fatigue state detection system according to claim 2, which is characterized in that in the Classification and Identification unit, The full articulamentum of the convolutional neural networks uses softmax classifier, softmax is defined as:

Wherein, j=0,1, and p_jIndicate that output result is the probability of jth class；y'_kWith y '_j=∑_ih_iw_i,j+b_jIndicate convolutional Neural The full articulamentum the last layer of network exports；I=0,1, h_iIt is upper one layer of output；w_i,jAnd b_jRespectively the weight of the last layer and Biasing.

4. embedded fatigue state detection system according to claim 2, which is characterized in that in the joint judging unit, It is described that the frame number frequency that eye closure and mouth open in the unit time is calculated separately using Parclos and Fom rule specifically:

If f_perIt is closed frame number frequency for eye, what is indicated is the ratio in the unit time between eye closure frame number and totalframes Value, it may be assumed that

Wherein, n indicates that eye closing frame number in the unit time, N indicate the collected video totalframes of camera in the unit time；

If f_fomFor mouth open frame number frequency, indicate be in the unit time mouth open frame number and unit time in Ratio between totalframes, it may be assumed that

Wherein, the frame number that mouth opens in the n ' expression unit time, N indicate the collected video of camera in the unit time Totalframes.

5. embedded fatigue state detection system according to claim 4, which is characterized in that in the joint judging unit, The setting of joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fomIt is determined as fatigue when > 0.4；Work as f_per> 0.25 and f_fomEqually it is determined as fatigue when > 0.25, remaining is not tired.

6. embedded fatigue state detection system according to claim 1, which is characterized in that the output decision-making module is being sentenced When the fatigue that sets the goal, controls shown loudspeaker and issue ring, remind learner's attention state, shown display screen records times of fatigue With study duration, loudspeaker ring reminds learner to take a good rest if study duration is more than setting study duration.

7. a kind of embedded fatigue state detection method, which is characterized in that the described method includes:

S1, camera dynamic capture learner's motion video in study, and the video is changed into hardwood image by default frame period；

S3, the key point in the facial image is detected using regression tree set ERT algorithm, be aligned face, and according to key Point serial number positions and intercepts out eye, mouth image；

S4, design convolutional neural networks structure, pass through optical data library CelebA and the eye intercepted out and mouth figure As the data set of composition is trained convolutional neural networks；

S5, the eye and mouth image that the acquisition is identified by the trained convolutional neural networks, obtain eye, mouth Opening and closure two states, and be saved in the queue of regular length；

The distribution situation being respectively worth in queue described in S6, real-time detection presets joint decision threshold, using Parclos and Fom rule The frame number frequency that eye closure and mouth open in the unit time is calculated separately, if the frame number frequency that eye closure or mouth open Learner's fatigue is then determined when more than the threshold value combined and determined.

8. embedded fatigue state detection method according to claim 7, which is characterized in that also wrapped after the step S6 It includes:

S7, if it is determined that learner is tired, ring is issued by loudspeaker and is reminded, and passes through display screen record times of fatigue and study Duration, if study duration is more than setting study duration, same ring is reminded.

9. embedded fatigue state detection method according to claim 7, which is characterized in that described to adopt in the step S6 The frame number frequency that eye closure and mouth open in the unit time is calculated separately with Parclos and Fom rule specifically:

The frame number that mouth opens in the n ' expression unit time, N indicate the total frame of the collected video of camera in the unit time Number.

10. embedded fatigue state detection system according to claim 9, which is characterized in that described pre- in the step S6 If joint decision threshold and dicision rules are as follows: work as f_per> 0.4 or f_fomIt is determined as fatigue when > 0.4；Work as f_per> 0.25 and f_fom Equally it is determined as fatigue when > 0.25, remaining is not tired.