CN110826396A - Method and device for detecting eye state in video - Google Patents

Method and device for detecting eye state in video Download PDF

Info

Publication number
CN110826396A
CN110826396A CN201910883511.5A CN201910883511A CN110826396A CN 110826396 A CN110826396 A CN 110826396A CN 201910883511 A CN201910883511 A CN 201910883511A CN 110826396 A CN110826396 A CN 110826396A
Authority
CN
China
Prior art keywords
eye
eye region
detecting
video
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910883511.5A
Other languages
Chinese (zh)
Other versions
CN110826396B (en
Inventor
张晋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN201910883511.5A priority Critical patent/CN110826396B/en
Publication of CN110826396A publication Critical patent/CN110826396A/en
Application granted granted Critical
Publication of CN110826396B publication Critical patent/CN110826396B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Ophthalmology & Optometry (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for detecting eye states in a video, which comprises the following steps: performing face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame; calling a pre-trained SVM classifier; and sending the first eye region feature into the SVM classifier for classification, and detecting the eye state of the current user according to the classification result. The eye region characteristics can be detected and acquired and trained only by needing less neural networks, the detection process is not complex, the eye region characteristic acquisition efficiency is greatly improved, the problem that the prior art utilizes two neural networks in complex operation is solved, and the eye state of the current user can be accurately judged.

Description

Method and device for detecting eye state in video
Technical Field
The present disclosure relates to the field of face recognition technologies, and in particular, to a method and an apparatus for detecting eye states in a video.
Background
Eyes are the most important parts of human faces, are the most direct sense organs contacting with foreign objects, and the application range of the detection method of the eye state is wider and wider. For example, a driver of a motor vehicle can feel fatigue when driving for a long time, and the fatigue of the eyes of the driver can be reflected by detecting the state of the eyes, so that traffic accidents are prevented. Parents can detect whether a child is talking and sleeping through eye state detection, etc., for example: and an online learning platform can be used for reminding the user of concentrating and carefully learning through state detection.
At present, most of eye state detection methods are feature analysis-based eye state detection methods in videos, and the steps are as follows: and intercepting the video face picture, detecting by using a face neural network, and then acquiring eye region features by using the face neural network and the eye neural network for training. The method needs to utilize a plurality of neural networks to obtain the eye region characteristics, is troublesome, consumes long time, and has low efficiency because the judgment on the eye state of the video person according to a single frame is not stable enough.
In view of the above problems, there is a need for a stable and efficient method for detecting eye state in video.
Disclosure of Invention
Aiming at the displayed problems, the method detects the eye state based on a mode classification detection method, obtains the eye region characteristics of the current video frame, and judges the eye state by means of an SVM (support vector machine), namely a generalized linear classifier which carries out binary classification on data according to a supervised learning mode.
The eye state includes an open eye state and a closed eye state.
A method for detecting eye states in videos comprises the following steps:
s101, carrying out face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame;
s102, calling a pre-trained SVM classifier;
s103, sending the first eye region characteristics into the SVM classifier for classification, and detecting the eye state of the current user according to the classification result.
Preferably, the performing face detection on the current video frame to obtain the first eye region feature of the current video frame includes:
detecting key points of the human face by using a neural network;
acquiring human eye key points from human face key points;
and combining the human eye key points and the human face key points to map and output the first eye region characteristics.
Preferably, the performing face detection on the current video frame to obtain the first eye region feature of the current video frame further includes:
generating two minimum circumscribed rectangles of the left eye and the right eye according to the landworks coordinates of the eyes of the face region;
and mapping the minimum circumscribed rectangle to a representation layer of the human face features in the neural network according to a receptive field calculation formula to obtain the high-dimensional features of the eye region.
Preferably, before invoking the pre-trained SVM classifier, the method further comprises:
detecting a second eye region feature of a preset user in a preset video;
and carrying out interpolation processing on the second eye region characteristics to obtain characteristics with the same dimensionality for training to obtain the SVM classifier.
Preferably, the detecting a second eye region feature of the preset user in the preset video includes:
acquiring N continuous preset video frames of a preset video in each preset time period;
n is a positive integer greater than or equal to 2;
taking the eye region features of N continuous preset video frames in each preset time period as a second eye region feature to obtain M second eye region features;
the M second eye region features include eye region features of left and right eyes of a preset user in the N frames, where M is 2N in general;
carrying out interpolation processing on the second eye region characteristics to change the second eye region characteristics into characteristics with the same dimensionality, and training to obtain the SVM classifier, wherein the method comprises the following steps:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
An eye state detection device in video, comprising:
the detection module is used for carrying out face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame;
the calling module is used for calling a pre-trained SVM classifier;
and the classification module is used for sending the first eye region characteristics into the SVM classifier for classification, and detecting the eye state of the current user according to a classification result.
Preferably, the detection module includes:
the detection submodule is used for detecting key points of the human face by utilizing a neural network;
the first acquisition submodule is used for acquiring human eye key points from the human face key points;
and the output submodule is used for combining the human eye key points and the human face key points to map and output the first eye region characteristics.
Preferably, the detection module further includes:
the generation submodule is used for generating a minimum circumscribed rectangle of two left and right eyes according to the landworks coordinates of the eyes of the human face region;
and the second acquisition submodule is used for mapping the minimum circumscribed rectangle to a representation layer of the face features in the neural network according to a receptive field calculation formula to acquire the high-dimensional features of the eye region.
Preferably, the detection module is further configured to detect a second eye region feature of a preset user in a preset video before the pre-trained SVM classifier is called;
the eye state detection device further includes:
and the training submodule is used for carrying out interpolation processing on the second eye region characteristic to obtain the characteristic with the same dimension for training to obtain the SVM classifier.
Preferably, the detection module further includes:
the third acquisition submodule is used for acquiring N continuous preset video frames of the preset video in each preset time period;
taking eye region features of N continuous preset video frames in each preset time period as a second eye region feature, and obtaining M second eye region features in total;
the training submodule is used for:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a flowchart illustrating a method for detecting eye state in video according to the present invention;
FIG. 2 is a flow chart of a method for training an SVM classifier according to the present invention;
FIG. 3 is a block diagram of an apparatus for detecting eye state in video according to the present invention;
fig. 4 is another structural diagram of an eye state detection apparatus in a video according to the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Eyes are the most important parts of human faces, are the most direct sense organs contacting with foreign objects, and the application range of the detection method of the eye state is wider and wider. For example, a driver of a motor vehicle can feel fatigue when driving for a long time, and the fatigue of the eyes of the driver can be reflected by detecting the state of the eyes, so that traffic accidents are prevented. Parents can detect whether children are talking and sleeping through eye state detection, and the like. For example: and an online learning platform can be used for reminding the user of concentrating and carefully learning through state detection.
At present, most of eye state detection methods are feature analysis-based eye state detection methods in videos, and the steps are as follows: and detecting the intercepted video face picture by using a face neural network, and then acquiring eye region characteristics by using the face neural network and the eye neural network for training. The method needs to utilize a plurality of neural networks to obtain the eye region characteristics, is troublesome, consumes long time, and has low efficiency because the judgment of the eye state of the video person according to a single frame is not stable enough. In order to solve the above technical problem, an embodiment of the present disclosure provides a method for detecting an eye state in a video, as shown in fig. 1:
the invention detects the eye state based on a mode classification detection method, extracts the eye region characteristics of the current video frame, and judges the eye state by means of an SVM classifier, comprising the following steps:
s101, carrying out face detection on a current video frame of a current user in a current video, and simultaneously obtaining a first eye region characteristic of the current video frame;
the current video frame of the current user is usually a continuous R frame in a preset time in the current video, and the first eye feature is obtained by combining G eye features obtained according to the R frame (that is, the eye feature sent to the SVM classifier is an eye feature of a multi-frame combination), where R is a positive integer greater than or equal to 2, G is an eye region feature of left and right eyes in the R frame, and usually R is 2G. Compared with the scheme of detecting and determining the eye state by using a single frame in the prior art, the eye state detection effect is clearer and more accurate.
S102, calling a pre-trained SVM classifier;
s103, sending the first eye region characteristics into an SVM classifier for classification, and detecting the eye state of the current user according to the classification result.
The working principle of the method is as follows: the method comprises the steps of utilizing a face neural network to carry out face detection on a video frame, utilizing a feature extraction network to obtain first eye region features of a current user (in the obtaining step, key point detection is carried out on the detected face, eye features are obtained at the same time), processing and combining a plurality of second eye region features of a preset user in a preset video to carry out training, obtaining an SVM classifier, and comparing the first eye region features of the current user in the SVM classifier to determine the eye state of the current user.
The method has the beneficial effects that: the eye state of the current user can be detected by training an SVM classifier by using eye region characteristics acquired by a preset user, the eye state of the current user can be detected without adding a special eye neural network to acquire the eye region characteristics for recognition as in the related art, the eye region characteristics can be detected and acquired and trained by using fewer neural networks, the detection process is not complex, the eye region characteristic acquisition efficiency is greatly improved, the problem that the current technology utilizes three neural networks in complex operation is solved, and the eye state of the current user can be more accurately judged.
In one embodiment, performing face detection on a current video frame to obtain a first eye region feature of the current video frame includes:
detecting a face in a current video frame by using a face neural network;
acquiring key points of the face (by using a feature extraction network);
acquiring human eye key points from the human face key points (by using a feature extraction network);
and (utilizing a feature extraction network) combining the human face key points and the human eye key points to map and output first eye region features.
The specific process of the embodiment is as follows: firstly, detecting a human face, then acquiring human face key points according to a feature extraction network, acquiring human eye key points from the human face key points, and then mapping the human eye key points and the human face key points in the feature extraction network to obtain eye features. Compared with the prior art, the method has the advantages that the number of networks is reduced, the eye feature acquisition efficiency is improved, and the time consumed for acquiring the eye features is shortened.
Or the specific implementation procedure of this embodiment is: firstly, a neural network is utilized to detect a human face, then a key point detection network is utilized to extract human eye characteristics, and the human eye characteristics are obtained by mapping the positions of key points (namely the key points of the human eye) and the human face characteristics of a network middle layer (namely the characteristics of the key points of the human face, including contours, facial features and the like).
The method has the beneficial effects that: the required eye region features can be obtained by quickly utilizing the two neural networks for the video frame, detection by utilizing the neural networks for 3 times is avoided, time is shortened, efficiency is improved, and the eye region features are obtained without training a specific network.
In one embodiment, performing face detection on a current video frame to obtain a first eye region feature, which is obtained by combining and mapping the face key points and the eye key points in a first eye region feature of the current video frame, and further includes:
generating two minimum circumscribed rectangles of the left eye and the right eye according to the landworks coordinates of the eyes of the face region;
and mapping the minimum circumscribed rectangle to a representation layer (the content of representation layer representation is the feature of the key point of the human face) of the human face feature in a neural network (namely, a feature extraction network) according to a receptive field calculation formula, and acquiring the high-dimensional feature (namely, the feature of the first eye region) of the eye region. The high-dimensional features may be 64-dimensional features, etc.
The landworks coordinates are human eye key points in the current video frame.
The method has the beneficial effects that: the position of the eye in the video frame can be determined in a smaller range according to the lowest circumscribed rectangle generated by the landframes coordinates, and the high-dimensional features of the eye region obtained by the receptive field calculation formula are used as a standard with the same dimension for later training of the SVM.
In one embodiment, before invoking the pre-trained SVM classifier, the method further comprises:
detecting a second eye region feature of a preset user in a preset video;
and carrying out interpolation processing on the second eye region characteristics to obtain characteristics with the same dimensionality for training to obtain the SVM classifier of the characteristic extraction network.
The method has the beneficial effects that: and acquiring a second eye region feature of a preset user in a preset video, and changing the second eye region feature into a dimension of the high-dimensional feature, so that the second eye region feature can be combined and trained more precisely to obtain the SVM classifier.
In one embodiment, detecting a second eye region feature of the preset user in the preset video, as shown in fig. 2, includes:
acquiring N continuous preset video frames of a preset video in each preset time period;
taking the eye region features of the N continuous preset video frames in each preset time period as a second eye region feature to obtain M second eye region features; in general, M ═ 2N, i.e., the eye region features of the left and right eyes, is common.
Carrying out interpolation processing on the second eye region characteristics to obtain characteristics with the same dimensionality for training, and obtaining the SVM classifier, wherein the method comprises the following steps:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
The method has the beneficial effects that: the continuous N video frames within the preset time can realize multi-frame processing, compared with a single-frame classification training network in the prior art, multi-frame training is put into the SVM classifier, a small amount of data is required for training, and the overall flow speed for judging the eye state is increased by using the SVM classifier.
In one embodiment, face detection is performed on a current video frame;
detecting a face key point by a face network, and simultaneously outputting the feature of a first eye region by combining the face key point and face feature mapping in the middle of the network;
the method comprises the steps that a human face key point network input is a detected human face, human face key point information including human eye key points is detected, and first eye region features are output by combining intermediate layer human face feature mapping of the network.
And generating two minimum circumscribed rectangles of the left eye and the right eye according to the landworks coordinates of the eyes of the human face region. Mapping the circumscribed rectangle to a face feature representation layer in a network by referring to a calculation formula of a receptive field, and acquiring high-dimensional features of a first eye region;
acquiring 10 frames of preset video frames in preset time in a preset video, acquiring 20 second eye region characteristics in the 10 frames, and performing interpolation processing on the second eye region characteristics to obtain a uniform dimension;
performing connection combination training on the second eye region characteristics of the processed continuous 10 frames to obtain an SVM classifier;
and (3) placing the first eye region features of 10 frames extracted by the current user into the SVM classifier for classification, and determining the eye state of the current user according to the classification result.
The method has the beneficial effects that: the SVM classifier for judging the eye opening and closing states does not need to train a classification model by a large amount of labeled data; the eye region features are obtained without training a specific network, and the region coordinates are mapped to a middle feature layer through a landworks network like the fast rcnn; and training by a small amount of data. Meanwhile, the SVM classifier is used to make the whole eye opening and closing judgment process fast. Under the video monitoring, compared with a single-frame classification training network in the prior art, the open and close eyes can be judged more stably and accurately by integrating multi-frame information.
In the above case, we can also separately process the eye region features of the left and right eyes, which includes the following steps:
acquiring 10 frames of preset video frames in preset time in a preset video, acquiring 20 second eye region characteristics in the 10 frames, separately placing the 10 second eye region characteristics of the left eye and the right eye respectively, and carrying out interpolation and other processing on the second eye region characteristics to form a unified dimension (for example, unified dimension is 18 and 64 characteristics);
respectively carrying out connection combination training on the second eye region characteristics of the left eye and the second eye region characteristics of the right eye of the processed continuous 10 frames to obtain an SVM classifier;
and placing the first eye region characteristic of the left eye and the first eye region characteristic of the right eye of the current user into the SVM classifier for classification, so that the respective states of the left eye and the right eye of the current user can be determined according to the classification result.
The method has the beneficial effects that: the second eye region features trained by the left eye and the right eye respectively can detect the eye states of the left eye and the right eye of the current user, and compared with the embodiment that all the second eye features are combined to train the left eye and the right eye to be not distinguished for detection, the detection result is more accurate.
In one embodiment, face detection is performed on a current video frame;
detecting a human face by using a human face neural network, detecting human face key points by using a feature extraction network, acquiring human eye key points from the human face key points, and simultaneously outputting the features of a first eye region by combining the human eye key points and the human face key point mapping;
and generating two minimum circumscribed rectangles of the left eye and the right eye according to the landworks coordinates of the eyes of the human face region. Mapping the circumscribed rectangle to a face feature representation layer in a network by referring to a calculation formula of a receptive field, and acquiring high-dimensional features of a first eye region;
acquiring 5 frames of preset video frames in preset time in a preset video, acquiring 10 second eye region characteristics in the 5 frames, and carrying out processing such as interpolation on the second eye region characteristics to form a unified dimension;
performing connection combination training on the second eye region characteristics of the processed continuous 5 frames to obtain an SVM classifier;
the method comprises the steps of sending a first eye feature of a current user into an SVM classifier for classification, copying 4 times of single frame input to combine the single frame into 5 frames to output a first eye region feature under an initial condition (namely under the condition of less than 5 frames), copying 3 times of second frame input to combine the second frame into 5 frames to output, copying 2 times of third frame input to combine the third frame into 5 frames to output and the like under a third frame input condition, circularly acquiring the eye region features of 5 continuous frames of the current user, putting the eye region features into the SVM classifier for classification after 5 frames are input, classifying the eye region features of 5 continuous frames of the current user as a whole every time, and further more accurately determining the eye state of the current user according to a classification result.
An apparatus for detecting eye state in video, as shown in fig. 3, comprises:
the detection module is used for carrying out face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame;
the calling module is used for calling a pre-trained SVM classifier;
and the classification module is used for sending the first eye region characteristics to an SVM classifier for classification and detecting the eye state of the current user according to the classification result.
In one embodiment, the detection module, as shown in fig. 4, includes:
the detection submodule is used for detecting key points of the human face by utilizing a neural network;
the first acquisition submodule is used for acquiring human eye key points from the human face key points;
and the output submodule is used for combining the human eye key points and the human face key points to map and output the first eye region characteristics.
In one embodiment, the detection module further includes:
the generation submodule is used for generating a minimum circumscribed rectangle of two left and right eyes according to the landworks coordinates of the eyes of the human face region;
and the second acquisition submodule is used for mapping the minimum circumscribed rectangle to a representation layer of the face features in the neural network according to a receptive field calculation formula and acquiring the high-dimensional features of the eye region.
In one embodiment, the detection module: the eye region detection method is also used for detecting a second eye region feature of a preset user in a preset video before calling a pre-trained SVM classifier;
the eye state detection device further includes:
and the training submodule is used for carrying out interpolation processing on the second eye region characteristic to change the second eye region characteristic into a characteristic with the same dimensionality for training to obtain the SVM classifier.
In one embodiment, the detection module further includes:
the third acquisition submodule is used for acquiring N continuous preset video frames of the preset video in each preset time period;
taking the eye region features of N continuous preset video frames in each preset time period as a second eye region feature to obtain M second eye region features;
the training submodule is configured to:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
It will be understood by those skilled in the art that the first and second terms of the present invention refer to different stages of application. For example: the first eye region feature is the eye feature in the detection stage, the second is the training stage, and the eye feature may be the size of the eye, the distance between the upper and lower eyelids, etc. The key points may be positions, etc., for example, the face key points may be the contours of the face, the positions of the five sense organs, etc., and the eye key points may be the positions of the eyes, etc.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. A method for detecting eye state in video is characterized by comprising the following steps:
carrying out face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame;
calling a pre-trained SVM classifier;
and sending the first eye region feature into the SVM classifier for classification, and detecting the eye state of the current user according to the classification result.
2. The method for detecting eye state in video according to claim 1, wherein the performing face detection on the current video frame to obtain the first eye region feature of the current video frame comprises:
detecting key points of the human face by using a neural network;
acquiring human eye key points from the human face key points;
and combining the human eye key points and the human face key points to map and output the first eye region characteristics.
3. The method for detecting eye state in video according to claim 2, wherein said performing face detection on the current video frame to obtain the first eye region feature of the current video frame further comprises:
generating two minimum circumscribed rectangles of the left eye and the right eye according to the landworks coordinates of the eyes of the face region;
and mapping the minimum circumscribed rectangle to a representation layer of the human face features in the neural network according to a receptive field calculation formula to obtain the high-dimensional features of the eye region.
4. The method of eye state detection in video of any one of claims 1 to 3, wherein prior to invoking the pre-trained SVM classifier, the method further comprises:
detecting a second eye region feature of a preset user in a preset video;
and carrying out interpolation processing on the second eye region characteristics to obtain characteristics with the same dimensionality for training, and obtaining the SVM classifier.
5. The method for detecting eye state in video according to claim 4, wherein the detecting the second eye region feature of the preset user in the preset video comprises:
acquiring N continuous preset video frames of the preset video in each preset time period;
taking the eye region features of the N continuous preset video frames in each preset time period as a second eye region feature, and obtaining M second eye region features in total;
the interpolating the second eye region feature to obtain a feature with the same dimension, and training to obtain the SVM classifier includes:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
6. An apparatus for detecting an eye state in a video, comprising:
the detection module is used for carrying out face detection on a current video frame of a current user to obtain a first eye region characteristic of the current video frame;
the calling module is used for calling a pre-trained SVM classifier;
and the classification module is used for sending the first eye region characteristics into the SVM classifier for classification, and detecting the eye state of the current user according to a classification result.
7. The apparatus for detecting eye state in video according to claim 6, wherein the detecting module comprises:
the detection submodule is used for detecting key points of the human face by utilizing a neural network;
the first acquisition submodule is used for acquiring human eye key points from the human face key points;
and the output sub-module is used for combining the human eye key points and the human face key points, mapping and outputting the first eye region characteristics.
8. The apparatus for detecting eye state in video according to claim 7, wherein the detecting module further comprises:
the generation submodule is used for generating a minimum circumscribed rectangle of two left and right eyes according to the landworks coordinates of the eyes of the human face region;
and the second acquisition submodule is used for mapping the minimum circumscribed rectangle to a representation layer of the face features in the neural network according to a receptive field calculation formula to acquire the high-dimensional features of the eye region.
9. The apparatus for detecting eye state in video according to any one of claims 6 to 8,
the detection module is further used for detecting a second eye region feature of a preset user in a preset video before the pre-trained SVM classifier is called;
the eye state detection device further includes:
and the training submodule is used for carrying out interpolation processing on the second eye region characteristic to obtain the characteristic with the same dimension for training to obtain the SVM classifier.
10. An apparatus for detecting eye state in video according to claim 9,
the detection module further comprises:
the third obtaining submodule is used for obtaining N continuous preset video frames of the preset video in each preset time period;
taking the eye region features of the N continuous preset video frames in each preset time period as a second eye region feature, and obtaining M second eye region features in total;
the training submodule is configured to:
and sequentially carrying out interpolation processing on each second eye region feature in the M second eye region features to obtain features with the same dimensionality, and training to obtain the SVM classifier.
CN201910883511.5A 2019-09-18 2019-09-18 Method and device for detecting eye state in video Active CN110826396B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910883511.5A CN110826396B (en) 2019-09-18 2019-09-18 Method and device for detecting eye state in video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910883511.5A CN110826396B (en) 2019-09-18 2019-09-18 Method and device for detecting eye state in video

Publications (2)

Publication Number Publication Date
CN110826396A true CN110826396A (en) 2020-02-21
CN110826396B CN110826396B (en) 2022-04-22

Family

ID=69547999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910883511.5A Active CN110826396B (en) 2019-09-18 2019-09-18 Method and device for detecting eye state in video

Country Status (1)

Country Link
CN (1) CN110826396B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076885A (en) * 2021-04-09 2021-07-06 中山大学 Concentration degree grading method and system based on human eye action characteristics
CN115953389A (en) * 2023-02-24 2023-04-11 广州视景医疗软件有限公司 Strabismus discrimination method and device based on face key point detection

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
CN107704805A (en) * 2017-09-01 2018-02-16 深圳市爱培科技术股份有限公司 method for detecting fatigue driving, drive recorder and storage device
CN107992864A (en) * 2018-01-15 2018-05-04 武汉神目信息技术有限公司 A kind of vivo identification method and device based on image texture
CN108446661A (en) * 2018-04-01 2018-08-24 桂林电子科技大学 A kind of deep learning parallelization face identification method
CN108960071A (en) * 2018-06-06 2018-12-07 武汉幻视智能科技有限公司 A kind of eye opening closed-eye state detection method
CN109460704A (en) * 2018-09-18 2019-03-12 厦门瑞为信息技术有限公司 A kind of fatigue detection method based on deep learning, system and computer equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198332A (en) * 2012-12-14 2013-07-10 华南理工大学 Real-time robust far infrared vehicle-mounted pedestrian detection method
CN107704805A (en) * 2017-09-01 2018-02-16 深圳市爱培科技术股份有限公司 method for detecting fatigue driving, drive recorder and storage device
CN107992864A (en) * 2018-01-15 2018-05-04 武汉神目信息技术有限公司 A kind of vivo identification method and device based on image texture
CN108446661A (en) * 2018-04-01 2018-08-24 桂林电子科技大学 A kind of deep learning parallelization face identification method
CN108960071A (en) * 2018-06-06 2018-12-07 武汉幻视智能科技有限公司 A kind of eye opening closed-eye state detection method
CN109460704A (en) * 2018-09-18 2019-03-12 厦门瑞为信息技术有限公司 A kind of fatigue detection method based on deep learning, system and computer equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113076885A (en) * 2021-04-09 2021-07-06 中山大学 Concentration degree grading method and system based on human eye action characteristics
CN113076885B (en) * 2021-04-09 2023-11-10 中山大学 Concentration degree grading method and system based on human eye action characteristics
CN115953389A (en) * 2023-02-24 2023-04-11 广州视景医疗软件有限公司 Strabismus discrimination method and device based on face key point detection
CN115953389B (en) * 2023-02-24 2023-11-24 广州视景医疗软件有限公司 Strabismus judging method and device based on face key point detection

Also Published As

Publication number Publication date
CN110826396B (en) 2022-04-22

Similar Documents

Publication Publication Date Title
CN105354986B (en) Driver's driving condition supervision system and method
US20230116801A1 (en) Image authenticity detection method and device, computer device, and storage medium
CN108446645B (en) Vehicle-mounted face recognition method based on deep learning
CN109815867A (en) A kind of crowd density estimation and people flow rate statistical method
CN111709497B (en) Information processing method and device and computer readable storage medium
CN110021051A (en) One kind passing through text Conrad object image generation method based on confrontation network is generated
Ibrahim et al. Embedded system for eye blink detection using machine learning technique
Wimmer et al. Low-level fusion of audio and video feature for multi-modal emotion recognition
CN106022317A (en) Face identification method and apparatus
Chen et al. Driver fatigue detection based on facial key points and LSTM
CN111209878A (en) Cross-age face recognition method and device
CN111950497B (en) AI face-changing video detection method based on multitask learning model
CN110826396B (en) Method and device for detecting eye state in video
Ashwin et al. An e-learning system with multifacial emotion recognition using supervised machine learning
Liu et al. A 3 GAN: an attribute-aware attentive generative adversarial network for face aging
CN112036276A (en) Artificial intelligent video question-answering method
CN112906617A (en) Driver abnormal behavior identification method and system based on hand detection
CN114708658A (en) Online learning concentration degree identification method
Sinha et al. Identity-preserving realistic talking face generation
Li et al. Image manipulation localization using attentional cross-domain CNN features
CN106326980A (en) Robot and method for simulating human facial movements by robot
CN111626197B (en) Recognition method based on human behavior recognition network model
Ren et al. Student behavior detection based on YOLOv4-Bi
Liu et al. A3GAN: An attribute-aware attentive generative adversarial network for face aging
Guo et al. Design of a smart art classroom system based on Internet of Things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant