CN109101103B - Blink detection method and device - Google Patents

Blink detection method and device Download PDF

Info

Publication number
CN109101103B
CN109101103B CN201710474295.XA CN201710474295A CN109101103B CN 109101103 B CN109101103 B CN 109101103B CN 201710474295 A CN201710474295 A CN 201710474295A CN 109101103 B CN109101103 B CN 109101103B
Authority
CN
China
Prior art keywords
image
eye
detection
state
blink
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710474295.XA
Other languages
Chinese (zh)
Other versions
CN109101103A (en
Inventor
肖蒴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201710474295.XA priority Critical patent/CN109101103B/en
Publication of CN109101103A publication Critical patent/CN109101103A/en
Application granted granted Critical
Publication of CN109101103B publication Critical patent/CN109101103B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • G06F3/013Eye tracking input arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Ophthalmology & Optometry (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a blink detection method and a blink detection device, and the method comprises the following steps: collecting a plurality of frames of detection images; acquiring an eye characteristic vector of a detection image; forming a first feature vector set by eye feature vectors of a plurality of frames of detection images; searching a mapping relation through the first feature vector set, and determining a blink state corresponding to the first feature vector set; the mapping relation is the mapping relation between the second characteristic vector set and the blink state. By the technical scheme, the operation of the application program can be controlled based on the blinking motion aiming at the application scene that the user does not use the wearable device, the blinking motion detection accuracy is high, the reliability is high, the calculation overhead is small, and the occupied resources of the terminal device are few.

Description

Blink detection method and device
Technical Field
The application relates to the technical field of internet, in particular to a blink detection method and device.
Background
Because the eye action contains rich emotion and physiological information, the traditional key operation can be replaced by the blinking action, the interaction operation between people and programs is directly carried out, and the user experience can be remarkably improved.
At present, a sensor can be implanted in wearable equipment (such as smart glasses) so that the wearable equipment can detect an electro-oculogram waveform signal, and whether blinking actions are generated currently is judged based on the corresponding relation between the amplitude change of the electro-oculogram waveform signal and blinking types, the detection result is accurate, and the detection efficiency is high.
However, the above implementation method needs to use a wearable device to be able to implement blink detection, and for a user using a terminal device (such as a mobile terminal, a PC, etc.), when the user does not use the wearable device, it cannot be detected whether the user has a blink action, that is, the application scene of the wearable device is limited. For example, when a user plays a game via a mobile terminal, the user cannot control the game operation by blinking.
Disclosure of Invention
The application provides a blink detection method which is applied to terminal equipment and comprises the following steps:
collecting a plurality of frames of detection images;
acquiring an eye characteristic vector of a detection image;
forming a first feature vector set by eye feature vectors of a plurality of frames of detection images;
searching a mapping relation through the first feature vector set, and determining a blink state corresponding to the first feature vector set; the mapping relation is the mapping relation between the second characteristic vector set and the blink state.
The application provides a blink detection method which is applied to terminal equipment and comprises the following steps:
collecting a plurality of frames of detection images;
acquiring an eye characteristic vector of a detection image;
forming a first feature vector set by eye feature vectors of a plurality of frames of detection images;
searching a mapping relation through the first feature vector set, and determining a blink state corresponding to the first feature vector set; the mapping relation is a mapping relation between a second characteristic vector set and a blink state;
inquiring an operation command corresponding to the blink state from a command set; the command set is used for recording the corresponding relation between the blink state and the operation command;
and executing the inquired operation command.
The application provides a blink detection method which is applied to terminal equipment and comprises the following steps:
collecting a plurality of frames of detection images;
acquiring an eye characteristic vector of a detection image;
forming a first feature vector set by eye feature vectors of a plurality of frames of detection images;
searching a mapping relation through the first feature vector set, and determining a blink state corresponding to the first feature vector set; the mapping relation is a mapping relation between a second characteristic vector set and a blink state;
and when the blinking state is blinking, simulating the clicking operation of a user or a mouse.
The application provides a blink detection method which is applied to a server side and comprises the following steps:
collecting a plurality of frame sample images;
acquiring an eye characteristic vector of a sample image;
forming a second feature vector set by the eye feature vectors of the multi-frame sample images;
generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image;
and sending the mapping relation between the second characteristic vector set and the blink state to terminal equipment through a notification message, so that the terminal equipment detects the blink state of the user according to the mapping relation.
The application provides a detection device blinks is applied to terminal equipment, the device includes:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set;
the mapping relationship is specifically a mapping relationship between the second feature vector set and the blink state.
The application provides a detection device blinks is applied to terminal equipment, the device includes:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
the query module is used for querying an operation command corresponding to the blink state from a command set; the command set is used for recording the corresponding relation between the blink state and the operation command;
and the processing module is used for executing the inquired operation command.
The application provides a detection device blinks is applied to terminal equipment, the device includes:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
and the processing module is used for simulating the clicking operation of a user or a mouse when the blinking state is blinking.
The application provides a detection device blinks is applied to the server side, the device includes:
the acquisition module is used for acquiring multi-frame sample images;
the acquisition module is used for acquiring the eye characteristic vector of the sample image;
the combining module is used for combining the eye characteristic vectors of the multi-frame sample images into a second characteristic vector set;
the generating module is used for generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image;
and the sending module is used for sending the mapping relation between the second characteristic vector set and the blink state to the terminal equipment through a notification message so that the terminal equipment can detect the blink state of the user according to the mapping relation.
Based on the technical scheme, in the embodiment of the application, for an application scene in which the user does not use the wearable device, whether the user generates a blinking motion or not can be directly detected through the terminal device, and when the user operates the application program on the terminal device, the operation of the application program can be controlled based on the blinking motion. Moreover, the blink action detection has high accuracy, high reliability, low calculation overhead and less occupied resources of terminal equipment.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments of the present application or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art according to the drawings of the embodiments of the present application.
FIG. 1 is a flow chart of a blink detection method in one embodiment of the present application;
FIG. 2 is a flow chart of a blink detection method in another embodiment of the present application;
FIG. 3 is a flow chart of a blink detection method in another embodiment of the present application;
FIG. 4 is a flow chart of a blink detection method in another embodiment of the present application;
FIGS. 5A-5C are flow diagrams of a blink detection method in another embodiment of the present application;
FIG. 6 is a diagram of a hardware configuration of a terminal device in one embodiment of the present application;
fig. 7 is a block diagram of a blink detection device according to an embodiment of the application;
fig. 8 is a block diagram of a blink detection device according to another embodiment of the application;
fig. 9 is a block diagram of a blink detection device according to another embodiment of the application;
FIG. 10 is a diagram of a hardware configuration of a server in one embodiment of the present application;
fig. 11 is a block diagram of a blink detection device according to an embodiment of the present application.
Detailed Description
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein is meant to encompass any and all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in the embodiments of the present application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present application. Depending on the context, moreover, the word "if" may be used is interpreted as "at … …," or "when … …," or "in response to a determination.
The embodiment of the application provides a blink detection method, which is used for detecting the blink state (such as blinking or non-blinking) of a user and then controlling the operation of an application program on a terminal device according to the blink state. In order to realize the detection of the blink state, a training phase and a detection phase are involved. In the training stage, a mapping relation between the second feature vector set and the blink state may be generated, and in the detection stage, the mapping relation may be queried based on the first feature vector set, and then the blink state corresponding to the first feature vector set is obtained. Moreover, in this embodiment of the application, the feature vector set obtained in the training stage may be referred to as a second feature vector set, and the feature vector set obtained in the detection stage may be referred to as a first feature vector set, where the first feature vector set and the second feature vector set are only examples provided for convenience of distinguishing, and are not limited to this.
The following describes the processing procedure of the training phase and the processing procedure of the detection phase in detail.
For the training phase, in one example, the terminal device may perform a training process, which is more demanding on the performance of the terminal device, and each terminal device supporting blink detection may perform the training process. In another example, the server may perform a training process and send the mapping relationship generated by the training process to each terminal device supporting blink detection, so that each terminal device does not need to perform the training process separately, and only the server needs to perform the training process, thereby reducing the processing pressure of the terminal device.
Referring to fig. 1, a flowchart of a blink detection method for a training phase in an embodiment of the present application is shown, where in this embodiment, taking a terminal device as an example to perform a training process, the method may include:
step 101, a terminal device collects a multi-frame sample image.
The sample image is a video image marked with a blinking state. For example, the user a performs a blinking motion, the terminal device may capture a plurality of consecutive video images of the user a, the blinking state of the video images is blinking, and the video image marked with blinking is used as a sample image. For another example, the user a does not perform a blinking motion, and the terminal device may capture consecutive video images of the user a, where the blinking state of the video images is non-blinking, and the video image marked with non-blinking is used as the sample image.
The terminal device may perform the step of collecting the multi-frame sample image for a large number of users, and perform the subsequent steps, where the users may be users of different genders and ages, and the users may include users wearing glasses and users not wearing glasses, and the type of the user is not limited. The number of the users can be selected according to actual needs, the number of the users is not limited, and the more the number of the users is, the more accurate the training result is. For convenience of description, the following description will take the example of acquiring multiple frames of sample images for one user.
And 102, the terminal equipment acquires the eye characteristic vector of the sample image.
Wherein, the eye feature vector may include but is not limited to one or any combination of the following: eye state (e.g., open or closed), center of gravity position of connected domain, black pixel proportion increase rate.
To acquire the eye state of the sample image, the following manner may be adopted: and performing feature detection on the sample image through a feature classifier, and acquiring the eye state of the sample image according to the feature detection result of the sample image.
In order to obtain the gravity center position of the connected domain of the sample image, the following method can be adopted: and acquiring a difference image of the frame sample image and the previous frame sample image aiming at the sample image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame sample image.
In order to obtain the black pixel proportion increase rate of the sample image, the following method can be adopted: carrying out binarization processing on the frame of sample image aiming at the sample image to obtain a binarized image, and counting the proportion value of black pixels in the binarized image; and acquiring the black pixel proportion increase rate of the frame sample image according to the black pixel proportion value corresponding to the frame sample image and the black pixel proportion value corresponding to the previous frame sample image.
The detailed description will be made in the subsequent embodiments of the present application for the process of obtaining the eye state of the sample image, the center of gravity position of the connected domain, and the black pixel proportion increase rate, and will not be repeated herein.
In an example, the process of obtaining the eye state of the sample image and the percentage increase rate of the black pixels at the center of gravity position of the connected domain is to obtain the eye state of the sample image and the percentage increase rate of the black pixels at the center of gravity position of the connected domain for each frame of the sample image in the multiple frames of sample images, and details of the process are omitted.
And 103, the terminal equipment forms the eye characteristic vectors of the multi-frame sample images into a second characteristic vector set. For example, the terminal device generates a second feature vector set, and then sequentially records the eye feature vectors of the plurality of frames of sample images to the second feature vector set according to the acquisition order of the sample images.
And 104, the terminal equipment generates a mapping relation between the second characteristic vector set and the blink state. The blink state refers to a blink state corresponding to the plurality of frames of sample images, such as blinking or non-blinking.
After the terminal device performs steps 101 to 104 for a large number of users, the mapping relationship shown in table 1 may be generated. For example, the terminal device acquires sample images of 200 blinking users to obtain 200 second feature vector sets, the blinking states of the 200 second feature vector sets are blinks, and the correspondence between the 200 second feature vector sets and the blinks is recorded in table 1. The terminal equipment acquires sample images of 300 non-blinking users to obtain 300 second feature vector sets, the blinking states of the 300 second feature vector sets are non-blinking, and the corresponding relation between the 300 second feature vector sets and the non-blinking is recorded in table 1.
TABLE 1
Second set of eigenvectors Blink state
Set of feature vectors 1 Blinking eye
Feature vector collection 200 Blinking eye
Feature vector set 201 Non-blinking
Feature vector set 500 Non-blinking
Referring to fig. 2, a flowchart of a blink detection method for a training phase in an embodiment of the present application is shown, where in this embodiment, taking a training process performed by a server as an example, the method may include:
step 201, the server collects a multi-frame sample image.
Step 202, the server side obtains the eye feature vector of the sample image.
And step 203, the server side makes the eye characteristic vectors of the multi-frame sample image into a second characteristic vector set.
And step 204, the server generates a mapping relation between the second characteristic vector set and the blink state.
The processing of steps 201 to 204 is similar to the processing of steps 101 to 104, except that the execution main body is changed from the terminal device to the server, and the description of steps 201 to 204 is not repeated.
In step 205, the server sends a notification message to the terminal device (e.g., each terminal device supporting blink detection), where the notification message carries a mapping relationship between the second feature vector set and the blink state.
In step 206, the terminal device analyzes the mapping relationship between the second feature vector set and the blink state from the notification message, and generates a mapping relationship between the second feature vector set and the blink state.
Based on the process shown in fig. 1 or the process shown in fig. 2, the terminal device may generate a mapping relationship between the second feature vector set and the blink state, so as to complete the training phase. In the detection stage, the terminal device may query the mapping relationship through the first feature vector set, and then obtain a blink state corresponding to the first feature vector set. Referring to fig. 3, a flowchart of a blink detection method for a detection phase in an embodiment of the present application, where the detection phase may be applied to a terminal device, the method may include:
step 301, the terminal device collects multiple frames of detection images.
The detection image is a video image not marked with a blinking state. For example, in order to know whether the user a blinks, the terminal device may capture a plurality of consecutive video images of the user a as detection images, and needs to analyze whether the blink state of the user a is blinking or non-blinking based on the detection images.
In one example, the number of detection images acquired by the terminal device may be the same as or different from the number of sample images acquired in step 101/step 201, such as the number of both of them is 20 frames.
Step 302, the terminal device obtains the eye feature vector of the detected image.
Wherein, the eye feature vector may include but is not limited to one or any combination of the following: eye state (e.g., open or closed), center of gravity position of connected domain, black pixel proportion increase rate.
In order to acquire the eye state of the detection image, the following manner may be adopted: and performing feature detection on the detection image through a feature classifier, and acquiring the eye state of the detection image according to the feature detection result of the detection image. In order to acquire the barycentric position of the connected component of the detection image, the following method may be adopted: and acquiring a difference image of the frame detection image and the previous frame detection image aiming at the detection image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame detection image. In order to obtain the black pixel proportion increase rate of the detection image, the following method can be adopted: carrying out binarization processing on the frame detection image aiming at the detection image to obtain a binarized image, and counting the proportion value of black pixels in the binarized image; and acquiring the proportion increase rate of the black pixels of the frame detection image according to the proportion value of the black pixels corresponding to the frame detection image and the proportion value of the black pixels corresponding to the previous frame detection image.
The detailed description will be given in the subsequent embodiments of the present application for the process of obtaining the eye state of the detected image, the center of gravity position of the connected domain, and the black pixel ratio increase rate, and will not be repeated here.
Step 303, the terminal device combines the eye feature vectors of the multiple frames of detected images into a first feature vector set. For example, the terminal device generates a first feature vector set, and then sequentially records the eye feature vectors of the plurality of frames of detection images to the first feature vector set according to the acquisition order of the detection images.
In step 304, the terminal device queries the mapping relationship (i.e. the mapping relationship generated in step 104/step 206) through the first feature vector set, and determines the blink state corresponding to the first feature vector set.
Assuming that the barycentric position of the connected domain includes an abscissa value and an ordinate value of the barycentric of the connected domain, the terminal device may record the eye state of the multi-frame detection image, the abscissa value of the barycentric of the connected domain, the ordinate value of the barycentric of the connected domain, and the black pixel proportion increase rate in the first feature vector set, respectively. For example, assuming that the eye feature vector of the first frame detection image is (0, 0.3, 0.4, 15%), the eye feature vector of the second frame detection image is (1, 0.3, 0.1, 3%), and the eye feature vector of the third frame detection image is (0, 0.3, 0.6, 25%), the first feature vector set may be (0, 0.3, 0.4, 15%, 1, 0.3, 0.1, 3%, 0, 0.3, 0.6, 25%), 0 indicates that the eye state is open, and 1 indicates that the eye state is closed.
Then, when querying the mapping relationship through the first feature vector set, the terminal device may determine that the blink state corresponding to the first feature vector set is blink, assuming that a mapping relationship of "(0, 0.3, 0.4, 15%, 1, 0.3, 0.1, 3%, 0, 0.3, 0.6, 25%) and blink" exists in the mapping relationship.
After the training stage and the detection stage, the blinking state of the user can be detected, and the operation of the application program on the terminal equipment is controlled according to the blinking state. Moreover, for an application scene in which the user does not use the wearable device, whether the user generates a blinking motion or not can be directly detected by the terminal device, and when the user operates the application program on the terminal device, the operation of the application program can be controlled based on the blinking motion. Moreover, the blink action detection has high accuracy, high reliability, low calculation overhead and less occupied resources of terminal equipment.
In the training stage, if the terminal equipment executes the training process, a camera can be installed on the terminal equipment so that the terminal equipment can acquire multi-frame sample images; if the training process is executed by the server, a camera can be installed on the server, so that the server can acquire multi-frame sample images. Moreover, the sample image collected by the terminal device/the server may include a human face, a body, and the like, and when blink detection is performed, blink detection may be performed only on the eye region, so that the flow shown in fig. 4 may be further adopted to process the sample image collected by the terminal device/the server (i.e., step 101/step 201) to obtain a sample image only on the eye region, and then obtain the eye feature vector of the sample image (i.e., step 102/step 202).
Step 401, the terminal device/the server performs face detection on the sample image to obtain a face region image.
Because the sample image collected by the terminal device/the server may include regions such as a face and a body, the sample image may be subjected to face detection to obtain an image only including the face region, and the image is referred to as a face region image. For the way of performing face detection on the sample image, a face detection algorithm such as AdaBoost may be adopted, and the algorithm is not limited as long as a face region image can be obtained based on the sample image.
In one example, considering that the size of the sample image collected by the terminal device/server may be larger, in order to reduce the amount of calculation, before step 401, the sample image may be scaled, for example, by scaling the width of the sample image to 320 pixels, and scaling the height of the sample image to 240 pixels, and scaling the width of the sample image. Then, the terminal device/server performs face detection on the scaled sample image to obtain a face region image, i.e., executes step 401.
For example, assuming that the size of the sample image collected by the terminal device/server is 960 × 480, the size of the sample image may be scaled to 320 × 160, or the size of the sample image may be scaled to 480 × 240.
In step 402, the terminal device/server obtains an eye region image from the face region image.
Since the face region may include the regions such as eyes, nose, mouth, eyebrows, ears, etc., the terminal device/server may further obtain an eye region image from the face region image. For example, the eye region image is cut out from the face region image according to the statistical ratio of the five sense organs, the eye region image does not include the nose, mouth, eyebrows, ears and other regions, and may include only the eye region, and the cutting manner is not limited.
In step 403, the terminal device/service end preprocesses the eye area image.
The method for preprocessing the image of the eye region may include, but is not limited to: and carrying out denoising processing, image enhancement processing, brightness conversion processing and the like on the eye region image. For specific implementation of the denoising process, the image enhancement process, and the luminance transformation process, a conventional method may be adopted, and details are not described herein.
Further, the terminal device/the server may analyze the pre-processed eye region image to obtain an eye feature vector corresponding to the sample image, that is, perform the above step 102/step 202.
Similar to the training process, in the detection stage, the detection image acquired by the terminal device may also include regions such as a human face and a body, and therefore, the terminal device may further perform the following processing on the detection image to obtain a detection image only for the eye region: the terminal equipment carries out face detection on the detected image to obtain a face area image; the terminal equipment acquires an eye area image from the face area image; the terminal device preprocesses the eye area image, thereby obtaining a detection image only for the eye area.
Further, the terminal device may further analyze the pre-processed eye region image to obtain an eye feature vector corresponding to the detected image, that is, perform step 302, which is not described in detail herein.
In the training phase, the following describes a process of acquiring the eye state of the sample image, the center of gravity position of the connected domain, and the black pixel proportion increase rate by the terminal device/the server side, with reference to a specific application scenario.
Fig. 5A is a schematic diagram showing an eye state for acquiring a sample image.
In step 5011, the terminal device/server trains a feature classifier. The feature classifier may be used to record, among other things, the correspondence of image features to eye states (e.g., open or closed).
The feature classifier may be a Haar (Haar) classifier, or a SVM (Support Vector Machine) classifier, or a random forest classifier, and the type of the feature classifier is not limited.
For the training process of the feature classifier, the terminal device/the server may first acquire a training image (for example, a frame of training image, and in order to distinguish from the sample image, a video image marked with an eye state may be referred to as a training image, and a video image marked with a blink state may be referred to as a sample image), where the training image is a video image marked with an eye state. For example, when the eye state of the user a is open, one frame of video image of the user a whose eye state is open may be captured, and the video image labeled "open" may be taken as the training image. Moreover, the terminal device/the server may perform the step of acquiring the training images for a large number of users, the number of users may be selected according to actual needs, the number of users is not limited, and for convenience of description, the following description will be given by taking an example of acquiring the training images for one user.
Then, the terminal device/the server obtains the image feature of the training image, where the image feature may be a texture feature (the texture feature is a global feature, describes surface properties of a scene corresponding to the training image, and needs to perform statistical calculation in a region including a plurality of pixel points), and the obtaining manner of the image feature is not limited, for example, the method of analyzing the texture feature of the gray level co-occurrence matrix, a geometric method, a model method, and the like may be used.
The terminal device/server may then train a feature classifier for recording the correspondence of image features of the training image to the eye state (e.g., open or closed) of the training image.
In step 5012, the terminal device/the server performs feature detection on the sample image through the feature classifier.
In step 5013, the terminal device/the server obtains the eye state of the sample image according to the feature detection result of the sample image, where the eye state of the sample image may be open or closed.
Since the feature classifier has been trained based on the training image in step 5011, and the feature classifier can record the correspondence between the image features and the eye states. Based on this, after the terminal device/the server collects the sample image, the sample image may be output to the feature classifier, the feature classifier performs feature detection on the sample image to obtain the image feature (i.e., the feature detection result) of the sample image, and for this "the feature classifier performs feature detection on the sample image", a texture feature analysis method, a geometric method, a model method, and the like of the gray level co-occurrence matrix may be adopted, which is not described herein again. Then, the feature classifier can query "the correspondence between the image features and the eye states" through the image features of the sample image, so as to obtain the eye states corresponding to the image features, and the eye states are the eye states of the sample image.
Fig. 5B is a schematic diagram showing the gravity center position of the connected component in the sample image. In fig. 5B, the frame sample image and the previous frame sample image are referred to, the frame sample image is the sample image of the current frame, and the previous frame sample image is the previous frame sample image of the frame sample image. For example, assuming that a first frame sample image, a second frame sample image, and a third frame sample image are acquired, when the first frame sample image is processed, the frame sample image is the first frame sample image, the previous frame sample image may be a default image, and the default image may be selected according to actual needs and may be configured in the terminal device/the server in advance. When the second frame sample image is processed, the frame sample image is the second frame sample image, and the previous frame sample image is the first frame sample image. When the third frame sample image is processed, the frame sample image is the third frame sample image, and the previous frame sample image is the second frame sample image. For another example, assuming that a first frame sample image, a second frame sample image, and a third frame sample image are acquired, the second frame sample image and the third frame sample image are used as sample images to be processed, and the first frame sample image is not used as the sample image to be processed. When the second frame sample image is processed, the frame sample image is the second frame sample image, and the previous frame sample image is the first frame sample image. When the third frame sample image is processed, the frame sample image is the third frame sample image, and the previous frame sample image is the second frame sample image.
In step 5021, for a sample image, the terminal device/the server obtains a difference image between the sample image of the frame and a previous sample image of the frame, where the difference image may also be referred to as a difference image.
The terminal device/the server firstly obtains the gray value of each pixel point in the frame of sample image and the gray value of each pixel point in the previous frame of sample image; then, for each pixel point, subtracting the gray value in the previous frame of sample image from the gray value in the frame of sample image to obtain a gray value which is the gray value of the pixel point in the difference image; based on the gray value of each pixel point in the difference image, the difference image can be obtained.
Step 5022, the terminal device/the server determines the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame sample image, wherein the gravity center position of the connected domain may include an abscissa value of the gravity center of the connected domain and an ordinate value of the gravity center of the connected domain, in the subsequent process, the abscissa value of the gravity center of the connected domain may be recorded as a position x, and the ordinate value of the gravity center of the connected domain may be recorded as a position y.
Specifically, if the user does not blink (for example, the user opens or closes the eyes all the time), the gray value of each pixel point of the sample image is not changed, and the gray value of each pixel point of the frame sample image is the same as the gray value of the previous frame sample image, so that the gray value of each pixel point in the difference image is 0, and thus, the region formed by all the pixel points of the difference image can be used as a connected domain, and the barycentric position (i.e., the centroid position) of the connected domain can be obtained. If the user blinks (for example, from eye opening to eye closing or from eye closing to eye opening), the gray value of the pixel point of the frame sample image in the blink area is different from the gray value of the previous frame sample image, and therefore, the gray value of the pixel point of the blink area in the difference image is not 0; moreover, the gray value of the pixel point of the frame sample image in other regions (regions outside the blinking region) is the same as the gray value of the previous frame sample image, so the gray value of the pixel point of the other regions in the difference image is 0; based on this, the region composed of the pixel points whose gray values are not 0 can be used as the connected domain, and the barycentric position (i.e., the centroid position) of the connected domain is obtained.
In one implementation, after obtaining the position x and the position y, the position x and the position y may be directly used as the center of gravity position of the final output. In another implementation, after the position x and the position y are obtained, it may be determined whether the position x and the position y are located in a specified interval (e.g., the specified interval is 0.0-1.0); if so, the position x and the position y can be taken as the center of gravity position of the final output; if not, normalization processing can be performed on the position x and the position y, so that the position x and the position y after normalization processing are located in a specified interval, and the position x and the position y after normalization processing are used as the gravity center position of final output.
In the process of normalizing the position x and the position y, scaling may be performed on the position x and the position y so that the position x and the position y after the normalization processing are located in the designated interval. For example, when the position x is 3 and the position y is 2, the position x and the position y may be reduced by 10 times, so that the position x after the normalization processing is 0.3 and the position y after the normalization processing is 0.2; alternatively, the position x and the position y may be reduced by 5 times, so that the position x after the normalization process is 0.6 and the position y after the normalization process is 0.4. Of course, the above normalization methods are only a few examples, and the normalization method is not limited.
Fig. 5C is a schematic diagram showing the ratio increase rate of black pixels for obtaining the sample image. In fig. 5C, the frame sample image and the previous frame sample image are referred to, and the explanation thereof is referred to case two.
Step 5031, for the sample image, the terminal device/server performs binarization processing on the frame sample image to obtain a binarized image, which is a black-and-white image.
Specifically, the terminal device/the server may count a gray value of each pixel point of the frame sample image, and then adjust the gray value to a first value (e.g., 255) if the gray value is greater than a preset threshold (which may be configured empirically), and adjust the gray value to a second value (e.g., 0) if the gray value is not greater than the preset threshold. After each pixel point is processed, the obtained image is a binary image, the gray value of the binary image is only two values, the first value is represented as white, and the second value is represented as black.
Step 5032, the terminal device/server counts the proportion value of black pixels in the binarized image.
In the binary image, pixels with gray values of a first value and pixels with gray values of a second value are included, the pixels with gray values of the first value are white pixels, and the pixels with gray values of the second value are black pixels. Based on this, the number of pixels with the gray value of the second value/(the number of pixels with the gray value of the first value + the number of pixels with the gray value of the second value), that is, the black pixel ratio value.
In step 5033, the terminal device/server obtains the black pixel proportion increase rate of the frame sample image according to the black pixel proportion value corresponding to the frame sample image and the black pixel proportion value corresponding to the previous frame sample image. Namely: (the ratio of black pixels corresponding to the sample image of the frame-the ratio of black pixels corresponding to the sample image of the previous frame)/the ratio of black pixels corresponding to the sample image of the previous frame is the ratio of increase of black pixels.
Based on the above first, second, and third cases, the terminal device/the server may obtain the eye state, the position x, the position y, and the black pixel proportion increase rate of the sample image, and then may record the eye state, the position x, the position y, and the black pixel proportion increase rate of the multi-frame sample image to the second feature vector set in sequence, so as to obtain a mapping relationship between the second feature vector set and the blink state. For example, for 3 frame sample images marked with blinks, assuming that the eye feature vector of the first frame sample image is (0, 0.3, 0.4, 15%), the eye feature vector of the second frame sample image is (1, 0.3, 0.1, 3%), and the eye feature vector of the third frame sample image is (0, 0.3, 0.6, 25%), the second feature vector set may be (0, 0.3, 0.4, 15%, 1, 0.3, 0.1, 3%, 0, 0.3, 0.6, 25%), 0 indicates that the eye state is open, and 1 indicates that the eye state is closed. Then, a mapping of a second set of feature vectors (0, 0.3, 0.4, 15%, 1, 0.3, 0.1, 3%, 0, 0.3, 0.6, 25%) to the blink state may be generated.
In the above embodiment, the reason why the eye state is selected as the eye feature vector is: when the user blinks, the eye state changes, such as "open-close-open", and when the user does not blink, the eye state does not change, such as "open-open", "close-close", and the like, so that the eye state can be used as an index of the eye feature vector, and whether the user blinks or not can be detected. In addition, the reason for selecting the barycentric position of the connected domain as the eye feature vector is: when the user blinks, the barycentric position of the connected domain will change, and when the user does not blink, the barycentric position of the connected domain will not change, so that the barycentric position of the connected domain can be used as an index of the eye feature vector to detect whether the user blinks. In addition, the reason for choosing the black pixel proportion increase rate as the eye feature vector is: when the user blinks, the black pixel proportion growth rate is changed, and when the user does not blink, the black pixel proportion growth rate is not changed, so that the black pixel proportion growth rate can be used as an index of the eye feature vector, and whether the user blinks or not can be detected.
The above process describes an example in which the terminal device/the server acquires the eye state of the sample image, the gravity center position of the connected domain, and the black pixel proportion increase rate in the training phase, and similarly, in the detection phase, the terminal device may also acquire the eye state of the detected image, the gravity center position of the connected domain, and the black pixel proportion increase rate, and a specific implementation manner is similar to the above process, and is not repeated here.
After the training phase and the detection phase, the terminal device may detect the blink state of the user, and then may control the operation of the application program based on the blink state, which will be described below.
In one example, after the terminal device detects the blink state, an operation command corresponding to the blink state can be inquired from the command set, the inquired operation command is executed, and then the operation of the application program is controlled through the operation command; wherein, the command set can be used for recording the corresponding relation between the blink state and the operation command. In another example, after the terminal device detects the blink state, when the blink state is blinking, the user or mouse clicking operation may be directly simulated without querying the command set.
The correspondence between the blinking state and the operation command will be described below with reference to several specific applications.
Application 1: in order to control the operation of the photographing program (such as APP with photographing function) by using the blinking motion, the correspondence relationship between "blinking and photographing" may be pre-recorded in the command set, so that the process of "executing the queried operation command" may include, but is not limited to: when the blinking state is blinking, the photographing operation is performed after delaying for M seconds (e.g., 3 seconds). Therefore, even if the distance between the terminal equipment and the user can be far, the user does not need to manually operate the terminal equipment to take a picture of the user, and the terminal equipment can be controlled to take a picture of the user only by giving a blinking action to the terminal equipment, so that the use experience of the user is obviously improved.
Application 2: in order to control the operation of an application program (a program controlled by a mouse, such as a game APP) on a terminal device (such as a PC which needs to be operated by using a mouse) by using a blinking motion, a correspondence relationship of "blinking and clicking" may be recorded in a command set in advance, so that a process for "executing a queried operation command" may include, but is not limited to: and when the blinking state is blinking, simulating the clicking operation of the mouse. Therefore, the user does not need to manually operate the mouse, and only needs to give the terminal equipment a blinking action to control the terminal equipment to simulate the clicking operation of the mouse, so that the use feeling of the user is obviously improved.
Application 3: in order to control the operation of an application program (a program that needs to be clicked by a hand, such as a game APP, etc.) on a terminal device (such as a mobile terminal that needs to be clicked by a hand, etc.) by using a blinking motion, a correspondence relationship of "blinking and clicking" may be pre-recorded in a command set, so that a process for "executing a queried operation command" may include, but is not limited to: and when the blinking state is blinking, simulating the clicking operation of the user. Therefore, the user does not need to manually operate the terminal equipment, and only needs to give the terminal equipment a blinking action to control the terminal equipment to simulate the clicking operation of the user, so that the use feeling of the user is obviously improved.
Application 4: in order to control the operation of a live program (such as APP with live interactive function) using blinking motion, the correspondence relationship between "blinking and expression" may be pre-recorded in a command set, so that the process of "executing the queried operation command" may include, but is not limited to: and when the blinking state is blinking, controlling the expression of the virtual game character. Based on the method, the terminal equipment can be controlled to output the expression of the virtual game character only by blinking the terminal equipment, and then the expression of the virtual game character can be displayed in a live program, so that other users watching the live broadcast can view the expression of the virtual game character.
Based on the same application concept as the method, the embodiment of the present application further provides a blink detection device 120, and the blink detection device 120 is applied to the terminal device 10. The blink detection device 120 may be implemented by software, hardware, or a combination of hardware and software. Taking a software implementation as an example, a logical means is formed by the processor 11 of the terminal device 10 where it is located reading corresponding computer program instructions in the non-volatile memory 12. From a hardware level, as shown in fig. 6, the hardware structure of the terminal device 10 where the blink detection apparatus 120 is located is shown, and besides the processor 11 and the nonvolatile memory 12 shown in fig. 6, the terminal device 10 may further include other hardware, such as a forwarding chip, a network interface, and a memory, which are responsible for processing a message; the terminal device 10 may also be a distributed device in terms of hardware structure, and may include a plurality of interface cards to extend the message processing at the hardware level.
As shown in fig. 7, a structure of a blink detection device according to the present application includes:
the acquisition module 1201 is used for acquiring multi-frame detection images;
an obtaining module 1202, configured to obtain an eye feature vector of a detection image;
a combining module 1203, configured to combine the eye feature vectors of the multiple frames of detection images into a first feature vector set;
a determining module 1204, configured to query a mapping relationship through the first feature vector set, and determine a blink state corresponding to the first feature vector set; the mapping relationship is specifically a mapping relationship between the second feature vector set and the blink state.
In one example, the blink detection device further comprises: a generation module (not shown in the figure); the generating module is used for generating a mapping relation between the second characteristic vector set and the blink state;
the generating module is specifically used for acquiring multi-frame sample images; acquiring an eye characteristic vector of a sample image; forming a second feature vector set by the eye feature vectors of the multi-frame sample images; generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image; or receiving a notification message sent by a server; analyzing the mapping relation between the second characteristic vector set and the blink state from the notification message; generating a mapping relation between the second feature vector set and the blink state.
The eye feature vector of the detection image comprises one or any combination of the following components: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the obtaining module 1202 is specifically configured to, in the process of obtaining the eye feature vector of the detected image, perform feature detection on the detected image through the feature classifier, and obtain the eye state of the detected image according to the feature detection result of the detected image; and/or acquiring a difference image of the frame detection image and the previous frame detection image aiming at the detection image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame detection image; and/or, carrying out binarization processing on the frame detection image aiming at the detection image to obtain a binarized image, counting the proportion value of black pixels in the binarized image, and acquiring the proportion increase rate of the black pixels of the frame detection image according to the proportion value of the black pixels corresponding to the frame detection image and the proportion value of the black pixels corresponding to the previous frame detection image.
As shown in fig. 8, a structure of a blink detection device according to the present application includes:
the acquisition module 1205 is configured to acquire multiple frames of detection images;
an obtaining module 1206, configured to obtain an eye feature vector of the detected image;
the combining module 1207 is used for combining the eye feature vectors of the multiple frames of detection images into a first feature vector set;
a determining module 1208, configured to query a mapping relationship through the first feature vector set, and determine a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
a query module 1209, configured to query, from a command set, an operation command corresponding to the blink state; the command set is used for recording the corresponding relation between the blink state and the operation command;
and a processing module 1210 configured to execute the queried operation command.
The processing module 1210 may be specifically configured to, in the process of executing the queried operation command, execute a photographing operation after delaying M seconds when the blink state is blink; or when the blinking state is blinking, simulating the clicking operation of a mouse; or when the blink state is blink, simulating the click operation of the user; alternatively, the expression of the virtual game character is controlled when the blinking state is blinking.
As shown in fig. 9, a structure of a blink detection device according to the present application includes:
the acquisition module 1211 is configured to acquire a plurality of frames of detection images;
an obtaining module 1212, configured to obtain an eye feature vector of the detected image;
a combining module 1213, configured to combine the eye feature vectors of the multiple frames of detected images into a first feature vector set;
a determining module 1214, configured to query a mapping relationship through the first feature vector set, and determine a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
and the processing module 1215 is used for simulating the clicking operation of a user or a mouse when the blink state is the blink state.
Based on the same application concept as the method, the embodiment of the present application further provides a blink detection apparatus 220, and the blink detection apparatus 220 may be applied to the server 20. The blink detection means 220 may be implemented by software, or by hardware, or by a combination of hardware and software. Taking a software implementation as an example, a logical device is formed by reading corresponding computer program instructions in the non-volatile memory 22 through the processor 21 of the server 20 where the device is located. From a hardware level, as shown in fig. 10, the hardware structure of the server 20 where the blink detection device 220 is located is shown, and besides the processor 21 and the nonvolatile memory 22 shown in fig. 10, the server 20 may further include other hardware, such as a forwarding chip, a network interface, and a memory, which are responsible for processing messages; in terms of hardware structure, the server 20 may also be a distributed device, and may include a plurality of interface cards, so as to perform an extension of the message processing at a hardware level.
As shown in fig. 11, a structure of a blink detection device according to the present application includes:
the acquisition module 2201 is used for acquiring multi-frame sample images;
an obtaining module 2202, configured to obtain an eye feature vector of the sample image;
a combining module 2203, configured to combine the eye feature vectors of the multiple frames of sample images into a second feature vector set;
a generating module 2204, configured to generate a mapping relationship between the second feature vector set and a blink state corresponding to the multi-frame sample image;
a sending module 2205, configured to send, through a notification message, the mapping relationship between the second feature vector set and the blink state to a terminal device, so that the terminal device detects the blink state of the user according to the mapping relationship.
The eye feature vector of the sample image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the obtaining module 2202 is specifically configured to, in the process of obtaining the eye feature vector of the sample image, perform feature detection on the sample image through the feature classifier, and obtain the eye state of the sample image according to a feature detection result of the sample image; and/or acquiring a difference image of the frame sample image and the previous frame sample image aiming at the sample image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame sample image; and/or, carrying out binarization processing on the frame of sample image aiming at the sample image to obtain a binarized image, and counting a black pixel proportion value in the binarized image; and acquiring the black pixel proportion increase rate of the frame sample image according to the black pixel proportion value corresponding to the frame sample image and the black pixel proportion value corresponding to the previous frame sample image.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (which may include, but is not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (21)

1. A blink detection method is applied to a terminal device, and is characterized by comprising the following steps:
collecting a plurality of frames of detection images;
acquiring an eye characteristic vector of a detection image; wherein, the eye feature vector of the detection image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
forming a first feature vector set by eye feature vectors of a plurality of frames of detection images;
searching a mapping relation through the first feature vector set, and determining a blink state corresponding to the first feature vector set; the mapping relation is the mapping relation between the second characteristic vector set and the blink state.
2. The method of claim 1,
the mapping relation between the second characteristic vector set and the blink state is determined by the following steps:
collecting a plurality of frame sample images;
acquiring an eye characteristic vector of a sample image;
forming a second feature vector set by the eye feature vectors of the multi-frame sample images;
generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image;
alternatively, the first and second electrodes may be,
receiving a notification message sent by a server;
analyzing the mapping relation between the second characteristic vector set and the blink state from the notification message;
generating a mapping relation between the second feature vector set and the blink state.
3. The method of claim 1, wherein the eye feature vector of the detected image comprises an eye state; the acquiring of the eye feature vector of the detection image comprises:
carrying out feature detection on the detected image through a feature classifier;
and acquiring the eye state of the detection image according to the characteristic detection result of the detection image.
4. The method of claim 1, wherein the eye feature vector of the inspection image comprises a barycentric location of a connected component; the acquiring of the eye feature vector of the detection image comprises:
and acquiring a difference image of the frame detection image and the previous frame detection image aiming at the detection image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame detection image.
5. The method of claim 1, wherein the eye feature vector of the detected image comprises a black pixel scale growth rate; the acquiring of the eye feature vector of the detection image comprises:
carrying out binarization processing on the frame detection image aiming at the detection image to obtain a binarized image, and counting a black pixel proportion value in the binarized image;
and acquiring the proportion increase rate of the black pixels of the frame detection image according to the proportion value of the black pixels corresponding to the frame detection image and the proportion value of the black pixels corresponding to the previous frame detection image.
6. The method of claim 1,
the process of obtaining the eye feature vector of the detection image specifically includes:
carrying out face detection on a detection image to obtain a face region image, acquiring an eye region image from the face region image, and preprocessing the eye region image;
and analyzing the pre-processed eye region image corresponding to the detection image to obtain the eye characteristic vector of the detection image.
7. The method according to claim 1, wherein the process of grouping the eye feature vectors of the multiple frames of detected images into the first feature vector set specifically includes:
and generating a first characteristic vector set, and sequentially recording the eye characteristic vectors of the multiple frames of detection images to the first characteristic vector set according to the acquisition sequence of the multiple frames of detection images.
8. A blink detection method is applied to a terminal device, and is characterized by comprising the following steps:
detecting a blink state of the user based on the method of any one of claims 1-7;
inquiring an operation command corresponding to the blink state from a command set; the command set is used for recording the corresponding relation between the blink state and the operation command;
and executing the inquired operation command.
9. The method of claim 8,
the process of executing the queried operation command specifically includes:
when the blink state is blink, performing photographing operation after delaying for M seconds; alternatively, the first and second electrodes may be,
when the blinking state is blinking, simulating the clicking operation of a mouse; alternatively, the first and second electrodes may be,
when the blinking state is blinking, simulating the clicking operation of the user; alternatively, the first and second electrodes may be,
and when the blinking state is blinking, controlling the expression of the virtual game character.
10. A blink detection method is applied to a terminal device, and is characterized by comprising the following steps:
detecting a blink state of the user based on the method of any one of claims 1-7;
and when the blinking state is blinking, simulating the clicking operation of a user or a mouse.
11. A blink detection method is applied to a server side and is characterized by comprising the following steps:
collecting a plurality of frame sample images;
acquiring an eye characteristic vector of a sample image; wherein the eye feature vector of the sample image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
forming a second feature vector set by the eye feature vectors of the multi-frame sample images;
generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image;
and sending the mapping relation between the second characteristic vector set and the blink state to terminal equipment through a notification message, so that the terminal equipment detects the blink state of the user according to the mapping relation.
12. The method of claim 11,
the obtaining of the eye feature vector of the sample image comprises:
carrying out feature detection on the sample image through a feature classifier, and acquiring the eye state of the sample image according to the feature detection result of the sample image; and/or the presence of a gas in the gas,
acquiring a difference image of the frame of sample image and a previous frame of sample image aiming at the sample image, and determining the gravity center position of a connected domain of the difference image as the gravity center position of the connected domain of the frame of sample image; and/or the presence of a gas in the gas,
carrying out binarization processing on the frame of sample image aiming at the sample image to obtain a binarized image, and counting a black pixel proportion value in the binarized image; and acquiring the black pixel proportion increase rate of the frame sample image according to the black pixel proportion value corresponding to the frame sample image and the black pixel proportion value corresponding to the previous frame sample image.
13. The method of claim 11,
the process of obtaining the eye feature vector of the sample image specifically includes:
carrying out face detection on a sample image to obtain a face region image, acquiring an eye region image from the face region image, and preprocessing the eye region image;
and analyzing the preprocessed eye area image corresponding to the sample image to obtain the eye characteristic vector of the sample image.
14. An apparatus for detecting blink, applied to a terminal device, the apparatus comprising:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image; wherein, the eye feature vector of the detection image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set;
the mapping relationship is specifically a mapping relationship between the second feature vector set and the blink state.
15. The apparatus of claim 14, further comprising: the generating module is used for generating a mapping relation between the second characteristic vector set and the blink state;
the generating module is specifically used for acquiring multi-frame sample images; acquiring an eye characteristic vector of a sample image; forming a second feature vector set by the eye feature vectors of the multi-frame sample images; generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image; alternatively, the first and second electrodes may be,
receiving a notification message sent by a server; analyzing the mapping relation between the second characteristic vector set and the blink state from the notification message; generating a mapping relation between the second feature vector set and the blink state.
16. The apparatus of claim 14,
the acquisition module is specifically used for carrying out feature detection on the detection image through the feature classifier in the process of acquiring the eye feature vector of the detection image and acquiring the eye state of the detection image according to the feature detection result of the detection image; and/or acquiring a difference image of the frame detection image and the previous frame detection image aiming at the detection image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame detection image; and/or, carrying out binarization processing on the frame detection image aiming at the detection image to obtain a binarized image, counting the proportion value of black pixels in the binarized image, and acquiring the proportion increase rate of the black pixels of the frame detection image according to the proportion value of the black pixels corresponding to the frame detection image and the proportion value of the black pixels corresponding to the previous frame detection image.
17. An apparatus for detecting blink, applied to a terminal device, the apparatus comprising:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image; wherein, the eye feature vector of the detection image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
the query module is used for querying an operation command corresponding to the blink state from a command set; the command set is used for recording the corresponding relation between the blink state and the operation command;
and the processing module is used for executing the inquired operation command.
18. The apparatus of claim 17,
the processing module is specifically used for executing photographing operation after delaying M seconds when the blinking state is blinking in the process of executing the inquired operation command; or when the blinking state is blinking, simulating the clicking operation of a mouse; or when the blink state is blink, simulating the click operation of the user; alternatively, the expression of the virtual game character is controlled when the blinking state is blinking.
19. An apparatus for detecting blink, applied to a terminal device, the apparatus comprising:
the acquisition module is used for acquiring multi-frame detection images;
the acquisition module is used for acquiring the eye characteristic vector of the detection image; wherein, the eye feature vector of the detection image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the combination module is used for combining the eye characteristic vectors of the multi-frame detection images into a first characteristic vector set;
the determining module is used for querying a mapping relation through the first feature vector set and determining a blink state corresponding to the first feature vector set; the mapping relation is specifically a mapping relation between a second characteristic vector set and a blink state;
and the processing module is used for simulating the clicking operation of a user or a mouse when the blinking state is blinking.
20. An eye blink detection device applied to a server side, the device comprising:
the acquisition module is used for acquiring multi-frame sample images;
the acquisition module is used for acquiring the eye characteristic vector of the sample image; wherein the eye feature vector of the sample image comprises one or any combination of the following: an eye state, the eye state being open or closed; the position of the center of gravity of the connected domain; black pixel proportional growth rate;
the combining module is used for combining the eye characteristic vectors of the multi-frame sample images into a second characteristic vector set;
the generating module is used for generating a mapping relation between the second characteristic vector set and the blink state corresponding to the multi-frame sample image;
and the sending module is used for sending the mapping relation between the second characteristic vector set and the blink state to the terminal equipment through a notification message so that the terminal equipment can detect the blink state of the user according to the mapping relation.
21. The apparatus of claim 20,
the acquisition module is specifically used for carrying out feature detection on the sample image through the feature classifier in the process of acquiring the eye feature vector of the sample image and acquiring the eye state of the sample image according to the feature detection result of the sample image; and/or acquiring a difference image of the frame sample image and the previous frame sample image aiming at the sample image, and determining the gravity center position of the connected domain of the difference image as the gravity center position of the connected domain of the frame sample image; and/or, carrying out binarization processing on the frame of sample image aiming at the sample image to obtain a binarized image, and counting a black pixel proportion value in the binarized image; and acquiring the black pixel proportion increase rate of the frame sample image according to the black pixel proportion value corresponding to the frame sample image and the black pixel proportion value corresponding to the previous frame sample image.
CN201710474295.XA 2017-06-21 2017-06-21 Blink detection method and device Active CN109101103B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710474295.XA CN109101103B (en) 2017-06-21 2017-06-21 Blink detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710474295.XA CN109101103B (en) 2017-06-21 2017-06-21 Blink detection method and device

Publications (2)

Publication Number Publication Date
CN109101103A CN109101103A (en) 2018-12-28
CN109101103B true CN109101103B (en) 2022-04-12

Family

ID=64795999

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710474295.XA Active CN109101103B (en) 2017-06-21 2017-06-21 Blink detection method and device

Country Status (1)

Country Link
CN (1) CN109101103B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103971131A (en) * 2014-05-13 2014-08-06 华为技术有限公司 Preset facial expression recognition method and device
CN104571508A (en) * 2014-12-29 2015-04-29 北京元心科技有限公司 Method for operating data displayed by mobile terminal
US20160041614A1 (en) * 2014-08-06 2016-02-11 Samsung Display Co., Ltd. System and method of inducing user eye blink
CN105678250A (en) * 2015-12-31 2016-06-15 北京小孔科技有限公司 Face identification method in video and face identification device in video
CN106709400A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Sense organ opening and closing state recognition method, sense organ opening and closing state recognition device and client

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103577516A (en) * 2013-07-01 2014-02-12 北京百纳威尔科技有限公司 Method and device for displaying contents
CN106127139B (en) * 2016-06-21 2019-06-25 东北大学 A kind of dynamic identifying method of MOOC course middle school student's facial expression

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103971131A (en) * 2014-05-13 2014-08-06 华为技术有限公司 Preset facial expression recognition method and device
US20160041614A1 (en) * 2014-08-06 2016-02-11 Samsung Display Co., Ltd. System and method of inducing user eye blink
CN104571508A (en) * 2014-12-29 2015-04-29 北京元心科技有限公司 Method for operating data displayed by mobile terminal
CN106709400A (en) * 2015-11-12 2017-05-24 阿里巴巴集团控股有限公司 Sense organ opening and closing state recognition method, sense organ opening and closing state recognition device and client
CN105678250A (en) * 2015-12-31 2016-06-15 北京小孔科技有限公司 Face identification method in video and face identification device in video

Also Published As

Publication number Publication date
CN109101103A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
Jegham et al. Vision-based human action recognition: An overview and real world challenges
CN107197384B (en) The multi-modal exchange method of virtual robot and system applied to net cast platform
CN109522815B (en) Concentration degree evaluation method and device and electronic equipment
CN105809144B (en) A kind of gesture recognition system and method using movement cutting
CN104794462B (en) A kind of character image processing method and processing device
US20170238859A1 (en) Mental state data tagging and mood analysis for data collected from multiple sources
Abd El Meguid et al. Fully automated recognition of spontaneous facial expressions in videos using random forest classifiers
CN113422977B (en) Live broadcast method and device, computer equipment and storage medium
Khan et al. Saliency-based framework for facial expression recognition
US20190222806A1 (en) Communication system and method
US20150313530A1 (en) Mental state event definition generation
US20190340780A1 (en) Engagement value processing system and engagement value processing apparatus
CN105451029B (en) A kind of processing method and processing device of video image
JP5771127B2 (en) Attention level estimation device and program thereof
CN113723530B (en) Intelligent psychological assessment system based on video analysis and electronic psychological sand table
CN113160231A (en) Sample generation method, sample generation device and electronic equipment
CN111860057A (en) Face image blurring and living body detection method and device, storage medium and equipment
Zhu et al. Egoobjects: A large-scale egocentric dataset for fine-grained object understanding
KR20160046399A (en) Method and Apparatus for Generation Texture Map, and Database Generation Method
CN112492297B (en) Video processing method and related equipment
CN113688804A (en) Multi-angle video-based action identification method and related equipment
JP2015011526A (en) Action recognition system, method, and program, and recognizer construction system
US9501710B2 (en) Systems, methods, and media for identifying object characteristics based on fixation points
JP7438690B2 (en) Information processing device, image recognition method, and learning model generation method
CN107944424A (en) Front end human image collecting and Multi-angle human are distributed as comparison method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant