WO2020173135A1 - Neural network training and eye opening and closing state detection method, apparatus, and device - Google Patents

Neural network training and eye opening and closing state detection method, apparatus, and device Download PDF

Info

Publication number
WO2020173135A1
WO2020173135A1 PCT/CN2019/118127 CN2019118127W WO2020173135A1 WO 2020173135 A1 WO2020173135 A1 WO 2020173135A1 CN 2019118127 W CN2019118127 W CN 2019118127W WO 2020173135 A1 WO2020173135 A1 WO 2020173135A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
open
closed
detection
eyes
Prior art date
Application number
PCT/CN2019/118127
Other languages
French (fr)
Chinese (zh)
Inventor
王飞
钱晨
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to JP2021541183A priority Critical patent/JP7227385B2/en
Priority to KR1020217023286A priority patent/KR20210113621A/en
Publication of WO2020173135A1 publication Critical patent/WO2020173135A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • the present disclosure relates to computer vision technology, in particular to a neural network training method, neural network training device, eye open and closed state detection method, eye open and closed state detection device, intelligent driving control method, intelligent driving control device, electronic equipment, computer Readable storage medium and computer program.
  • the eye open and closed state detection is to detect the open and closed conditions of the eyes. Eye open and closed state detection can be used in fatigue monitoring, living body recognition, facial expression recognition and other fields. For example, in assisted driving technology, it is necessary to detect the eye open and closed state of the driver, and determine whether the driver is in a fatigued driving state based on the detection result of the eye open and closed state, so as to realize the fatigue driving monitoring. Accurately detect the open and closed state of the eyes, avoid misjudgment as much as possible, and help improve the safety of vehicle driving.
  • the embodiments of the present disclosure provide a technical solution for neural network training, eye open and closed state detection, and intelligent driving control.
  • a neural network training method which includes: a neural network for open and closed eye detection to be trained, for multiple eyes in an image set corresponding to each of at least two open and closed eye detection training tasks The image is subjected to the eye open and closed state detection processing, and the eye open and closed state detection results are output; wherein, the eye images contained in different image sets are at least partially different; according to the eye open and closed annotation information of the eye image and the neural network output According to the detection results of the eye open and closed state of the at least two eye open and closed detection training tasks, the respective losses corresponding to each of the at least two eye open and closed detection training tasks are determined, and the network of the neural network is adjusted according to the respective losses of the at least two eye open and closed detection training tasks. parameter.
  • a method for detecting the open and closed state of eyes including: acquiring an image to be processed; performing eye open and closed state detection processing on the image to be processed through a neural network, and outputting the open and closed eyes State detection result; wherein, the neural network is obtained by training using the neural network training method described in the foregoing implementation manner.
  • an intelligent driving control method including: acquiring a to-be-processed image collected by a camera set on a vehicle; and performing an eye-opening state on the to-be-processed image via a neural network Detection processing, outputting the detection result of the eye open and closed state; at least according to the detection result of the eye open and closed state belonging to the same target object in the multiple images to be processed with a time series relationship, determine the fatigue state of the target object; according to the target object A corresponding instruction is formed in the fatigue state, and the instruction is output; wherein, the neural network is obtained by training using the neural network training method described in the foregoing embodiment.
  • a neural network training device which includes: a neural network for open and closed eye detection to be trained, used for detecting a large number of images in at least two open and closed eye detection training tasks.
  • Eye images respectively perform eye open and closed state detection processing, and output eye open and closed state detection results; wherein, the eye images contained in different image sets are at least partially different;
  • the adjustment module is used to mark the eye open and closed according to the eye image Information and the detection result of the eye open and closed state output by the neural network, respectively determine the loss corresponding to each of the at least two eye open and closed detection training tasks, and determine the loss corresponding to each of the at least two eye open and closed detection training tasks Adjust the network parameters of the neural network.
  • an eye open and closed state detection device including: an acquisition module for acquiring an image to be processed; a neural network for detecting the eye open and closed state of the image to be processed Processing and outputting the detection result of the eye open and closed state; wherein the neural network is obtained by training using the neural network training device described in the foregoing embodiment.
  • an intelligent driving control device including: an acquisition module for acquiring images to be processed collected by a camera set on a vehicle; a neural network for evaluating the images to be processed , Perform eye open and closed state detection processing, and output eye open and closed state detection results; determine the fatigue state module, used to determine at least according to the eye open and closed state detection results of the same target object in multiple images to be processed with a time sequence relationship The fatigue state of the target object; an instruction module for forming a corresponding instruction according to the fatigue state of the target object, and outputting the instruction; wherein, the neural network is trained using the neural network training device described in the above embodiment acquired.
  • an electronic device including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and when the computer program is executed, Any method embodiment of the present disclosure.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, it implements any method embodiment of the present disclosure.
  • a computer program including computer instructions, which, when the computer instructions run in a processor of the device, implement any method implementation of the present disclosure.
  • the inventor found that the traditional single-task training neural network often appears as a neural network trained on the image set of the task, which has better accuracy in detecting open and closed eyes in the scene corresponding to the task.
  • it is difficult to guarantee the accuracy of open and closed eyes detection.
  • you simply use the images collected from multiple different scenes as a whole image set for neural network training it does not distinguish whether the images in the image set come from different scenes or correspond to different training tasks, and the whole image set is input to the neural network every time.
  • the distribution of image subsets (Batch) in network training is uncontrollable. It is possible that there are many images in one scene but few or no images in other scenes.
  • the distribution of image subsets in different iterations of training is not exactly the same, that is to say ,
  • the distribution of image subsets in each iteration of the neural network is too random, and different training tasks do not perform targeted loss calculations, so it is impossible to control the ability of the neural network to take into account different training tasks during the training process, so the training can not be guaranteed.
  • eye open and closed state detection method and device Based on the neural network training method and device, eye open and closed state detection method and device, intelligent driving control method and device, electronic equipment, computer readable storage medium, and computer program provided by the present disclosure, through multiple different eye open and closed detection tasks Respectively determine the corresponding image set, determine multiple eye images for a single training of the neural network from multiple image sets, and determine the opening and closing of the neural network for each training task in the training according to the eye images from multiple image sets
  • the loss of eye detection results and adjust the network parameters of the neural network according to each loss, so that the eye image subset fed to the neural network in each iteration of the neural network training includes the eye image corresponding to each training task, and Targeted calculation of the loss of each training task enables the neural network training process to learn the ability to detect the ability to open and close the eyes for each training task, taking into account the ability learning of different training tasks, so that the trained neural network can improve at the same time.
  • the accuracy of the open and closed eye detection of the eye images of each of the multiple scenes corresponding to a training task is helpful to improve the universality and generalization of the technical solution for accurate detection of open and closed eyes in different scenarios based on the neural network , which is conducive to better meet the actual application requirements of multiple scenarios.
  • Fig. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure
  • FIG. 2 is a schematic diagram of an embodiment of multiple open and closed eye detection training tasks in the present disclosure
  • FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure
  • FIG. 5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure.
  • FIG. 6 is a schematic structural diagram of an embodiment of the eye open/close state detection device of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
  • Fig. 8 is a block diagram of an exemplary device for implementing the embodiments of the present disclosure.
  • the embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, and servers, which can operate with many other general or special computing system environments or configurations.
  • Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, and servers including but not limited to: personal computer systems, server computer systems, thin clients, thick Client computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc. .
  • Electronic devices such as terminal devices, computer systems, and servers can be described in the general context of computer system executable instructions (such as program modules) executed by the computer system.
  • program modules can include routines, programs, target programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network.
  • program modules may be located on a storage medium of a local or remote computing system including a storage device.
  • FIG. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure. As shown in Fig. 1, the method of this embodiment includes steps: S100 and S110. Each step in Figure 1 will be described in detail below.
  • the to-be-trained neural network for eye-opening detection of the present disclosure can be used to detect the eye-open-close state of the image to be processed after being successfully trained, and output the detection result of the eye-open-close state of the image to be processed
  • the neural network outputs two probability values, where one probability value represents the probability that the target object in the image to be processed is in the open state. The greater the probability value, the closer to the open state; The other probability value represents the probability that the eyes of the target object in the image to be processed are in the closed state, and the larger the probability value, the closer to the closed state.
  • the sum of the two probability values can be 1.
  • the neural network in the present disclosure may be a convolutional neural network.
  • the neural network in the present disclosure may include but is not limited to: convolutional layer, Relu (Rectified Linear Unit) layer (also called activation layer), pooling layer, fully connected layer, and classification (such as two Classification), etc.
  • convolutional layer also called activation layer
  • pooling layer also called activation layer
  • classification such as two Classification
  • the present disclosure does not limit the specific structure of the neural network.
  • each open and closed eye detection training task should belong to the neural network. Realize the total training task of detecting open and closed eyes.
  • the training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment.
  • the above-mentioned attachments may be glasses or transparent plastic sheets.
  • the aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment.
  • the glasses in the present disclosure generally refer to glasses that can see the eye of the wearer through the lens.
  • the open and closed eyes detection task in the case where the eyes have attachments may be the open and closed eyes detection task with glasses.
  • the task of detecting open and closed eyes with glasses can be realized: at least one of detecting open and closed eyes with glasses indoors and detecting open and closed eyes with glasses outdoors.
  • the open and closed eyes detection task in the case where there is no eye attachment may be the open and closed eyes detection task without glasses.
  • the task of detecting open and closed eyes without glasses can be realized: at least one of the detection of open and closed eyes indoors without glasses and the detection of open and closed eyes outdoors without glasses.
  • the task of detecting open and closed eyes in an indoor environment can be realized: detection of open and closed eyes without glasses in the room, detection of open and closed eyes with glasses in the room and reflection of glasses, and detection of glasses in the room without reflection of glasses At least one of the open and closed eyes detection.
  • the task of detecting open and closed eyes in an outdoor environment can be realized: the detection of open and closed eyes without glasses outdoors, the detection of open and closed eyes with glasses and reflective glasses outdoors, and the detection of eyes with glasses and non-reflective glasses outdoors. At least one of the open and closed eyes detection.
  • the open and closed eyes detection task in the case where there is an attachment on the eye and a light spot on the attachment may be an open and closed eye detection task with glasses and reflection of the glasses.
  • the task of detecting open and closed eyes with glasses and reflections of the glasses can be realized: at least one of detection of open and closed eyes with glasses and reflections of glasses indoors and detection of open and closed eyes with glasses and reflections of glasses outdoors.
  • the open and closed eye detection task where there is an attachment on the eye and there is no light spot on the attachment may be the open and closed eye detection task with glasses and the glasses are not reflective.
  • the task of detecting open and closed eyes with glasses and non-reflective glasses can be realized: at least one of the detection of open and closed eyes with glasses and non-reflective glasses indoors and the detection of open and closed eyes with glasses and non-reflective glasses outdoors.
  • the open and closed eye detection task with glasses can be compared with the open and closed eye detection task in an indoor environment, and the open and closed eye detection task in an outdoor environment.
  • the situation where there is an intersection between the six open and closed eye detection training tasks mentioned above will not be explained one by one here.
  • the present disclosure does not limit the number of open and closed eye detection training tasks involved, and the number of open and closed eye detection training tasks can be determined according to actual needs, and the present disclosure does not limit the specific performance of any open and closed eye detection training tasks. form.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following three open and closed eye detection training tasks:
  • Open and closed eyes detection training task a Open and closed eyes detection training task in indoor environment
  • Open and closed eyes detection training task b Open and closed eyes detection task in outdoor environment
  • Open and closed eyes detection training task c Open and closed eyes detection task with attachments to the eyes and spots on the attachments.
  • At least two open and closed eye detection training tasks in the present disclosure each correspond to an image set, for example, open and closed eyes detection training task a, open and closed eyes detection training task b, and open and closed eye detection training tasks in FIG.
  • Each eye detection training task c corresponds to an image set.
  • Each image set usually includes multiple eye images.
  • the eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets.
  • the eye images contained in different image sets may have an intersection.
  • the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • all eye images in the eye image set with eye attachments may be eye images with glasses.
  • the eye image set may include: eye images with glasses collected in an indoor environment and images in an outdoor environment. The captured eye image with glasses.
  • all eye images in the eye image set without eye attachments may be eye images without glasses.
  • the eye image set may include: eye images without glasses collected in an indoor environment and those outdoors. Eye images without glasses collected in the environment.
  • the set of eye images collected in an indoor environment may include: eye images without glasses collected in an indoor environment, and eye images with glasses collected in an indoor environment.
  • the set of eye images collected in an outdoor environment may include: eye images without glasses collected in an outdoor environment, and eye images with glasses collected in an outdoor environment.
  • all eye images in the eye image set with attachments on the eyes and spots on the attachments may be eye images with glasses and spots on the glasses.
  • the eye image set may include: in an indoor environment Collected eye images with glasses and speckles on the spectacles and eye images with spectacles and speckles on the spectacles collected in an outdoor environment.
  • all eye images in the eye image set with attachments to the eyes and no spots on the attachments may be eye images with glasses and no spots on the glasses.
  • the eye image set may include: in an indoor environment Collected eye images with glasses and no light spots on the glasses and eye images with glasses and no light spots on the glasses collected in an outdoor environment.
  • the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
  • the eye image used in the neural network training process of the present disclosure may also be called an eye image sample, and the image content of the eye image sample usually includes eyes.
  • the eye image sample in the present disclosure is usually a monocular-based eye image sample, that is, the image content of the eye image sample does not include two eyes, but includes one eye.
  • the eye image sample may be an eye image sample based on a single side eye, for example, an eye image sample based on the left eye.
  • the present disclosure does not exclude the case where the eye image sample is an eye image sample based on both eyes or an eye image sample based on any side of the eye.
  • the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera.
  • the process of forming an eye image in the present disclosure may include: performing eye detection on the image taken by the camera device to determine the eye part in the image, and then segmenting the detected eye part from the image, optionally Yes, the present disclosure can perform processing such as zooming and/or image content mapping (such as converting a right-eye image block into a left-eye image block through image content mapping) on the segmented image blocks, thereby forming a method for training open and closed eyes detection Eye image with neural network.
  • the eye image in the present disclosure does not rule out the possibility of using the complete image including the eye captured by the camera as the eye image.
  • the eye image in the present disclosure may be the eye image in the corresponding training sample set.
  • the eye image used for training the neural network for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image.
  • the annotation information can indicate whether the eyes in the eye image are in an open state or a closed state.
  • the label information of the eye image is 1, which means that the eyes in the eye image are in the open state, and the label information of the eye image is 0, which means that the eyes in the eye image are in the closed state.
  • the present disclosure usually obtains a corresponding number of eye images from the eye image sets corresponding to different training tasks.
  • the eye images of the corresponding data obtained from the image set corresponding to the open and closed eye detection training task a are provided to the neural network for open and closed eye detection to be trained, and the image set corresponding to the open and closed eye detection training task b is obtained
  • the eye images of the corresponding data are provided to the neural network for open and closed eyes detection to be trained, and the eye images with corresponding data obtained from the image set corresponding to the open and closed eye detection training task c are provided to the neural network for open and closed eyes detection to be trained.
  • the present disclosure may obtain a corresponding number of eye images from the eye image set corresponding to each training task according to the preset image number ratio of different training tasks; in addition, in the process of obtaining eye images, usually The preset batch quantity will be considered.
  • the present disclosure can obtain 200 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and from the open and closed eyes
  • the eye image corresponding to the detection training task c is collected to acquire 200 eye images.
  • the eye images corresponding to the other open and closed eye detection training tasks can be detected Collect the corresponding number of eye images to achieve batch processing.
  • 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task a
  • 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task b
  • 250 eye images can be obtained from the open and closed eyes detection training task c
  • the corresponding eye images are collected in 100 eye images, so that a total of 600 eye images are obtained. In this way, the flexibility of obtaining eye images can be increased.
  • the present disclosure may also adopt a method of randomly setting the number to obtain a corresponding number of eye images from the eye image sets corresponding to different training tasks.
  • the present disclosure does not limit the specific implementation of obtaining a corresponding number of eye images from eye image sets corresponding to different training tasks.
  • the present disclosure may sequentially provide the acquired multiple eye images to the neural network for eye-opening detection to be trained, and the neural network for eye-opening detection to be trained performs the input for each eye
  • the images are respectively subjected to eye open and closed state detection processing, so that the neural network for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image.
  • an eye image input to the neural network for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification.
  • the neural network is used to output two probability values, the ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1.
  • One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state.
  • the other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
  • the present disclosure should determine the loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the loss corresponding to each training task, and use the comprehensive loss to adjust the network of the neural network parameter.
  • the network parameters in the present disclosure may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
  • the present disclosure may output the largest of the eye open and closed state detection results respectively output by the neural network for multiple eye images in the image set corresponding to the training task.
  • the angle between the probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task.
  • the present disclosure may use the A-softmax (normalized index with angle) loss function to determine different openings and closings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network.
  • the loss corresponding to each of the eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the corresponding loss of different open and closed eye detection training tasks, and the stochastic gradient descent method is used to adjust the network parameters of the neural network.
  • the present disclosure can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks.
  • the network parameters of the neural network for the open and closed eye detection to be trained are updated in the manner of loss gradient descent.
  • all eye images provided to the neural network for each iteration of training can form a subset of eye images.
  • the eye image subset includes eye images corresponding to each training task.
  • the neural network can learn the ability to detect the ability to open and close the eyes for each training task during the training process, taking into account the ability learning of different training tasks, so that the trained nerve
  • the network can simultaneously improve the accuracy of eye open and closed detection of eye images of each scene in multiple scenes corresponding to multiple training tasks, thereby helping to improve the generalization of the technical solution for accurate detection of eye open and closed in different scenarios based on the neural network. Adaptability and generalization can better meet the actual application requirements of multiple scenarios.
  • the A-softmax loss function in the present disclosure can be represented by the following formula (1):
  • Lang represents the loss corresponding to a training task
  • N represents the number of eye images of the training task
  • represents the modulus of *
  • x i represents the i-th corresponding to the training task Eye images
  • y i represents the label value of the i-th eye image corresponding to the training task
  • m is a constant, and the minimum value of m is usually not less than a predetermined value, for example, the minimum value of m is not less than
  • this training process ends when the training of the neural network for detecting open and closed eyes to be trained reaches a predetermined iterative condition.
  • the predetermined iterative conditions in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network for eye open and closed detection to be trained for the eye image and the label information of the eye image, which meets the predetermined difference requirement. In the case that the difference meets the predetermined difference requirement, the training of the neural network is successfully completed this time.
  • the predetermined iterative conditions in the present disclosure may also include: training the neural network for open and closed eye detection to be trained, and the number of eye images used reaches a predetermined number requirement, etc. When the number of eye images used reaches the predetermined number requirement, however, the difference does not meet the predetermined difference requirement, the neural network was not successfully trained this time.
  • the neural network that has been successfully trained can be used for the detection and processing of the eye open and closed state.
  • the present disclosure forms a comprehensive loss based on the loss of different training tasks, and uses the comprehensive loss to adjust the network parameters of the neural network for eye-opening detection, so that the neural network can open and close the eyes for each training task during the training process
  • the ability learning of ability detection takes into account the ability learning of different training tasks, so that the trained neural network can simultaneously improve the accuracy of the eye image detection of the eyes of each scene in the multiple scenes corresponding to multiple training tasks, and then It is helpful to improve the universality and generalization of the technical solution based on the neural network for accurate detection of open and closed eyes in different scenarios, and better meet the actual application requirements of multiple scenarios.
  • FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure.
  • the method of this embodiment includes steps: S300 and S310. Each step in Figure 3 will be described in detail below.
  • the image to be processed in the present disclosure may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, captured by a camera set on a moving object
  • the video frame in the video for example, is a video frame in a video taken by a camera set at a fixed position.
  • the above-mentioned moving objects may be vehicles, robots, or robotic arms.
  • the above-mentioned fixed position can be a desktop or a wall.
  • the present disclosure does not limit the specific manifestations of moving objects and fixed positions.
  • the present disclosure may detect the location area of the eyes in the image to be processed. For example, the method of face detection or face key point detection may be used to determine the area to be processed.
  • the eye of the image circumscribes the frame.
  • the present disclosure can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network.
  • the segmented eye image blocks can be provided to the neural network after certain preprocessing.
  • the segmented eye image block is scaled, so that the size of the eye image block after the scaled process can meet the size requirement of the neural network for the input image.
  • the eye image blocks on the predetermined side are mapped to form two eye image blocks on the same side of the target object.
  • the two eye image blocks on the same side can be scaled.
  • S310 Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result.
  • the neural network in the present disclosure is obtained through successful training using the implementation of the neural network training method in the present disclosure.
  • the neural network in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is open and
  • the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • Eye movements for example, fast blinking, opening one eye and closing one eye, or squinting.
  • the present disclosure can determine the multiple to-be-processed images with a timing relationship based on the detection result of the eye open and closed state with a timing relationship and the state of other organs of the target object's face output to the neural network
  • the facial expressions of the target object for example, smiling, laughing or crying or sad.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • the state of fatigue for example, mild fatigue or dozing off or asleep.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • Eye actions so that the present disclosure can determine the interactive control information expressed by the target objects in multiple images to be processed with a time sequence relationship at least according to eye actions.
  • the eye movements, facial expressions, fatigue states, and interactive control information determined by the present disclosure can be utilized by various applications. For example, using the predetermined eye movements and/or facial expressions of the target object to trigger the predetermined special effects in the live broadcast/rebroadcasting process or realize the corresponding human-computer interaction, etc., so as to facilitate the realization of rich applications; another example, in the intelligent driving technology In the real-time detection of the driver’s fatigue state, it is helpful to prevent fatigue driving.
  • the present disclosure does not limit the specific application of the eye open and closed state detection results output by the neural network.
  • FIG. 4 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
  • the intelligent driving control method of the present disclosure can be applied in an automatic driving environment and also in a cruise driving environment.
  • the present disclosure does not limit the applicable environment of the intelligent driving control method.
  • the method of this embodiment includes steps: S400, S410, S420, and S430.
  • the steps in Figure 4 will be described in detail below.
  • S400 Acquire an image to be processed collected by a camera device provided on the vehicle.
  • S300 in FIG. 3 For the specific implementation manner of this step, reference may be made to the description of S300 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
  • S410 Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result.
  • the neural network in this embodiment is obtained through successful training using the implementation of the neural network training method described above.
  • For the specific implementation manner of this step reference may be made to the description of S310 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
  • S420 Determine the fatigue state of the target object at least according to the detection results of the open and closed eyes of the same target object of the multiple images to be processed with a time series relationship.
  • the target object in the present disclosure is usually the driver of the vehicle.
  • the present disclosure can determine the number of blinks, the duration of a single eye closure, or a single eye opening of the target object (such as a driver) in a unit time based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship.
  • Index parameters such as eye length can be used to determine whether the target object (such as the driver) is in a state of fatigue by using predetermined index requirements to further determine the corresponding index parameters.
  • the fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation of determining the fatigue state of the target object.
  • S430 Form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
  • the instructions generated by the present disclosure may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake-up driver instruction, and report dangerous driving information instruction. At least one kind, the present disclosure does not limit the specific manifestation of the instruction.
  • the neural network successfully trained by the neural network training method of the present disclosure is beneficial to improve the accuracy of the detection result of the open and closed eye state of the neural network
  • the detection result of the open and closed eye state output by the neural network is used to perform the fatigue state
  • the judgment is beneficial to improve the accuracy of the fatigue state detection, so that corresponding instructions are formed according to the detected fatigue state detection, which is beneficial to avoid fatigue driving, and thus is beneficial to improve driving safety.
  • FIG. 5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure.
  • the neural network training device as shown in FIG. 5 includes: a neural network 500 for detecting open and closed eyes to be trained and an adjustment module 510.
  • the device may further include: an input module 520.
  • the neural network 500 for eye open and closed detection to be trained is used to perform eye open and closed state detection processing on multiple eye images in the image set corresponding to at least two open and closed eye detection training tasks, respectively, and output eye open and closed state detection results .
  • the eye images contained in different image sets are at least partially different.
  • the to-be-trained neural network 500 for eye-opening detection of the present disclosure can be used to detect the eye-open state of the image to be processed after being successfully trained, and output the eye-open state detection of the image to be processed
  • the neural network 500 outputs two probability values, one of which indicates the probability that the target object in the image to be processed is open. The larger the probability value, the closer to being open. State; where another probability value represents the probability that the eyes of the target object in the image to be processed is in the closed state, and the larger the probability value, the closer to the closed state.
  • the sum of the two probability values can be 1.
  • the neural network 500 in the present disclosure may be a convolutional neural network.
  • the neural network 500 in the present disclosure may include, but is not limited to: a convolutional layer, a Relu layer (also referred to as an activation layer), a pooling layer, a fully connected layer, and a layer for classification (such as binary classification).
  • a convolutional layer also referred to as an activation layer
  • a pooling layer also referred to as an activation layer
  • a fully connected layer such as binary classification
  • a layer for classification such as binary classification
  • each open and closed eye detection training task should belong to The network realizes the total training task of detecting the state of open and closed eyes.
  • the training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network 500 into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment.
  • the situation of open and closed eyes detection task may be glasses or transparent plastic sheets.
  • the aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment.
  • At least two open and closed eye detection training tasks in the present disclosure each correspond to an image set.
  • Each image set usually includes multiple eye images.
  • the eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets.
  • the eye images contained in different image sets may have an intersection.
  • the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • an eye image set with eyes attached an eye image set without eyes attached
  • an eye image collected in an indoor environment Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
  • the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera.
  • the formation process of the eye image in the present disclosure reference may be made to the description in the foregoing method embodiment, which is not described in detail here.
  • the eye image used for training the neural network 500 for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image.
  • the labeling information in the present disclosure may also indicate that the eyes in the eye image are in an uncertain state of opening and closing.
  • the eye image used for training the neural network 500 in the present disclosure generally does not include the labeling information as being open. Closing the eye image in the uncertain state is beneficial to avoid the influence of the eye image in the uncertain state of opening and closing on the neural network 500, and is beneficial to improving the detection accuracy of the neural network 500 for detecting open and closed eyes.
  • the input module 520 is used to obtain a corresponding number of eye images from different image sets, and provide them to the neural network 500 for eye opening and closing detection to be trained. For example, the input module 520 obtains a corresponding number of eye images from different image sets for different open and closed eye detection training tasks according to preset image quantity ratios for different open and closed eye detection training tasks, and provides them to the open and closed eyes to be trained. Neural network 500 for closed eyes detection. In addition, the input module 520 usually considers the preset batch processing quantity when acquiring the eye image.
  • the input module 520 can obtain 200 eye images from the eye image set corresponding to the open and closed eye detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eye detection training task b, The eye image corresponding to the eye detection training task c is collected 200 eye images.
  • the input module 520 may correspond to other open and closed eye detection training tasks Obtain the corresponding number of eye images in the eye image collection to achieve batch processing.
  • the input module 520 can obtain 250 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 250 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and detect from the open and closed eyes
  • the eye images corresponding to the training task c acquire 100 eye images collectively, so that the input module 520 acquires 600 eye images in total.
  • the input module 520 may also adopt a manner of randomly setting a number to obtain a corresponding number of eye images from respective eye image sets corresponding to different training tasks.
  • the present disclosure does not limit the specific implementation manner in which the input module 520 obtains a corresponding number of eye images from eye image sets corresponding to different training tasks.
  • it should avoid acquiring eye images whose labeling information is in an uncertain state of opening and closing, so as to help improve the detection accuracy of the neural network for eye opening and closing detection.
  • the input module 520 may sequentially provide the acquired multiple eye images to the neural network 500 for eye-opening detection to be trained, and the neural network for eye-opening detection 500 to be trained performs the input
  • An eye image is separately processed for eye open and closed state detection, so that the neural network 500 for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image.
  • an eye image input to the neural network 500 for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification.
  • the detection neural network 500 outputs two probability values, the value ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1.
  • One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state. The other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
  • the adjustment module 510 is configured to determine the respective corresponding losses of the at least two open and closed eye detection training tasks according to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the neural network 500, and according to the at least two open and closed eye detection training tasks.
  • the network parameters of the neural network 500 are adjusted for the respective losses corresponding to the closed-eye detection training tasks.
  • the adjustment module 510 should determine the respective loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the respective loss of all training tasks, and the adjustment module 510 uses the comprehensive loss to adjust the nerve.
  • Network parameters of the network may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
  • the adjustment module 510 may output according to the eye open and closed state detection results of the multiple eye images in the image set corresponding to the training task by the neural network.
  • the angle between the maximum probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task.
  • the adjustment module 510 may use the A-softmax (normalized index with angle) loss function to determine different openings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network.
  • the loss corresponding to each of the closed-eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the respective corresponding losses of the different open and closed-eye detection training tasks.
  • the adjustment module 510 adopts a stochastic gradient descent method to adjust the neural network Network parameters.
  • the adjustment module 510 can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks.
  • the network parameters of the neural network 500 for eye open and closed detection to be trained are updated in the manner of loss gradient descent.
  • the adjustment module 510 may control the end of this training process.
  • the predetermined iterative condition in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network 500 for eye open and closed detection to be trained for the eye image and the annotation information of the eye image meets the predetermined difference requirement. In the case where the difference meets the predetermined difference requirement, the neural network 500 is successfully trained this time.
  • the predetermined iterative conditions used by the adjustment module 510 may also include: training a neural network for detecting open and closed eyes to be trained, and the number of eye images used reaches a predetermined number requirement, etc.
  • the neural network 500 is not successfully trained this time.
  • the neural network 500 that has been successfully trained can be used for the detection processing of the eye open and closed state.
  • Fig. 6 is a schematic structural diagram of an embodiment of an eye open-close state detection device of the present disclosure.
  • the device of this embodiment includes: an acquisition module 600 and a neural network 600.
  • the device for detecting the eye open and closed state may further include: a determining module 620.
  • the acquiring module 600 is used to acquire the image to be processed.
  • the image to be processed obtained by the obtaining module 600 may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, a camera set on a moving object.
  • the video frame in the captured video is another example of the video frame in the video captured by a camera set at a fixed position.
  • the above-mentioned moving objects may be vehicles, robots, or robotic arms.
  • the above-mentioned fixed position can be a desktop or a wall.
  • the acquiring module 600 may detect the location area of the eyes in the image to be processed. For example, the acquiring module 600 may use methods such as face detection or face key point detection. Determine the eye circumscribed frame of the image to be processed. Then, the acquisition module 600 can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network 600. Of course, the acquisition module 600 may perform certain preprocessing on the segmented eye image blocks and provide them to the neural network 610.
  • the acquisition module 600 performs scaling processing on the segmented eye image blocks, so that the size of the eye image blocks after the scaling process meets the size requirement of the neural network 610 for the input image. For another example, after segmenting the eye image blocks of the two eyes of the target object, the acquisition module 600 performs mapping processing on the eye image blocks on the predetermined side thereof, thereby forming two eye image blocks on the same side of the target object. Yes, the acquisition module 600 can also perform scaling processing on two eye image blocks on the same side.
  • the present disclosure does not limit the specific implementation manner of the acquisition module 600 segmenting the eye image blocks from the image to be processed, nor the specific implementation manner of the acquisition module 600 preprocessing the segmented eye image blocks.
  • the neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
  • the neural network 600 in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is in an open state and
  • the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
  • the determining module 620 is configured to determine the eye movements and/or facial expressions and/or fatigue status and/or interaction of the target object at least according to the detection results of the open and closed eyes of the same target object in the multiple to-be-processed images with a time sequence relationship. Control information.
  • the eye motion of the target object for example, a quick blinking motion, or an eye opening and closing motion, or an eye squinting motion, etc.
  • Facial expressions of the target object for example, smiling, laughing or crying or sadness, etc.
  • the fatigue state of the target object for example, mild fatigue or dozing off or deep asleep.
  • the interactive control information expressed by the target object for example, confirmation or denial.
  • FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
  • the device in FIG. 7 mainly includes: an acquisition module 600, a neural network 610, a fatigue state determination module 700, and an instruction module 710.
  • the acquisition module 600 is used to acquire the to-be-processed image collected by the camera device installed on the vehicle.
  • the neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
  • the fatigue state determining module 700 is configured to determine the fatigue state of the target object at least according to the detection results of the open/closed state of the eyes belonging to the same target object in the plurality of images to be processed with a time series relationship.
  • the target object in this disclosure is usually a driver.
  • the fatigue state determining module 700 can determine the number of blinks per unit time, the duration of a single eye closure, or the duration of a single eye closure of the target object (such as a driver) based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship. Index parameters such as the duration of a single eye opening to determine the fatigue state module 700 further judge the corresponding index parameters by using predetermined index requirements, and the fatigue state determination module 700 can determine whether the target object (such as the driver) is in a fatigue state.
  • the fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation manner of determining the fatigue state of the target object by the fatigue state determining module 700.
  • the instruction module 710 is used to form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
  • the instruction module 710 generates instructions based on the fatigue state of the target object, and the generated instructions may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake up driver instruction, and report dangerous driving information instruction, etc. At least one of the instructions, the present disclosure does not limit the specific manifestation of the instruction.
  • the fatigue state determining module 700 uses the open and closed output of the neural network 610 Judging the fatigue state based on the result of the eye state detection is beneficial to improve the accuracy of the fatigue state detection. Therefore, the instruction module 710 forms a corresponding instruction according to the detected fatigue state detection, which is beneficial to avoid fatigue driving and thus is beneficial to improve driving safety.
  • FIG. 8 shows an exemplary device 800 suitable for implementing the present disclosure.
  • the device 800 may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, for example, a desktop). Computer or notebook computer, etc.), tablet computer, server, etc.
  • a mobile terminal for example, a smart mobile phone, etc.
  • PC personal computer
  • Computer or notebook computer, etc. tablet computer, server, etc.
  • the device 800 includes one or more processors, communication parts, etc., and the one or more processors may be: one or more central processing units (CPU) 801, and/or, one or more acceleration
  • the unit 813, the acceleration unit 813 may be a graphics processor (GPU), etc., and the processor may be based on executable instructions stored in a read-only memory (ROM) 802 or loaded from the storage part 808 to a random access memory (RAM) 803.
  • the instructions can be executed to perform various appropriate actions and processing.
  • the communication unit 812 may include but is not limited to a network card, and the network card may include but is not limited to an IB (Infiniband) network card.
  • IB Infiniband
  • the processor can communicate with the read-only memory 802 and/or the random access memory 803 to execute executable instructions, connect with the communication part 812 through the bus 804, and communicate with other target devices through the communication part 812, thereby completing the corresponding in this disclosure. step.
  • the RAM 803 can also store various programs and data required for device operation.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • ROM802 is an optional module.
  • the RAM 803 stores executable instructions, or writes executable instructions into the ROM 802 during operation, and the executable instructions cause the central processing unit 801 to execute the steps included in the above method.
  • An input/output (I/O) interface 805 is also connected to the bus 804.
  • the communication unit 812 may be integrated, or may be configured to have multiple sub-modules (for example, multiple IB network cards) and be connected to the bus respectively.
  • the following components are connected to the I/O interface 805: an input part 806 including a keyboard and a mouse; an output part 807 such as a cathode ray tube (CRT), a liquid crystal display (LCD) and a speaker; a storage part 808 including a hard disk; And the communication part 809 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 809 performs communication processing via a network such as the Internet.
  • the driver 810 is also connected to the I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 810 as needed, so that the computer program read from it is installed in the storage section 808 as needed.
  • the architecture shown in Figure 8 is only an optional implementation. In the specific practice process, the number and types of components in Figure 8 can be selected, deleted, added or replaced according to actual needs. ; In the setting of different functional components, implementation methods such as separate or integrated settings can also be used.
  • the acceleration unit 813 and the CPU801 can be separately arranged.
  • the acceleration unit 813 can be integrated on the CPU801, and the communication part can be separately arranged, It can also be integrated on the CPU801 or the acceleration unit 813.
  • the process described below with reference to the flowcharts can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program product tangibly contained on a machine-readable medium.
  • the computer program includes program code for executing the steps shown in the flowchart.
  • the program code may include instructions corresponding to the steps in the method provided by the present disclosure.
  • the computer program may be downloaded and installed from the network through the communication part 809, and/or installed from the removable medium 811.
  • the computer program is executed by the central processing unit (CPU) 801
  • the instructions described in the present disclosure to implement the above-mentioned corresponding steps are executed.
  • the embodiments of the present disclosure also provide a computer program program product for storing computer-readable instructions, which when executed, cause a computer to execute the procedures described in any of the foregoing embodiments.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is specifically embodied as a computer storage medium.
  • the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
  • SDK software development kit
  • the embodiments of the present disclosure also provide another method for detecting the open and closed state of eyes, a method for intelligent driving control, and a training method for neural networks, and corresponding devices, electronic equipment, and computer storage media.
  • a computer program and a computer program product wherein the method includes: the first device sends a neural network training instruction or an eye open and closed state detection instruction or an intelligent driving control instruction to the second device, and the instruction causes the second device to perform any of the above possible The neural network training method or the eye open/close state detection method or the intelligent driving control method in the embodiment; the first device receives the neural network training result or the eye open/close state detection result or the intelligent driving control result sent by the second device.
  • the neural network training instruction or the eye open and closed state detection instruction or the intelligent driving control instruction may be specifically a call instruction, and the first device may instruct the second device to perform the neural network training operation or open and close the eyes by calling.
  • the state detection operation or the intelligent driving control operation correspondingly, in response to receiving the call instruction, the second device may execute the steps and steps in any embodiment of the above-mentioned neural network training method or the eye-opening state detection method or the intelligent driving control method. /Or process.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure may be implemented in many ways.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless otherwise specifically stated.
  • the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

Abstract

Disclosed in the embodiments of the present disclosure are a neural network training method, an eye opening and closing state detection method, a smart driving control method, an apparatus, an electronic device, a computer readable storage medium, and a computer program, the neural network training method comprising: by means of a neural network to be trained for eye opening and closing state detection, respectively performing eye opening and closing state detection processing on multiple eye images in image sets corresponding to each of at least two eye opening and closing detection training tasks, and outputting eye opening and closing state detection results, the eye images contained in different image sets being at least partially different; on the basis of eye opening and closing label information of the eye images and the eye opening and closing state detection results outputted by the neural network, respectively determining the loss corresponding to each of the at least two eye opening and closing detection training tasks and, on the basis of the loss corresponding to each of the at least two eye opening and closing detection training tasks, adjusting the network parameters of the neural network.

Description

神经网络训练及眼睛睁闭状态检测方法、装置及设备Neural network training and eye open and closed state detection method, device and equipment
本公开要求在2019年2月28日提交中国专利局、申请号为201910153463.4、发明名称为“神经网络训练及眼睛睁闭状态检测方法、装置及设备”的中国专利申请的优先权,其全部内容通过引用结合在本公开中。This disclosure requires the priority of a Chinese patent application filed with the Chinese Patent Office on February 28, 2019, the application number is 201910153463.4, and the invention title is "Neural Network Training and Eye Open and Closed State Detection Method, Apparatus, and Equipment", and its entire contents Incorporated in this disclosure by reference.
技术领域Technical field
本公开涉及计算机视觉技术,尤其是涉及一种神经网络训练方法、神经网络训练装置、眼睛睁闭状态检测方法、眼睛睁闭状态检测装置、智能驾驶控制方法、智能驾驶控制装置、电子设备、计算机可读存储介质以及计算机程序。The present disclosure relates to computer vision technology, in particular to a neural network training method, neural network training device, eye open and closed state detection method, eye open and closed state detection device, intelligent driving control method, intelligent driving control device, electronic equipment, computer Readable storage medium and computer program.
背景技术Background technique
眼睛睁闭状态检测即检测眼睛的睁闭情况。眼睛睁闭状态检测可以用于疲劳监测、活体识别、表情识别等领域。例如,在辅助驾驶技术中,需要针对驾驶员进行眼睛睁闭状态检测,基于眼睛睁闭状态检测结果,判断驾驶员是否处于疲劳驾驶状态,从而实现疲劳驾驶监测。准确的检测出眼睛睁闭状态,尽可能的避免误判,有利于提高车辆行驶的安全性。The eye open and closed state detection is to detect the open and closed conditions of the eyes. Eye open and closed state detection can be used in fatigue monitoring, living body recognition, facial expression recognition and other fields. For example, in assisted driving technology, it is necessary to detect the eye open and closed state of the driver, and determine whether the driver is in a fatigued driving state based on the detection result of the eye open and closed state, so as to realize the fatigue driving monitoring. Accurately detect the open and closed state of the eyes, avoid misjudgment as much as possible, and help improve the safety of vehicle driving.
发明内容Summary of the invention
本公开实施方式提供一种神经网络训练、眼睛睁闭状态检测以及智能驾驶控制技术方案。The embodiments of the present disclosure provide a technical solution for neural network training, eye open and closed state detection, and intelligent driving control.
根据本公开实施方式其中一方面,提供一种神经网络训练方法,包括:经待训练的睁闭眼检测用神经网络,对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;其中,不同图像集所包含的眼睛图像至少部分不同;根据所述眼睛图像的眼睛睁闭标注信息和所述神经网络输出的眼睛睁闭状态检测结果,分别确定所述至少两个睁闭眼检测训练任务各自对应的损失,并根据所述至少两个睁闭眼检测训练任务各自对应的损失调整所述神经网络的网络参数。According to one aspect of the embodiments of the present disclosure, a neural network training method is provided, which includes: a neural network for open and closed eye detection to be trained, for multiple eyes in an image set corresponding to each of at least two open and closed eye detection training tasks The image is subjected to the eye open and closed state detection processing, and the eye open and closed state detection results are output; wherein, the eye images contained in different image sets are at least partially different; according to the eye open and closed annotation information of the eye image and the neural network output According to the detection results of the eye open and closed state of the at least two eye open and closed detection training tasks, the respective losses corresponding to each of the at least two eye open and closed detection training tasks are determined, and the network of the neural network is adjusted according to the respective losses of the at least two eye open and closed detection training tasks. parameter.
根据本公开实施方式其中另一方面,提供一种眼睛睁闭状态检测方法,包括:获取待处理图像;经神经网络,对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;其中,所述神经网络是利用上述实施方式所述的神经网络训练方法训练获得的。According to another aspect of the embodiments of the present disclosure, there is provided a method for detecting the open and closed state of eyes, including: acquiring an image to be processed; performing eye open and closed state detection processing on the image to be processed through a neural network, and outputting the open and closed eyes State detection result; wherein, the neural network is obtained by training using the neural network training method described in the foregoing implementation manner.
根据本公开实施方式其中又一方面,提供一种智能驾驶控制方法,包括:获取车辆上设置的摄像装置所采集的待处理图像;经神经网络,对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的疲劳状态;根据所述目标对象的疲劳状态形成相应的指令,并输出该指令;其中,所述神经网络是利用上述实施方式所述的神经网络训练方法训练获得的。According to another aspect of the embodiments of the present disclosure, there is provided an intelligent driving control method, including: acquiring a to-be-processed image collected by a camera set on a vehicle; and performing an eye-opening state on the to-be-processed image via a neural network Detection processing, outputting the detection result of the eye open and closed state; at least according to the detection result of the eye open and closed state belonging to the same target object in the multiple images to be processed with a time series relationship, determine the fatigue state of the target object; according to the target object A corresponding instruction is formed in the fatigue state, and the instruction is output; wherein, the neural network is obtained by training using the neural network training method described in the foregoing embodiment.
根据本公开实施方式其中再一方面,提供一种神经网络训练装置,包括:待训练的睁闭眼检测用神经网络,用于对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;其中,不同图像集所包含的眼睛图像至少部分不同;调整模块,用于根据所述眼睛图像的眼睛睁闭标注信息和所述神经网络输出的眼睛睁闭状态检测结果,分别确定所述至少两个睁闭眼检测训练任务各自对应的损失,并根据所述至少两个睁闭眼检测训练任务各自对应的损失调整所述神经网络的网络参数。According to still another aspect of the embodiments of the present disclosure, a neural network training device is provided, which includes: a neural network for open and closed eye detection to be trained, used for detecting a large number of images in at least two open and closed eye detection training tasks. Eye images, respectively perform eye open and closed state detection processing, and output eye open and closed state detection results; wherein, the eye images contained in different image sets are at least partially different; the adjustment module is used to mark the eye open and closed according to the eye image Information and the detection result of the eye open and closed state output by the neural network, respectively determine the loss corresponding to each of the at least two eye open and closed detection training tasks, and determine the loss corresponding to each of the at least two eye open and closed detection training tasks Adjust the network parameters of the neural network.
根据本公开实施方式其中再一方面,提供一种眼睛睁闭状态检测装置,包括:获取模块,用于获取待处理图像;神经网络,用于对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态 检测结果;其中,所述神经网络是利用上述实施方式所述的神经网络训练装置训练获得的。According to still another aspect of the embodiments of the present disclosure, there is provided an eye open and closed state detection device, including: an acquisition module for acquiring an image to be processed; a neural network for detecting the eye open and closed state of the image to be processed Processing and outputting the detection result of the eye open and closed state; wherein the neural network is obtained by training using the neural network training device described in the foregoing embodiment.
根据本公开实施方式其中再一方面,提供一种智能驾驶控制装置,包括:获取模块,用于获取车辆上设置的摄像装置所采集的待处理图像;神经网络,用于对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;确定疲劳状态模块,用于至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的疲劳状态;指令模块,用于根据所述目标对象的疲劳状态形成相应的指令,并输出该指令;其中,所述神经网络是利用上述实施方式所述的神经网络训练装置训练获得的。According to another aspect of the embodiments of the present disclosure, there is provided an intelligent driving control device, including: an acquisition module for acquiring images to be processed collected by a camera set on a vehicle; a neural network for evaluating the images to be processed , Perform eye open and closed state detection processing, and output eye open and closed state detection results; determine the fatigue state module, used to determine at least according to the eye open and closed state detection results of the same target object in multiple images to be processed with a time sequence relationship The fatigue state of the target object; an instruction module for forming a corresponding instruction according to the fatigue state of the target object, and outputting the instruction; wherein, the neural network is trained using the neural network training device described in the above embodiment acquired.
根据本公开实施方式再一方面,提供一种电子设备,包括:存储器,用于存储计算机程序;处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,实现本公开任一方法实施方式。According to another aspect of the embodiments of the present disclosure, there is provided an electronic device, including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and when the computer program is executed, Any method embodiment of the present disclosure.
根据本公开实施方式再一个方面,提供一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时,实现本公开任一方法实施方式。According to another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, it implements any method embodiment of the present disclosure.
根据本公开实施方式的再一个方面,提供一种计算机程序,包括计算机指令,当所述计算机指令在设备的处理器中运行时,实现本公开任一方法实施方式。According to another aspect of the embodiments of the present disclosure, there is provided a computer program, including computer instructions, which, when the computer instructions run in a processor of the device, implement any method implementation of the present disclosure.
发明人在实践本公开实施例过程中发现,传统的单任务训练的神经网络往往出现针对该任务的图像集训练得到的神经网络,在该任务对应的场景中具有较好的睁闭眼检测准确率,但在非该任务对应的其他场景,睁闭眼检测的准确率很难得到保证。如果简单将多个不同场景采集到的图像作为神经网络训练用的一个整的图像集,不区分图像集中的图像是否来自不同场景或对应不同训练任务,从该整的图像集中每次输入到神经网络训练中的图像子集(Batch)的分布不可控,有可能某个场景的图像多而其他场景的图像少甚至没有,不同迭代次训练的图像子集的分布也不完全相同,也就是说,在神经网络每迭代次的图像子集分布过于随机,不同训练任务没有针对性进行损失计算,则无法控制神经网络在训练过程中兼顾各个不同训练任务的能力学习,由此无法保证训练得到的神经网络在不同任务对应的不同场景的睁闭眼检测的准确性。In practicing the embodiments of the present disclosure, the inventor found that the traditional single-task training neural network often appears as a neural network trained on the image set of the task, which has better accuracy in detecting open and closed eyes in the scene corresponding to the task. However, in other scenarios not corresponding to the task, it is difficult to guarantee the accuracy of open and closed eyes detection. If you simply use the images collected from multiple different scenes as a whole image set for neural network training, it does not distinguish whether the images in the image set come from different scenes or correspond to different training tasks, and the whole image set is input to the neural network every time. The distribution of image subsets (Batch) in network training is uncontrollable. It is possible that there are many images in one scene but few or no images in other scenes. The distribution of image subsets in different iterations of training is not exactly the same, that is to say , The distribution of image subsets in each iteration of the neural network is too random, and different training tasks do not perform targeted loss calculations, so it is impossible to control the ability of the neural network to take into account different training tasks during the training process, so the training can not be guaranteed. The accuracy of the neural network's eyesight detection in different scenes corresponding to different tasks.
基于本公开提供的神经网络训练方法及装置、眼睛睁闭状态检测方法及装置、智能驾驶控制方法及装置、电子设备、计算机可读存储介质以及计算机程序,通过从多个不同睁闭眼检测任务分别确定对应的图像集,从多个图像集确定神经网络单次训练的多张眼睛图像,根据来自多个图像集的眼睛图像来分别确定该次训练中神经网络针对每个训练任务的睁闭眼检测结果的损失,并根据各损失来调整神经网络的网络参数,这样,在神经网络的每次迭代训练喂入神经网络中的眼睛图像子集中都包括有对应各个训练任务的眼睛图像,并针对性计算各个训练任务的损失,使得神经网络训练过程中能够针对每个训练任务进行睁闭眼能力检测的能力学习,兼顾不同训练任务的能力学习,从而使得训练后的神经网络能够同时提高多个训练任务对应的多个场景中的各个场景的眼睛图像的睁闭眼检测的准确性,有利于提高基于该神经网络进行不同场景睁闭眼准确检测的技术方案的普适性和泛化性,有利于更好满足多场景的实际应用需求。Based on the neural network training method and device, eye open and closed state detection method and device, intelligent driving control method and device, electronic equipment, computer readable storage medium, and computer program provided by the present disclosure, through multiple different eye open and closed detection tasks Respectively determine the corresponding image set, determine multiple eye images for a single training of the neural network from multiple image sets, and determine the opening and closing of the neural network for each training task in the training according to the eye images from multiple image sets The loss of eye detection results, and adjust the network parameters of the neural network according to each loss, so that the eye image subset fed to the neural network in each iteration of the neural network training includes the eye image corresponding to each training task, and Targeted calculation of the loss of each training task enables the neural network training process to learn the ability to detect the ability to open and close the eyes for each training task, taking into account the ability learning of different training tasks, so that the trained neural network can improve at the same time. The accuracy of the open and closed eye detection of the eye images of each of the multiple scenes corresponding to a training task is helpful to improve the universality and generalization of the technical solution for accurate detection of open and closed eyes in different scenarios based on the neural network , Which is conducive to better meet the actual application requirements of multiple scenarios.
下面通过附图和实施方式,对本公开的技术方案做进一步的详细描述。The technical solutions of the present disclosure will be further described in detail below through the drawings and embodiments.
附图说明Description of the drawings
构成说明书的一部分的附图描述了本公开的实施方式,并且连同描述一起用于解释本公开的原理。The drawings constituting a part of the specification describe the embodiments of the present disclosure, and together with the description, serve to explain the principle of the present disclosure.
参照附图,根据下面的详细描述,可以更加清楚地理解本公开,其中:With reference to the accompanying drawings, the present disclosure can be understood more clearly according to the following detailed description, in which:
图1为本公开的神经网络训练方法一个实施方式的流程图;Fig. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure;
图2为本公开的多个睁闭眼检测训练任务一个实施方式的示意图;2 is a schematic diagram of an embodiment of multiple open and closed eye detection training tasks in the present disclosure;
图3为本公开的眼睛睁闭状态检测方法一个实施方式的流程图;FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure;
图4为本公开的智能驾驶控制方法一个实施方式的流程图;4 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure;
图5为本公开的神经网络训练装置一个实施方式的结构示意图;5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure;
图6为本公开的眼睛睁闭状态检测装置一个实施方式的结构示意图;FIG. 6 is a schematic structural diagram of an embodiment of the eye open/close state detection device of the present disclosure;
图7为本公开的智能驾驶控制装置一个实施方式的结构示意图;FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure;
图8为实现本公开实施方式的一示例性设备的框图。Fig. 8 is a block diagram of an exemplary device for implementing the embodiments of the present disclosure.
具体实施例Specific embodiment
现在将参照附图来详细描述本公开的各种示例性实施例。应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that unless specifically stated otherwise, the relative arrangement, numerical expressions and numerical values of the components and steps set forth in these embodiments do not limit the scope of the present disclosure.
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。At the same time, it should be understood that, for ease of description, the sizes of the various parts shown in the drawings are not drawn in accordance with actual proportional relationships.
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。The following description of at least one exemplary embodiment is actually only illustrative, and in no way serves as any limitation to the present disclosure and its application or use.
对于相关领域普通技术人员已知的技术、方法以及设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。The technologies, methods, and equipment known to those of ordinary skill in the relevant fields may not be discussed in detail, but where appropriate, the technologies, methods, and equipment should be regarded as part of the specification.
应当注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。It should be noted that similar reference numerals and letters indicate similar items in the following drawings, and therefore, once an item is defined in one drawing, it does not need to be discussed further in subsequent drawings.
本公开实施例可以应用于终端设备、计算机系统及服务器等电子设备,其可与众多其它通用或者专用的计算系统环境或者配置一起操作。适于与终端设备、计算机系统以及服务器等电子设备一起使用的众所周知的终端设备、计算系统、环境和/或配置的例子,包括但不限于:个人计算机系统、服务器计算机系统、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的系统、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机系统﹑大型计算机系统和包括上述任何系统的分布式云计算技术环境,等等。The embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, and servers, which can operate with many other general or special computing system environments or configurations. Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, and servers, including but not limited to: personal computer systems, server computer systems, thin clients, thick Client computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc. .
终端设备、计算机系统以及服务器等电子设备可以在由计算机系统执行的计算机系统可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑以及数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机系统/服务器可以在分布式云计算环境中实施,分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算系统存储介质上。Electronic devices such as terminal devices, computer systems, and servers can be described in the general context of computer system executable instructions (such as program modules) executed by the computer system. Generally, program modules can include routines, programs, target programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types. The computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network. In a distributed cloud computing environment, program modules may be located on a storage medium of a local or remote computing system including a storage device.
示例性实施例Exemplary embodiment
图1为本公开的神经网络训练方法一个实施例的流程图。如图1所示,该实施例方法包括步骤:S100以及S110。下面对图1中的各步骤分别进行详细描述。FIG. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure. As shown in Fig. 1, the method of this embodiment includes steps: S100 and S110. Each step in Figure 1 will be described in detail below.
S100、经待训练的睁闭眼检测用神经网络,对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。S100. After the neural network for eye open and closed detection to be trained, perform eye open and closed state detection processing on multiple eye images in the respective image sets corresponding to at least two open and closed eye detection training tasks, and output eye open and closed state detection result.
在一个可选示例中,本公开的待训练的睁闭眼检测用神经网络在成功训练后,可以用于对待处理图像进行眼睛睁闭状态检测,并输出待处理图像的眼睛睁闭状态检测结果,例如,针对一待处理图像, 神经网络输出两个概率值,其中一个概率值表示待处理图像中的目标对象的眼睛处于睁状态的概率,该概率值越大,表示越接近于睁状态;其中另一个概率值表示待处理图像中的目标对象的眼睛处于闭状态的概率,该概率值越大表示越接近于闭状态。两个概率值之和可以为1。In an optional example, the to-be-trained neural network for eye-opening detection of the present disclosure can be used to detect the eye-open-close state of the image to be processed after being successfully trained, and output the detection result of the eye-open-close state of the image to be processed For example, for an image to be processed, the neural network outputs two probability values, where one probability value represents the probability that the target object in the image to be processed is in the open state. The greater the probability value, the closer to the open state; The other probability value represents the probability that the eyes of the target object in the image to be processed are in the closed state, and the larger the probability value, the closer to the closed state. The sum of the two probability values can be 1.
在一个可选示例中,本公开中的神经网络可以为卷积神经网络。本公开中的神经网络可以包括但不限于:卷积层、Relu(Rectified Linear Unit,修正线性单元)层(也可以称为激活层)、池化层、全连接层以及用于分类(如二分类)的层等。该神经网络所包含的层数越多,则网络越深。本公开对神经网络的具体结构不作限制。In an alternative example, the neural network in the present disclosure may be a convolutional neural network. The neural network in the present disclosure may include but is not limited to: convolutional layer, Relu (Rectified Linear Unit) layer (also called activation layer), pooling layer, fully connected layer, and classification (such as two Classification), etc. The more layers the neural network contains, the deeper the network. The present disclosure does not limit the specific structure of the neural network.
在一个可选示例中,本公开在对神经网络进行训练的过程,所涉及到的睁闭眼检测训练任务至少为两个,且每一个睁闭眼检测训练任务均应该属于用于使神经网络实现睁闭眼状态检测这一总训练任务。不同的睁闭眼检测训练任务所对应的训练目标并不完全相同。也就是说,本公开可以将神经网络的总训练任务划分为多个训练任务,每一个训练任务针对一种训练目标,且不同的训练任务所对应的训练目标并不相同。In an optional example, in the process of training the neural network in the present disclosure, there are at least two open and closed eye detection training tasks involved, and each open and closed eye detection training task should belong to the neural network. Realize the total training task of detecting open and closed eyes. The training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
在一个可选示例中,本公开中的至少两个睁闭眼检测训练任务可以包括以下至少两个任务:眼睛有附着物情形的睁闭眼检测任务、眼睛无附着物情形的睁闭眼检测任务、室内环境下的睁闭眼检测任务、室外环境下的睁闭眼检测任务、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务、眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务。上述附着物可以为眼镜或者透明塑料片等。上述光斑可以是由于附着物反光而在附着物上形成的光斑。本公开中的眼镜通常是指可以透过镜片看到带眼镜者的眼睛的眼镜。In an optional example, the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment. Tasks, open and closed eyes detection tasks in indoor environments, open and closed eyes detection tasks in outdoor environments, open and closed eyes detection tasks with attachments to the eyes and spots on the attachments, eyes with attachments and no spots on the attachments The situation of open and closed eyes detection task. The above-mentioned attachments may be glasses or transparent plastic sheets. The aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment. The glasses in the present disclosure generally refer to glasses that can see the eye of the wearer through the lens.
可选的,眼睛有附着物情形的睁闭眼检测任务可以为带眼镜的睁闭眼检测任务。该带眼镜的睁闭眼检测任务可以实现:针对室内带眼镜的睁闭眼检测以及针对室外带眼镜的睁闭眼检测中的至少一个。Optionally, the open and closed eyes detection task in the case where the eyes have attachments may be the open and closed eyes detection task with glasses. The task of detecting open and closed eyes with glasses can be realized: at least one of detecting open and closed eyes with glasses indoors and detecting open and closed eyes with glasses outdoors.
可选的,眼睛无附着物情形的睁闭眼检测任务可以为不带眼镜的睁闭眼检测任务。该不带眼镜的睁闭眼检测任务可以实现:针对室内不带眼镜的睁闭眼检测以及针对室外不带眼镜的睁闭眼检测中的至少一个。Optionally, the open and closed eyes detection task in the case where there is no eye attachment may be the open and closed eyes detection task without glasses. The task of detecting open and closed eyes without glasses can be realized: at least one of the detection of open and closed eyes indoors without glasses and the detection of open and closed eyes outdoors without glasses.
可选的,室内环境下的睁闭眼检测任务可以实现:针对室内不带眼镜的睁闭眼检测、针对室内带眼镜且眼镜反光的睁闭眼检测、以及针对室内带眼镜且眼镜不反光的睁闭眼检测中的至少一个。Optionally, the task of detecting open and closed eyes in an indoor environment can be realized: detection of open and closed eyes without glasses in the room, detection of open and closed eyes with glasses in the room and reflection of glasses, and detection of glasses in the room without reflection of glasses At least one of the open and closed eyes detection.
可选的,室外环境下的睁闭眼检测任务可以实现:针对室外不带眼镜的睁闭眼检测、针对室外带眼镜且眼镜反光的睁闭眼检测、以及针对室外带眼镜且眼镜不反光的睁闭眼检测中的至少一个。Optionally, the task of detecting open and closed eyes in an outdoor environment can be realized: the detection of open and closed eyes without glasses outdoors, the detection of open and closed eyes with glasses and reflective glasses outdoors, and the detection of eyes with glasses and non-reflective glasses outdoors. At least one of the open and closed eyes detection.
可选的,眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务可以为带眼镜且眼镜反光的睁闭眼检测任务。该带眼镜且眼镜反光的睁闭眼检测任务可以实现:针对室内带眼镜且眼镜反光的睁闭眼检测以及针对室外带眼镜且眼镜反光的睁闭眼检测中的至少一个。Optionally, the open and closed eyes detection task in the case where there is an attachment on the eye and a light spot on the attachment may be an open and closed eye detection task with glasses and reflection of the glasses. The task of detecting open and closed eyes with glasses and reflections of the glasses can be realized: at least one of detection of open and closed eyes with glasses and reflections of glasses indoors and detection of open and closed eyes with glasses and reflections of glasses outdoors.
可选的,眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务可以为带眼镜且眼镜不反光的睁闭眼检测任务。该带眼镜且眼镜不反光的睁闭眼检测任务可以实现:针对室内带眼镜且眼镜不反光的睁闭眼检测以及针对室外带眼镜且眼镜不反光的睁闭眼检测中的至少一个。Optionally, the open and closed eye detection task where there is an attachment on the eye and there is no light spot on the attachment may be the open and closed eye detection task with glasses and the glasses are not reflective. The task of detecting open and closed eyes with glasses and non-reflective glasses can be realized: at least one of the detection of open and closed eyes with glasses and non-reflective glasses indoors and the detection of open and closed eyes with glasses and non-reflective glasses outdoors.
由上述描述可知,本公开中的不同睁闭眼检测训练任务之间存在交集,例如,带眼镜的睁闭眼检测任务可以与室内环境下的睁闭眼检测任务、室外环境下的睁闭眼检测任务、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务、眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务分别存在交集。上述例举的六个睁闭眼检测训练任务之间存在交集的情况在此不再一一说明。另外,本公开不限制所涉及到的睁闭眼检测训练任务的数量,且睁闭眼检测训练任务的数量可以根据实际需求确定,本 公开也不限制任一睁闭眼检测训练任务的具体表现形式。It can be seen from the above description that there is an intersection between different open and closed eye detection training tasks in the present disclosure. For example, the open and closed eye detection task with glasses can be compared with the open and closed eye detection task in an indoor environment, and the open and closed eye detection task in an outdoor environment. There is an overlap between the detection task, the open and closed eyes detection task when the eyes have attachments and light spots on the attachments, and the open and closed eyes detection tasks when the eyes have attachments and no light spots on the attachments. The situation where there is an intersection between the six open and closed eye detection training tasks mentioned above will not be explained one by one here. In addition, the present disclosure does not limit the number of open and closed eye detection training tasks involved, and the number of open and closed eye detection training tasks can be determined according to actual needs, and the present disclosure does not limit the specific performance of any open and closed eye detection training tasks. form.
可选的,如图2所示,本公开中的至少两个睁闭眼检测训练任务可以包括下述三个睁闭眼检测训练任务:Optionally, as shown in FIG. 2, the at least two open and closed eye detection training tasks in the present disclosure may include the following three open and closed eye detection training tasks:
睁闭眼检测训练任务a、室内环境下的睁闭眼检测训练任务;Open and closed eyes detection training task a. Open and closed eyes detection training task in indoor environment;
睁闭眼检测训练任务b、室外环境下的睁闭眼检测任务;Open and closed eyes detection training task b. Open and closed eyes detection task in outdoor environment;
睁闭眼检测训练任务c、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务。Open and closed eyes detection training task c. Open and closed eyes detection task with attachments to the eyes and spots on the attachments.
睁闭眼检测训练任务a与睁闭眼检测训练任务b之间不存在交集,训练任务a与训练任务c之间可以存在交集,训练任务b与训练任务c之间可以存在交集。There is no intersection between the open and closed eye detection training task a and the open and closed eye detection training task b, there can be an intersection between the training task a and the training task c, and there can be an intersection between the training task b and the training task c.
在一个可选示例中,本公开中的至少两个睁闭眼检测训练任务各自对应有图像集,例如,图2中的睁闭眼检测训练任务a、睁闭眼检测训练任务b以及睁闭眼检测训练任务c各自对应有图像集。每一图像集通常均包括多张眼睛图像。不同图像集所包含的眼睛图像至少部分不同。也就是说,针对任一图像集而言,该图像集中的至少部分眼睛图像不会出现在其他图像集中。可选的,不同图像集所包含的眼睛图像可以存在交集。In an optional example, at least two open and closed eye detection training tasks in the present disclosure each correspond to an image set, for example, open and closed eyes detection training task a, open and closed eyes detection training task b, and open and closed eye detection training tasks in FIG. Each eye detection training task c corresponds to an image set. Each image set usually includes multiple eye images. The eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets. Optionally, the eye images contained in different image sets may have an intersection.
可选的,上述例举的六个睁闭眼检测训练任务各自对应的图像集可以分别为:眼睛有附着物的眼睛图像集、眼睛无附着物的眼睛图像集、室内环境下采集的眼睛图像集、室外环境下采集的眼睛图像集、眼睛有附着物且附着物上有光斑的眼睛图像集、眼睛有附着物且附着物上无光斑的眼睛图像集。Optionally, the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
可选的,眼睛有附着物的眼睛图像集中的所有眼睛图像可以均为带眼镜的眼睛图像,例如,该眼睛图像集可以包括:在室内环境下采集的带眼镜的眼睛图像以及在室外环境下采集的带眼镜的眼睛图像。Optionally, all eye images in the eye image set with eye attachments may be eye images with glasses. For example, the eye image set may include: eye images with glasses collected in an indoor environment and images in an outdoor environment. The captured eye image with glasses.
可选的,眼睛无附着物的眼睛图像集中的所有眼睛图像可以均为不带眼镜的眼睛图像,例如,该眼睛图像集可以包括:在室内环境下采集的不带眼镜的眼睛图像以及在室外环境下采集的不带眼镜的眼睛图像。Optionally, all eye images in the eye image set without eye attachments may be eye images without glasses. For example, the eye image set may include: eye images without glasses collected in an indoor environment and those outdoors. Eye images without glasses collected in the environment.
可选的,室内环境下采集的眼睛图像集可以包括:在室内环境下采集的不带眼镜的眼睛图像、以及在室内环境下采集的带眼镜的眼睛图像。Optionally, the set of eye images collected in an indoor environment may include: eye images without glasses collected in an indoor environment, and eye images with glasses collected in an indoor environment.
可选的,室外环境下采集的眼睛图像集可以包括:在室外环境下采集的不带眼镜的眼睛图像、以及在室外环境下采集的带眼镜的眼睛图像。Optionally, the set of eye images collected in an outdoor environment may include: eye images without glasses collected in an outdoor environment, and eye images with glasses collected in an outdoor environment.
可选的,眼睛有附着物且附着物上有光斑的眼睛图像集中的所有眼睛图像可以均为带眼镜,且眼镜上有光斑的眼睛图像,例如,该眼睛图像集可以包括:在室内环境下采集的带眼镜且眼镜上有光斑的眼睛图像以及在室外环境下采集的带眼镜且眼镜上有光斑的眼睛图像。Optionally, all eye images in the eye image set with attachments on the eyes and spots on the attachments may be eye images with glasses and spots on the glasses. For example, the eye image set may include: in an indoor environment Collected eye images with glasses and speckles on the spectacles and eye images with spectacles and speckles on the spectacles collected in an outdoor environment.
可选的,眼睛有附着物且附着物上无光斑的眼睛图像集中的所有眼睛图像可以均为带眼镜,且眼镜上没有光斑的眼睛图像,例如,该眼睛图像集可以包括:在室内环境下采集的带眼镜且眼镜上没有光斑的眼睛图像以及在室外环境下采集的带眼镜且眼镜上没有光斑的眼睛图像。Optionally, all eye images in the eye image set with attachments to the eyes and no spots on the attachments may be eye images with glasses and no spots on the glasses. For example, the eye image set may include: in an indoor environment Collected eye images with glasses and no light spots on the glasses and eye images with glasses and no light spots on the glasses collected in an outdoor environment.
在一个可选示例中,本公开所包含的图像集是由本公开所包含的睁闭眼检测训练任务决定的。例如,本公开包含上述六个睁闭眼检测训练任务中的至少两个,则本公开包含该至少两个睁闭眼检测训练任务各自对应的眼睛图像集。In an optional example, the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
在一个可选示例中,本公开神经网络训练过程中使用的眼睛图像也可以称为眼睛图像样本,且眼睛图像样本的图像内容通常包含有眼睛。本公开中的眼睛图像样本通常为基于单眼的眼睛图像样本,即眼睛图像样本的图像内容并未包含有双眼,而是包含一只眼睛。可选的,眼睛图像样本可以为基于单一侧眼睛的眼睛图像样本,例如,如基于左眼的眼睛图像样本。当然,本公开也不排除眼睛图像样 本为基于双眼的眼睛图像样本或者基于任意侧眼睛的眼睛图像样本的情况。In an optional example, the eye image used in the neural network training process of the present disclosure may also be called an eye image sample, and the image content of the eye image sample usually includes eyes. The eye image sample in the present disclosure is usually a monocular-based eye image sample, that is, the image content of the eye image sample does not include two eyes, but includes one eye. Optionally, the eye image sample may be an eye image sample based on a single side eye, for example, an eye image sample based on the left eye. Of course, the present disclosure does not exclude the case where the eye image sample is an eye image sample based on both eyes or an eye image sample based on any side of the eye.
在一个可选示例中,本公开中的眼睛图像通常可以为:从摄像装置摄取的包含有眼睛的图像中切分出的眼睛图像块。例如,本公开中的眼睛图像的形成过程可以包括:对摄像装置所摄取的图像进行眼睛检测,以确定图像中的眼睛部分,然后,将检测出的眼睛部分从图像中切分出来,可选的,本公开可以对切分出的图像块进行缩放和/或图像内容映射(如通过图像内容映射将右眼图像块转换为左眼图像块)等处理,从而形成用于训练睁闭眼检测用神经网络的眼睛图像。当然,本公开中的眼睛图像也不排除将摄像装置摄取的、包含有眼睛的完整图像作为眼睛图像的可能性。另外,本公开中的眼睛图像可以是相应的训练样本集中的眼睛图像。In an optional example, the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera. For example, the process of forming an eye image in the present disclosure may include: performing eye detection on the image taken by the camera device to determine the eye part in the image, and then segmenting the detected eye part from the image, optionally Yes, the present disclosure can perform processing such as zooming and/or image content mapping (such as converting a right-eye image block into a left-eye image block through image content mapping) on the segmented image blocks, thereby forming a method for training open and closed eyes detection Eye image with neural network. Of course, the eye image in the present disclosure does not rule out the possibility of using the complete image including the eye captured by the camera as the eye image. In addition, the eye image in the present disclosure may be the eye image in the corresponding training sample set.
在一个可选示例中,本公开中的用于训练睁闭眼检测用神经网络的眼睛图像,通常具有标注信息,且该标注信息可以表示出眼睛图像中的眼睛的睁闭状态。也就是说,标注信息可以表示出眼睛图像中的眼睛是处于睁状态,还是处于闭状态。一个可选例子,眼睛图像的标注信息为1,表示该眼睛图像中的眼睛处于睁状态,眼睛图像的标注信息为0,表示该眼睛图像中的眼睛处于闭状态。In an optional example, the eye image used for training the neural network for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image. In other words, the annotation information can indicate whether the eyes in the eye image are in an open state or a closed state. As an alternative example, the label information of the eye image is 1, which means that the eyes in the eye image are in the open state, and the label information of the eye image is 0, which means that the eyes in the eye image are in the closed state.
在一个可选示例中,本公开通常会从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像。例如,图2中,从睁闭眼检测训练任务a对应的图像集中获取相应数据的眼睛图像提供给待训练的睁闭眼检测用神经网络,从睁闭眼检测训练任务b对应的图像集中获取相应数据的眼睛图像提供给待训练的睁闭眼检测用神经网络,从睁闭眼检测训练任务c对应的图像集中获取相应数据的眼睛图像提供给待训练的睁闭眼检测用神经网络。In an optional example, the present disclosure usually obtains a corresponding number of eye images from the eye image sets corresponding to different training tasks. For example, in Fig. 2, the eye images of the corresponding data obtained from the image set corresponding to the open and closed eye detection training task a are provided to the neural network for open and closed eye detection to be trained, and the image set corresponding to the open and closed eye detection training task b is obtained The eye images of the corresponding data are provided to the neural network for open and closed eyes detection to be trained, and the eye images with corresponding data obtained from the image set corresponding to the open and closed eye detection training task c are provided to the neural network for open and closed eyes detection to be trained.
一个可选例子,本公开可以按照预先设置的不同训练任务的图像数量比例,从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像;另外,在获取眼睛图像的过程中,通常还会考虑预设的批处理数量。例如,在针对睁闭眼检测训练任务a、睁闭眼检测训练任务b和睁闭眼检测训练任务c而预设的图像数量比例为1:1:1的情况下,如果预设的批处理数量为600,则本公开可以从睁闭眼检测训练任务a对应的眼睛图像集中获取200张眼睛图像,从睁闭眼检测训练任务b对应的眼睛图像集中获取200张眼睛图像,从睁闭眼检测训练任务c对应的眼睛图像集中获取200张眼睛图像。As an optional example, the present disclosure may obtain a corresponding number of eye images from the eye image set corresponding to each training task according to the preset image number ratio of different training tasks; in addition, in the process of obtaining eye images, usually The preset batch quantity will be considered. For example, if the preset image quantity ratio is 1:1:1 for open and closed eyes detection training task a, open and closed eyes detection training task b, and open and closed eyes detection training task c, if the preset batch processing If the number is 600, the present disclosure can obtain 200 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and from the open and closed eyes The eye image corresponding to the detection training task c is collected to acquire 200 eye images.
可选的,如果某个睁闭眼检测训练任务对应的眼睛图像集中的眼睛图像的数量达不到相应数量(如达不到200),则可以从其他睁闭眼检测训练任务对应的眼睛图像集中获取相应数量的眼睛图像,以达到批处理数量。例如,假设睁闭眼检测训练任务c对应的眼睛图像集中只有100张眼睛图像,而睁闭眼检测训练任务a和睁闭眼检测训练任务b各自对应的眼睛图像集中的眼睛图像的数量均超过250,则可以从睁闭眼检测训练任务a对应的眼睛图像集中获取250张眼睛图像,从睁闭眼检测训练任务b对应的眼睛图像集中获取250张眼睛图像,从睁闭眼检测训练任务c对应的眼睛图像集中获取100张眼睛图像,从而总共获得600张眼睛图像。这样,可以增加获取眼睛图像的灵活性。Optionally, if the number of eye images in the eye image set corresponding to a certain open and closed eye detection training task does not reach the corresponding number (such as less than 200), the eye images corresponding to the other open and closed eye detection training tasks can be detected Collect the corresponding number of eye images to achieve batch processing. For example, suppose that there are only 100 eye images in the eye image set corresponding to the open and closed eye detection training task c, and the number of eye images in the eye image set corresponding to each of the open and closed eye detection training task a and the open and closed eye detection training task b exceeds 250, then 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task a, 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task b, and 250 eye images can be obtained from the open and closed eyes detection training task c The corresponding eye images are collected in 100 eye images, so that a total of 600 eye images are obtained. In this way, the flexibility of obtaining eye images can be increased.
需要特别说明的是,本公开也可以采用随机设置数量的方式,从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像。本公开不限制从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像的具体实现方式。另外,在从眼睛图像集中获取眼睛图像的过程中,应避免获取标注信息为睁闭不确定状态的眼睛图像,从而有利于提高睁闭眼检测用神经网络的检测准确性。It should be particularly noted that the present disclosure may also adopt a method of randomly setting the number to obtain a corresponding number of eye images from the eye image sets corresponding to different training tasks. The present disclosure does not limit the specific implementation of obtaining a corresponding number of eye images from eye image sets corresponding to different training tasks. In addition, in the process of acquiring eye images from the eye image collection, it is necessary to avoid acquiring eye images whose label information is in an uncertain state of opening and closing, which is beneficial to improve the detection accuracy of the neural network for eye opening and closing detection.
在一个可选示例中,本公开可以将获取到的多张眼睛图像顺序提供给待训练的睁闭眼检测用神经网络,由待训练的睁闭眼检测用神经网络对输入的每一张眼睛图像分别进行眼睛睁闭状态检测处理,从而待训练的睁闭眼检测用神经网络会顺序输出每一张眼睛图像的眼睛睁闭状态检测结果。例如,输入待训练的睁闭眼检测用神经网络的一张眼睛图像在依次经过卷积层的处理、全连接层的处理以及用于分类的层的处理之后,由待训练的睁闭眼检测用神经网络输出两个概率值,两个概率值的取值范围 分别均为0至1,且两个概率值之和为1。其中一个概率值对应睁状态,该概率值的大小越接近1,表示该眼睛图像中的眼睛越接近睁开状态。其中另一个概率值对应闭状态,该概率值的大小越接近1,表示该眼睛图像中的眼睛越接近闭合状态。In an optional example, the present disclosure may sequentially provide the acquired multiple eye images to the neural network for eye-opening detection to be trained, and the neural network for eye-opening detection to be trained performs the input for each eye The images are respectively subjected to eye open and closed state detection processing, so that the neural network for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image. For example, an eye image input to the neural network for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification. The neural network is used to output two probability values, the ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1. One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state. The other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
S110、根据眼睛图像的眼睛睁闭标注信息和上述神经网络输出的眼睛睁闭状态检测结果,分别确定上述至少两个睁闭眼检测训练任务各自对应的损失,并根据至少两个睁闭眼检测训练任务各自对应的损失调整神经网络的网络参数。S110. According to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the above neural network, respectively determine the respective losses of the at least two open and closed eye detection training tasks, and detect according to the at least two open and closed eyes The loss corresponding to each training task adjusts the network parameters of the neural network.
在一个可选示例中,本公开应确定每一个睁闭眼检测训练任务各自对应的损失,并根据所有训练任务各自对应的损失,确定出综合损失,并利用该综合损失来调整神经网络的网络参数。本公开中的网络参数可以包括但不限于:卷积核参数和/或矩阵权重等。本公开不限制网络参数所包含的具体内容。In an optional example, the present disclosure should determine the loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the loss corresponding to each training task, and use the comprehensive loss to adjust the network of the neural network parameter. The network parameters in the present disclosure may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
在一个可选示例中,针对任一睁闭眼检测训练任务而言,本公开可以根据神经网络针对该训练任务对应的图像集中的多个眼睛图像分别输出的眼睛睁闭状态检测结果中的最大概率值与该图像集中的相应眼睛图像的标注信息所对应的分界面之间的夹角,来确定该训练任务对应的损失。可选的,本公开可以根据眼睛图像的眼睛睁闭标注信息和神经网络输出的眼睛睁闭状态检测结果,利用A-softmax(带有角度的归一化指数)损失函数,分别确定不同睁闭眼检测训练任务各自对应的损失,并根据不同睁闭眼检测训练任务各自对应的损失确定综合损失(如各损失之和),采用随机梯度下降方式,来调整神经网络的网络参数。例如,本公开可以使用A-softmax损失函数,分别计算出每一个睁闭眼检测训练任务各自对应的损失,并根据所有睁闭眼检测训练任务各自对应的损失之和,进行反向传播处理,使待训练的睁闭眼检测用神经网络的网络参数按照损失梯度下降的方式来更新。In an optional example, for any open and closed eye detection training task, the present disclosure may output the largest of the eye open and closed state detection results respectively output by the neural network for multiple eye images in the image set corresponding to the training task. The angle between the probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task. Optionally, the present disclosure may use the A-softmax (normalized index with angle) loss function to determine different openings and closings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network. The loss corresponding to each of the eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the corresponding loss of different open and closed eye detection training tasks, and the stochastic gradient descent method is used to adjust the network parameters of the neural network. For example, the present disclosure can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks. The network parameters of the neural network for the open and closed eye detection to be trained are updated in the manner of loss gradient descent.
由上述描述可知,本公开在对神经网络进行训练过程中,每次迭代训练提供给神经网络的所有眼睛图像可以形成一眼睛图像子集。该眼睛图像子集中包括有对应各个训练任务的眼睛图像。本公开通过针对性计算各个训练任务的损失,使得神经网络能够在训练过程中,针对每个训练任务进行睁闭眼能力检测的能力学习,兼顾不同训练任务的能力学习,从而使得训练后的神经网络能够同时提高多个训练任务对应的多个场景中的各个场景的眼睛图像的睁闭眼检测的准确性,进而有利于提高基于该神经网络进行不同场景睁闭眼准确检测的技术方案的普适性和泛化性,更好满足多场景的实际应用需求。It can be seen from the above description that in the process of training the neural network in the present disclosure, all eye images provided to the neural network for each iteration of training can form a subset of eye images. The eye image subset includes eye images corresponding to each training task. In the present disclosure, by calculating the loss of each training task in a targeted manner, the neural network can learn the ability to detect the ability to open and close the eyes for each training task during the training process, taking into account the ability learning of different training tasks, so that the trained nerve The network can simultaneously improve the accuracy of eye open and closed detection of eye images of each scene in multiple scenes corresponding to multiple training tasks, thereby helping to improve the generalization of the technical solution for accurate detection of eye open and closed in different scenarios based on the neural network. Adaptability and generalization can better meet the actual application requirements of multiple scenarios.
本公开中的A-softmax损失函数可以为如下公式(1)所示:The A-softmax loss function in the present disclosure can be represented by the following formula (1):
Figure PCTCN2019118127-appb-000001
Figure PCTCN2019118127-appb-000001
在上述公式(1)中,L ang表示一训练任务对应的损失;N表示该训练任务的眼睛图像的数量;||*||表示*的模值;x i表示该训练任务对应的第i个眼睛图像;y i表示该训练任务对应的第i个眼睛图像的标注值;m为一常数,且m的最小值通常不小于预定值,如m的最小值不小于
Figure PCTCN2019118127-appb-000002
Figure PCTCN2019118127-appb-000003
表示针对第i个眼睛图像而言,神经网络输出的眼睛睁闭状态检测结果中的最大概率值与标注值所对应的分界面之间的夹角。
Figure PCTCN2019118127-appb-000004
表示m与上述夹角的乘积。
In the above formula (1), Lang represents the loss corresponding to a training task; N represents the number of eye images of the training task; ||*|| represents the modulus of *; x i represents the i-th corresponding to the training task Eye images; y i represents the label value of the i-th eye image corresponding to the training task; m is a constant, and the minimum value of m is usually not less than a predetermined value, for example, the minimum value of m is not less than
Figure PCTCN2019118127-appb-000002
Figure PCTCN2019118127-appb-000003
Represents the angle between the maximum probability value in the eye open-close state detection result output by the neural network and the interface corresponding to the label value for the i-th eye image.
Figure PCTCN2019118127-appb-000004
Represents the product of m and the above included angle.
在一个可选示例中,在针对待训练的睁闭眼检测用神经网络的训练达到预定迭代条件时,本次训练过程结束。本公开中的预定迭代条件可以包括:待训练的睁闭眼检测用神经网络针对眼睛图像输出的眼睛睁闭状态检测结果与眼睛图像的标注信息之间的差异,满足预定差异要求。在差异满足预定差异要求的情况下,本次对神经网络成功训练完成。本公开中的预定迭代条件也可以包括:对待训练的睁闭眼检测用神经网络进行训练,所使用的眼睛图像的数量达到预定数量要求等。在使用的眼睛图像的数量达到预定数量要求,然而,差异并未满足预定差异要求的情况下,本次对神经网络并未训练成功。成功训练完成的神经网络可以用于眼睛睁闭状态检测处理。In an optional example, this training process ends when the training of the neural network for detecting open and closed eyes to be trained reaches a predetermined iterative condition. The predetermined iterative conditions in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network for eye open and closed detection to be trained for the eye image and the label information of the eye image, which meets the predetermined difference requirement. In the case that the difference meets the predetermined difference requirement, the training of the neural network is successfully completed this time. The predetermined iterative conditions in the present disclosure may also include: training the neural network for open and closed eye detection to be trained, and the number of eye images used reaches a predetermined number requirement, etc. When the number of eye images used reaches the predetermined number requirement, however, the difference does not meet the predetermined difference requirement, the neural network was not successfully trained this time. The neural network that has been successfully trained can be used for the detection and processing of the eye open and closed state.
本公开通过根据不同的训练任务的损失,形成综合损失,并利用综合损失来调整睁闭眼检测用神经网络的网络参数,使神经网络能够在训练过程中,针对每个训练任务进行睁闭眼能力检测的能力学习,兼顾不同训练任务的能力学习,从而使得训练后的神经网络能够同时提高多个训练任务对应的多个场景中的各个场景的眼睛图像的睁闭眼检测的准确性,进而有利于提高基于该神经网络进行不同场景睁闭眼准确检测的技术方案的普适性和泛化性,更好满足多场景的实际应用需求。The present disclosure forms a comprehensive loss based on the loss of different training tasks, and uses the comprehensive loss to adjust the network parameters of the neural network for eye-opening detection, so that the neural network can open and close the eyes for each training task during the training process The ability learning of ability detection takes into account the ability learning of different training tasks, so that the trained neural network can simultaneously improve the accuracy of the eye image detection of the eyes of each scene in the multiple scenes corresponding to multiple training tasks, and then It is helpful to improve the universality and generalization of the technical solution based on the neural network for accurate detection of open and closed eyes in different scenarios, and better meet the actual application requirements of multiple scenarios.
图3为本公开的眼睛睁闭状态检测方法一个实施例的流程图。FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure.
如图3所示,该实施例的方法包括步骤:S300以及S310。下面对图3中的各步骤分别进行详细描述。As shown in FIG. 3, the method of this embodiment includes steps: S300 and S310. Each step in Figure 3 will be described in detail below.
S300、获取待处理图像。S300. Obtain an image to be processed.
在一个可选示例中,本公开中的待处理图像可以为呈现静态的图片或照片等图像,也可以为呈现动态的视频中的视频帧,例如,设置在移动物体上的摄像装置所摄取的视频中的视频帧,再例如,设置在固定位置的摄像装置所摄取的视频中的视频帧。上述移动物体可以为车辆、机器人或者机械臂等。上述固定位置可以为桌面或者墙壁等。本公开不限制移动物体和固定位置的具体表现形式。In an optional example, the image to be processed in the present disclosure may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, captured by a camera set on a moving object The video frame in the video, for example, is a video frame in a video taken by a camera set at a fixed position. The above-mentioned moving objects may be vehicles, robots, or robotic arms. The above-mentioned fixed position can be a desktop or a wall. The present disclosure does not limit the specific manifestations of moving objects and fixed positions.
在一个可选示例中,本公开在获取到待处理图像后,可以检测待处理图像中的眼睛所在的位置区域,例如,可以通过人脸检测或者人脸关键点检测等方法,确定出待处理图像的眼睛外接框。然后,本公开可以根据眼睛外接框将眼睛区域的图像从待处理图像中切分出来,切分出来的眼睛图像块被提供给神经网络。当然,切分出来的眼睛图像块可以在经过一定的预处理后,提供给神经网络。例如,对切分出来的眼睛图像块进行缩放处理,以便于使缩放处理后的眼睛图像块的大小满足神经网络对输入图像的尺寸要求。再例如,在切分出目标对象的两只眼睛的眼睛图像块后,对其中预定侧的眼睛图像块进行映射处理,从而形成目标对象的两个同侧的眼睛图像块,可选的,还可以对两个同侧的眼睛图像块进行缩放处理。本公开不限制从待处理图像中切分出眼睛图像块的具体实现方式,也不限制对切分出来的眼睛图像块进行预处理的具体实现方式。In an optional example, after acquiring the image to be processed, the present disclosure may detect the location area of the eyes in the image to be processed. For example, the method of face detection or face key point detection may be used to determine the area to be processed. The eye of the image circumscribes the frame. Then, the present disclosure can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network. Of course, the segmented eye image blocks can be provided to the neural network after certain preprocessing. For example, the segmented eye image block is scaled, so that the size of the eye image block after the scaled process can meet the size requirement of the neural network for the input image. For another example, after the eye image blocks of the two eyes of the target object are segmented, the eye image blocks on the predetermined side are mapped to form two eye image blocks on the same side of the target object. Optionally, also The two eye image blocks on the same side can be scaled. The present disclosure does not limit the specific implementation of segmenting the eye image block from the image to be processed, nor does it limit the specific implementation of preprocessing the segmented eye image block.
S310、经神经网络,对上述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。本公开中的神经网络是利用本公开中的神经网络训练方法的实施方式,成功训练获得的。S310: Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result. The neural network in the present disclosure is obtained through successful training using the implementation of the neural network training method in the present disclosure.
在一个可选示例中,本公开中的神经网络针对输入的眼睛图像块,而输出的眼睛睁闭状态检测结果可以为至少一概率值,例如,表示眼睛处于睁状态的概率值以及表示眼睛处于闭状态的概率值,这两个概率值的取值范围可以均为0-1,且针对同一个眼睛图像块的两个概率值之和为1。表示眼睛处于睁状态的概率值的大小越接近1,表示眼睛图像块中的眼睛越接近睁眼状态。表示眼睛处于闭状态的 概率值的大小越接近1,表示眼睛图像块中的眼睛越接近闭眼状态。In an optional example, the neural network in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is open and For the probability value of the closed state, the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
在一个可选示例中,本公开可以针对神经网络输出的,具有时序关系的眼睛睁闭状态检测结果,进行进一步的判断,从而可以确定出具有时序关系的多张待处理图像中的目标对象的眼睛动作,例如,快速眨眼动作、或者睁一只眼闭一只眼动作、或者眯眼动作等。In an optional example, the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship. Eye movements, for example, fast blinking, opening one eye and closing one eye, or squinting.
在一个可选示例中,本公开可以根据对神经网络输出的,具有时序关系的眼睛睁闭状态检测结果以及目标对象的面部的其他器官的状态,确定出具有时序关系的多张待处理图像中的目标对象的面部表情,例如,微笑、大笑或者哭泣或者愁苦等。In an optional example, the present disclosure can determine the multiple to-be-processed images with a timing relationship based on the detection result of the eye open and closed state with a timing relationship and the state of other organs of the target object's face output to the neural network The facial expressions of the target object, for example, smiling, laughing or crying or sad.
在一个可选示例中,本公开可以针对神经网络输出的,具有时序关系的眼睛睁闭状态检测结果,进行进一步的判断,从而可以确定出具有时序关系的多张待处理图像中的目标对象的疲劳状态,例如,轻度疲劳或者打瞌睡或者熟睡等。In an optional example, the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship. The state of fatigue, for example, mild fatigue or dozing off or asleep.
在一个可选示例中,本公开可以针对神经网络输出的,具有时序关系的眼睛睁闭状态检测结果,进行进一步的判断,从而可以确定出具有时序关系的多张待处理图像中的目标对象的眼睛动作,从而本公开可以至少根据眼睛动作确定出具有时序关系的多张待处理图像中的目标对象所表达的交互控制信息。In an optional example, the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship. Eye actions, so that the present disclosure can determine the interactive control information expressed by the target objects in multiple images to be processed with a time sequence relationship at least according to eye actions.
在一个可选示例中,本公开所确定出的眼睛动作、面部表情、疲劳状态以及交互控制信息等可以被多种应用所利用。例如,利用目标对象的预定眼睛动作和/或面部表情,来触发直播/转播过程中的预定特效或者实现相应的人机交互等,从而有利于丰富应用的实现方式;再例如,在智能驾驶技术中,通过实时检测驾驶员的疲劳状态,有利于防止疲劳驾驶现象。本公开不限制神经网络输出的眼睛睁闭状态检测结果的具体应用。In an optional example, the eye movements, facial expressions, fatigue states, and interactive control information determined by the present disclosure can be utilized by various applications. For example, using the predetermined eye movements and/or facial expressions of the target object to trigger the predetermined special effects in the live broadcast/rebroadcasting process or realize the corresponding human-computer interaction, etc., so as to facilitate the realization of rich applications; another example, in the intelligent driving technology In the real-time detection of the driver’s fatigue state, it is helpful to prevent fatigue driving. The present disclosure does not limit the specific application of the eye open and closed state detection results output by the neural network.
图4为本公开的智能驾驶控制方法的一个实施例的流程图。本公开的智能驾驶控制方法可以适用于自动驾驶环境中,也可以适用于巡航驾驶环境中。本公开不限制智能驾驶控制方法的适用环境。FIG. 4 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure. The intelligent driving control method of the present disclosure can be applied in an automatic driving environment and also in a cruise driving environment. The present disclosure does not limit the applicable environment of the intelligent driving control method.
如图4所示,该实施例方法包括步骤:S400、S410、S420以及S430。下面对图4中的各步骤进行详细说明。As shown in FIG. 4, the method of this embodiment includes steps: S400, S410, S420, and S430. The steps in Figure 4 will be described in detail below.
S400、获取车辆上设置的摄像装置所采集的待处理图像。本步骤的具体实现方式可以参见上述方法实施方式中针对图3中的S300的描述,在此不再详细说明。S400: Acquire an image to be processed collected by a camera device provided on the vehicle. For the specific implementation manner of this step, reference may be made to the description of S300 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
S410、经神经网络,对上述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。本实施例中的神经网络是利用上述神经网络训练方法的实施方式,成功训练获得的。本步骤的具体实现方式可以参见上述方法实施方式中针对图3中的S310的描述,在此不再详细说明。S410: Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result. The neural network in this embodiment is obtained through successful training using the implementation of the neural network training method described above. For the specific implementation manner of this step, reference may be made to the description of S310 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
S420、至少根据具有时序关系的多张待处理图像的属于同一目标对象的眼睛睁闭状态检测结果,确定目标对象的疲劳状态。S420: Determine the fatigue state of the target object at least according to the detection results of the open and closed eyes of the same target object of the multiple images to be processed with a time series relationship.
在一个可选示例中,本公开中的目标对象通常为车辆的驾驶员。本公开可以根据属于同一目标对象,且具有时序关系的多个眼睛睁闭状态监测结果,确定出该目标对象(如驾驶员)在单位时间内的眨眼次数、单次闭眼时长或者单次睁眼时长等指标参数,从而通过利用预定指标要求,对相应的指标参数进行进一步判断,可以确定出目标对象(如驾驶员)是否处于疲劳状态。本公开中的疲劳状态可以包括多种不同程度的疲劳状态,例如,轻度疲劳状态、中度疲劳状态或者重度疲劳状态等。本公开不限制确定目标对象的疲劳状态的具体实现方式。In an alternative example, the target object in the present disclosure is usually the driver of the vehicle. The present disclosure can determine the number of blinks, the duration of a single eye closure, or a single eye opening of the target object (such as a driver) in a unit time based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship. Index parameters such as eye length can be used to determine whether the target object (such as the driver) is in a state of fatigue by using predetermined index requirements to further determine the corresponding index parameters. The fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation of determining the fatigue state of the target object.
S430、根据目标对象的疲劳状态形成相应的指令,并输出该指令。S430: Form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
在一个可选示例中,本公开根据目标对象的疲劳状态,所生成的指令可以包括:切换为智能驾驶 状态指令、语音警示疲劳驾驶指令、震动唤醒驾驶员指令以及上报危险驾驶信息指令等中的至少一种,本公开不限制指令的具体表现形式。In an optional example, according to the fatigue state of the target object, the instructions generated by the present disclosure may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake-up driver instruction, and report dangerous driving information instruction. At least one kind, the present disclosure does not limit the specific manifestation of the instruction.
由于利用本公开的神经网络训练方法所成功训练出的神经网络,有利于提高神经网络的睁闭眼状态检测结果的准确性,因此,利用该神经网络输出的睁闭眼状态检测结果进行疲劳状态判断,有利于提高疲劳状态检测的准确性,从而根据检测出的疲劳状态检测形成相应的指令,有利于避免疲劳驾驶,进而有利于提高驾驶安全性。Since the neural network successfully trained by the neural network training method of the present disclosure is beneficial to improve the accuracy of the detection result of the open and closed eye state of the neural network, the detection result of the open and closed eye state output by the neural network is used to perform the fatigue state The judgment is beneficial to improve the accuracy of the fatigue state detection, so that corresponding instructions are formed according to the detected fatigue state detection, which is beneficial to avoid fatigue driving, and thus is beneficial to improve driving safety.
图5为本公开的神经网络训练装置一个实施例的结构示意图。如图5所示的神经网络训练装置包括:待训练的睁闭眼检测用神经网络500以及调整模块510。可选的,该装置还可以包括:输入模块520。FIG. 5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure. The neural network training device as shown in FIG. 5 includes: a neural network 500 for detecting open and closed eyes to be trained and an adjustment module 510. Optionally, the device may further include: an input module 520.
待训练的睁闭眼检测用神经网络500用于对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。不同图像集所包含的眼睛图像至少部分不同。The neural network 500 for eye open and closed detection to be trained is used to perform eye open and closed state detection processing on multiple eye images in the image set corresponding to at least two open and closed eye detection training tasks, respectively, and output eye open and closed state detection results . The eye images contained in different image sets are at least partially different.
在一个可选示例中,本公开的待训练的睁闭眼检测用神经网络500在成功训练后,可以用于对待处理图像进行眼睛睁闭状态检测,并输出待处理图像的眼睛睁闭状态检测结果,例如,针对一待处理图像,神经网络500输出两个概率值,其中一个概率值表示待处理图像中的目标对象的眼睛处于睁状态的概率,该概率值越大,表示越接近于睁状态;其中另一个概率值表示待处理图像中的目标对象的眼睛处于闭状态的概率,该概率值越大表示越接近于闭状态。两个概率值之和可以为1。In an optional example, the to-be-trained neural network 500 for eye-opening detection of the present disclosure can be used to detect the eye-open state of the image to be processed after being successfully trained, and output the eye-open state detection of the image to be processed As a result, for example, for an image to be processed, the neural network 500 outputs two probability values, one of which indicates the probability that the target object in the image to be processed is open. The larger the probability value, the closer to being open. State; where another probability value represents the probability that the eyes of the target object in the image to be processed is in the closed state, and the larger the probability value, the closer to the closed state. The sum of the two probability values can be 1.
在一个可选示例中,本公开中的神经网络500可以为卷积神经网络。本公开中的神经网络500可以包括但不限于:卷积层、Relu层(也可以称为激活层)、池化层、全连接层以及用于分类(如二分类)的层等。该神经网络500所包含的层数越多,则网络越深。本公开对神经网络500的具体结构不作限制。In an alternative example, the neural network 500 in the present disclosure may be a convolutional neural network. The neural network 500 in the present disclosure may include, but is not limited to: a convolutional layer, a Relu layer (also referred to as an activation layer), a pooling layer, a fully connected layer, and a layer for classification (such as binary classification). The more layers contained in the neural network 500, the deeper the network. The present disclosure does not limit the specific structure of the neural network 500.
在一个可选示例中,本公开在对神经网络500进行训练的过程,所涉及到的睁闭眼检测训练任务至少为两个,且每一个睁闭眼检测训练任务均应该属于用于使神经网络实现睁闭眼状态检测这一总训练任务。不同的睁闭眼检测训练任务所对应的训练目标并不完全相同。也就是说,本公开可以将神经网络500的总训练任务划分为多个训练任务,每一个训练任务针对一种训练目标,且不同的训练任务所对应的训练目标并不相同。In an optional example, in the process of training the neural network 500 in the present disclosure, there are at least two open and closed eye detection training tasks involved, and each open and closed eye detection training task should belong to The network realizes the total training task of detecting the state of open and closed eyes. The training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network 500 into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
在一个可选示例中,本公开中的至少两个睁闭眼检测训练任务可以包括以下至少两个任务:眼睛有附着物情形的睁闭眼检测任务、眼睛无附着物情形的睁闭眼检测任务、室内环境下的睁闭眼检测任务、室外环境下的睁闭眼检测任务、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务、眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务。上述附着物可以为眼镜或者透明塑料片等。上述光斑可以是由于附着物反光而在附着物上形成的光斑。上述例举的任务的具体描述参见上述方法实施方式中的描述,在此不再详细说明。In an optional example, the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment. Tasks, open and closed eyes detection tasks in indoor environments, open and closed eyes detection tasks in outdoor environments, open and closed eyes detection tasks with attachments to the eyes and spots on the attachments, eyes with attachments and no spots on the attachments The situation of open and closed eyes detection task. The above-mentioned attachments may be glasses or transparent plastic sheets. The aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment. For specific descriptions of the above-exemplified tasks, refer to the descriptions in the foregoing method implementations, and detailed descriptions are omitted here.
在一个可选示例中,本公开中的至少两个睁闭眼检测训练任务各自对应有图像集。每一图像集通常均包括多张眼睛图像。不同图像集所包含的眼睛图像至少部分不同。也就是说,针对任一图像集而言,该图像集中的至少部分眼睛图像不会出现在其他图像集中。可选的,不同图像集所包含的眼睛图像可以存在交集。In an optional example, at least two open and closed eye detection training tasks in the present disclosure each correspond to an image set. Each image set usually includes multiple eye images. The eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets. Optionally, the eye images contained in different image sets may have an intersection.
可选的,上述例举的六个睁闭眼检测训练任务各自对应的图像集可以分别为:眼睛有附着物的眼睛图像集、眼睛无附着物的眼睛图像集、室内环境下采集的眼睛图像集、室外环境下采集的眼睛图像集、眼睛有附着物且附着物上有光斑的眼睛图像集、眼睛有附着物且附着物上无光斑的眼睛图像集。上述例举的图像集的具体描述参见上述方法实施方式中的描述,在此不再详细说明。Optionally, the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments. For the specific description of the above-exemplified image set, please refer to the description in the above-mentioned method implementation, which will not be detailed here.
在一个可选示例中,本公开所包含的图像集是由本公开所包含的睁闭眼检测训练任务决定的。例如,本公开包含上述六个睁闭眼检测训练任务中的至少两个,则本公开包含该至少两个睁闭眼检测训练任务各自对应的眼睛图像集。In an optional example, the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
在一个可选示例中,本公开中的眼睛图像通常可以为:从摄像装置摄取的包含有眼睛的图像中切分出的眼睛图像块。本公开中的眼睛图像的形成过程可以参见上述方法实施方式中的描述,在此不再详细说明。In an optional example, the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera. For the formation process of the eye image in the present disclosure, reference may be made to the description in the foregoing method embodiment, which is not described in detail here.
在一个可选示例中,本公开中的用于训练睁闭眼检测用神经网络500的眼睛图像,通常具有标注信息,且该标注信息可以表示出眼睛图像中的眼睛的睁闭状态。可选的,本公开中的标注信息还可以表示眼睛图像中的眼睛处于睁闭不确定状态,然而,本公开中的用于对神经网络500进行训练的眼睛图像,通常不包括标注信息为睁闭不确定状态的眼睛图像,从而有利于避免睁闭不确定状态的眼睛图像对神经网络500的影响,有利于提高睁闭眼检测用神经网络500的检测准确性。In an optional example, the eye image used for training the neural network 500 for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image. Optionally, the labeling information in the present disclosure may also indicate that the eyes in the eye image are in an uncertain state of opening and closing. However, the eye image used for training the neural network 500 in the present disclosure generally does not include the labeling information as being open. Closing the eye image in the uncertain state is beneficial to avoid the influence of the eye image in the uncertain state of opening and closing on the neural network 500, and is beneficial to improving the detection accuracy of the neural network 500 for detecting open and closed eyes.
输入模块520用于从不同的图像集中获取相应数量的眼睛图像,并提供给待训练的睁闭眼检测用神经网络500。例如,输入模块520根据预设的不同睁闭眼检测训练任务的图像数量比例,针对不同睁闭眼检测训练任务,分别从不同的图像集中获取相应数量的眼睛图像,并提供给待训练的睁闭眼检测用神经网络500。另外,输入模块520在获取眼睛图像的过程中,通常还会考虑预设的批处理数量。例如,在针对睁闭眼检测训练任务a、睁闭眼检测训练任务b和睁闭眼检测训练任务c而预设的图像数量比例为1:1:1的情况下,如果预设的批处理数量为600,则输入模块520可以从睁闭眼检测训练任务a对应的眼睛图像集中获取200张眼睛图像,从睁闭眼检测训练任务b对应的眼睛图像集中获取200张眼睛图像,从睁闭眼检测训练任务c对应的眼睛图像集中获取200张眼睛图像。The input module 520 is used to obtain a corresponding number of eye images from different image sets, and provide them to the neural network 500 for eye opening and closing detection to be trained. For example, the input module 520 obtains a corresponding number of eye images from different image sets for different open and closed eye detection training tasks according to preset image quantity ratios for different open and closed eye detection training tasks, and provides them to the open and closed eyes to be trained. Neural network 500 for closed eyes detection. In addition, the input module 520 usually considers the preset batch processing quantity when acquiring the eye image. For example, if the preset image quantity ratio is 1:1:1 for open and closed eyes detection training task a, open and closed eyes detection training task b, and open and closed eyes detection training task c, if the preset batch processing If the number is 600, the input module 520 can obtain 200 eye images from the eye image set corresponding to the open and closed eye detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eye detection training task b, The eye image corresponding to the eye detection training task c is collected 200 eye images.
可选的,如果某个睁闭眼检测训练任务对应的眼睛图像集中的眼睛图像的数量达不到相应数量(如达不到200),则输入模块520可以从其他睁闭眼检测训练任务对应的眼睛图像集中获取相应数量的眼睛图像,以达到批处理数量。例如,假设睁闭眼检测训练任务c对应的眼睛图像集中只有100张眼睛图像,而睁闭眼检测训练任务a和睁闭眼检测训练任务b各自对应的眼睛图像集中的眼睛图像的数量均超过250,则输入模块520可以从睁闭眼检测训练任务a对应的眼睛图像集中获取250张眼睛图像,从睁闭眼检测训练任务b对应的眼睛图像集中获取250张眼睛图像,从睁闭眼检测训练任务c对应的眼睛图像集中获取100张眼睛图像,从而输入模块520总共获得600张眼睛图像。Optionally, if the number of eye images in the eye image set corresponding to a certain open and closed eye detection training task does not reach the corresponding number (such as less than 200), the input module 520 may correspond to other open and closed eye detection training tasks Obtain the corresponding number of eye images in the eye image collection to achieve batch processing. For example, suppose that there are only 100 eye images in the eye image set corresponding to the open and closed eye detection training task c, and the number of eye images in the eye image set corresponding to each of the open and closed eye detection training task a and the open and closed eye detection training task b exceeds 250, the input module 520 can obtain 250 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 250 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and detect from the open and closed eyes The eye images corresponding to the training task c acquire 100 eye images collectively, so that the input module 520 acquires 600 eye images in total.
需要特别说明的是,输入模块520也可以采用随机设置数量的方式,从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像。本公开不限制输入模块520从不同训练任务各自对应的眼睛图像集中分别获取相应数量的眼睛图像的具体实现方式。另外,在输入模块520从眼睛图像集中获取眼睛图像的过程中,应避免获取标注信息为睁闭不确定状态的眼睛图像,从而有利于提高睁闭眼检测用神经网络的检测准确性。It should be particularly noted that the input module 520 may also adopt a manner of randomly setting a number to obtain a corresponding number of eye images from respective eye image sets corresponding to different training tasks. The present disclosure does not limit the specific implementation manner in which the input module 520 obtains a corresponding number of eye images from eye image sets corresponding to different training tasks. In addition, in the process of the input module 520 acquiring eye images from the eye image collection, it should avoid acquiring eye images whose labeling information is in an uncertain state of opening and closing, so as to help improve the detection accuracy of the neural network for eye opening and closing detection.
在一个可选示例中,输入模块520可以将获取到的多张眼睛图像顺序提供给待训练的睁闭眼检测用神经网络500,由待训练的睁闭眼检测用神经网络500对输入的每一张眼睛图像分别进行眼睛睁闭状态检测处理,从而待训练的睁闭眼检测用神经网络500会顺序输出每一张眼睛图像的眼睛睁闭状态检测结果。例如,输入待训练的睁闭眼检测用神经网络500的一张眼睛图像在依次经过卷积层的处理、全连接层的处理以及用于分类的层的处理之后,由待训练的睁闭眼检测用神经网络500输出两个概率值,两个概率值的取值范围分别均为0至1,且两个概率值之和为1。其中一个概率值对应睁状态,该概率值的大小越接近1,表示该眼睛图像中的眼睛越接近睁开状态。其中另一个概率值对应闭状态,该概率值的大小越接近1,表示该眼睛图像中的眼睛越接近闭合状态。In an optional example, the input module 520 may sequentially provide the acquired multiple eye images to the neural network 500 for eye-opening detection to be trained, and the neural network for eye-opening detection 500 to be trained performs the input An eye image is separately processed for eye open and closed state detection, so that the neural network 500 for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image. For example, an eye image input to the neural network 500 for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification. The detection neural network 500 outputs two probability values, the value ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1. One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state. The other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
调整模块510用于根据眼睛图像的眼睛睁闭标注信息和神经网络500输出的眼睛睁闭状态检测结果,分别确定上述至少两个睁闭眼检测训练任务各自对应的损失,并根据至少两个睁闭眼检测训练任务各自对应的损失调整神经网络500的网络参数。The adjustment module 510 is configured to determine the respective corresponding losses of the at least two open and closed eye detection training tasks according to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the neural network 500, and according to the at least two open and closed eye detection training tasks. The network parameters of the neural network 500 are adjusted for the respective losses corresponding to the closed-eye detection training tasks.
在一个可选示例中,调整模块510应确定每一个睁闭眼检测训练任务各自对应的损失,并根据所有训练任务各自对应的损失,确定出综合损失,调整模块510利用该综合损失来调整神经网络的网络参数。本公开中的网络参数可以包括但不限于:卷积核参数和/或矩阵权重等。本公开不限制网络参数所包含的具体内容。In an optional example, the adjustment module 510 should determine the respective loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the respective loss of all training tasks, and the adjustment module 510 uses the comprehensive loss to adjust the nerve. Network parameters of the network. The network parameters in the present disclosure may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
在一个可选示例中,针对任一睁闭眼检测训练任务而言,调整模块510可以根据神经网络针对该训练任务对应的图像集中的多个眼睛图像分别输出的眼睛睁闭状态检测结果中的最大概率值与该图像集中的相应眼睛图像的标注信息所对应的分界面之间的夹角,来确定该训练任务对应的损失。In an optional example, for any open and closed eye detection training task, the adjustment module 510 may output according to the eye open and closed state detection results of the multiple eye images in the image set corresponding to the training task by the neural network. The angle between the maximum probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task.
可选的,调整模块510可以根据眼睛图像的眼睛睁闭标注信息和神经网络输出的眼睛睁闭状态检测结果,利用A-softmax(带有角度的归一化指数)损失函数,分别确定不同睁闭眼检测训练任务各自对应的损失,并根据不同睁闭眼检测训练任务各自对应的损失确定综合损失(如各损失之和),之后,调整模块510采用随机梯度下降方式,来调整神经网络的网络参数。例如,调整模块510可以使用A-softmax损失函数,分别计算出每一个睁闭眼检测训练任务各自对应的损失,并根据所有睁闭眼检测训练任务各自对应的损失之和,进行反向传播处理,使待训练的睁闭眼检测用神经网络500的网络参数按照损失梯度下降的方式来更新。Optionally, the adjustment module 510 may use the A-softmax (normalized index with angle) loss function to determine different openings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network. The loss corresponding to each of the closed-eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the respective corresponding losses of the different open and closed-eye detection training tasks. Then, the adjustment module 510 adopts a stochastic gradient descent method to adjust the neural network Network parameters. For example, the adjustment module 510 can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks. , The network parameters of the neural network 500 for eye open and closed detection to be trained are updated in the manner of loss gradient descent.
在一个可选示例中,在针对待训练的睁闭眼检测用神经网络500的训练达到预定迭代条件时,调整模块510可以控制本次训练过程结束。本公开中的预定迭代条件可以包括:待训练的睁闭眼检测用神经网络500针对眼睛图像输出的眼睛睁闭状态检测结果与眼睛图像的标注信息之间的差异,满足预定差异要求。在差异满足预定差异要求的情况下,本次对神经网络500成功训练完成。In an optional example, when the training of the neural network 500 for detecting open and closed eyes to be trained reaches a predetermined iterative condition, the adjustment module 510 may control the end of this training process. The predetermined iterative condition in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network 500 for eye open and closed detection to be trained for the eye image and the annotation information of the eye image meets the predetermined difference requirement. In the case where the difference meets the predetermined difference requirement, the neural network 500 is successfully trained this time.
可选的,调整模块510所使用的预定迭代条件也可以包括:对待训练的睁闭眼检测用神经网络进行训练,所使用的眼睛图像的数量达到预定数量要求等。在使用的眼睛图像的数量达到预定数量要求,然而,差异并未满足预定差异要求的情况下,本次对神经网络500并未训练成功。成功训练完成的神经网络500可以用于眼睛睁闭状态检测处理。Optionally, the predetermined iterative conditions used by the adjustment module 510 may also include: training a neural network for detecting open and closed eyes to be trained, and the number of eye images used reaches a predetermined number requirement, etc. When the number of eye images used reaches the predetermined number requirement, but the difference does not meet the predetermined difference requirement, the neural network 500 is not successfully trained this time. The neural network 500 that has been successfully trained can be used for the detection processing of the eye open and closed state.
图6为本公开的眼睛睁闭状态检测装置一个实施例的结构示意图。如图6所示,该实施例的装置包括:获取模块600以及神经网络600。可选的,眼睛睁闭状态检测装置还可以包括:确定模块620。Fig. 6 is a schematic structural diagram of an embodiment of an eye open-close state detection device of the present disclosure. As shown in FIG. 6, the device of this embodiment includes: an acquisition module 600 and a neural network 600. Optionally, the device for detecting the eye open and closed state may further include: a determining module 620.
获取模块600用于获取待处理图像。The acquiring module 600 is used to acquire the image to be processed.
在一个可选示例中,获取模块600所获取的待处理图像可以为呈现静态的图片或照片等图像,也可以为呈现动态的视频中的视频帧,例如,设置在移动物体上的摄像装置所摄取的视频中的视频帧,再例如,设置在固定位置的摄像装置所摄取的视频中的视频帧。上述移动物体可以为车辆、机器人或者机械臂等。上述固定位置可以为桌面或者墙壁等。In an optional example, the image to be processed obtained by the obtaining module 600 may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, a camera set on a moving object. The video frame in the captured video is another example of the video frame in the video captured by a camera set at a fixed position. The above-mentioned moving objects may be vehicles, robots, or robotic arms. The above-mentioned fixed position can be a desktop or a wall.
在一个可选示例中,获取模块600在获取到待处理图像后,可以检测待处理图像中的眼睛所在的位置区域,例如,获取模块600可以通过人脸检测或者人脸关键点检测等方法,确定出待处理图像的眼睛外接框。然后,获取模块600可以根据眼睛外接框将眼睛区域的图像从待处理图像中切分出来,切分出来的眼睛图像块被提供给神经网络600。当然,获取模块600可以对切分出来的眼睛图像块进行一定的预处理后,提供给神经网络610。例如,获取模块600对切分出来的眼睛图像块进行缩放处理,以便于使缩放处理后的眼睛图像块的大小满足神经网络610对输入图像的尺寸要求。再例如,在切分 出目标对象的两只眼睛的眼睛图像块后,获取模块600对其中预定侧的眼睛图像块进行映射处理,从而形成目标对象的两个同侧的眼睛图像块,可选的,获取模块600还可以对两个同侧的眼睛图像块进行缩放处理。本公开不限制获取模块600从待处理图像中切分出眼睛图像块的具体实现方式,也不限制获取模块600对切分出来的眼睛图像块进行预处理的具体实现方式。In an optional example, after acquiring the image to be processed, the acquiring module 600 may detect the location area of the eyes in the image to be processed. For example, the acquiring module 600 may use methods such as face detection or face key point detection. Determine the eye circumscribed frame of the image to be processed. Then, the acquisition module 600 can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network 600. Of course, the acquisition module 600 may perform certain preprocessing on the segmented eye image blocks and provide them to the neural network 610. For example, the acquisition module 600 performs scaling processing on the segmented eye image blocks, so that the size of the eye image blocks after the scaling process meets the size requirement of the neural network 610 for the input image. For another example, after segmenting the eye image blocks of the two eyes of the target object, the acquisition module 600 performs mapping processing on the eye image blocks on the predetermined side thereof, thereby forming two eye image blocks on the same side of the target object. Yes, the acquisition module 600 can also perform scaling processing on two eye image blocks on the same side. The present disclosure does not limit the specific implementation manner of the acquisition module 600 segmenting the eye image blocks from the image to be processed, nor the specific implementation manner of the acquisition module 600 preprocessing the segmented eye image blocks.
神经网络610用于对待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。The neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
在一个可选示例中,本公开中的神经网络600针对输入的眼睛图像块,而输出的眼睛睁闭状态检测结果可以为至少一概率值,例如,表示眼睛处于睁状态的概率值以及表示眼睛处于闭状态的概率值,这两个概率值的取值范围可以均为0-1,且针对同一个眼睛图像块的两个概率值之和为1。表示眼睛处于睁状态的概率值的大小越接近1,表示眼睛图像块中的眼睛越接近睁眼状态。表示眼睛处于闭状态的概率值的大小越接近1,表示眼睛图像块中的眼睛越接近闭眼状态。In an optional example, the neural network 600 in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is in an open state and For the probability value of the closed state, the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
确定模块620用于至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定目标对象的眼睛动作和/或面部表情和/或疲劳状态和/或交互控制信息。The determining module 620 is configured to determine the eye movements and/or facial expressions and/or fatigue status and/or interaction of the target object at least according to the detection results of the open and closed eyes of the same target object in the multiple to-be-processed images with a time sequence relationship. Control information.
在一个可选示例中,目标对象的眼睛动作,例如,快速眨眼动作、或者睁一只眼闭一只眼动作、或者眯眼动作等。目标对象的面部表情,例如,微笑、大笑或者哭泣或者愁苦等。目标对象的疲劳状态,例如,轻度疲劳或者打瞌睡或者熟睡等。目标对象所表达的交互控制信息,例如,确认或者否认等。In an optional example, the eye motion of the target object, for example, a quick blinking motion, or an eye opening and closing motion, or an eye squinting motion, etc. Facial expressions of the target object, for example, smiling, laughing or crying or sadness, etc. The fatigue state of the target object, for example, mild fatigue or dozing off or deep asleep. The interactive control information expressed by the target object, for example, confirmation or denial.
图7为本公开的智能驾驶控制装置的一个实施例的结构示意图。图7中的装置主要包括:获取模块600、神经网络610、确定疲劳状态模块700以及指令模块710。FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure. The device in FIG. 7 mainly includes: an acquisition module 600, a neural network 610, a fatigue state determination module 700, and an instruction module 710.
获取模块600用于获取车辆上设置的摄像装置所采集的待处理图像。The acquisition module 600 is used to acquire the to-be-processed image collected by the camera device installed on the vehicle.
神经网络610用于对待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果。The neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
获取模块600和神经网络610具体执行的操作可以参见上述装置实施方式中的描述。在此不再重复说明。For specific operations performed by the acquisition module 600 and the neural network 610, refer to the description in the foregoing device implementation. The description will not be repeated here.
确定疲劳状态模块700用于至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定目标对象的疲劳状态。The fatigue state determining module 700 is configured to determine the fatigue state of the target object at least according to the detection results of the open/closed state of the eyes belonging to the same target object in the plurality of images to be processed with a time series relationship.
在一个可选示例中,本公开中的目标对象通常为驾驶员。确定疲劳状态模块700可以根据属于同一目标对象,且具有时序关系的多个眼睛睁闭状态监测结果,确定出该目标对象(如驾驶员)在单位时间内的眨眼次数、单次闭眼时长或者单次睁眼时长等指标参数,从而确定疲劳状态模块700通过利用预定指标要求,对相应的指标参数进行进一步判断,确定疲劳状态模块700可以确定出目标对象(如驾驶员)是否处于疲劳状态。本公开中的疲劳状态可以包括多种不同程度的疲劳状态,例如,轻度疲劳状态、中度疲劳状态或者重度疲劳状态等。本公开不限制确定疲劳状态模块700确定目标对象的疲劳状态的具体实现方式。In an alternative example, the target object in this disclosure is usually a driver. The fatigue state determining module 700 can determine the number of blinks per unit time, the duration of a single eye closure, or the duration of a single eye closure of the target object (such as a driver) based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship. Index parameters such as the duration of a single eye opening to determine the fatigue state module 700 further judge the corresponding index parameters by using predetermined index requirements, and the fatigue state determination module 700 can determine whether the target object (such as the driver) is in a fatigue state. The fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation manner of determining the fatigue state of the target object by the fatigue state determining module 700.
指令模块710用于根据目标对象的疲劳状态形成相应的指令,并输出该指令。The instruction module 710 is used to form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
在一个可选示例中,指令模块710根据目标对象的疲劳状态,所生成的指令可以包括:切换为智能驾驶状态指令、语音警示疲劳驾驶指令、震动唤醒驾驶员指令以及上报危险驾驶信息指令等中的至少一种,本公开不限制指令的具体表现形式。In an optional example, the instruction module 710 generates instructions based on the fatigue state of the target object, and the generated instructions may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake up driver instruction, and report dangerous driving information instruction, etc. At least one of the instructions, the present disclosure does not limit the specific manifestation of the instruction.
由于利用本公开的神经网络训练方法所成功训练出的神经网络610,有利于提高神经网络的睁闭眼状态检测结果的准确性,因此,确定疲劳状态模块700利用该神经网络610输出的睁闭眼状态检测结果进行疲劳状态判断,有利于提高疲劳状态检测的准确性,从而指令模块710根据检测出的疲劳状态 检测形成相应的指令,有利于避免疲劳驾驶,进而有利于提高驾驶安全性。Since the neural network 610 successfully trained by the neural network training method of the present disclosure is beneficial to improve the accuracy of the detection results of the open and closed eyes of the neural network, therefore, the fatigue state determining module 700 uses the open and closed output of the neural network 610 Judging the fatigue state based on the result of the eye state detection is beneficial to improve the accuracy of the fatigue state detection. Therefore, the instruction module 710 forms a corresponding instruction according to the detected fatigue state detection, which is beneficial to avoid fatigue driving and thus is beneficial to improve driving safety.
示例性设备Exemplary equipment
图8示出了适于实现本公开的示例性设备800,该设备800可以是汽车中配置的控制系统/电子系统、移动终端(例如,智能移动电话等)、个人计算机(PC,例如,台式计算机或者笔记型计算机等)、平板电脑以及服务器等。图8中,设备800包括一个或者多个处理器、通信部等,所述一个或者多个处理器可以为:一个或者多个中央处理单元(CPU)801,和/或,一个或者多个加速单元813,加速单元813可以为图像处理器(GPU)等,处理器可以根据存储在只读存储器(ROM)802中的可执行指令或者从存储部分808加载到随机访问存储器(RAM)803中的可执行指令而执行各种适当的动作和处理。通信部812可以包括但不限于网卡,所述网卡可以包括但不限于IB(Infiniband)网卡。处理器可与只读存储器802和/或随机访问存储器803中通信以执行可执行指令,通过总线804与通信部812相连、并经通信部812与其他目标设备通信,从而完成本公开中的相应步骤。FIG. 8 shows an exemplary device 800 suitable for implementing the present disclosure. The device 800 may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, for example, a desktop). Computer or notebook computer, etc.), tablet computer, server, etc. In FIG. 8, the device 800 includes one or more processors, communication parts, etc., and the one or more processors may be: one or more central processing units (CPU) 801, and/or, one or more acceleration The unit 813, the acceleration unit 813 may be a graphics processor (GPU), etc., and the processor may be based on executable instructions stored in a read-only memory (ROM) 802 or loaded from the storage part 808 to a random access memory (RAM) 803. The instructions can be executed to perform various appropriate actions and processing. The communication unit 812 may include but is not limited to a network card, and the network card may include but is not limited to an IB (Infiniband) network card. The processor can communicate with the read-only memory 802 and/or the random access memory 803 to execute executable instructions, connect with the communication part 812 through the bus 804, and communicate with other target devices through the communication part 812, thereby completing the corresponding in this disclosure. step.
上述各指令所执行的操作可以参见上述方法实施例中的相关描述,在此不再详细说明。此外,在RAM 803中,还可以存储有装置操作所需的各种程序以及数据。CPU801、ROM802以及RAM803通过总线804彼此相连。For the operations performed by the foregoing instructions, reference may be made to the related descriptions in the foregoing method embodiments, and detailed descriptions are omitted here. In addition, the RAM 803 can also store various programs and data required for device operation. The CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
在有RAM803的情况下,ROM802为可选模块。RAM803存储可执行指令,或在运行时向ROM802中写入可执行指令,可执行指令使中央处理单元801执行上述方法所包括的步骤。输入/输出(I/O)接口805也连接至总线804。通信部812可以集成设置,也可以设置为具有多个子模块(例如,多个IB网卡),并分别与总线连接。In the case of RAM803, ROM802 is an optional module. The RAM 803 stores executable instructions, or writes executable instructions into the ROM 802 during operation, and the executable instructions cause the central processing unit 801 to execute the steps included in the above method. An input/output (I/O) interface 805 is also connected to the bus 804. The communication unit 812 may be integrated, or may be configured to have multiple sub-modules (for example, multiple IB network cards) and be connected to the bus respectively.
以下部件连接至I/O接口805:包括键盘、鼠标等的输入部分806;包括诸如阴极射线管(CRT)、液晶显示器(LCD)及扬声器等的输出部分807;包括硬盘等的存储部分808;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分809。通信部分809经由诸如因特网的网络执行通信处理。驱动器810也根据需要连接至I/O接口805。可拆卸介质811,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器810上,以便于从其上读出的计算机程序根据需要被安装在存储部分808中。The following components are connected to the I/O interface 805: an input part 806 including a keyboard and a mouse; an output part 807 such as a cathode ray tube (CRT), a liquid crystal display (LCD) and a speaker; a storage part 808 including a hard disk; And the communication part 809 including a network interface card such as a LAN card, a modem, and the like. The communication section 809 performs communication processing via a network such as the Internet. The driver 810 is also connected to the I/O interface 805 as needed. A removable medium 811, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 810 as needed, so that the computer program read from it is installed in the storage section 808 as needed.
需要特别说明的是,如图8所示的架构仅为一种可选实现方式,在具体实践过程中,可根据实际需要对上述图8的部件数量和类型进行选择、删减、增加或者替换;在不同功能部件设置上,也可采用分离设置或者集成设置等实现方式,例如,加速单元813和CPU801可分离设置,再例如,可将加速单元813集成在CPU801上,通信部可分离设置,也可集成设置在CPU801或加速单元813上等。这些可替换的实施方式均落入本公开的保护范围。It should be noted that the architecture shown in Figure 8 is only an optional implementation. In the specific practice process, the number and types of components in Figure 8 can be selected, deleted, added or replaced according to actual needs. ; In the setting of different functional components, implementation methods such as separate or integrated settings can also be used. For example, the acceleration unit 813 and the CPU801 can be separately arranged. For another example, the acceleration unit 813 can be integrated on the CPU801, and the communication part can be separately arranged, It can also be integrated on the CPU801 or the acceleration unit 813. These alternative embodiments all fall into the protection scope of the present disclosure.
特别地,根据本公开的实施方式,下文参考流程图描述的过程可以被实现为计算机软件程序,例如,本公开实施方式包括一种计算机程序产品,其包含有形地包含在机器可读介质上的计算机程序,计算机程序包含用于执行流程图所示的步骤的程序代码,程序代码可包括对应执行本公开提供的方法中的步骤对应的指令。In particular, according to the embodiments of the present disclosure, the process described below with reference to the flowcharts can be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product, which includes a computer program product tangibly contained on a machine-readable medium. A computer program. The computer program includes program code for executing the steps shown in the flowchart. The program code may include instructions corresponding to the steps in the method provided by the present disclosure.
在这样的实施方式中,该计算机程序可以通过通信部分809从网络上被下载及安装,和/或从可拆卸介质811被安装。在该计算机程序被中央处理单元(CPU)801执行时,执行本公开中记载的实现上述相应步骤的指令。In such an embodiment, the computer program may be downloaded and installed from the network through the communication part 809, and/or installed from the removable medium 811. When the computer program is executed by the central processing unit (CPU) 801, the instructions described in the present disclosure to implement the above-mentioned corresponding steps are executed.
在一个或多个可选实施方式中,本公开实施例还提供了一种计算机程序程序产品,用于存储计算机可读指令,所述指令被执行时使得计算机执行上述任意实施例中所述的神经网络训练方法或者眼睛睁闭状态检测方法或者智能驾驶控制方法。In one or more optional implementation manners, the embodiments of the present disclosure also provide a computer program program product for storing computer-readable instructions, which when executed, cause a computer to execute the procedures described in any of the foregoing embodiments. Neural network training method or eye open and closed state detection method or intelligent driving control method.
该计算机程序产品可以具体通过硬件、软件或其结合的方式实现。在一个可选例子中,所述计算机程序产品具体体现为计算机存储介质,在另一个可选例子中,所述计算机程序产品具体体现为软件产品,例如软件开发包(Software Development Kit,SDK)等等。The computer program product can be specifically implemented by hardware, software or a combination thereof. In an optional example, the computer program product is specifically embodied as a computer storage medium. In another optional example, the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
在一个或多个可选实施方式中,本公开实施例还提供了另一种眼睛睁闭状态检测方法、智能驾驶控制方法和神经网络的训练方法及其对应的装置和电子设备、计算机存储介质、计算机程序以及计算机程序产品,其中的方法包括:第一装置向第二装置发送神经网络训练指示或者眼睛睁闭状态检测指示或者智能驾驶控制指示,该指示使得第二装置执行上述任一可能的实施例中的神经网络训练方法或者眼睛睁闭状态检测方法或者智能驾驶控制方法;第一装置接收第二装置发送的神经网络训练结果或者眼睛睁闭状态检测结果或者智能驾驶控制结果。In one or more optional implementation manners, the embodiments of the present disclosure also provide another method for detecting the open and closed state of eyes, a method for intelligent driving control, and a training method for neural networks, and corresponding devices, electronic equipment, and computer storage media. , A computer program and a computer program product, wherein the method includes: the first device sends a neural network training instruction or an eye open and closed state detection instruction or an intelligent driving control instruction to the second device, and the instruction causes the second device to perform any of the above possible The neural network training method or the eye open/close state detection method or the intelligent driving control method in the embodiment; the first device receives the neural network training result or the eye open/close state detection result or the intelligent driving control result sent by the second device.
在一些实施例中,该神经网络训练指示或者眼睛睁闭状态检测指示或者智能驾驶控制指示可以具体为调用指令,第一装置可以通过调用的方式指示第二装置执行神经网络训练操作或者眼睛睁闭状态检测操作或者智能驾驶控制操作,相应地,响应于接收到调用指令,第二装置可以执行上述神经网络训练方法或者眼睛睁闭状态检测方法或者智能驾驶控制方法中的任意实施例中的步骤和/或流程。In some embodiments, the neural network training instruction or the eye open and closed state detection instruction or the intelligent driving control instruction may be specifically a call instruction, and the first device may instruct the second device to perform the neural network training operation or open and close the eyes by calling. The state detection operation or the intelligent driving control operation, correspondingly, in response to receiving the call instruction, the second device may execute the steps and steps in any embodiment of the above-mentioned neural network training method or the eye-opening state detection method or the intelligent driving control method. /Or process.
应理解,本公开实施例中的“第一”、“第二”等术语仅仅是为了区分,而不应理解成对本公开实施例的限定。还应理解,在本公开中,“多个”可以指两个或两个以上,“至少一个”可以指一个、两个或两个以上。还应理解,对于本公开中提及的任一部件、数据或结构,在没有明确限定或者在前后文给出相反启示的情况下,一般可以理解为一个或多个。还应理解,本公开对各个实施例的描述着重强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,不再一一赘述。It should be understood that terms such as “first” and “second” in the embodiments of the present disclosure are only for distinguishing purposes, and should not be construed as limiting the embodiments of the present disclosure. It should also be understood that in the present disclosure, "plurality" can refer to two or more, and "at least one" can refer to one, two, or more than two. It should also be understood that any component, data, or structure mentioned in the present disclosure can generally be understood as one or more unless it is clearly defined or the context gives opposite enlightenment. It should also be understood that the description of the various embodiments in the present disclosure emphasizes the differences between the various embodiments, and the same or similarities can be referred to each other, and for the sake of brevity, the details are not repeated one by one.
可能以许多方式来实现本公开的方法和装置、电子设备以及计算机可读存储介质。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和装置、电子设备以及计算机可读存储介质。用于方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施方式中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。The method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure may be implemented in many ways. For example, the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware. The above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless otherwise specifically stated. In addition, in some embodiments, the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
本公开的描述,是为了示例和描述起见而给出的,而并不是无遗漏的或者将本公开限于所公开的形式。很多修改和变化对于本领域的普通技术人员而言,是显然的。选择和描述实施方式是为了更好说明本公开的原理以及实际应用,并且使本领域的普通技术人员能够理解本公开实施例可以从而设计适于特定用途的带有各种修改的各种实施方式。The description of the present disclosure is given for the sake of example and description, and is not exhaustive or limits the present disclosure to the disclosed form. Many modifications and changes are obvious to those of ordinary skill in the art. The embodiments are selected and described in order to better explain the principles and practical applications of the present disclosure, and to enable those of ordinary skill in the art to understand that the embodiments of the present disclosure can design various embodiments with various modifications suitable for specific purposes. .

Claims (19)

  1. 一种神经网络训练方法,其特征在于,包括:A neural network training method is characterized in that it includes:
    经待训练的睁闭眼检测用神经网络,对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;其中,不同图像集所包含的眼睛图像至少部分不同;After the neural network for eye open and closed detection to be trained, perform eye open and closed state detection processing on multiple eye images in the image set corresponding to each of at least two open and closed eye detection training tasks, and output eye open and closed state detection results; Wherein, the eye images contained in different image sets are at least partially different;
    根据所述眼睛图像的眼睛睁闭标注信息和所述神经网络输出的眼睛睁闭状态检测结果,分别确定所述至少两个睁闭眼检测训练任务各自对应的损失,并根据所述至少两个睁闭眼检测训练任务各自对应的损失调整所述神经网络的网络参数。According to the eye open and closed annotation information of the eye image and the detection result of the eye open and closed state output by the neural network, the loss corresponding to each of the at least two eye open and closed detection training tasks is determined, and according to the at least two Adjust the network parameters of the neural network by detecting the loss corresponding to each of the open and closed eyes detection training tasks.
  2. 根据权利要求1所述的方法,其特征在于:The method according to claim 1, wherein:
    所述至少两个睁闭眼检测训练任务包括以下至少两个任务:眼睛有附着物情形的睁闭眼检测任务、眼睛无附着物情形的睁闭眼检测任务、室内环境下的睁闭眼检测任务、室外环境下的睁闭眼检测任务、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务、眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务;The at least two open and closed eye detection training tasks include the following at least two tasks: the open and closed eye detection task when the eyes have attachments, the open and closed eye detection task when the eyes have no attachments, and the open and closed eyes detection in an indoor environment Tasks, open and closed eyes detection tasks in outdoor environments, open and closed eyes detection tasks with attachments to the eyes and spots on the attachments, open and closed eyes detection tasks with attachments to the eyes and no spots on the attachments;
    所述至少两个睁闭眼检测训练任务各自对应的图像集包括以下相应的至少两个图像集:眼睛有附着物的眼睛图像集、眼睛无附着物的眼睛图像集、室内环境下采集的眼睛图像集、室外环境下采集的眼睛图像集、眼睛有附着物且附着物上有光斑的眼睛图像集、眼睛有附着物且附着物上无光斑的眼睛图像集。The respective image sets corresponding to the at least two open and closed eye detection training tasks include the following corresponding at least two image sets: an eye image set with eyes attached, an eye image set without eyes attached, and eyes collected in an indoor environment Image set, eye image set collected in an outdoor environment, eye image set with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  3. 根据权利要求1或2所述的方法,其特征在于,所述经待训练的睁闭眼检测用神经网络,对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果,包括:The method according to claim 1 or 2, wherein the neural network for detecting open and closed eyes to be trained performs multiple eye images in the image set corresponding to at least two open and closed eye detection training tasks, respectively, Perform eye open and closed state detection processing respectively, and output the eye open and closed state detection results, including:
    根据预设的不同睁闭眼检测训练任务的图像数量比例,针对不同睁闭眼检测训练任务,分别从不同的所述图像集中获取相应数量的眼睛图像;According to preset image quantity ratios of different open and closed eye detection training tasks, for different open and closed eye detection training tasks, obtain a corresponding number of eye images from different image sets;
    经待训练的睁闭眼检测用神经网络,对所述相应数量的眼睛图像,分别进行眼睛睁闭状态检测处理,输出各眼睛图像各自对应的眼睛睁闭状态检测结果。After the neural network for eye open and closed detection to be trained, the eye open and closed state detection processing is performed on the corresponding number of eye images, and the eye open and closed state detection results corresponding to each eye image are output.
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据所述眼睛图像的眼睛睁闭标注信息和所述神经网络输出的眼睛睁闭状态检测结果,分别确定所述至少两个睁闭眼检测训练任务各自对应的损失,包括:The method according to any one of claims 1 to 3, characterized in that, according to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the neural network, the The loss corresponding to at least two open and closed eyes detection training tasks includes:
    针对任一睁闭眼检测训练任务而言,根据所述神经网络针对该训练任务对应的图像集中的多个眼睛图像分别输出的眼睛睁闭状态检测结果中的最大概率值与该图像集中的相应眼睛图像的标注信息所对应的分界面之间的夹角,确定该训练任务对应的损失。For any open and closed eye detection training task, the maximum probability value in the eye open and closed state detection results respectively output by the neural network for the multiple eye images in the image set corresponding to the training task corresponds to the corresponding value in the image set. The angle between the interface corresponding to the annotation information of the eye image determines the loss corresponding to the training task.
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述根据所述至少两个睁闭眼检测训练任务各自对应的损失调整所述神经网络的网络参数,包括:The method according to any one of claims 1 to 4, wherein the adjusting the network parameters of the neural network according to the respective losses of the at least two open and closed eye detection training tasks comprises:
    根据所述至少两个睁闭眼检测训练任务各自对应的损失,确定所述至少两个睁闭眼检测训练任务的综合损失;Determine the comprehensive loss of the at least two open and closed eye detection training tasks according to the respective losses of the at least two open and closed eye detection training tasks;
    根据所述综合损失,调整所述神经网络的网络参数。According to the comprehensive loss, the network parameters of the neural network are adjusted.
  6. 一种眼睛睁闭状态检测方法,其特征在于,包括:A method for detecting the open and closed state of eyes, characterized in that it comprises:
    获取待处理图像;Obtain the image to be processed;
    经神经网络,对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;Through the neural network, perform eye open and closed state detection processing on the image to be processed, and output the eye open and closed state detection result;
    其中,所述神经网络是利用上述权利要求1-5中任一项所述的方法训练获得的。Wherein, the neural network is obtained by training using the method described in any one of claims 1-5.
  7. 根据权利要求6所述的方法,其特征在于,所述眼睛睁闭状态检测方法还包括:The method according to claim 6, wherein the method for detecting the open and closed state of the eyes further comprises:
    至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的眼睛动作和/或面部表情和/或疲劳状态和/或交互控制信息。Determine the eye movements and/or facial expressions and/or fatigue status and/or interactive control information of the target object at least according to the detection results of the open and closed eyes of the same target object in the multiple to-be-processed images having a time sequence relationship.
  8. 一种智能驾驶控制方法,其特征在于,包括:An intelligent driving control method, characterized by comprising:
    获取车辆上设置的摄像装置所采集的待处理图像;Acquiring the image to be processed collected by the camera device installed on the vehicle;
    经神经网络,对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;Through the neural network, perform eye open and closed state detection processing on the image to be processed, and output the eye open and closed state detection result;
    至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的疲劳状态;Determining the fatigue state of the target object at least according to the detection results of the open and closed eyes of the same target object in the plurality of images to be processed in a time series relationship;
    根据所述目标对象的疲劳状态形成相应的指令,并输出该指令;Form a corresponding instruction according to the fatigue state of the target object, and output the instruction;
    其中,所述神经网络是利用上述权利要求1-5中任一项所述的方法训练获得的。Wherein, the neural network is obtained by training using the method described in any one of claims 1-5.
  9. 一种神经网络训练装置,其特征在于,包括:A neural network training device is characterized in that it comprises:
    待训练的睁闭眼检测用神经网络,用于对至少两个睁闭眼检测训练任务各自对应的图像集中的多张眼睛图像,分别进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;其中,不同图像集所包含的眼睛图像至少部分不同;The neural network for eye open and closed detection to be trained is used to perform eye open and closed state detection processing on multiple eye images in the image set corresponding to each of at least two open and closed eye detection training tasks, and output the eye open and closed state detection results ; Among them, the eye images contained in different image sets are at least partially different;
    调整模块,用于根据所述眼睛图像的眼睛睁闭标注信息和所述神经网络输出的眼睛睁闭状态检测结果,分别确定所述至少两个睁闭眼检测训练任务各自对应的损失,并根据所述至少两个睁闭眼检测训练任务各自对应的损失调整所述神经网络的网络参数。The adjustment module is configured to determine the respective corresponding losses of the at least two eye open and closed detection training tasks according to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the neural network, and according to The loss corresponding to each of the at least two open and closed eye detection training tasks adjusts the network parameters of the neural network.
  10. 根据权利要求9所述的装置,其特征在于:The device according to claim 9, wherein:
    所述至少两个睁闭眼检测训练任务包括以下至少两个任务:眼睛有附着物情形的睁闭眼检测任务、眼睛无附着物情形的睁闭眼检测任务、室内环境下的睁闭眼检测任务、室外环境下的睁闭眼检测任务、眼睛有附着物且附着物上有光斑情形的睁闭眼检测任务、眼睛有附着物且附着物上无光斑情形的睁闭眼检测任务;The at least two open and closed eye detection training tasks include the following at least two tasks: the open and closed eye detection task when the eyes have attachments, the open and closed eye detection task when the eyes have no attachments, and the open and closed eyes detection in an indoor environment Tasks, open and closed eyes detection tasks in outdoor environments, open and closed eyes detection tasks with attachments to the eyes and spots on the attachments, open and closed eyes detection tasks with attachments to the eyes and no spots on the attachments;
    所述至少两个睁闭眼检测训练任务各自对应的图像集包括以下相应的至少两个图像集:眼睛有附着物的眼睛图像集、眼睛无附着物的眼睛图像集、室内环境下采集的眼睛图像集、室外环境下采集的眼睛图像集、眼睛有附着物且附着物上有光斑的眼睛图像集、眼睛有附着物且附着物上无光斑的眼睛图像集。The respective image sets corresponding to the at least two open and closed eye detection training tasks include the following corresponding at least two image sets: an eye image set with eyes attached, an eye image set without eyes attached, and eyes collected in an indoor environment Image set, eye image set collected in an outdoor environment, eye image set with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  11. 根据权利要求9或10所述的装置,其特征在于,所述装置还包括:The device according to claim 9 or 10, wherein the device further comprises:
    输入模块,用于根据预设的不同睁闭眼检测训练任务的图像数量比例,针对不同睁闭眼检测训练任务,分别从不同的所述图像集中获取相应数量的眼睛图像,并提供给待训练的睁闭眼检测用神经网络;The input module is used to obtain a corresponding number of eye images from different image sets for different open and closed eye detection training tasks according to the preset image quantity ratios of different open and closed eye detection training tasks, and provide them to the training task Neural network for detecting open and closed eyes;
    所述待训练的睁闭眼检测用神经网络,对所述相应数量的眼睛图像,分别进行眼睛睁闭状态检测处理,输出各眼睛图像各自对应的眼睛睁闭状态检测结果。The neural network for detecting open and closed eyes to be trained performs eye open and closed state detection processing on the corresponding number of eye images, and outputs the corresponding eye open and closed state detection results of each eye image.
  12. 根据权利要求9至11中任一项所述的装置,其特征在于,所述调整模块进一步用于:The device according to any one of claims 9 to 11, wherein the adjustment module is further configured to:
    针对任一睁闭眼检测训练任务而言,根据神经网络针对该训练任务对应的图像集中的多个眼睛图像分别输出的眼睛睁闭状态检测结果中的最大概率值与该图像集中的相应眼睛图像的标注信息所对应的分界面之间的夹角,确定该训练任务对应的损失。For any open and closed eye detection training task, the maximum probability value in the eye open and closed state detection results output by the neural network for multiple eye images in the image set corresponding to the training task is the same as the corresponding eye image in the image set. The angle between the interface corresponding to the labeling information of, determines the loss corresponding to the training task.
  13. 根据权利要求9至12中任一项所述的装置,其特征在于,所述调整模块进一步用于:The device according to any one of claims 9 to 12, wherein the adjustment module is further configured to:
    根据所述至少两个睁闭眼检测训练任务各自对应的损失,确定所述至少两个睁闭眼检测训练任务的综合损失;Determine the comprehensive loss of the at least two open and closed eye detection training tasks according to the respective losses of the at least two open and closed eye detection training tasks;
    根据所述综合损失,调整所述神经网络的网络参数。According to the comprehensive loss, the network parameters of the neural network are adjusted.
  14. 一种眼睛睁闭状态检测装置,其特征在于,包括:A device for detecting the state of opening and closing eyes, which is characterized in that it comprises:
    获取模块,用于获取待处理图像;The acquisition module is used to acquire the image to be processed;
    神经网络,用于对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;The neural network is used to perform eye open/close state detection processing on the image to be processed, and output the eye open/close state detection result;
    其中,所述神经网络是利用上述权利要求9至13中任一项所述的装置训练获得的。Wherein, the neural network is obtained by training with the device according to any one of claims 9 to 13.
  15. 根据权利要求14所述的装置,其特征在于,所述眼睛睁闭状态检测装置还包括:The device according to claim 14, wherein the device for detecting the state of opening and closing the eyes further comprises:
    确定模块,用于至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的眼睛动作和/或面部表情和/或疲劳状态和/或交互控制信息。The determining module is used to determine the eye movements and/or facial expressions and/or fatigue status and/or fatigue status of the target object at least according to the detection results of the open and closed eyes of the same target object in the multiple images to be processed with a time sequence relationship Or interactive control information.
  16. 一种智能驾驶控制装置,其特征在于,包括:An intelligent driving control device, characterized in that it comprises:
    获取模块,用于获取车辆上设置的摄像装置所采集的待处理图像;The acquisition module is used to acquire the to-be-processed image collected by the camera device installed on the vehicle;
    神经网络,用于对所述待处理图像,进行眼睛睁闭状态检测处理,输出眼睛睁闭状态检测结果;The neural network is used to perform eye open/close state detection processing on the image to be processed, and output the eye open/close state detection result;
    确定疲劳状态模块,用于至少根据具有时序关系的多张待处理图像中的属于同一目标对象的眼睛睁闭状态检测结果,确定所述目标对象的疲劳状态;The fatigue state determining module is configured to determine the fatigue state of the target object at least according to the detection results of the open and closed state of the eyes belonging to the same target object in the plurality of images to be processed with a time sequence relationship;
    指令模块,用于根据所述目标对象的疲劳状态形成相应的指令,并输出该指令;The instruction module is used to form a corresponding instruction according to the fatigue state of the target object, and output the instruction;
    其中,所述神经网络是利用上述权利要求9至13中任一项所述的装置训练获得的。Wherein, the neural network is obtained by training with the device according to any one of claims 9 to 13.
  17. 一种电子设备,包括:An electronic device including:
    存储器,用于存储计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述存储器中存储的计算机程序,且所述计算机程序被执行时,实现上述权利要求1-8中任一项所述的方法。The processor is configured to execute the computer program stored in the memory, and when the computer program is executed, it implements the method according to any one of claims 1-8.
  18. 一种计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时,实现上述权利要求1-8中任一项所述的方法。A computer-readable storage medium with a computer program stored thereon, and when the computer program is executed by a processor, the method according to any one of claims 1-8 is realized.
  19. 一种计算机程序,包括计算机指令,当所述计算机指令在设备的处理器中运行时,实现上述权利要求1-8中任一项所述的方法。A computer program comprising computer instructions, when the computer instructions run in the processor of the device, the method according to any one of claims 1-8 is implemented.
PCT/CN2019/118127 2019-02-28 2019-11-13 Neural network training and eye opening and closing state detection method, apparatus, and device WO2020173135A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2021541183A JP7227385B2 (en) 2019-02-28 2019-11-13 Neural network training and eye open/close state detection method, apparatus and equipment
KR1020217023286A KR20210113621A (en) 2019-02-28 2019-11-13 Method, apparatus and apparatus for training neural network and detecting eye opening/closing state

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910153463.4 2019-02-28
CN201910153463.4A CN111626087A (en) 2019-02-28 2019-02-28 Neural network training and eye opening and closing state detection method, device and equipment

Publications (1)

Publication Number Publication Date
WO2020173135A1 true WO2020173135A1 (en) 2020-09-03

Family

ID=72238751

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118127 WO2020173135A1 (en) 2019-02-28 2019-11-13 Neural network training and eye opening and closing state detection method, apparatus, and device

Country Status (4)

Country Link
JP (1) JP7227385B2 (en)
KR (1) KR20210113621A (en)
CN (1) CN111626087A (en)
WO (1) WO2020173135A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283488A (en) * 2022-03-08 2022-04-05 北京万里红科技有限公司 Method for generating detection model and method for detecting eye state by using detection model

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313790A (en) * 2021-05-31 2021-08-27 北京字跳网络技术有限公司 Video generation method, device, equipment and storage medium
CN113537176A (en) * 2021-09-16 2021-10-22 武汉未来幻影科技有限公司 Method, device and equipment for determining fatigue state of driver

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106585629A (en) * 2016-12-06 2017-04-26 广州市科恩电脑有限公司 Automobile control method and device
US20170294010A1 (en) * 2016-04-12 2017-10-12 Adobe Systems Incorporated Utilizing deep learning for rating aesthetics of digital images
CN108805185A (en) * 2018-05-29 2018-11-13 腾讯科技(深圳)有限公司 Training method, device, storage medium and the computer equipment of model
CN108985135A (en) * 2017-06-02 2018-12-11 腾讯科技(深圳)有限公司 A kind of human-face detector training method, device and electronic equipment
WO2019028798A1 (en) * 2017-08-10 2019-02-14 北京市商汤科技开发有限公司 Method and device for monitoring driving condition, and electronic device

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4687150B2 (en) 2005-03-08 2011-05-25 日産自動車株式会社 Direct light detector
JP4770218B2 (en) 2005-03-22 2011-09-14 日産自動車株式会社 Visual behavior determination device
JP4978227B2 (en) 2007-02-15 2012-07-18 トヨタ自動車株式会社 Image detection device
CN107003834B (en) * 2014-12-15 2018-07-06 北京市商汤科技开发有限公司 Pedestrian detection device and method
JP2016176699A (en) 2015-03-18 2016-10-06 株式会社オートネットワーク技術研究所 Route search device, route search method, and computer program
JP6582604B2 (en) 2015-06-23 2019-10-02 富士通株式会社 Pupil detection program, pupil detection method, pupil detection device, and gaze detection system
JP6762794B2 (en) 2016-07-29 2020-09-30 アルパイン株式会社 Eyelid opening / closing detection device and eyelid opening / closing detection method
JP6892231B2 (en) 2016-07-29 2021-06-23 アルパイン株式会社 Eyelid opening / closing detection device and eyelid opening / closing detection method
CN106529402B (en) * 2016-09-27 2019-05-28 中国科学院自动化研究所 The face character analysis method of convolutional neural networks based on multi-task learning
JP2018075208A (en) 2016-11-10 2018-05-17 パナソニックIpマネジメント株式会社 Operator condition detection system and operator condition detection method
CN108022238B (en) * 2017-08-09 2020-07-03 深圳科亚医疗科技有限公司 Method, computer storage medium, and system for detecting object in 3D image
CN108614999B (en) * 2018-04-16 2022-09-16 贵州大学 Eye opening and closing state detection method based on deep learning
CN108960071A (en) * 2018-06-06 2018-12-07 武汉幻视智能科技有限公司 A kind of eye opening closed-eye state detection method
CN108932536B (en) * 2018-07-18 2021-11-09 电子科技大学 Face posture reconstruction method based on deep neural network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170294010A1 (en) * 2016-04-12 2017-10-12 Adobe Systems Incorporated Utilizing deep learning for rating aesthetics of digital images
CN106585629A (en) * 2016-12-06 2017-04-26 广州市科恩电脑有限公司 Automobile control method and device
CN108985135A (en) * 2017-06-02 2018-12-11 腾讯科技(深圳)有限公司 A kind of human-face detector training method, device and electronic equipment
WO2019028798A1 (en) * 2017-08-10 2019-02-14 北京市商汤科技开发有限公司 Method and device for monitoring driving condition, and electronic device
CN108805185A (en) * 2018-05-29 2018-11-13 腾讯科技(深圳)有限公司 Training method, device, storage medium and the computer equipment of model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283488A (en) * 2022-03-08 2022-04-05 北京万里红科技有限公司 Method for generating detection model and method for detecting eye state by using detection model
CN114283488B (en) * 2022-03-08 2022-06-14 北京万里红科技有限公司 Method for generating detection model and method for detecting eye state by using detection model

Also Published As

Publication number Publication date
JP2022517398A (en) 2022-03-08
JP7227385B2 (en) 2023-02-21
KR20210113621A (en) 2021-09-16
CN111626087A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
US11551377B2 (en) Eye gaze tracking using neural networks
CN108229284B (en) Sight tracking and training method and device, system, electronic equipment and storage medium
WO2019179464A1 (en) Method for predicting direction of movement of target object, vehicle control method, and device
WO2019128932A1 (en) Face pose analysis method and apparatus, device, storage medium, and program
WO2018177379A1 (en) Gesture recognition, gesture control and neural network training methods and apparatuses, and electronic device
WO2020173135A1 (en) Neural network training and eye opening and closing state detection method, apparatus, and device
WO2022156640A1 (en) Gaze correction method and apparatus for image, electronic device, computer-readable storage medium, and computer program product
US20180088663A1 (en) Method and system for gesture-based interactions
WO2018054329A1 (en) Object detection method and device, electronic apparatus, computer program and storage medium
WO2020125499A1 (en) Operation prompting method and glasses
US20210124928A1 (en) Object tracking methods and apparatuses, electronic devices and storage media
US11704563B2 (en) Classifying time series image data
WO2017114168A1 (en) Method and device for target detection
WO2019029459A1 (en) Method and device for recognizing facial age, and electronic device
EP4053735A1 (en) Method for structuring pedestrian information, device, apparatus and storage medium
WO2022082999A1 (en) Object recognition method and apparatus, and terminal device and storage medium
WO2023178906A1 (en) Liveness detection method and apparatus, and electronic device, storage medium, computer program and computer program product
US11868523B2 (en) Eye gaze classification
WO2021217973A1 (en) Emotion information recognition method and apparatus, and storage medium and computer device
KR20210000671A (en) Head pose estimation
WO2021238586A1 (en) Training method and apparatus, device, and computer readable storage medium
CN114461078B (en) Man-machine interaction method based on artificial intelligence
US11796801B2 (en) Reducing light leakage via external gaze detection
Tazhigaliyeva et al. Cyrillic manual alphabet recognition in RGB and RGB-D data for sign language interpreting robotic system (SLIRS)
EP4287123A1 (en) Method of estimating a three-dimensional position of an object

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19916840

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021541183

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20217023286

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19916840

Country of ref document: EP

Kind code of ref document: A1