WO2020173135A1 - Apprentissage de réseau neuronal et procédé, appareil et dispositif de détection d'état d'ouverture et de fermeture d'œil - Google Patents

Apprentissage de réseau neuronal et procédé, appareil et dispositif de détection d'état d'ouverture et de fermeture d'œil Download PDF

Info

Publication number
WO2020173135A1
WO2020173135A1 PCT/CN2019/118127 CN2019118127W WO2020173135A1 WO 2020173135 A1 WO2020173135 A1 WO 2020173135A1 CN 2019118127 W CN2019118127 W CN 2019118127W WO 2020173135 A1 WO2020173135 A1 WO 2020173135A1
Authority
WO
WIPO (PCT)
Prior art keywords
eye
open
closed
detection
eyes
Prior art date
Application number
PCT/CN2019/118127
Other languages
English (en)
Chinese (zh)
Inventor
王飞
钱晨
Original Assignee
北京市商汤科技开发有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京市商汤科技开发有限公司 filed Critical 北京市商汤科技开发有限公司
Priority to KR1020217023286A priority Critical patent/KR20210113621A/ko
Priority to JP2021541183A priority patent/JP7227385B2/ja
Publication of WO2020173135A1 publication Critical patent/WO2020173135A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Definitions

  • the present disclosure relates to computer vision technology, in particular to a neural network training method, neural network training device, eye open and closed state detection method, eye open and closed state detection device, intelligent driving control method, intelligent driving control device, electronic equipment, computer Readable storage medium and computer program.
  • the eye open and closed state detection is to detect the open and closed conditions of the eyes. Eye open and closed state detection can be used in fatigue monitoring, living body recognition, facial expression recognition and other fields. For example, in assisted driving technology, it is necessary to detect the eye open and closed state of the driver, and determine whether the driver is in a fatigued driving state based on the detection result of the eye open and closed state, so as to realize the fatigue driving monitoring. Accurately detect the open and closed state of the eyes, avoid misjudgment as much as possible, and help improve the safety of vehicle driving.
  • the embodiments of the present disclosure provide a technical solution for neural network training, eye open and closed state detection, and intelligent driving control.
  • a neural network training method which includes: a neural network for open and closed eye detection to be trained, for multiple eyes in an image set corresponding to each of at least two open and closed eye detection training tasks The image is subjected to the eye open and closed state detection processing, and the eye open and closed state detection results are output; wherein, the eye images contained in different image sets are at least partially different; according to the eye open and closed annotation information of the eye image and the neural network output According to the detection results of the eye open and closed state of the at least two eye open and closed detection training tasks, the respective losses corresponding to each of the at least two eye open and closed detection training tasks are determined, and the network of the neural network is adjusted according to the respective losses of the at least two eye open and closed detection training tasks. parameter.
  • a method for detecting the open and closed state of eyes including: acquiring an image to be processed; performing eye open and closed state detection processing on the image to be processed through a neural network, and outputting the open and closed eyes State detection result; wherein, the neural network is obtained by training using the neural network training method described in the foregoing implementation manner.
  • an intelligent driving control method including: acquiring a to-be-processed image collected by a camera set on a vehicle; and performing an eye-opening state on the to-be-processed image via a neural network Detection processing, outputting the detection result of the eye open and closed state; at least according to the detection result of the eye open and closed state belonging to the same target object in the multiple images to be processed with a time series relationship, determine the fatigue state of the target object; according to the target object A corresponding instruction is formed in the fatigue state, and the instruction is output; wherein, the neural network is obtained by training using the neural network training method described in the foregoing embodiment.
  • a neural network training device which includes: a neural network for open and closed eye detection to be trained, used for detecting a large number of images in at least two open and closed eye detection training tasks.
  • Eye images respectively perform eye open and closed state detection processing, and output eye open and closed state detection results; wherein, the eye images contained in different image sets are at least partially different;
  • the adjustment module is used to mark the eye open and closed according to the eye image Information and the detection result of the eye open and closed state output by the neural network, respectively determine the loss corresponding to each of the at least two eye open and closed detection training tasks, and determine the loss corresponding to each of the at least two eye open and closed detection training tasks Adjust the network parameters of the neural network.
  • an eye open and closed state detection device including: an acquisition module for acquiring an image to be processed; a neural network for detecting the eye open and closed state of the image to be processed Processing and outputting the detection result of the eye open and closed state; wherein the neural network is obtained by training using the neural network training device described in the foregoing embodiment.
  • an intelligent driving control device including: an acquisition module for acquiring images to be processed collected by a camera set on a vehicle; a neural network for evaluating the images to be processed , Perform eye open and closed state detection processing, and output eye open and closed state detection results; determine the fatigue state module, used to determine at least according to the eye open and closed state detection results of the same target object in multiple images to be processed with a time sequence relationship The fatigue state of the target object; an instruction module for forming a corresponding instruction according to the fatigue state of the target object, and outputting the instruction; wherein, the neural network is trained using the neural network training device described in the above embodiment acquired.
  • an electronic device including: a memory for storing a computer program; a processor for executing the computer program stored in the memory, and when the computer program is executed, Any method embodiment of the present disclosure.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed by a processor, it implements any method embodiment of the present disclosure.
  • a computer program including computer instructions, which, when the computer instructions run in a processor of the device, implement any method implementation of the present disclosure.
  • the inventor found that the traditional single-task training neural network often appears as a neural network trained on the image set of the task, which has better accuracy in detecting open and closed eyes in the scene corresponding to the task.
  • it is difficult to guarantee the accuracy of open and closed eyes detection.
  • you simply use the images collected from multiple different scenes as a whole image set for neural network training it does not distinguish whether the images in the image set come from different scenes or correspond to different training tasks, and the whole image set is input to the neural network every time.
  • the distribution of image subsets (Batch) in network training is uncontrollable. It is possible that there are many images in one scene but few or no images in other scenes.
  • the distribution of image subsets in different iterations of training is not exactly the same, that is to say ,
  • the distribution of image subsets in each iteration of the neural network is too random, and different training tasks do not perform targeted loss calculations, so it is impossible to control the ability of the neural network to take into account different training tasks during the training process, so the training can not be guaranteed.
  • eye open and closed state detection method and device Based on the neural network training method and device, eye open and closed state detection method and device, intelligent driving control method and device, electronic equipment, computer readable storage medium, and computer program provided by the present disclosure, through multiple different eye open and closed detection tasks Respectively determine the corresponding image set, determine multiple eye images for a single training of the neural network from multiple image sets, and determine the opening and closing of the neural network for each training task in the training according to the eye images from multiple image sets
  • the loss of eye detection results and adjust the network parameters of the neural network according to each loss, so that the eye image subset fed to the neural network in each iteration of the neural network training includes the eye image corresponding to each training task, and Targeted calculation of the loss of each training task enables the neural network training process to learn the ability to detect the ability to open and close the eyes for each training task, taking into account the ability learning of different training tasks, so that the trained neural network can improve at the same time.
  • the accuracy of the open and closed eye detection of the eye images of each of the multiple scenes corresponding to a training task is helpful to improve the universality and generalization of the technical solution for accurate detection of open and closed eyes in different scenarios based on the neural network , which is conducive to better meet the actual application requirements of multiple scenarios.
  • Fig. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure
  • FIG. 2 is a schematic diagram of an embodiment of multiple open and closed eye detection training tasks in the present disclosure
  • FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure
  • FIG. 5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure.
  • FIG. 6 is a schematic structural diagram of an embodiment of the eye open/close state detection device of the present disclosure.
  • FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
  • Fig. 8 is a block diagram of an exemplary device for implementing the embodiments of the present disclosure.
  • the embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, and servers, which can operate with many other general or special computing system environments or configurations.
  • Examples of well-known terminal devices, computing systems, environments, and/or configurations suitable for use with electronic devices such as terminal devices, computer systems, and servers including but not limited to: personal computer systems, server computer systems, thin clients, thick Client computers, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, small computer systems, large computer systems, and distributed cloud computing technology environments including any of the above systems, etc. .
  • Electronic devices such as terminal devices, computer systems, and servers can be described in the general context of computer system executable instructions (such as program modules) executed by the computer system.
  • program modules can include routines, programs, target programs, components, logic, and data structures, etc., which perform specific tasks or implement specific abstract data types.
  • the computer system/server can be implemented in a distributed cloud computing environment. In the distributed cloud computing environment, tasks are executed by remote processing equipment linked through a communication network.
  • program modules may be located on a storage medium of a local or remote computing system including a storage device.
  • FIG. 1 is a flowchart of an embodiment of the neural network training method of the present disclosure. As shown in Fig. 1, the method of this embodiment includes steps: S100 and S110. Each step in Figure 1 will be described in detail below.
  • the to-be-trained neural network for eye-opening detection of the present disclosure can be used to detect the eye-open-close state of the image to be processed after being successfully trained, and output the detection result of the eye-open-close state of the image to be processed
  • the neural network outputs two probability values, where one probability value represents the probability that the target object in the image to be processed is in the open state. The greater the probability value, the closer to the open state; The other probability value represents the probability that the eyes of the target object in the image to be processed are in the closed state, and the larger the probability value, the closer to the closed state.
  • the sum of the two probability values can be 1.
  • the neural network in the present disclosure may be a convolutional neural network.
  • the neural network in the present disclosure may include but is not limited to: convolutional layer, Relu (Rectified Linear Unit) layer (also called activation layer), pooling layer, fully connected layer, and classification (such as two Classification), etc.
  • convolutional layer also called activation layer
  • pooling layer also called activation layer
  • classification such as two Classification
  • the present disclosure does not limit the specific structure of the neural network.
  • each open and closed eye detection training task should belong to the neural network. Realize the total training task of detecting open and closed eyes.
  • the training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment.
  • the above-mentioned attachments may be glasses or transparent plastic sheets.
  • the aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment.
  • the glasses in the present disclosure generally refer to glasses that can see the eye of the wearer through the lens.
  • the open and closed eyes detection task in the case where the eyes have attachments may be the open and closed eyes detection task with glasses.
  • the task of detecting open and closed eyes with glasses can be realized: at least one of detecting open and closed eyes with glasses indoors and detecting open and closed eyes with glasses outdoors.
  • the open and closed eyes detection task in the case where there is no eye attachment may be the open and closed eyes detection task without glasses.
  • the task of detecting open and closed eyes without glasses can be realized: at least one of the detection of open and closed eyes indoors without glasses and the detection of open and closed eyes outdoors without glasses.
  • the task of detecting open and closed eyes in an indoor environment can be realized: detection of open and closed eyes without glasses in the room, detection of open and closed eyes with glasses in the room and reflection of glasses, and detection of glasses in the room without reflection of glasses At least one of the open and closed eyes detection.
  • the task of detecting open and closed eyes in an outdoor environment can be realized: the detection of open and closed eyes without glasses outdoors, the detection of open and closed eyes with glasses and reflective glasses outdoors, and the detection of eyes with glasses and non-reflective glasses outdoors. At least one of the open and closed eyes detection.
  • the open and closed eyes detection task in the case where there is an attachment on the eye and a light spot on the attachment may be an open and closed eye detection task with glasses and reflection of the glasses.
  • the task of detecting open and closed eyes with glasses and reflections of the glasses can be realized: at least one of detection of open and closed eyes with glasses and reflections of glasses indoors and detection of open and closed eyes with glasses and reflections of glasses outdoors.
  • the open and closed eye detection task where there is an attachment on the eye and there is no light spot on the attachment may be the open and closed eye detection task with glasses and the glasses are not reflective.
  • the task of detecting open and closed eyes with glasses and non-reflective glasses can be realized: at least one of the detection of open and closed eyes with glasses and non-reflective glasses indoors and the detection of open and closed eyes with glasses and non-reflective glasses outdoors.
  • the open and closed eye detection task with glasses can be compared with the open and closed eye detection task in an indoor environment, and the open and closed eye detection task in an outdoor environment.
  • the situation where there is an intersection between the six open and closed eye detection training tasks mentioned above will not be explained one by one here.
  • the present disclosure does not limit the number of open and closed eye detection training tasks involved, and the number of open and closed eye detection training tasks can be determined according to actual needs, and the present disclosure does not limit the specific performance of any open and closed eye detection training tasks. form.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following three open and closed eye detection training tasks:
  • Open and closed eyes detection training task a Open and closed eyes detection training task in indoor environment
  • Open and closed eyes detection training task b Open and closed eyes detection task in outdoor environment
  • Open and closed eyes detection training task c Open and closed eyes detection task with attachments to the eyes and spots on the attachments.
  • At least two open and closed eye detection training tasks in the present disclosure each correspond to an image set, for example, open and closed eyes detection training task a, open and closed eyes detection training task b, and open and closed eye detection training tasks in FIG.
  • Each eye detection training task c corresponds to an image set.
  • Each image set usually includes multiple eye images.
  • the eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets.
  • the eye images contained in different image sets may have an intersection.
  • the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • all eye images in the eye image set with eye attachments may be eye images with glasses.
  • the eye image set may include: eye images with glasses collected in an indoor environment and images in an outdoor environment. The captured eye image with glasses.
  • all eye images in the eye image set without eye attachments may be eye images without glasses.
  • the eye image set may include: eye images without glasses collected in an indoor environment and those outdoors. Eye images without glasses collected in the environment.
  • the set of eye images collected in an indoor environment may include: eye images without glasses collected in an indoor environment, and eye images with glasses collected in an indoor environment.
  • the set of eye images collected in an outdoor environment may include: eye images without glasses collected in an outdoor environment, and eye images with glasses collected in an outdoor environment.
  • all eye images in the eye image set with attachments on the eyes and spots on the attachments may be eye images with glasses and spots on the glasses.
  • the eye image set may include: in an indoor environment Collected eye images with glasses and speckles on the spectacles and eye images with spectacles and speckles on the spectacles collected in an outdoor environment.
  • all eye images in the eye image set with attachments to the eyes and no spots on the attachments may be eye images with glasses and no spots on the glasses.
  • the eye image set may include: in an indoor environment Collected eye images with glasses and no light spots on the glasses and eye images with glasses and no light spots on the glasses collected in an outdoor environment.
  • the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
  • the eye image used in the neural network training process of the present disclosure may also be called an eye image sample, and the image content of the eye image sample usually includes eyes.
  • the eye image sample in the present disclosure is usually a monocular-based eye image sample, that is, the image content of the eye image sample does not include two eyes, but includes one eye.
  • the eye image sample may be an eye image sample based on a single side eye, for example, an eye image sample based on the left eye.
  • the present disclosure does not exclude the case where the eye image sample is an eye image sample based on both eyes or an eye image sample based on any side of the eye.
  • the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera.
  • the process of forming an eye image in the present disclosure may include: performing eye detection on the image taken by the camera device to determine the eye part in the image, and then segmenting the detected eye part from the image, optionally Yes, the present disclosure can perform processing such as zooming and/or image content mapping (such as converting a right-eye image block into a left-eye image block through image content mapping) on the segmented image blocks, thereby forming a method for training open and closed eyes detection Eye image with neural network.
  • the eye image in the present disclosure does not rule out the possibility of using the complete image including the eye captured by the camera as the eye image.
  • the eye image in the present disclosure may be the eye image in the corresponding training sample set.
  • the eye image used for training the neural network for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image.
  • the annotation information can indicate whether the eyes in the eye image are in an open state or a closed state.
  • the label information of the eye image is 1, which means that the eyes in the eye image are in the open state, and the label information of the eye image is 0, which means that the eyes in the eye image are in the closed state.
  • the present disclosure usually obtains a corresponding number of eye images from the eye image sets corresponding to different training tasks.
  • the eye images of the corresponding data obtained from the image set corresponding to the open and closed eye detection training task a are provided to the neural network for open and closed eye detection to be trained, and the image set corresponding to the open and closed eye detection training task b is obtained
  • the eye images of the corresponding data are provided to the neural network for open and closed eyes detection to be trained, and the eye images with corresponding data obtained from the image set corresponding to the open and closed eye detection training task c are provided to the neural network for open and closed eyes detection to be trained.
  • the present disclosure may obtain a corresponding number of eye images from the eye image set corresponding to each training task according to the preset image number ratio of different training tasks; in addition, in the process of obtaining eye images, usually The preset batch quantity will be considered.
  • the present disclosure can obtain 200 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and from the open and closed eyes
  • the eye image corresponding to the detection training task c is collected to acquire 200 eye images.
  • the eye images corresponding to the other open and closed eye detection training tasks can be detected Collect the corresponding number of eye images to achieve batch processing.
  • 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task a
  • 250 eye images can be obtained from the eye image set corresponding to the open and closed eyes detection training task b
  • 250 eye images can be obtained from the open and closed eyes detection training task c
  • the corresponding eye images are collected in 100 eye images, so that a total of 600 eye images are obtained. In this way, the flexibility of obtaining eye images can be increased.
  • the present disclosure may also adopt a method of randomly setting the number to obtain a corresponding number of eye images from the eye image sets corresponding to different training tasks.
  • the present disclosure does not limit the specific implementation of obtaining a corresponding number of eye images from eye image sets corresponding to different training tasks.
  • the present disclosure may sequentially provide the acquired multiple eye images to the neural network for eye-opening detection to be trained, and the neural network for eye-opening detection to be trained performs the input for each eye
  • the images are respectively subjected to eye open and closed state detection processing, so that the neural network for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image.
  • an eye image input to the neural network for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification.
  • the neural network is used to output two probability values, the ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1.
  • One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state.
  • the other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
  • the present disclosure should determine the loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the loss corresponding to each training task, and use the comprehensive loss to adjust the network of the neural network parameter.
  • the network parameters in the present disclosure may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
  • the present disclosure may output the largest of the eye open and closed state detection results respectively output by the neural network for multiple eye images in the image set corresponding to the training task.
  • the angle between the probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task.
  • the present disclosure may use the A-softmax (normalized index with angle) loss function to determine different openings and closings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network.
  • the loss corresponding to each of the eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the corresponding loss of different open and closed eye detection training tasks, and the stochastic gradient descent method is used to adjust the network parameters of the neural network.
  • the present disclosure can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks.
  • the network parameters of the neural network for the open and closed eye detection to be trained are updated in the manner of loss gradient descent.
  • all eye images provided to the neural network for each iteration of training can form a subset of eye images.
  • the eye image subset includes eye images corresponding to each training task.
  • the neural network can learn the ability to detect the ability to open and close the eyes for each training task during the training process, taking into account the ability learning of different training tasks, so that the trained nerve
  • the network can simultaneously improve the accuracy of eye open and closed detection of eye images of each scene in multiple scenes corresponding to multiple training tasks, thereby helping to improve the generalization of the technical solution for accurate detection of eye open and closed in different scenarios based on the neural network. Adaptability and generalization can better meet the actual application requirements of multiple scenarios.
  • the A-softmax loss function in the present disclosure can be represented by the following formula (1):
  • Lang represents the loss corresponding to a training task
  • N represents the number of eye images of the training task
  • represents the modulus of *
  • x i represents the i-th corresponding to the training task Eye images
  • y i represents the label value of the i-th eye image corresponding to the training task
  • m is a constant, and the minimum value of m is usually not less than a predetermined value, for example, the minimum value of m is not less than
  • this training process ends when the training of the neural network for detecting open and closed eyes to be trained reaches a predetermined iterative condition.
  • the predetermined iterative conditions in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network for eye open and closed detection to be trained for the eye image and the label information of the eye image, which meets the predetermined difference requirement. In the case that the difference meets the predetermined difference requirement, the training of the neural network is successfully completed this time.
  • the predetermined iterative conditions in the present disclosure may also include: training the neural network for open and closed eye detection to be trained, and the number of eye images used reaches a predetermined number requirement, etc. When the number of eye images used reaches the predetermined number requirement, however, the difference does not meet the predetermined difference requirement, the neural network was not successfully trained this time.
  • the neural network that has been successfully trained can be used for the detection and processing of the eye open and closed state.
  • the present disclosure forms a comprehensive loss based on the loss of different training tasks, and uses the comprehensive loss to adjust the network parameters of the neural network for eye-opening detection, so that the neural network can open and close the eyes for each training task during the training process
  • the ability learning of ability detection takes into account the ability learning of different training tasks, so that the trained neural network can simultaneously improve the accuracy of the eye image detection of the eyes of each scene in the multiple scenes corresponding to multiple training tasks, and then It is helpful to improve the universality and generalization of the technical solution based on the neural network for accurate detection of open and closed eyes in different scenarios, and better meet the actual application requirements of multiple scenarios.
  • FIG. 3 is a flowchart of an embodiment of the method for detecting the open and closed state of the eyes of the present disclosure.
  • the method of this embodiment includes steps: S300 and S310. Each step in Figure 3 will be described in detail below.
  • the image to be processed in the present disclosure may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, captured by a camera set on a moving object
  • the video frame in the video for example, is a video frame in a video taken by a camera set at a fixed position.
  • the above-mentioned moving objects may be vehicles, robots, or robotic arms.
  • the above-mentioned fixed position can be a desktop or a wall.
  • the present disclosure does not limit the specific manifestations of moving objects and fixed positions.
  • the present disclosure may detect the location area of the eyes in the image to be processed. For example, the method of face detection or face key point detection may be used to determine the area to be processed.
  • the eye of the image circumscribes the frame.
  • the present disclosure can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network.
  • the segmented eye image blocks can be provided to the neural network after certain preprocessing.
  • the segmented eye image block is scaled, so that the size of the eye image block after the scaled process can meet the size requirement of the neural network for the input image.
  • the eye image blocks on the predetermined side are mapped to form two eye image blocks on the same side of the target object.
  • the two eye image blocks on the same side can be scaled.
  • S310 Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result.
  • the neural network in the present disclosure is obtained through successful training using the implementation of the neural network training method in the present disclosure.
  • the neural network in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is open and
  • the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • Eye movements for example, fast blinking, opening one eye and closing one eye, or squinting.
  • the present disclosure can determine the multiple to-be-processed images with a timing relationship based on the detection result of the eye open and closed state with a timing relationship and the state of other organs of the target object's face output to the neural network
  • the facial expressions of the target object for example, smiling, laughing or crying or sad.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • the state of fatigue for example, mild fatigue or dozing off or asleep.
  • the present disclosure can make further judgments based on the detection result of the eye open and closed state with a timing relationship output by the neural network, so as to determine the target object in the multiple images to be processed with a timing relationship.
  • Eye actions so that the present disclosure can determine the interactive control information expressed by the target objects in multiple images to be processed with a time sequence relationship at least according to eye actions.
  • the eye movements, facial expressions, fatigue states, and interactive control information determined by the present disclosure can be utilized by various applications. For example, using the predetermined eye movements and/or facial expressions of the target object to trigger the predetermined special effects in the live broadcast/rebroadcasting process or realize the corresponding human-computer interaction, etc., so as to facilitate the realization of rich applications; another example, in the intelligent driving technology In the real-time detection of the driver’s fatigue state, it is helpful to prevent fatigue driving.
  • the present disclosure does not limit the specific application of the eye open and closed state detection results output by the neural network.
  • FIG. 4 is a flowchart of an embodiment of the intelligent driving control method of the present disclosure.
  • the intelligent driving control method of the present disclosure can be applied in an automatic driving environment and also in a cruise driving environment.
  • the present disclosure does not limit the applicable environment of the intelligent driving control method.
  • the method of this embodiment includes steps: S400, S410, S420, and S430.
  • the steps in Figure 4 will be described in detail below.
  • S400 Acquire an image to be processed collected by a camera device provided on the vehicle.
  • S300 in FIG. 3 For the specific implementation manner of this step, reference may be made to the description of S300 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
  • S410 Perform an eye open/close state detection process on the above-mentioned image to be processed via a neural network, and output an eye open/close state detection result.
  • the neural network in this embodiment is obtained through successful training using the implementation of the neural network training method described above.
  • For the specific implementation manner of this step reference may be made to the description of S310 in FIG. 3 in the foregoing method implementation, which is not described in detail here.
  • S420 Determine the fatigue state of the target object at least according to the detection results of the open and closed eyes of the same target object of the multiple images to be processed with a time series relationship.
  • the target object in the present disclosure is usually the driver of the vehicle.
  • the present disclosure can determine the number of blinks, the duration of a single eye closure, or a single eye opening of the target object (such as a driver) in a unit time based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship.
  • Index parameters such as eye length can be used to determine whether the target object (such as the driver) is in a state of fatigue by using predetermined index requirements to further determine the corresponding index parameters.
  • the fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation of determining the fatigue state of the target object.
  • S430 Form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
  • the instructions generated by the present disclosure may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake-up driver instruction, and report dangerous driving information instruction. At least one kind, the present disclosure does not limit the specific manifestation of the instruction.
  • the neural network successfully trained by the neural network training method of the present disclosure is beneficial to improve the accuracy of the detection result of the open and closed eye state of the neural network
  • the detection result of the open and closed eye state output by the neural network is used to perform the fatigue state
  • the judgment is beneficial to improve the accuracy of the fatigue state detection, so that corresponding instructions are formed according to the detected fatigue state detection, which is beneficial to avoid fatigue driving, and thus is beneficial to improve driving safety.
  • FIG. 5 is a schematic structural diagram of an embodiment of the neural network training device of the present disclosure.
  • the neural network training device as shown in FIG. 5 includes: a neural network 500 for detecting open and closed eyes to be trained and an adjustment module 510.
  • the device may further include: an input module 520.
  • the neural network 500 for eye open and closed detection to be trained is used to perform eye open and closed state detection processing on multiple eye images in the image set corresponding to at least two open and closed eye detection training tasks, respectively, and output eye open and closed state detection results .
  • the eye images contained in different image sets are at least partially different.
  • the to-be-trained neural network 500 for eye-opening detection of the present disclosure can be used to detect the eye-open state of the image to be processed after being successfully trained, and output the eye-open state detection of the image to be processed
  • the neural network 500 outputs two probability values, one of which indicates the probability that the target object in the image to be processed is open. The larger the probability value, the closer to being open. State; where another probability value represents the probability that the eyes of the target object in the image to be processed is in the closed state, and the larger the probability value, the closer to the closed state.
  • the sum of the two probability values can be 1.
  • the neural network 500 in the present disclosure may be a convolutional neural network.
  • the neural network 500 in the present disclosure may include, but is not limited to: a convolutional layer, a Relu layer (also referred to as an activation layer), a pooling layer, a fully connected layer, and a layer for classification (such as binary classification).
  • a convolutional layer also referred to as an activation layer
  • a pooling layer also referred to as an activation layer
  • a fully connected layer such as binary classification
  • a layer for classification such as binary classification
  • each open and closed eye detection training task should belong to The network realizes the total training task of detecting the state of open and closed eyes.
  • the training targets corresponding to different open and closed eye detection training tasks are not exactly the same. That is to say, the present disclosure can divide the total training task of the neural network 500 into multiple training tasks, each training task is aimed at one type of training target, and different training tasks correspond to different training targets.
  • the at least two open and closed eye detection training tasks in the present disclosure may include the following at least two tasks: the open and closed eye detection task when the eye has an attachment, and the open and closed eye detection when the eye has no attachment.
  • the situation of open and closed eyes detection task may be glasses or transparent plastic sheets.
  • the aforementioned light spot may be a light spot formed on the attachment due to reflection of light from the attachment.
  • At least two open and closed eye detection training tasks in the present disclosure each correspond to an image set.
  • Each image set usually includes multiple eye images.
  • the eye images contained in different image sets are at least partially different. That is, for any image set, at least part of the eye images in the image set will not appear in other image sets.
  • the eye images contained in different image sets may have an intersection.
  • the image sets corresponding to each of the six open and closed eye detection training tasks mentioned above can be respectively: an eye image set with eyes attached, an eye image set without eyes attached, and an eye image collected in an indoor environment. Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • an eye image set with eyes attached an eye image set without eyes attached
  • an eye image collected in an indoor environment Sets, eye image sets collected in an outdoor environment, eye image sets with attachments to the eyes and spots on the attachments, eye image sets with attachments to the eyes and no spots on the attachments.
  • the image set included in the present disclosure is determined by the open and closed eye detection training task included in the present disclosure. For example, if the present disclosure includes at least two of the above-mentioned six open and closed eye detection training tasks, the present disclosure includes respective eye image sets corresponding to the at least two open and closed eye detection training tasks.
  • the eye image in the present disclosure may generally be: an eye image block cut out from an image containing the eye captured by the camera.
  • the formation process of the eye image in the present disclosure reference may be made to the description in the foregoing method embodiment, which is not described in detail here.
  • the eye image used for training the neural network 500 for detecting open and closed eyes in the present disclosure usually has annotation information, and the annotation information may indicate the open and closed state of the eyes in the eye image.
  • the labeling information in the present disclosure may also indicate that the eyes in the eye image are in an uncertain state of opening and closing.
  • the eye image used for training the neural network 500 in the present disclosure generally does not include the labeling information as being open. Closing the eye image in the uncertain state is beneficial to avoid the influence of the eye image in the uncertain state of opening and closing on the neural network 500, and is beneficial to improving the detection accuracy of the neural network 500 for detecting open and closed eyes.
  • the input module 520 is used to obtain a corresponding number of eye images from different image sets, and provide them to the neural network 500 for eye opening and closing detection to be trained. For example, the input module 520 obtains a corresponding number of eye images from different image sets for different open and closed eye detection training tasks according to preset image quantity ratios for different open and closed eye detection training tasks, and provides them to the open and closed eyes to be trained. Neural network 500 for closed eyes detection. In addition, the input module 520 usually considers the preset batch processing quantity when acquiring the eye image.
  • the input module 520 can obtain 200 eye images from the eye image set corresponding to the open and closed eye detection training task a, and 200 eye images from the eye image set corresponding to the open and closed eye detection training task b, The eye image corresponding to the eye detection training task c is collected 200 eye images.
  • the input module 520 may correspond to other open and closed eye detection training tasks Obtain the corresponding number of eye images in the eye image collection to achieve batch processing.
  • the input module 520 can obtain 250 eye images from the eye image set corresponding to the open and closed eyes detection training task a, and 250 eye images from the eye image set corresponding to the open and closed eyes detection training task b, and detect from the open and closed eyes
  • the eye images corresponding to the training task c acquire 100 eye images collectively, so that the input module 520 acquires 600 eye images in total.
  • the input module 520 may also adopt a manner of randomly setting a number to obtain a corresponding number of eye images from respective eye image sets corresponding to different training tasks.
  • the present disclosure does not limit the specific implementation manner in which the input module 520 obtains a corresponding number of eye images from eye image sets corresponding to different training tasks.
  • it should avoid acquiring eye images whose labeling information is in an uncertain state of opening and closing, so as to help improve the detection accuracy of the neural network for eye opening and closing detection.
  • the input module 520 may sequentially provide the acquired multiple eye images to the neural network 500 for eye-opening detection to be trained, and the neural network for eye-opening detection 500 to be trained performs the input
  • An eye image is separately processed for eye open and closed state detection, so that the neural network 500 for eye open and closed detection to be trained will sequentially output the eye open and closed state detection results of each eye image.
  • an eye image input to the neural network 500 for open and closed eyes detection to be trained is processed by the convolutional layer, the fully connected layer, and the layer for classification.
  • the detection neural network 500 outputs two probability values, the value ranges of the two probability values are both 0 to 1, and the sum of the two probability values is 1.
  • One of the probability values corresponds to the open state. The closer the probability value is to 1, the closer the eyes in the eye image are to the open state. The other probability value corresponds to the closed state, and the closer the probability value is to 1, the closer the eyes in the eye image are to the closed state.
  • the adjustment module 510 is configured to determine the respective corresponding losses of the at least two open and closed eye detection training tasks according to the eye open and closed annotation information of the eye image and the eye open and closed state detection result output by the neural network 500, and according to the at least two open and closed eye detection training tasks.
  • the network parameters of the neural network 500 are adjusted for the respective losses corresponding to the closed-eye detection training tasks.
  • the adjustment module 510 should determine the respective loss corresponding to each open and closed eye detection training task, and determine the comprehensive loss according to the respective loss of all training tasks, and the adjustment module 510 uses the comprehensive loss to adjust the nerve.
  • Network parameters of the network may include but are not limited to: convolution kernel parameters and/or matrix weights. The present disclosure does not limit the specific content contained in the network parameters.
  • the adjustment module 510 may output according to the eye open and closed state detection results of the multiple eye images in the image set corresponding to the training task by the neural network.
  • the angle between the maximum probability value and the interface corresponding to the annotation information of the corresponding eye image in the image set is used to determine the loss corresponding to the training task.
  • the adjustment module 510 may use the A-softmax (normalized index with angle) loss function to determine different openings based on the eye opening and closing annotation information of the eye image and the detection result of the eye opening and closing state output by the neural network.
  • the loss corresponding to each of the closed-eye detection training tasks is determined, and the comprehensive loss (such as the sum of each loss) is determined according to the respective corresponding losses of the different open and closed-eye detection training tasks.
  • the adjustment module 510 adopts a stochastic gradient descent method to adjust the neural network Network parameters.
  • the adjustment module 510 can use the A-softmax loss function to calculate the respective loss of each open and closed eye detection training task, and perform back propagation processing based on the sum of the respective losses of all open and closed eye detection training tasks.
  • the network parameters of the neural network 500 for eye open and closed detection to be trained are updated in the manner of loss gradient descent.
  • the adjustment module 510 may control the end of this training process.
  • the predetermined iterative condition in the present disclosure may include: the difference between the eye open and closed state detection result output by the neural network 500 for eye open and closed detection to be trained for the eye image and the annotation information of the eye image meets the predetermined difference requirement. In the case where the difference meets the predetermined difference requirement, the neural network 500 is successfully trained this time.
  • the predetermined iterative conditions used by the adjustment module 510 may also include: training a neural network for detecting open and closed eyes to be trained, and the number of eye images used reaches a predetermined number requirement, etc.
  • the neural network 500 is not successfully trained this time.
  • the neural network 500 that has been successfully trained can be used for the detection processing of the eye open and closed state.
  • Fig. 6 is a schematic structural diagram of an embodiment of an eye open-close state detection device of the present disclosure.
  • the device of this embodiment includes: an acquisition module 600 and a neural network 600.
  • the device for detecting the eye open and closed state may further include: a determining module 620.
  • the acquiring module 600 is used to acquire the image to be processed.
  • the image to be processed obtained by the obtaining module 600 may be an image that presents a static picture or a photo, or may be a video frame in a dynamic video, for example, a camera set on a moving object.
  • the video frame in the captured video is another example of the video frame in the video captured by a camera set at a fixed position.
  • the above-mentioned moving objects may be vehicles, robots, or robotic arms.
  • the above-mentioned fixed position can be a desktop or a wall.
  • the acquiring module 600 may detect the location area of the eyes in the image to be processed. For example, the acquiring module 600 may use methods such as face detection or face key point detection. Determine the eye circumscribed frame of the image to be processed. Then, the acquisition module 600 can segment the image of the eye area from the image to be processed according to the circumscribed frame of the eye, and the segmented eye image block is provided to the neural network 600. Of course, the acquisition module 600 may perform certain preprocessing on the segmented eye image blocks and provide them to the neural network 610.
  • the acquisition module 600 performs scaling processing on the segmented eye image blocks, so that the size of the eye image blocks after the scaling process meets the size requirement of the neural network 610 for the input image. For another example, after segmenting the eye image blocks of the two eyes of the target object, the acquisition module 600 performs mapping processing on the eye image blocks on the predetermined side thereof, thereby forming two eye image blocks on the same side of the target object. Yes, the acquisition module 600 can also perform scaling processing on two eye image blocks on the same side.
  • the present disclosure does not limit the specific implementation manner of the acquisition module 600 segmenting the eye image blocks from the image to be processed, nor the specific implementation manner of the acquisition module 600 preprocessing the segmented eye image blocks.
  • the neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
  • the neural network 600 in the present disclosure is directed to the input eye image block, and the output eye open and closed state detection result may be at least one probability value, for example, a probability value indicating that the eye is in an open state and
  • the value range of the two probability values may both be 0-1, and the sum of the two probability values for the same eye image block is 1. The closer the probability value that the eyes are in the open state is to 1, the closer the eyes in the eye image block are to the open eyes state. The closer the probability value that the eyes are in the closed state is to 1, the closer the eyes in the eye image block are to the closed-eye state.
  • the determining module 620 is configured to determine the eye movements and/or facial expressions and/or fatigue status and/or interaction of the target object at least according to the detection results of the open and closed eyes of the same target object in the multiple to-be-processed images with a time sequence relationship. Control information.
  • the eye motion of the target object for example, a quick blinking motion, or an eye opening and closing motion, or an eye squinting motion, etc.
  • Facial expressions of the target object for example, smiling, laughing or crying or sadness, etc.
  • the fatigue state of the target object for example, mild fatigue or dozing off or deep asleep.
  • the interactive control information expressed by the target object for example, confirmation or denial.
  • FIG. 7 is a schematic structural diagram of an embodiment of the intelligent driving control device of the present disclosure.
  • the device in FIG. 7 mainly includes: an acquisition module 600, a neural network 610, a fatigue state determination module 700, and an instruction module 710.
  • the acquisition module 600 is used to acquire the to-be-processed image collected by the camera device installed on the vehicle.
  • the neural network 610 is used for the image to be processed, performing the detection processing of the eye open and closed state, and output the detection result of the eye open and closed state.
  • the fatigue state determining module 700 is configured to determine the fatigue state of the target object at least according to the detection results of the open/closed state of the eyes belonging to the same target object in the plurality of images to be processed with a time series relationship.
  • the target object in this disclosure is usually a driver.
  • the fatigue state determining module 700 can determine the number of blinks per unit time, the duration of a single eye closure, or the duration of a single eye closure of the target object (such as a driver) based on the monitoring results of multiple eye open and closed states that belong to the same target object and have a time sequence relationship. Index parameters such as the duration of a single eye opening to determine the fatigue state module 700 further judge the corresponding index parameters by using predetermined index requirements, and the fatigue state determination module 700 can determine whether the target object (such as the driver) is in a fatigue state.
  • the fatigue state in the present disclosure may include various fatigue states of different degrees, for example, a mild fatigue state, a moderate fatigue state, or a severe fatigue state. The present disclosure does not limit the specific implementation manner of determining the fatigue state of the target object by the fatigue state determining module 700.
  • the instruction module 710 is used to form a corresponding instruction according to the fatigue state of the target object, and output the instruction.
  • the instruction module 710 generates instructions based on the fatigue state of the target object, and the generated instructions may include: switch to smart driving state instruction, voice alert fatigue driving instruction, vibration wake up driver instruction, and report dangerous driving information instruction, etc. At least one of the instructions, the present disclosure does not limit the specific manifestation of the instruction.
  • the fatigue state determining module 700 uses the open and closed output of the neural network 610 Judging the fatigue state based on the result of the eye state detection is beneficial to improve the accuracy of the fatigue state detection. Therefore, the instruction module 710 forms a corresponding instruction according to the detected fatigue state detection, which is beneficial to avoid fatigue driving and thus is beneficial to improve driving safety.
  • FIG. 8 shows an exemplary device 800 suitable for implementing the present disclosure.
  • the device 800 may be a control system/electronic system configured in a car, a mobile terminal (for example, a smart mobile phone, etc.), a personal computer (PC, for example, a desktop). Computer or notebook computer, etc.), tablet computer, server, etc.
  • a mobile terminal for example, a smart mobile phone, etc.
  • PC personal computer
  • Computer or notebook computer, etc. tablet computer, server, etc.
  • the device 800 includes one or more processors, communication parts, etc., and the one or more processors may be: one or more central processing units (CPU) 801, and/or, one or more acceleration
  • the unit 813, the acceleration unit 813 may be a graphics processor (GPU), etc., and the processor may be based on executable instructions stored in a read-only memory (ROM) 802 or loaded from the storage part 808 to a random access memory (RAM) 803.
  • the instructions can be executed to perform various appropriate actions and processing.
  • the communication unit 812 may include but is not limited to a network card, and the network card may include but is not limited to an IB (Infiniband) network card.
  • IB Infiniband
  • the processor can communicate with the read-only memory 802 and/or the random access memory 803 to execute executable instructions, connect with the communication part 812 through the bus 804, and communicate with other target devices through the communication part 812, thereby completing the corresponding in this disclosure. step.
  • the RAM 803 can also store various programs and data required for device operation.
  • the CPU 801, the ROM 802, and the RAM 803 are connected to each other through a bus 804.
  • ROM802 is an optional module.
  • the RAM 803 stores executable instructions, or writes executable instructions into the ROM 802 during operation, and the executable instructions cause the central processing unit 801 to execute the steps included in the above method.
  • An input/output (I/O) interface 805 is also connected to the bus 804.
  • the communication unit 812 may be integrated, or may be configured to have multiple sub-modules (for example, multiple IB network cards) and be connected to the bus respectively.
  • the following components are connected to the I/O interface 805: an input part 806 including a keyboard and a mouse; an output part 807 such as a cathode ray tube (CRT), a liquid crystal display (LCD) and a speaker; a storage part 808 including a hard disk; And the communication part 809 including a network interface card such as a LAN card, a modem, and the like.
  • the communication section 809 performs communication processing via a network such as the Internet.
  • the driver 810 is also connected to the I/O interface 805 as needed.
  • a removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is installed on the drive 810 as needed, so that the computer program read from it is installed in the storage section 808 as needed.
  • the architecture shown in Figure 8 is only an optional implementation. In the specific practice process, the number and types of components in Figure 8 can be selected, deleted, added or replaced according to actual needs. ; In the setting of different functional components, implementation methods such as separate or integrated settings can also be used.
  • the acceleration unit 813 and the CPU801 can be separately arranged.
  • the acceleration unit 813 can be integrated on the CPU801, and the communication part can be separately arranged, It can also be integrated on the CPU801 or the acceleration unit 813.
  • the process described below with reference to the flowcharts can be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program product tangibly contained on a machine-readable medium.
  • the computer program includes program code for executing the steps shown in the flowchart.
  • the program code may include instructions corresponding to the steps in the method provided by the present disclosure.
  • the computer program may be downloaded and installed from the network through the communication part 809, and/or installed from the removable medium 811.
  • the computer program is executed by the central processing unit (CPU) 801
  • the instructions described in the present disclosure to implement the above-mentioned corresponding steps are executed.
  • the embodiments of the present disclosure also provide a computer program program product for storing computer-readable instructions, which when executed, cause a computer to execute the procedures described in any of the foregoing embodiments.
  • the computer program product can be specifically implemented by hardware, software or a combination thereof.
  • the computer program product is specifically embodied as a computer storage medium.
  • the computer program product is specifically embodied as a software product, such as a software development kit (SDK), etc. Wait.
  • SDK software development kit
  • the embodiments of the present disclosure also provide another method for detecting the open and closed state of eyes, a method for intelligent driving control, and a training method for neural networks, and corresponding devices, electronic equipment, and computer storage media.
  • a computer program and a computer program product wherein the method includes: the first device sends a neural network training instruction or an eye open and closed state detection instruction or an intelligent driving control instruction to the second device, and the instruction causes the second device to perform any of the above possible The neural network training method or the eye open/close state detection method or the intelligent driving control method in the embodiment; the first device receives the neural network training result or the eye open/close state detection result or the intelligent driving control result sent by the second device.
  • the neural network training instruction or the eye open and closed state detection instruction or the intelligent driving control instruction may be specifically a call instruction, and the first device may instruct the second device to perform the neural network training operation or open and close the eyes by calling.
  • the state detection operation or the intelligent driving control operation correspondingly, in response to receiving the call instruction, the second device may execute the steps and steps in any embodiment of the above-mentioned neural network training method or the eye-opening state detection method or the intelligent driving control method. /Or process.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure may be implemented in many ways.
  • the method and apparatus, electronic equipment, and computer-readable storage medium of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless otherwise specifically stated.
  • the present disclosure can also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Ophthalmology & Optometry (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)

Abstract

Les modes de réalisation de la présente invention concernent un procédé d'apprentissage de réseau neuronal, un procédé de détection d'état d'ouverture et de fermeture d'œil, un procédé de commande de conduite intelligente, un appareil, un dispositif électronique, un support de stockage lisible par ordinateur et un programme informatique, le procédé d'apprentissage de réseau neuronal comprenant : au moyen d'un réseau neuronal devant être entraîné pour une détection d'état d'ouverture et de fermeture d'œil, la réalisation respective d'un traitement de détection d'état d'ouverture et de fermeture d'œil sur de multiples images d'œil dans des ensembles d'images correspondant à chacune d'au moins deux tâches d'apprentissage de détection d'ouverture et de fermeture d'œil et la sortie des résultats de détection d'état d'ouverture et de fermeture d'œil, les images d'œil contenues dans différents ensembles d'images étant au moins partiellement différentes ; sur la base des informations d'étiquette d'ouverture et de fermeture de l'œil des images de l'œil et des résultats de détection d'état d'ouverture et de fermeture de l'œil émis par le réseau neuronal, la détermination respective de la perte correspondant à chacune des au moins deux tâches d'apprentissage de détection d'ouverture et de fermeture d'œil et, sur la base de la perte correspondant à chacune des au moins deux tâches d'apprentissage de détection d'ouverture et de fermeture d'œil, l'ajustement des paramètres de réseau du réseau neuronal.
PCT/CN2019/118127 2019-02-28 2019-11-13 Apprentissage de réseau neuronal et procédé, appareil et dispositif de détection d'état d'ouverture et de fermeture d'œil WO2020173135A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217023286A KR20210113621A (ko) 2019-02-28 2019-11-13 뉴럴 네트워크의 트레이닝 및 눈 개폐 상태의 검출 방법, 장치 및 기기
JP2021541183A JP7227385B2 (ja) 2019-02-28 2019-11-13 ニューラルネットワークのトレーニング及び目開閉状態の検出方法、装置並び機器

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910153463.4 2019-02-28
CN201910153463.4A CN111626087A (zh) 2019-02-28 2019-02-28 神经网络训练及眼睛睁闭状态检测方法、装置及设备

Publications (1)

Publication Number Publication Date
WO2020173135A1 true WO2020173135A1 (fr) 2020-09-03

Family

ID=72238751

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118127 WO2020173135A1 (fr) 2019-02-28 2019-11-13 Apprentissage de réseau neuronal et procédé, appareil et dispositif de détection d'état d'ouverture et de fermeture d'œil

Country Status (4)

Country Link
JP (1) JP7227385B2 (fr)
KR (1) KR20210113621A (fr)
CN (1) CN111626087A (fr)
WO (1) WO2020173135A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283488A (zh) * 2022-03-08 2022-04-05 北京万里红科技有限公司 生成检测模型的方法及利用检测模型检测眼睛状态的方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113313790A (zh) * 2021-05-31 2021-08-27 北京字跳网络技术有限公司 视频生成方法、装置、设备及存储介质
CN113537176A (zh) * 2021-09-16 2021-10-22 武汉未来幻影科技有限公司 一种驾驶员疲劳状态的确定方法、装置以及设备
CN117687313B (zh) * 2023-12-29 2024-07-12 广东福临门世家智能家居有限公司 基于智能门锁的智能家居设备控制方法及系统

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106585629A (zh) * 2016-12-06 2017-04-26 广州市科恩电脑有限公司 一种车辆控制方法和装置
US20170294010A1 (en) * 2016-04-12 2017-10-12 Adobe Systems Incorporated Utilizing deep learning for rating aesthetics of digital images
CN108805185A (zh) * 2018-05-29 2018-11-13 腾讯科技(深圳)有限公司 模型的训练方法、装置、存储介质及计算机设备
CN108985135A (zh) * 2017-06-02 2018-12-11 腾讯科技(深圳)有限公司 一种人脸检测器训练方法、装置及电子设备
WO2019028798A1 (fr) * 2017-08-10 2019-02-14 北京市商汤科技开发有限公司 Procédé et dispositif de surveillance d'une condition de conduite, et dispositif électronique associé

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4687150B2 (ja) 2005-03-08 2011-05-25 日産自動車株式会社 直射光検出装置
JP4770218B2 (ja) 2005-03-22 2011-09-14 日産自動車株式会社 視認行動判定装置
JP4978227B2 (ja) 2007-02-15 2012-07-18 トヨタ自動車株式会社 画像検出装置
CN107003834B (zh) * 2014-12-15 2018-07-06 北京市商汤科技开发有限公司 行人检测设备和方法
JP2016176699A (ja) 2015-03-18 2016-10-06 株式会社オートネットワーク技術研究所 経路探索装置、経路探索方法及びコンピュータプログラム
JP6582604B2 (ja) 2015-06-23 2019-10-02 富士通株式会社 瞳孔検出プログラム、瞳孔検出方法、瞳孔検出装置および視線検出システム
JP6892231B2 (ja) 2016-07-29 2021-06-23 アルパイン株式会社 瞼開閉検出装置および瞼開閉検出方法
JP6762794B2 (ja) 2016-07-29 2020-09-30 アルパイン株式会社 瞼開閉検出装置および瞼開閉検出方法
CN106529402B (zh) * 2016-09-27 2019-05-28 中国科学院自动化研究所 基于多任务学习的卷积神经网络的人脸属性分析方法
JP2018075208A (ja) 2016-11-10 2018-05-17 パナソニックIpマネジメント株式会社 運転者の状態検出システムおよび状態検出方法
CN108022238B (zh) * 2017-08-09 2020-07-03 深圳科亚医疗科技有限公司 对3d图像中对象进行检测的方法、计算机存储介质和系统
CN108614999B (zh) * 2018-04-16 2022-09-16 贵州大学 基于深度学习的眼睛睁闭状态检测方法
CN108960071A (zh) * 2018-06-06 2018-12-07 武汉幻视智能科技有限公司 一种睁眼闭眼状态检测方法
CN108932536B (zh) * 2018-07-18 2021-11-09 电子科技大学 基于深度神经网络的人脸姿态重建方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170294010A1 (en) * 2016-04-12 2017-10-12 Adobe Systems Incorporated Utilizing deep learning for rating aesthetics of digital images
CN106585629A (zh) * 2016-12-06 2017-04-26 广州市科恩电脑有限公司 一种车辆控制方法和装置
CN108985135A (zh) * 2017-06-02 2018-12-11 腾讯科技(深圳)有限公司 一种人脸检测器训练方法、装置及电子设备
WO2019028798A1 (fr) * 2017-08-10 2019-02-14 北京市商汤科技开发有限公司 Procédé et dispositif de surveillance d'une condition de conduite, et dispositif électronique associé
CN108805185A (zh) * 2018-05-29 2018-11-13 腾讯科技(深圳)有限公司 模型的训练方法、装置、存储介质及计算机设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114283488A (zh) * 2022-03-08 2022-04-05 北京万里红科技有限公司 生成检测模型的方法及利用检测模型检测眼睛状态的方法
CN114283488B (zh) * 2022-03-08 2022-06-14 北京万里红科技有限公司 生成检测模型的方法及利用检测模型检测眼睛状态的方法

Also Published As

Publication number Publication date
KR20210113621A (ko) 2021-09-16
JP2022517398A (ja) 2022-03-08
JP7227385B2 (ja) 2023-02-21
CN111626087A (zh) 2020-09-04

Similar Documents

Publication Publication Date Title
WO2020173135A1 (fr) Apprentissage de réseau neuronal et procédé, appareil et dispositif de détection d'état d'ouverture et de fermeture d'œil
US11551377B2 (en) Eye gaze tracking using neural networks
CN108229284B (zh) 视线追踪及训练方法和装置、系统、电子设备和存储介质
WO2019128932A1 (fr) Procédé et appareil d'analyse de pose de visage, dispositif, support d'informations et programme
WO2022156640A1 (fr) Procédé et appareil de correction du regard pour image, dispositif électronique, support d'enregistrement lisible par ordinateur et produit programme d'ordinateur
WO2018054329A1 (fr) Procédé et dispositif de détection d'objets, appareil électronique, programme informatique et support de stockage
EP4053735A1 (fr) Procédé de structuration d'informations sur un piéton, dispositif, appareil et support d'enregistrement
WO2020125499A1 (fr) Procédé d'invite d'opération et lunettes
US20210124928A1 (en) Object tracking methods and apparatuses, electronic devices and storage media
US11704563B2 (en) Classifying time series image data
WO2017114168A1 (fr) Procédé et dispositif de détection de cibles
WO2019029459A1 (fr) Procédé et dispositif de reconnaissance d'âge facial et dispositif électronique
WO2021238586A1 (fr) Procédé et appareil d'entraînement, dispositif, et support de stockage lisible par ordinateur
WO2022082999A1 (fr) Procédé et appareil de reconnaissance d'objets, dispositif terminal et support de stockage
WO2023178906A1 (fr) Procédé et appareil de détection de vivacité, et dispositif électronique, support de stockage, programme informatique et produit-programme informatique
US11868523B2 (en) Eye gaze classification
Liu et al. Rgbd video based human hand trajectory tracking and gesture recognition system
KR20210000671A (ko) 헤드 포즈 추정
CN114461078B (zh) 一种基于人工智能的人机交互方法
US11796801B2 (en) Reducing light leakage via external gaze detection
Tazhigaliyeva et al. Cyrillic manual alphabet recognition in RGB and RGB-D data for sign language interpreting robotic system (SLIRS)
EP4287123A1 (fr) Procédé d'estimation d'une position tridimensionnelle d'un objet
Hanche et al. Comparative Analysis of Methods of Gesture Recognition in Image Processing
CN118043859A (zh) 高效视觉感知
CN117392527A (zh) 一种高精度水下目标分类检测方法及其模型搭建方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19916840

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021541183

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20217023286

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19916840

Country of ref document: EP

Kind code of ref document: A1