WO2023090618A1 - Dispositif automobile permettant de déterminer l'état du regard d'un conducteur en utilisant l'intelligence artificielle, et son procédé de commande - Google Patents

Dispositif automobile permettant de déterminer l'état du regard d'un conducteur en utilisant l'intelligence artificielle, et son procédé de commande Download PDF

Info

Publication number
WO2023090618A1
WO2023090618A1 PCT/KR2022/014220 KR2022014220W WO2023090618A1 WO 2023090618 A1 WO2023090618 A1 WO 2023090618A1 KR 2022014220 W KR2022014220 W KR 2022014220W WO 2023090618 A1 WO2023090618 A1 WO 2023090618A1
Authority
WO
WIPO (PCT)
Prior art keywords
driver
gaze
data
learning
image
Prior art date
Application number
PCT/KR2022/014220
Other languages
English (en)
Korean (ko)
Inventor
강현욱
김나경
김병욱
강동희
Original Assignee
전남대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 전남대학교 산학협력단 filed Critical 전남대학교 산학협력단
Publication of WO2023090618A1 publication Critical patent/WO2023090618A1/fr

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/10Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to vehicle motion
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • B60W2040/0818Inactivity or incapacity of driver
    • B60W2040/0863Inactivity or incapacity of driver due to erroneous selection or response of the driver
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0002Automatic control, details of type of controller or control system architecture
    • B60W2050/0004In digital systems, e.g. discrete-time systems involving sampling
    • B60W2050/0005Processor details or data handling, e.g. memory registers or chip architecture
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W50/00Details of control systems for road vehicle drive control not related to the control of a particular sub-unit, e.g. process diagnostic or vehicle driver interfaces
    • B60W2050/0001Details of the control system
    • B60W2050/0002Automatic control, details of type of controller or control system architecture
    • B60W2050/0008Feedback, closed loop systems or details of feedback error signal
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2420/00Indexing codes relating to the type of sensors based on the principle of their operation
    • B60W2420/40Photo, light or radio wave sensitive means, e.g. infrared sensors
    • B60W2420/403Image sensing, e.g. optical camera
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2520/00Input parameters relating to overall vehicle dynamics
    • B60W2520/10Longitudinal speed
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/225Direction of gaze
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/80Spatial relation or speed relative to objects
    • B60W2554/801Lateral distance

Definitions

  • the present disclosure relates to a vehicle device and a control method thereof, and more particularly, to a vehicle device for determining a driver's gaze state and a control method thereof.
  • the car's moving distance is about 55 m or more, which is the same as driving with your eyes closed, which can lead to a dangerous large-scale accident.
  • the present disclosure has been made in accordance with the above-mentioned needs, and an object of the present disclosure is to obtain driver's gaze data using an artificial neural network model, determine the driver's forward gaze state based on this, and provide corresponding feedback. It is to provide a vehicle device and a control method thereof.
  • a vehicle apparatus includes a camera, a memory in which a learned artificial neural network model is stored, and a driver's photographic image obtained through the camera is input to the artificial neural network model so that the driver and a processor for acquiring gaze data of the driver, determining a gaze state of the driver based on the driver's gaze data, and providing feedback corresponding to the driver's gaze state, wherein the artificial neural network model comprises a driver image for learning.
  • the artificial neural network model comprises a driver image for learning. is a model learned using as input data and gaze data corresponding to each driver image for learning as output data, and the gaze data corresponding to each driver image for learning includes coordinates of the driver's pupils It can be obtained based on the direction vector.
  • gaze data corresponding to each driver image for learning is obtained based on 3D absolute coordinates and 3D direction vectors of the driver's eyes corresponding to the driver's image for learning, and the driver's gaze output from the artificial neural network model.
  • the data may be 3D gaze coordinates corresponding to the driver's eyes.
  • the processor acquires a driver image for learning corresponding to the driver's careless state, extracts pixel coordinates of facial feature points of the driver and direction vectors of the driver's pupils from the driver image for learning, and extracts the pixel coordinates of the facial feature points. to extract the pixel coordinates of the pupil, convert the pixel coordinates of the pupil into 3D absolute coordinates, and obtain gaze data corresponding to the driver's image for learning based on the 3D absolute coordinates of the pupil and the direction vector of the pupil can do.
  • the processor may obtain driving data of the vehicle device, and provide feedback corresponding to the driver's gaze state based on the driving data of the vehicle device, wherein the driving data of the vehicle device includes vehicle speed data and Separation distance information from the center of the lane may be included.
  • the processor may determine the driver's gaze state based on the driver's gaze data and additional information, and the additional information may include at least one of driving environment information and driver profile information.
  • a control method of a vehicle device includes the steps of acquiring driver's gaze data by inputting a driver's photographic image acquired through a camera to a learned artificial neural network model, and the driver's gaze data based on the driver's gaze data. Determining a gaze state of the driver and providing feedback corresponding to the driver's gaze state, wherein the artificial neural network model uses driver images for learning as input data and gazes corresponding to each of the driver images for learning. This is a model learned using data as output data, and gaze data corresponding to each of the driver images for learning may be obtained based on coordinates and direction vectors of the driver's eyes corresponding to the driver images for learning.
  • gaze data corresponding to each driver image for learning is obtained based on 3D absolute coordinates and 3D direction vectors of the driver's eyes corresponding to the driver's image for learning, and the driver's gaze output from the artificial neural network model.
  • the data may be 3D gaze coordinates corresponding to the driver's eyes.
  • acquiring a driver image for learning corresponding to the driver's inattention state extracting pixel coordinates of facial feature points of the driver and direction vectors of the driver's pupils from the driver image for learning, extracting the driver's pupils based on the pixel coordinates of the facial feature points. extracting pixel coordinates of the pupil, converting the pixel coordinates of the pupil into 3D absolute coordinates, and generating gaze data corresponding to the driver's image for learning based on the 3D absolute coordinates of the pupil and the direction vector of the pupil. Acquisition steps may be included.
  • the method may further include obtaining driving data of the vehicle device, and providing feedback corresponding to the driver's gaze state may include providing feedback corresponding to the driver's gaze state based on the driving data of the vehicle device.
  • the driving data of the vehicle device may include vehicle speed data and separation distance information from the center of the lane.
  • the determining of the driver's gazing state may include determining the driver's gazing state based on the driver's gaze data and additional information, wherein the additional information includes at least one of driving environment information and driver profile information. can do.
  • FIG. 1 is a block diagram illustrating a configuration of a vehicle device according to an exemplary embodiment of the present disclosure.
  • FIGS. 2A and 2B are diagrams for explaining a learning method of an artificial neural network model according to an embodiment.
  • 3 to 6 are diagrams for explaining a learning data acquisition method according to an exemplary embodiment.
  • FIG. 7 is a diagram for explaining the operation of a learned artificial neural network model according to an embodiment.
  • FIG. 8 is a diagram for explaining an actual application example and effects for determining a driver's gaze state according to an exemplary embodiment.
  • FIG. 9 is a diagram illustrating an implementation example of a vehicle device according to an exemplary embodiment.
  • FIG. 10 is a flowchart illustrating a vehicle control method according to an exemplary embodiment.
  • expressions such as “has,” “can have,” “includes,” or “can include” indicate the existence of a corresponding feature (eg, numerical value, function, operation, or component such as a part). , which does not preclude the existence of additional features.
  • a component e.g., a first component
  • another component e.g., a second component
  • connection to it should be understood that an element may be directly connected to another element, or may be connected through another element (eg, a third element).
  • a “module” or “unit” performs at least one function or operation, and may be implemented in hardware or software or a combination of hardware and software.
  • a plurality of “modules” or a plurality of “units” are integrated into at least one module and implemented by at least one processor (not shown), except for “modules” or “units” that need to be implemented with specific hardware. It can be.
  • FIG. 1 is a block diagram illustrating a configuration of a vehicle device according to an exemplary embodiment of the present disclosure.
  • a vehicle device 100 includes a camera 110 , a memory 120 and a processor 130 .
  • the camera 110 may be turned on according to a preset event to take a picture.
  • the camera 110 may convert a captured image into an electrical signal and generate image data based on the converted signal.
  • a subject may be converted into an electrical image signal through a charge coupled device (CCD), and the converted image signal may be amplified and converted into a digital signal and then signal processed.
  • the camera 110 may be implemented as a general camera, a stereo camera, or a depth camera.
  • the camera 110 may be disposed at a location within the vehicle device 100 capable of capturing the driver's face and obtain an image of the driver's face.
  • the camera 110 may be disposed on a dashboard within the vehicle device 100 .
  • the memory 120 may store data necessary for various embodiments of the present disclosure.
  • the memory 120 may be implemented in the form of a memory embedded in the vehicle device 100 or in the form of a memory capable of communicating with (or detachable from) the vehicle device 100 according to a data storage purpose.
  • data for driving the vehicle device 100 is stored in a memory embedded in the vehicle device 100
  • data for an extended function of the vehicle device 100 is communicable with the vehicle device 100. can be stored in memory.
  • volatile memory eg, DRAM (dynamic RAM), SRAM (static RAM), SDRAM (synchronous dynamic RAM), etc.
  • non-volatile memory non-volatile memory
  • OTPROM one time programmable ROM
  • PROM programmable ROM
  • EPROM erasable and programmable ROM
  • EEPROM electrically erasable and programmable ROM
  • mask ROM flash ROM, flash memory (such as NAND flash or NOR flash, etc.)
  • flash memory such as NAND flash or NOR flash, etc.
  • SSD solid state drive
  • a memory card eg, a compact flash (CF)
  • SD secure digital
  • Micro-SD micro secure digital
  • Mini-SD mini secure digital
  • xD extreme digital
  • MMC multi-media card
  • the memory 120 may store at least one instruction or a computer program including instructions for controlling the vehicle device 100 .
  • the memory 120 may store various data, programs, or applications for driving/controlling the vehicle device 100 .
  • the vehicle device 100 may store a control program for controlling the vehicle device 100 and the processor 130, an application initially provided by a manufacturer or downloaded from the outside, databases, or related data.
  • the memory 120 may store information for determining a gaze state based on gaze data, feedback information corresponding to a driver's gaze state, and the like, according to an embodiment.
  • the memory 120 may store information about an artificial neural network model (or artificial intelligence model) including a plurality of layers.
  • storing information about the artificial neural network model means various information related to the operation of the artificial neural network model, for example, information about a plurality of layers included in the artificial neural network model, parameters used in each of the plurality of layers (eg, , filter coefficients, bias, etc.)
  • the memory 120 may store information about an artificial neural network model learned to output driver's gaze data according to an embodiment.
  • the processor 130 is implemented as hardware dedicated to the artificial neural network model
  • information on the artificial neural network model may be stored in an internal memory of the processor 130 .
  • the artificial neural network model may be stored in an external device such as a server, and the vehicle device 100 may acquire driver's gaze data from the external device by transmitting the driver's captured image to the external device. do.
  • the memory 120 may be implemented as a single memory that stores data generated in various operations according to the present disclosure. However, according to another embodiment, the memory 120 may be implemented to include a plurality of memories each storing different types of data or each storing data generated in different steps.
  • the processor 130 is electrically connected to the camera 110 and the memory 120 to control overall operations of the vehicle device 100 .
  • Processor 130 may be composed of one or a plurality of processors. Specifically, the processor 130 may perform the operation of the vehicle device 100 according to various embodiments of the present disclosure by executing at least one instruction stored in a memory (not shown).
  • the processor 130 may include a digital signal processor (DSP), a microprocessor, a graphics processing unit (GPU), an artificial intelligence (AI) processor, and a neural processing unit (NPU) for processing digital image signals.
  • DSP digital signal processor
  • GPU graphics processing unit
  • AI artificial intelligence
  • NPU neural processing unit
  • Processing Unit time controller
  • CPU central processing unit
  • MCU micro controller unit
  • MPU micro processing unit
  • controller controller
  • AP application processor
  • CP communication processor
  • the processor 130 may be implemented in the form of a system on chip (SoC) with a built-in processing algorithm, large scale integration (LSI), application specific integrated circuit (ASIC), or field programmable gate array (FPGA).
  • SoC system on chip
  • LSI large scale integration
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the processor 130 for executing the artificial neural network model includes a general-purpose processor such as a CPU, AP, digital signal processor (DSP), and the like, a graphics-only processor such as a GPU, a vision processing unit (VPU), or an NPU. It can be implemented through a combination of the same artificial intelligence dedicated processor and software.
  • the processor 130 may control input data to be processed according to a predefined operation rule or an artificial neural network model stored in the memory 120 .
  • the processor 130 is a dedicated processor (or artificial intelligence dedicated processor), it may be designed as a hardware structure specialized for processing a specific artificial neural network model.
  • hardware specialized for processing a specific artificial neural network model may be designed as a hardware chip such as an ASIC or FPGA.
  • the processor 130 When the processor 130 is implemented as a dedicated processor, it may be implemented to include a memory for implementing an embodiment of the present disclosure or to include a memory processing function for using an external memory.
  • the processor 130 may obtain the driver's gaze data using the learned artificial neural network model.
  • the artificial neural network model is a convolutional neural network (CNN), a recurrent neural network (RNN), a restricted boltzmann machine (RBM), a deep belief network (DBN), a bidirectional recurrent deep neural network (BRDNN), or a deep Q-network ( Deep Q-Networks), etc., but is not limited thereto.
  • the processor 130 may acquire the driver's gaze data by inputting the driver's photographed image acquired in real time through the camera 110 to the artificial neural network model, and determine the driver's gaze state based on the driver's gaze data.
  • the driver's gaze data may include 3D gaze coordinates corresponding to the driver's eyes.
  • the 3D gaze coordinates may be coordinates including a depth direction (perspective).
  • the driver's photographed image may be an image including the driver's whole body, face, and other environments.
  • the artificial neural network model may be a model learned by using driver images for learning as input data and gaze data corresponding to each driver image for learning as output data.
  • gaze data corresponding to each driver's image for learning may be obtained based on the coordinates and direction vectors of the driver's eyes corresponding to the driver's image for learning.
  • gaze data corresponding to each driver image for learning may be obtained based on 3D absolute coordinates and 3D direction vectors of the driver's eyes corresponding to the driver's image for learning. Accordingly, the artificial neural network model may output 3D gaze coordinates corresponding to the driver's eyes when the driver's image is input.
  • the processor 130 may determine the driver's gaze state based on the driver's gaze data obtained by inputting the driver's captured image to the artificial neural network model. According to an example, it may be determined whether the driver is in a forward looking state or a non-looking state (or careless state) based on the driver's 3D gaze coordinates. For example, it may be identified that the driver is in a non-looking state (or inattentive state) when it matches gaze data obtained based on a driver image for learning corresponding to the driver's non-looking state (or inattentive state).
  • the processor 130 may provide feedback corresponding to the driver's gazing state.
  • the processor 130 may provide feedback corresponding to the driver's gazing state based on driving data of the vehicle device 100 .
  • the driving data of the vehicle device may include vehicle speed data and separation distance information from the center of the lane.
  • the processor 130 may provide a warning alarm to the driver or switch to an autonomous driving mode when it is determined that the driver is in a frontal non-looking state.
  • the processor 130 may switch to an autonomous driving mode when it is determined that the driver's attention state is not recovered after providing a warning alarm.
  • the processor 130 may provide only a warning alarm according to the level of the driver's gazing state, or may immediately switch to the autonomous driving mode simultaneously with the warning alarm.
  • the processor 130 may control the driving speed of the vehicle simultaneously with a warning alarm when the speed of the vehicle is greater than or equal to the threshold value.
  • the processor 130 may control the driving speed of the vehicle using an electronic control unit (ECU).
  • ECU electronice control unit
  • the warning alarm may be provided in various forms such as a sound alarm, a haptic alarm, and a visual alarm (ex: strong lighting).
  • a sound alarm may be provided by outputting a human voice or a predetermined alarm sound through a speaker (eg, a speaker of a car audio system, an AV system, a navigation device, or a telematics terminal).
  • a haptic alarm may be provided through a vibration device installed on a driver's seat or steering wheel.
  • a visual alarm may be provided by turning on an LED light installed inside the vehicle.
  • the processor 130 acquires a driver image for learning corresponding to the driver's careless state, extracts pixel coordinates of the driver's facial feature points and direction vectors of the driver's pupils from the driver image for learning, and based on the pixel coordinates of the facial feature points The pixel coordinates of the pupil can be extracted. Subsequently, the processor 130 may convert the pixel coordinates of the pupil into 3D absolute coordinates, and obtain gaze data corresponding to the driver's image for training based on the 3D absolute coordinates of the pupil and the direction vector of the pupil.
  • FIGS. 2A and 2B are diagrams for explaining a learning method of an artificial neural network model according to an embodiment.
  • the artificial neural network model 10 may be learned based on a pair of input training data and output training data or may be learned based on the input training data.
  • learning of the artificial neural network model means that a basic artificial neural network model (eg, an artificial neural network model including random parameters) is learned using a plurality of training data by a learning algorithm, so that desired characteristics (or, This means that a predefined action rule or an artificial neural network model set to perform the purpose) is created.
  • a basic artificial neural network model eg, an artificial neural network model including random parameters
  • desired characteristics or, This means that a predefined action rule or an artificial neural network model set to perform the purpose
  • Such learning may be performed through the vehicle device 100, but is not limited thereto and may be performed through a separate server and/or system.
  • Examples of learning algorithms include supervised learning, unsupervised learning, semi-supervised learning, or reinforcement learning, but are not limited to the above examples. However, this is an example of supervised learning, and it goes without saying that an artificial neural network model can be trained based on unsupervised learning in which an artificial neural network model is trained by inputting only input data without using output data.
  • an artificial neural network model may be composed of an input layer, a hidden layer, an out layer, and an activation function (f).
  • the activation function (f) can be implemented as a linear, sigmoid, or hyperbolic tangent function (tanh), a Rectified Linear Unit (ReLU) function, or the like.
  • activation functions of the hidden layer and the out layer may be implemented as sigmoid and ReLU, respectively, but are not limited thereto.
  • training data may be randomly mixed and used for learning an artificial neural network model. However, some of the training data may be used to verify the artificial neural network model. For example, 80% of the training data may be used for training and the remaining 20% for verification.
  • training of an artificial neural network model may be performed in an external device such as a server.
  • learning of the artificial neural network model is performed in the vehicle device 100 itself.
  • the processor 130 of the vehicle device 100 performs learning of an artificial neural network model will be assumed and described.
  • the processor 130 may train the artificial neural network model by mapping gaze data for each driver image for learning (or driver's face image for learning).
  • mapping may mean a pair of input training data (driver image for learning) and output training data (gaze data) as shown in FIG. 2A, but as shown in FIG. 2B, labeled input training data (driver image for learning) -Gaze data).
  • mapping the learning driver's image and gaze data used for learning will be described.
  • the processor 130 has been described as being a subject for convenience of description, learning of an artificial neural network model and/or acquisition of training data used for learning can be performed in an external device, of course.
  • the processor 130 may detect a face region from a driver's photographic image for learning captured through the camera 110 .
  • a face area detection method various conventional methods may be used. Specifically, a direct recognition method and a method using statistics may be used.
  • rules are created using physical characteristics such as outline, skin color, size of components, and distance between components of a face image, and comparison, inspection, and measurement are performed according to the rules.
  • the method using statistics may detect a face region according to a pre-learned algorithm. That is, it is a method of comparing and analyzing the unique characteristics of the input face with a large database prepared by converting them into data.
  • a facial area may be detected according to a pre-learned algorithm, and methods such as a multi-layer perceptron (MLP) and a support vector machine (SVM) may be used.
  • MLP multi-layer perceptron
  • SVM support vector machine
  • an eye image is identified from a photographed image through a face modeling technique.
  • the face modeling technology is an analysis process of converting a face image into digital information for processing and transmission, and one of an Active Shape Modeling (ASM) technique and an Active Appearance Modeling (AAM) technique may be used.
  • ASM Active Shape Modeling
  • AAM Active Appearance Modeling
  • the face region identified in this way may be used to obtain pixel coordinates of the facial feature points of the driver as will be described later.
  • 3 to 6 are diagrams for explaining a learning data acquisition method according to an exemplary embodiment.
  • FIG. 3 is a diagram for explaining a driving simulation environment according to an exemplary embodiment.
  • a driving simulation environment may be established, and image data generated while driving in the driving simulation environment may be obtained and used for learning an artificial neural network model.
  • a camera module for acquiring image data such as an IR camera, a normal camera, and a depth camera may be used in a driving simulation environment.
  • image data of a driver may be obtained through a camera module attached to a vehicle dashboard.
  • the driver's image data may be an image including the driver's whole body, face, and other environments.
  • a workload may be assigned to the driver by assuming the driver's careless situation.
  • the driver may solve an arithmetic problem (calculation of arithmetic) output from a mobile phone while driving, and the driver may keep the vehicle speed (100 km/h) and the lane in the process of solving the arithmetic problem.
  • driver driving data and arithmetic calculation accuracy according to a workload level may be acquired.
  • 3D absolute coordinates of the driver's pupil and 3D vector of the driver's eye are required.
  • the processor 130 may identify a facial region in the driver's photographed image as described above, as shown in FIG. 4 , and obtain pixel coordinates of facial feature points of the driver in the identified facial region.
  • the processor 130 may extract pixel coordinates of 68 facial feature points of the driver by using a convolutional experts constrained local model (CE-CLM), which is one of deep learning models.
  • CE-CLM convolutional experts constrained local model
  • the type and number of feature points of the deep learning model for extracting pixel coordinates are examples, and are not limited thereto.
  • the processor 130 may obtain a vector value by applying a constrained local neural filed landmark detector to driver image data in order to calculate a 3D vector of the driver's eyes.
  • the processor 130 may reconstruct the extracted pixel coordinates of the driver's pupil into 3D absolute coordinates of the driver's pupil, as shown in FIG. 5 .
  • coordinate system transformation may be performed through internal (K) parameters and external (R
  • internal parameters may include a focal length, a principal point, and a skew coefficient.
  • External parameters may indicate the direction of the camera (ex: rotation and translation).
  • Equation 1 represents an example of an internal (K) parameter of the camera
  • [Equation 2] represents an example of an external (R
  • f x and f y represent focal lengths
  • C x and C y represent principal points
  • skew (skew_c) represents a skew coefficient
  • the processor 130 converts the pixel coordinates into the camera coordinate system using an internal parameter and converts the camera coordinate system into an absolute coordinate system using an external parameter to convert the pixel coordinates of the driver's eyes into the absolute coordinate system. can be calculated with
  • the internal parameters of the camera are parameters describing the conversion relationship between the pixel coordinate system and the camera coordinate system, and refer to intrinsic parameters of the camera itself, such as the focal length, aspect ratio, and center point of the camera.
  • the camera's extrinsic parameters are parameters describing the transformation relationship between the camera coordinate system and the world coordinate system, and are expressed as rotation and translation transformation between the two coordinate systems. Since the external parameters of the camera are not intrinsic to the camera, they may vary depending on the position and orientation of the camera and how the absolute coordinate system is defined.
  • Equation 3 represents a formula for converting a pixel coordinate system into a camera coordinate system
  • Equation 4 represents a formula for converting a camera coordinate system into an absolute coordinate system.
  • u and v represent pixel coordinates
  • DOP represents length per unit pixel (cm/pixel)
  • width represents pixel width
  • height represents pixel height.
  • P x is the absolute coordinate of the x-axis of the camera
  • P y is the absolute coordinate of the y-axis of the camera
  • P z is the absolute coordinate of the z-axis of the camera
  • is the tilt of the camera.
  • Gaze data may be extracted from the driver's image data through the 3D absolute coordinates of the driver's pupil and the direction vector of the pupil obtained in the above manner.
  • the gaze data extracted in this way may be labeled with the driver's image data (or the driver's face data) and used for learning the artificial neural network model.
  • FIG. 7 is a diagram for explaining the operation of a learned artificial neural network model according to an embodiment.
  • the trained artificial neural network model 10 ′ may output gaze data when a driver's photographed image is input.
  • gaze data may be 3D absolute coordinates.
  • the trained artificial neural network model 10' may output a probability value corresponding to each of a plurality of gaze data.
  • the processor 130 may obtain the driver's gaze data based on a probability value corresponding to each of a plurality of gaze data output from the learned artificial neural network model 10'.
  • the output part of the artificial neural network model 10' may be implemented to enable softmax processing.
  • softmax is a function that normalizes all input values to a value between 0 and 1 and always makes the sum of the output values 1, and outputs a probability value corresponding to each class, for example, gaze data a, gaze data b, etc. can be performed.
  • the output part of the artificial neural network model 10' may be implemented to enable Argmax processing.
  • Argmax is a function that selects the most probable one among multiple labels, and here, it can function to select the ratio with the largest value among the probability values for each class. That is, when each output part of the artificial neural network model 10' is Argmax-processed, only state information (ex: gaze data a) having the highest probability value can be output.
  • the artificial neural network model 10' is trained to output information on the driver's gaze state, and may output information on the driver's gaze state when a driver's photographed image is input.
  • the artificial neural network model 10' receives driver's gaze data from the first artificial neural network model that outputs the driver's gaze data when the driver's photographic image is input, and obtains the driver's gaze state information from the first artificial neural network model. It is also possible to be implemented to include a second artificial neural network model that outputs.
  • driver's photographed image not only the driver's photographed image but also additional information may be used to determine the driver's gaze state.
  • a driver's gaze state may be determined based on gaze data and additional information obtained from an artificial neural network model.
  • the additional information may be various information such as driving environment information (ex: weather information, temperature information, humidity information, etc.), driver profile information (ex: gender, age, etc.).
  • additional information along with the driver's image may be input to the artificial neural network model 10', and the corresponding information may be used to output gaze data.
  • additional information may be used for learning of the artificial neural network model 10'.
  • FIG. 8 is a diagram for explaining an actual application example and effects for determining a driver's gaze state according to an exemplary embodiment.
  • a gaze dispersion ratio and a maximum gaze dispersion time may be utilized to determine driving gaze through driver gaze coordinates.
  • Gaze dispersion means a state in which the driver's gaze deviates from the simulator screen while driving
  • the maximum gaze dispersion time may mean the maximum time when the driver's gaze deviates from the simulator screen. For example, if the driver neglects to look forward for several seconds (ex: 2 seconds) and drives at a preset speed (ex: 100 km/h), since the moving distance of the vehicle is, for example, about 55 m or more, 2 If the system neglects to look forward for more than a second, the driver is notified of a warning and vehicle control can be performed.
  • FIG. 9 is a diagram illustrating an implementation example of a vehicle device according to an exemplary embodiment.
  • a vehicle device 100' includes a camera 110, a memory 120, a processor 130, a display 140, a speaker 150, a user interface 160, and a communication interface 170. do. Among the components shown in FIG. 9 , detailed descriptions of components overlapping with those shown in FIG. 1 will be omitted.
  • the display 140 may be implemented as a display including a self-light emitting element or a display including a non-light emitting element and a backlight.
  • LCD Liquid Crystal Display
  • OLED Organic Light Emitting Diodes
  • LED Light Emitting Diodes
  • micro LED micro LED
  • Mini LED PDP (Plasma Display Panel)
  • QD Quantum dot
  • QLED Quadantum dot light-emitting diodes
  • the display 130 may also include a driving circuit, a backlight unit, and the like that may be implemented in the form of an a-si TFT, a low temperature poly silicon (LTPS) TFT, or an organic TFT (OTFT).
  • the display 140 is implemented as a touch screen combined with a touch sensor, a flexible display, a rollable display, a 3D display, a display in which a plurality of display modules are physically connected, and the like It can be. Also, since the display 140 has a built-in touch screen, a program can be executed using a finger or a pen (eg, a stylus pen).
  • the speaker 150 may be a component that outputs not only various kinds of audio data processed by the processor 130 but also various notification sounds or voice messages. According to one example, the processor 130 may control the speaker 150 to output a warning notification according to various embodiments of the present disclosure.
  • the communication interface 160 is a component for communicating with various external devices and may include a wireless communication module, for example, a Wi-Fi module, a Bluetooth module, and the like. However, it is not limited thereto, and the communication interface 160 may be used in addition to the above-described communication method such as zigbee, 3rd generation (3G), 3rd generation partnership project (3GPP), long term evolution (LTE), LTE-A (LTE Advanced), 4G (4th Generation), 5G (5th Generation), etc., various wireless communication standards, infrared communication (IrDA, Infrared Data Association) technology, etc. may perform communication.
  • the communication interface 160 may include various other wired communication interfaces (ex: USB terminal).
  • the user interface 170 is a component for receiving various user commands, and can be implemented as a button, a touch pad, a wheel, or the like, depending on the implementation of the electronic device 100'.
  • the vehicle device 100' may further include a microphone (not shown).
  • the microphone is a component for receiving a user's voice or other sounds and converting them into audio data. For example, a user voice command related to various embodiments of the present disclosure may be received through a microphone (not shown).
  • FIG. 10 is a flowchart illustrating a vehicle control method according to an exemplary embodiment.
  • the driver's gaze data is acquired by inputting the driver's captured image acquired through the camera to the learned artificial neural network model (S1010).
  • the driver's gaze state is determined based on the driver's gaze data (S1020).
  • the artificial neural network model may be a model learned by using driver images for learning as input data and gaze data corresponding to each driver image for learning as output data.
  • gaze data corresponding to each driver's image for learning may be obtained based on the coordinates and direction vectors of the driver's eyes corresponding to the driver's image for learning.
  • gaze data corresponding to each driver image for learning may be obtained based on 3D absolute coordinates and 3D direction vectors of the driver's eyes corresponding to the driver's image for learning.
  • the driver's gaze data output from the artificial neural network model may be three-dimensional gaze coordinates corresponding to the driver's eyes.
  • control method includes acquiring a driver image for learning corresponding to the driver's careless state, extracting pixel coordinates of facial feature points of the driver and direction vectors of the driver's pupils from the driver image for learning, and based on the pixel coordinates of the facial feature points. extracting the pixel coordinates of the pupil, converting the pixel coordinates of the pupil into 3D absolute coordinates, and obtaining gaze data corresponding to the driver's image for training based on the 3D absolute coordinates of the pupil and the direction vector of the pupil. steps may be included.
  • the method may further include acquiring driving data of the vehicle device, and in step S1030 , feedback corresponding to the driver's gaze state may be provided based on the driving data of the vehicle device.
  • the driving data of the vehicle device may include vehicle speed data and separation distance information from the center of the lane.
  • the driver's gaze state is determined based on the driver's gaze data and additional information, and the additional information may include at least one of driving environment information and driver profile information.
  • an EEG sensor that measures the driver's brain wave (EEG), an EOG sensor that measures the driver's eye conduction (EOG), and a PPG sensor that measures the driver's photoplethysmogram (PPG) are used to monitor the driver's attention. It can also be used as an auxiliary indicator for determining the condition.
  • the artificial neural network model it is possible to track not only the direction of the driver's gaze but also the focus of the driver's gaze by using the artificial neural network model. Accordingly, by accurately acquiring the coordinates of the driver's gaze, it is possible to respond to an emergency through rapid vehicle control in the event of not only the driver's carelessness but also the driver's drowsiness and emergency situations (ex: fall, cardiac arrest, etc.).
  • the coordinates of the driver's gaze can be accurately obtained, and through this, the driver's state (ex: drowsiness, neglect of looking ahead, staring at the road, distraction, It is possible to make accurate judgments and provide information of drivers (such as no drivers).
  • the methods according to various embodiments of the present disclosure described above may be implemented in the form of an application that can be installed in an existing electronic device.
  • the above-described methods according to various embodiments of the present disclosure may be performed using a deep learning-based artificial neural network (or deep artificial neural network), that is, a learning network model.
  • various embodiments of the present disclosure described above may be performed through an embedded server included in the electronic device or an external server of the electronic device.
  • a device is a device capable of calling a stored command from a storage medium and operating according to the called command, and may include an electronic device (eg, the electronic device A) according to the disclosed embodiments.
  • the processor may perform a function corresponding to the command directly or by using other components under the control of the processor.
  • An instruction may include code generated or executed by a compiler or interpreter.
  • the device-readable storage medium may be provided in the form of a non-transitory storage medium.
  • 'non-temporary' only means that the storage medium does not contain a signal and is tangible, but does not distinguish whether data is stored semi-permanently or temporarily in the storage medium.
  • the method according to the various embodiments described above may be included in a computer program product and provided.
  • Computer program products may be traded between sellers and buyers as commodities.
  • the computer program product may be distributed in the form of a device-readable storage medium (eg compact disc read only memory, CD-ROM) or online through an application store (eg Play Store TM ).
  • an application store eg Play Store TM
  • at least part of the computer program product may be temporarily stored or temporarily created in a storage medium such as a manufacturer's server, an application store server, or a relay server's memory.
  • each of the components may be composed of a single object or a plurality of entities, and some sub-components among the aforementioned sub-components may be omitted, or other sub-components may be used. Components may be further included in various embodiments. Alternatively or additionally, some components (eg, modules or programs) may be integrated into one entity and perform the same or similar functions performed by each corresponding component prior to integration. According to various embodiments, operations performed by modules, programs, or other components may be executed sequentially, in parallel, repetitively, or heuristically, or at least some operations may be executed in a different order, may be omitted, or other operations may be added. can

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Automation & Control Theory (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)

Abstract

Un dispositif automobile permettant de déterminer un état du regard d'un conducteur en utilisant l'intelligence artificielle est divulgué. Le dispositif automobile comprend une caméra, une mémoire dans laquelle un modèle de réseau neuronal artificiel entraîné est stocké, et un processeur qui obtient des données de regard d'un conducteur en entrant une image obtenue par capture du conducteur avec la caméra vers le modèle de réseau neuronal artificiel, détermine un état du regard du conducteur sur la base des données de regard du conducteur, et fournit une rétroaction correspondant à l'état du regard du conducteur. Le modèle de réseau neuronal artificiel est un modèle entraîné en utilisant une image de conducteur pour apprentissage comme données d'entrée, et des données de regard correspondant à chaque image de conducteur pour apprentissage comme données de sortie, et les données de regard correspondant à chaque image de conducteur pour apprentissage peuvent être obtenues sur la base d'un vecteur de direction et de coordonnées de la pupille d'un conducteur correspondant à l'image de conducteur pour apprentissage.
PCT/KR2022/014220 2021-11-16 2022-09-23 Dispositif automobile permettant de déterminer l'état du regard d'un conducteur en utilisant l'intelligence artificielle, et son procédé de commande WO2023090618A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020210157964A KR102597068B1 (ko) 2021-11-16 2021-11-16 인공 지능을 이용하여 운전자의 주시 상태를 판단하는 차량 장치 및 그 제어 방법
KR10-2021-0157964 2021-11-16

Publications (1)

Publication Number Publication Date
WO2023090618A1 true WO2023090618A1 (fr) 2023-05-25

Family

ID=86397215

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/014220 WO2023090618A1 (fr) 2021-11-16 2022-09-23 Dispositif automobile permettant de déterminer l'état du regard d'un conducteur en utilisant l'intelligence artificielle, et son procédé de commande

Country Status (2)

Country Link
KR (1) KR102597068B1 (fr)
WO (1) WO2023090618A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002367100A (ja) * 2001-06-12 2002-12-20 Nissan Motor Co Ltd 運転者状態検出装置
JP2006163828A (ja) * 2004-12-07 2006-06-22 Nissan Motor Co Ltd 車両用警報装置、車両周囲状況の警報方法
JP2007249757A (ja) * 2006-03-17 2007-09-27 Denso It Laboratory Inc 警報装置
KR20090104607A (ko) * 2008-03-31 2009-10-06 현대자동차주식회사 전방 미주시 운전 검출 경보 시스템
KR20120037253A (ko) * 2010-10-11 2012-04-19 현대자동차주식회사 운전자 주시방향 연동 전방충돌 위험경보 시스템, 그 방법 및 그를 이용한 차량

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3776347A1 (fr) * 2019-06-17 2021-02-17 Google LLC Entrée en contact avec l'occupant de véhicule à l'aide de vecteurs de regard en trois dimensions
KR20210052634A (ko) * 2019-10-29 2021-05-11 엘지전자 주식회사 운전자의 부주의를 판단하는 인공 지능 장치 및 그 방법
KR102338067B1 (ko) * 2019-12-26 2021-12-10 경북대학교 산학협력단 관심영역을 이용한 운전자 모니터링 시스템

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002367100A (ja) * 2001-06-12 2002-12-20 Nissan Motor Co Ltd 運転者状態検出装置
JP2006163828A (ja) * 2004-12-07 2006-06-22 Nissan Motor Co Ltd 車両用警報装置、車両周囲状況の警報方法
JP2007249757A (ja) * 2006-03-17 2007-09-27 Denso It Laboratory Inc 警報装置
KR20090104607A (ko) * 2008-03-31 2009-10-06 현대자동차주식회사 전방 미주시 운전 검출 경보 시스템
KR20120037253A (ko) * 2010-10-11 2012-04-19 현대자동차주식회사 운전자 주시방향 연동 전방충돌 위험경보 시스템, 그 방법 및 그를 이용한 차량

Also Published As

Publication number Publication date
KR20230071593A (ko) 2023-05-23
KR102597068B1 (ko) 2023-10-31

Similar Documents

Publication Publication Date Title
Ramzan et al. A survey on state-of-the-art drowsiness detection techniques
Li et al. Detection of driver manual distraction via image-based hand and ear recognition
García et al. Driver monitoring based on low-cost 3-D sensors
CN112016457A (zh) 驾驶员分神以及危险驾驶行为识别方法、设备和存储介质
WO2018061616A1 (fr) Système de surveillance
WO2020122432A1 (fr) Dispositif électronique et procédé d'affichage d'une image tridimensionnelle de celui-ci
CN112307855A (zh) 一种用户状态检测方法、装置、电子设备及存储介质
CN113316805A (zh) 使用红外线和可见光监视人的方法和系统
US10268903B2 (en) Method and system for automatic calibration of an operator monitor
WO2023075161A1 (fr) Appareil de véhicule pour déterminer l'état d'un conducteur en utilisant l'intelligence artificielle et son procédé de commande
Sharara et al. A real-time automotive safety system based on advanced ai facial detection algorithms
EP4064113A1 (fr) Procédé et système de détection d'informations d'utilisateur, et dispositif électronique
WO2023090618A1 (fr) Dispositif automobile permettant de déterminer l'état du regard d'un conducteur en utilisant l'intelligence artificielle, et son procédé de commande
Isaza et al. Dynamic set point model for driver alert state using digital image processing
WO2022182096A1 (fr) Suivi du mouvement de membre en temps réel
CN114037979A (zh) 一种轻量化的驾驶员疲劳状态检测方法
CN109484330B (zh) 基于Logistic模型的新手驾驶员驾驶技能辅助提升系统
Chhabria et al. Multimodal interface for disabled persons
Li et al. Motion fatigue state detection based on neural networks
Moazen et al. Implementation of a low-cost driver drowsiness evaluation system using a thermal camera
Oommen et al. Drowsiness Detection System
Kono et al. Suppression of Vestibulo-Ocular Reflex with Increased Mental Workload While Driving
JP7412514B1 (ja) キャビンモニタリング方法及び上記キャビンモニタリング方法を実行するキャビンモニタリングシステム
Pradhan et al. Driver Drowsiness Detection Model System Using EAR
EP4332886A1 (fr) Dispositif électronique, procédé de commande de dispositif électronique et programme

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22895827

Country of ref document: EP

Kind code of ref document: A1