CN117485348A - Driver intention recognition method - Google Patents

Driver intention recognition method Download PDF

Info

Publication number
CN117485348A
CN117485348A CN202311618805.8A CN202311618805A CN117485348A CN 117485348 A CN117485348 A CN 117485348A CN 202311618805 A CN202311618805 A CN 202311618805A CN 117485348 A CN117485348 A CN 117485348A
Authority
CN
China
Prior art keywords
driver
intention
information
semantic
driving environment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311618805.8A
Other languages
Chinese (zh)
Other versions
CN117485348B (en
Inventor
牛超
赵运
郑岳琦
马天发
全杰
孙德荣
王晋武
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changchun Automotive Test Center Co ltd
Original Assignee
Changchun Automotive Test Center Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changchun Automotive Test Center Co ltd filed Critical Changchun Automotive Test Center Co ltd
Priority to CN202311618805.8A priority Critical patent/CN117485348B/en
Publication of CN117485348A publication Critical patent/CN117485348A/en
Application granted granted Critical
Publication of CN117485348B publication Critical patent/CN117485348B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W40/00Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models
    • B60W40/08Estimation or calculation of non-directly measurable driving parameters for road vehicle drive control systems not related to the control of a particular sub unit, e.g. by using mathematical models related to drivers or passengers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/043Identity of occupants
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2540/00Input parameters relating to occupants
    • B60W2540/223Posture, e.g. hand, foot, or seat position, turned or inclined
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2552/00Input parameters relating to infrastructure
    • B60W2552/50Barriers
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2552/00Input parameters relating to infrastructure
    • B60W2552/53Road markings, e.g. lane marker or crosswalk
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60WCONJOINT CONTROL OF VEHICLE SUB-UNITS OF DIFFERENT TYPE OR DIFFERENT FUNCTION; CONTROL SYSTEMS SPECIALLY ADAPTED FOR HYBRID VEHICLES; ROAD VEHICLE DRIVE CONTROL SYSTEMS FOR PURPOSES NOT RELATED TO THE CONTROL OF A PARTICULAR SUB-UNIT
    • B60W2554/00Input parameters relating to objects
    • B60W2554/40Dynamic objects, e.g. animals, windblown objects
    • B60W2554/402Type
    • B60W2554/4029Pedestrians
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Automation & Control Theory (AREA)
  • Mathematical Physics (AREA)
  • Transportation (AREA)
  • Mechanical Engineering (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention provides a driver intention recognition method, which comprises the following steps of S1: collecting data of driver behaviors and image and video data of driving environment; step S2: extracting key visual features from the data of the step S1 for subsequent intention recognition; step S3: carrying out semantic analysis on the acquired driving environment image and video data, and extracting key semantic information; step S4: and fusing various feature information according to the visual feature extraction and the semantic analysis to obtain a driver intention recognition result.

Description

Driver intention recognition method
Technical Field
The invention relates to the technical field of intelligent interaction of automobiles, in particular to a driver intention recognition method.
Background
With the intelligent starting of automobiles, people want to know more and more themselves for the needs of good experience of automobiles, and customize corresponding service contents and auxiliary driving according to the states and the needs of the people; the method accurately identifies the driving intention of the driver and plays an extremely important role in providing more humanized service and safer and more comfortable auxiliary driving for the driver.
At present, the existing driver identification functions are integrated in different ADAS controllers, and the driver intention is often identified by a single signal such as a turn signal light signal, a brake switch signal and the like. However, due to the fact that signals are few and the identification method is simple, erroneous judgment is often caused, and therefore the performance of the system is affected; in the case of various judgment information, a collision occurs, and the intention of the driver cannot be accurately recognized.
Disclosure of Invention
The present invention is directed to a driver intention recognition method, which solves the above-mentioned problems of the related art.
The invention is realized by the following technical scheme: a driver intention recognition method, the method comprising the steps of:
step S1: collecting data of driver behaviors and image and video data of driving environment;
step S2: extracting key visual features from the data of the step S1 for subsequent intention recognition;
step S3: carrying out semantic analysis on the acquired driving environment image and video data, and extracting key semantic information;
step S4: and fusing various feature information according to the visual feature extraction and the semantic analysis to obtain a driver intention recognition result.
Specifically, the visual feature extraction in the step S2 specifically includes the following steps:
step S2.1: face detection and recognition, namely detecting the face in the driver behavior data through an MTCNN neural network, and recognizing the detected face through a FaceNet face recognition algorithm to acquire the identity information of the driver;
step S2.2: the method comprises the following steps of performing expression analysis, namely analyzing the facial expression of a driver through a FERNET network, and extracting the current expression state of the driver through a facial expression recognition algorithm, wherein the expression state comprises anger, happiness and confusion;
step S2.3: gesture recognition, namely recognizing hand actions of a driver through a space-time attention network, and extracting the type and state of the gesture of the driver;
step S2.4: estimating the space gesture, namely estimating the head gesture of a driver through a 3D gesture estimation network to acquire information such as the rotation angle and the direction of the head;
step S2.5: and (3) analyzing driving scenes, namely performing target detection and tracking on images of driving environments through a fast R-CNN model, and extracting scene information such as road signs, traffic lights, pedestrians, obstacles and the like.
Specifically, the specific steps for extracting the key semantic information in the step S3 are as follows:
step S3.1: dividing the driving environment image, extracting semantic information of different areas, specifically comprising dividing different areas such as roads, sky, buildings and the like, and deducing semantic meaning of each area;
step S3.2: establishing a semantic relation model through a graph neural network, analyzing the relation between target objects in a driving environment, and judging the relation between traffic signals and pedestrians and the relation between vehicles and road signs;
step S3.3: action recognition and intention reasoning utilize action recognition and intention reasoning algorithm to comprehensively analyze the behavior and driving environment of the driver, and deduce the intention of the driver by analyzing the behavior action and environment semantic information of the driver;
step S3.4: and modeling and analyzing the driver behaviors and the driving environment through a cyclic neural network by combining the historical data, and predicting possible future events, including traffic jams and road conditions, by considering the historical data of the current driving environment.
Specifically, the specific process of multi-mode fusion in the step 4 is as follows:
step S4.1: the method comprises the steps of carrying out fusion on feature information from visual feature extraction and semantic analysis, and specifically comprises the steps of fusing visual features of face recognition, expression analysis and gesture recognition with semantic features of target detection and scene segmentation to form a multi-modal feature vector;
step S4.2: according to the contribution degree of different modal information, the weight of the characteristic information is adjusted, and according to the complexity degree of a driving scene and the criticality of the behavior of a driver, the weight of different characteristics is adjusted, so that in the fusion process, important characteristic information can better influence the final intention recognition result;
step S4.3: when conflict exists among different modal information, conflict resolution is carried out, when the face recognition result shows that the driver is in an anger state, but other visual characteristics and semantic analysis results indicate that the driving scene is normal, the conflict resolution is needed, and the final intention recognition result is determined;
step S4.4: and predicting the intention of the driver at the current moment by combining the intention recognition result at the previous moment and the current driving environment state, obtaining a more accurate intention recognition result and adapting to the change of the driving environment.
Specifically, in the step S4.2, the specific process of adjusting the weight of the feature information is as follows:
firstly, setting by random assignment or according to priori knowledge, and initializing the weight of each mode;
secondly, forward propagation is carried out on various characteristic information data in the multi-mode characteristic vector through each mode, and an output value of each mode is calculated;
thirdly, calculating a total loss value by weighting and summing the losses of each mode based on the output value of each mode;
fourth, the loss values are back propagated to the network, and the contribution degree of each mode to the total loss is calculated;
fifthly, calculating the gradient of each modal weight according to the contribution degree of each modal to the total loss;
sixth, the gradient descent optimizer is used to update the weights of each modality.
Specifically, in step S4.3, when there is a conflict between different modality information, a specific process of resolving the conflict is as follows:
the conflict is detected in the data after the multi-mode fusion through the support vector machine model, the area with the conflict in the fusion data is marked, and for the detected conflict, the decision is made through combining the weight of each mode information through the decision tree algorithm, so that the final intention recognition result is determined.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a driver intention recognition method, which fuses multi-mode information, can accurately recognize the intention of a driver by carrying out weight adjustment and conflict resolution on the multi-mode information, provides accurate and comprehensive driver intention information for an intelligent driving system, and improves driving safety and driving experience.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only preferred embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is an overall structure diagram of a driver intention recognition method provided by the present invention.
Fig. 2 is a specific flowchart of step S2 visual feature extraction provided in the present invention.
Fig. 3 is a specific flowchart of the semantic analysis of step S3 provided in the present invention.
Fig. 4 is a specific flowchart of step S4 multi-mode fusion provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, exemplary embodiments according to the present invention will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present invention and not all embodiments of the present invention, and it should be understood that the present invention is not limited by the example embodiments described herein. Based on the embodiments of the invention described in the present application, all other embodiments that a person skilled in the art would have without inventive effort shall fall within the scope of the invention.
In the following description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. It will be apparent, however, to one skilled in the art that the invention may be practiced without one or more of these details. In other instances, well-known features have not been described in detail in order to avoid obscuring the invention.
It should be understood that the present invention may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term "and/or" includes any and all combinations of the associated listed items.
In order to provide a thorough understanding of the present invention, detailed structures will be presented in the following description in order to illustrate the technical solutions presented by the present invention. Alternative embodiments of the invention are described in detail below, however, the invention may have other implementations in addition to these detailed descriptions.
Referring to fig. 1, a driver intention recognition method includes the steps of:
step S1: collecting data of driver behaviors and image and video data of driving environment;
step S2: extracting key visual features from the step S1 for subsequent intention recognition;
step S3: carrying out semantic analysis on the acquired driving environment image and video data, and extracting key semantic information;
step S4: and fusing various feature information according to the visual feature extraction and the semantic analysis to obtain a driver intention recognition result.
According to the driver intention recognition method, firstly, the face in the driver behavior data is detected through the MTCNN neural network, and then the detected face is recognized by using a FaceNet face recognition algorithm, so that the identity information of the driver is obtained. Facial expression recognition is carried out through the FERNET network, facial expressions of the driver are analyzed, and the current expression states of the driver, such as anger, happiness, confusion and the like, are extracted from the behavior data of the driver so as to judge the emotion and intention of the driver. Meanwhile, through the empty attention network, the hand actions of the driver are identified to extract the types and states of gestures of the driver, such as steering wheel, button pressing and the like, so as to help judge the operation intention of the driver. On the basis, the target object in the driving environment is detected and tracked through the fast R-CNN network, so that the information such as the position, the motion state and the like of the target object is extracted, and the intention of a driver is assisted to be judged. And then estimating the head gesture of the driver through a 3D gesture estimation network so as to acquire information such as the rotation angle and the direction of the head and judge the sight line and the attention direction of the driver. To help determine the driving scenario and intent in which the driver is currently located. And analyzing the relation between the target objects in the driving environment by using a graph neural network semantic relation model. For example, determining the relationship between traffic lights and pedestrians, and between vehicles and road signs, these models may provide richer semantic information that aids in determining the driver's intent. By using a cyclic neural network (RNN), the behavior and the driving environment of a driver are comprehensively analyzed, the intention of the driver is deduced through analyzing the behavior action and the environmental semantic information of the driver, such as lane changing, acceleration, parking and the like, a long-short-term memory network (LSTM) model is reused, the behavior and the driving environment of the driver are modeled and analyzed, and the history data of the current driving environment is considered to predict the possible occurrence of events such as traffic jams, road conditions and the like in the future so as to improve the accuracy of supporting the intention recognition of the driver. And fusing various feature information of the extracted visual feature extraction and semantic analysis, comprehensively utilizing the feature information of different modes, and improving the accuracy of driver intention recognition. And carrying out post-processing reasoning on the fused intention recognition result. Statistical methods are used to infer the next behavior of the driver, converting the intent recognition results into corresponding driving operations such as acceleration, braking, steering, etc. This translates the driver's intent into specific row instructions to support decision and control of the intelligent driving system.
Specifically, referring to fig. 2, the visual feature extraction in step 2 specifically includes the following steps:
step S2.1: face detection and recognition, namely detecting the face in the driver behavior data through an MTCNN neural network, and recognizing the detected face through a FaceNet face recognition algorithm to acquire the identity information of the driver;
exemplary, specific steps of step S2.1 are as follows:
first, a driver behavior data set needs to be prepared for training and testing, which data set should contain behavior data of the driver, and face images associated with each behavior, ensuring that the face images in the data set have been labeled with corresponding identity information.
Inputting an image in the driver behavior data, transmitting the image to an MTCNN model, and outputting the detected face position by the model;
extracting the face position detected by the MTCNN from the original image, and preprocessing, wherein the preprocessing step can comprise operations of cutting, adjusting the size, normalizing and the like, so that all face images are ensured to have the same size and characteristic representation;
the face images after preprocessing are identified by using a FaceNet face recognition algorithm, the face images are mapped to a high-dimensional feature space, the similarity between the faces is calculated, and the most similar identity information can be found by comparing the face images to be identified with the face images with known identities;
and selecting a proper similarity threshold according to the output result of the FaceNet, comparing the face to be recognized with the face with the known identity, if the similarity exceeds the threshold, considering the face to belong to the known identity, and extracting corresponding identity information.
Step S2.2: the method comprises the following steps of performing expression analysis, namely analyzing the facial expression of a driver through a FERNET network, and extracting the current expression state of the driver through a facial expression recognition algorithm, wherein the expression state comprises anger, happiness and confusion;
exemplary, specific steps of step S2.2 are as follows:
firstly, preparing a driver facial expression data set for training and testing, wherein the data set contains facial expression images of a driver under different emotions and is marked with corresponding expression categories;
dividing the data set into a training set and a testing set, training the FERNT network by using the training set, updating the weight and bias of the network by using a back propagation algorithm and a proper optimizer (such as Adam) by inputting the preprocessed expression image into the FERNT network, and defining a proper loss function (such as cross entropy loss) to measure the difference between the prediction result of the model and the real label in the training process;
setting a threshold according to the output result of the FERNET network, and judging that the driver is in an anger state when the prediction probability of the anger class exceeds the threshold; when the prediction probability of the happy category exceeds a threshold value, judging that the driver is in a happy state; when the prediction probability of the confusion class exceeds the threshold value, it is judged that the driver is in a confusion state.
Step S2.3: gesture recognition, namely recognizing hand actions of a driver through a space-time attention network, and extracting the type and state of the gesture of the driver;
exemplary, specific steps of step S2.3 are as follows:
first, a driver hand motion data set is prepared for training and testing. The data set should contain video sequences of the driver under different gestures and be annotated with the corresponding gesture types and states.
Training a time-space attention network by using a training set, and measuring the difference between a predicted result of the model and a real label by a loss function by inputting a preprocessed video sequence into the network;
setting a threshold according to the output result of the space-time attention network, and judging that the gesture is being performed by the driver when the prediction probability of a certain gesture type exceeds the threshold; when the prediction probability of a certain gesture state exceeds a threshold value, the driver is judged to be in the gesture state.
Step S2.4: estimating the space gesture, namely estimating the head gesture of a driver through a 3D gesture estimation network to acquire information such as the rotation angle and the direction of the head;
exemplary, specific steps of step S2.4 are as follows:
first, a driver head pose data set for training and testing is prepared. The data set should contain head images or video sequences of the driver in different postures, and corresponding information such as head rotation angle and direction is marked.
Training a 3D pose estimation network using a training set, by inputting a preprocessed head image or video sequence into the network,
according to the output result of the 3D gesture estimation network, information such as the rotation angle and the direction of the head of the driver can be obtained, and the rotation angle and the direction of the head can be extracted from the Euler angle or the rotation matrix output by the network.
Step S2.5: and (3) analyzing driving scenes, namely performing target detection and tracking on images of driving environments through a fast R-CNN model, and extracting scene information such as road signs, traffic lights, pedestrians, obstacles and the like.
Exemplary, specific steps of step S2.5 are as follows:
first, a driving environment image dataset for training and testing is prepared. The dataset should contain images of various objects in the driving environment (e.g., road signs, traffic lights, pedestrians, obstacles, etc.), and be labeled with bounding box locations and category labels for the corresponding objects.
The Faster R-CNN model is trained using a training set. Inputting the preprocessed driving environment image into the model, and then outputting a result;
according to the output result of the fast R-CNN model, scene information such as road signs, traffic lights, pedestrians, obstacles and the like in a driving environment can be extracted.
Specifically, referring to fig. 3, the specific steps for extracting the key semantic information in the step 3 are as follows:
step S3.1: dividing the driving environment image, extracting semantic information of different areas, specifically comprising dividing different areas such as roads, sky, buildings and the like, and deducing semantic meaning of each area;
step S3.2: establishing a semantic relation model through a graph neural network, analyzing the relation between target objects in a driving environment, and judging the relation between traffic signals and pedestrians and the relation between vehicles and road signs;
step S3.3: action recognition and intention reasoning utilize action recognition and intention reasoning algorithm to comprehensively analyze the behavior and driving environment of the driver, and deduce the intention of the driver by analyzing the behavior action and environment semantic information of the driver;
step S3.4: and modeling and analyzing the driver behaviors and the driving environment through a cyclic neural network by combining the historical data, and predicting possible future events, including traffic jams and road conditions, by considering the historical data of the current driving environment.
Specifically, referring to fig. 4, the specific process of the multimodal fusion in step 4 is as follows:
step S4.1: the method comprises the steps of carrying out fusion on feature information from visual feature extraction and semantic analysis, and specifically comprises the steps of fusing visual features of face recognition, expression analysis and gesture recognition with semantic features of target detection and scene segmentation to form a multi-modal feature vector;
step S4.2: according to the contribution degree of different modal information, the weight of the characteristic information is adjusted, and according to the complexity degree of a driving scene and the criticality of the behavior of a driver, the weight of different characteristics is adjusted, so that in the fusion process, important characteristic information can better influence the final intention recognition result;
step S4.3: when conflict exists among different modal information, conflict resolution is carried out, when the face recognition result shows that the driver is in an anger state, but other visual characteristics and semantic analysis results indicate that the driving scene is normal, the conflict resolution is needed, and the final intention recognition result is determined;
step S4.4: and predicting the intention of the driver at the current moment by combining the intention recognition result at the previous moment and the current driving environment state, obtaining a more accurate intention recognition result and adapting to the change of the driving environment.
Specifically, in the step S4.2, the specific process of adjusting the weight of the feature information is as follows:
firstly, setting by random assignment or according to priori knowledge, and initializing the weight of each mode;
secondly, forward propagation is carried out on various characteristic information data in the multi-mode characteristic vector through each mode, and an output value of each mode is calculated;
thirdly, calculating a total loss value by weighting and summing the losses of each mode based on the output value of each mode;
fourth, the loss values are back propagated to the network, and the contribution degree of each mode to the total loss is calculated;
fifthly, calculating the gradient of each modal weight according to the contribution degree of each modal to the total loss;
sixth, the gradient descent optimizer is used to update the weights of each modality.
Specifically, in step S4.3, when there is a conflict between different modality information, a specific process of resolving the conflict is as follows:
the conflict is detected in the data after the multi-mode fusion through the support vector machine model, the area with the conflict in the fusion data is marked, and for the detected conflict, the decision is made through combining the weight of each mode information through the decision tree algorithm, so that the final intention recognition result is determined.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather to enable any modification, equivalent replacement, improvement or the like to be made within the spirit and principles of the invention.

Claims (6)

1. A driver intention recognition method, characterized in that the method comprises the steps of:
step S1: collecting data of driver behaviors and image and video data of driving environment;
step S2: extracting key visual features from the data of the step S1 for subsequent intention recognition;
step S3: carrying out semantic analysis on the acquired driving environment image and video data, and extracting key semantic information;
step S4: and fusing various feature information according to the visual feature extraction and the semantic analysis to obtain a driver intention recognition result.
2. The method for recognizing the intention of a driver according to claim 1, wherein the visual feature extraction in the step S2 specifically includes the steps of:
step S2.1: face detection and recognition, namely detecting the face in the driver behavior data through an MTCNN neural network, and recognizing the detected face through a FaceNet face recognition algorithm to acquire the identity information of the driver;
step S2.2: the method comprises the following steps of performing expression analysis, namely analyzing the facial expression of a driver through a FERNET network, and extracting the current expression state of the driver through a facial expression recognition algorithm, wherein the expression state comprises anger, happiness and confusion;
step S2.3: gesture recognition, namely recognizing hand actions of a driver through a space-time attention network, and extracting the type and state of the gesture of the driver;
step S2.4: estimating the space gesture, namely estimating the head gesture of a driver through a 3D gesture estimation network to acquire information such as the rotation angle and the direction of the head;
step S2.5: and (3) analyzing driving scenes, namely performing target detection and tracking on images of driving environments through a fast R-CNN model, and extracting scene information such as road signs, traffic lights, pedestrians, obstacles and the like.
3. The method for identifying the intention of the driver according to claim 2, wherein the specific step of extracting the key semantic information in the step S3 is as follows:
step S3.1: dividing the driving environment image, extracting semantic information of different areas, specifically comprising dividing different areas such as roads, sky, buildings and the like, and deducing semantic meaning of each area;
step S3.2: establishing a semantic relation model through a graph neural network, analyzing the relation between target objects in a driving environment, and judging the relation between traffic signals and pedestrians and the relation between vehicles and road signs;
step S3.3: action recognition and intention reasoning utilize action recognition and intention reasoning algorithm to comprehensively analyze the behavior and driving environment of the driver, and deduce the intention of the driver by analyzing the behavior action and environment semantic information of the driver;
step S3.4: and modeling and analyzing the driver behaviors and the driving environment through a cyclic neural network by combining the historical data, and predicting possible future events, including traffic jams and road conditions, by considering the historical data of the current driving environment.
4. A method for identifying driver's intention according to claim 3, wherein the specific procedure of the multi-modal fusion in step 4 is as follows:
step S4.1: the method comprises the steps of carrying out fusion on feature information from visual feature extraction and semantic analysis, and specifically comprises the steps of fusing visual features of face recognition, expression analysis and gesture recognition with semantic features of target detection and scene segmentation to form a multi-modal feature vector;
step S4.2: according to the contribution degree of different modal information, the weight of the characteristic information is adjusted, and according to the complexity degree of a driving scene and the criticality of the behavior of a driver, the weight of different characteristics is adjusted, so that in the fusion process, important characteristic information can better influence the final intention recognition result;
step S4.3: when conflict exists among different modal information, conflict resolution is carried out, when the face recognition result shows that the driver is in an anger state, but other visual characteristics and semantic analysis results indicate that the driving scene is normal, the conflict resolution is needed, and the final intention recognition result is determined;
step S4.4: and predicting the intention of the driver at the current moment by combining the intention recognition result at the previous moment and the current driving environment state, obtaining a more accurate intention recognition result and adapting to the change of the driving environment.
5. The method for identifying the intention of the driver according to claim 4, wherein in the step S4.2, the specific process of weighting the characteristic information is as follows:
firstly, setting by random assignment or according to priori knowledge, and initializing the weight of each mode;
secondly, forward propagation is carried out on various characteristic information data in the multi-mode characteristic vector through each mode, and an output value of each mode is calculated;
thirdly, calculating a total loss value by weighting and summing the losses of each mode based on the output value of each mode;
fourth, the loss values are back propagated to the network, and the contribution degree of each mode to the total loss is calculated;
fifthly, calculating the gradient of each modal weight according to the contribution degree of each modal to the total loss;
sixth, the gradient descent optimizer is used to update the weights of each modality.
6. The method for identifying the intention of the driver according to claim 5, wherein in the step S4.3, when there is a conflict between different modality information, the specific process of performing the conflict resolution is as follows:
the conflict is detected in the data after the multi-mode fusion through the support vector machine model, the area with the conflict in the fusion data is marked, and for the detected conflict, the decision is made through combining the weight of each mode information through the decision tree algorithm, so that the final intention recognition result is determined.
CN202311618805.8A 2023-11-30 2023-11-30 Driver intention recognition method Active CN117485348B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311618805.8A CN117485348B (en) 2023-11-30 2023-11-30 Driver intention recognition method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311618805.8A CN117485348B (en) 2023-11-30 2023-11-30 Driver intention recognition method

Publications (2)

Publication Number Publication Date
CN117485348A true CN117485348A (en) 2024-02-02
CN117485348B CN117485348B (en) 2024-07-19

Family

ID=89676424

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311618805.8A Active CN117485348B (en) 2023-11-30 2023-11-30 Driver intention recognition method

Country Status (1)

Country Link
CN (1) CN117485348B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658503A (en) * 2018-12-29 2019-04-19 北京理工大学 A kind of driving behavior intention detection method merging EEG signals
WO2020074565A1 (en) * 2018-10-11 2020-04-16 Continental Automotive Gmbh Driver assistance system for a vehicle
CN112434588A (en) * 2020-11-18 2021-03-02 青岛慧拓智能机器有限公司 Inference method for end-to-end driver expressway lane change intention
CN113392692A (en) * 2020-02-26 2021-09-14 本田技研工业株式会社 Driver-centric risk assessment: risk object identification via causal reasoning for intent-aware driving models
CN113386775A (en) * 2021-06-16 2021-09-14 杭州电子科技大学 Driver intention identification method considering human-vehicle-road characteristics
CN115027484A (en) * 2022-05-23 2022-09-09 吉林大学 Human-computer fusion perception method for high-degree automatic driving

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020074565A1 (en) * 2018-10-11 2020-04-16 Continental Automotive Gmbh Driver assistance system for a vehicle
CN109658503A (en) * 2018-12-29 2019-04-19 北京理工大学 A kind of driving behavior intention detection method merging EEG signals
CN113392692A (en) * 2020-02-26 2021-09-14 本田技研工业株式会社 Driver-centric risk assessment: risk object identification via causal reasoning for intent-aware driving models
CN112434588A (en) * 2020-11-18 2021-03-02 青岛慧拓智能机器有限公司 Inference method for end-to-end driver expressway lane change intention
CN113386775A (en) * 2021-06-16 2021-09-14 杭州电子科技大学 Driver intention identification method considering human-vehicle-road characteristics
CN115027484A (en) * 2022-05-23 2022-09-09 吉林大学 Human-computer fusion perception method for high-degree automatic driving

Also Published As

Publication number Publication date
CN117485348B (en) 2024-07-19

Similar Documents

Publication Publication Date Title
CN109598066B (en) Effect evaluation method, apparatus, device and storage medium for prediction module
CN108550259B (en) Road congestion judging method, terminal device and computer readable storage medium
CN107247947B (en) Face attribute identification method and device
KR20200075344A (en) Detector, method of object detection, learning apparatus, and learning method for domain transformation
CN105654139B (en) A kind of real-time online multi-object tracking method using time dynamic apparent model
CN112734808B (en) Trajectory prediction method for vulnerable road users in vehicle driving environment
JP6418574B2 (en) Risk estimation device, risk estimation method, and computer program for risk estimation
US11420623B2 (en) Systems for determining object importance in on-road driving scenarios and methods thereof
CN111797771A (en) Method and system for detecting weak supervision video behaviors based on iterative learning
EP3382570A1 (en) Method for characterizing driving events of a vehicle based on an accelerometer sensor
CN116363712B (en) Palmprint palm vein recognition method based on modal informativity evaluation strategy
US20220383736A1 (en) Method for estimating coverage of the area of traffic scenarios
CN114067292A (en) Image processing method and device for intelligent driving
CN118334604B (en) Accident detection and data set construction method and equipment based on multi-mode large model
CN116964588A (en) Target detection method, target detection model training method and device
CN117246358A (en) Circuit board for automatic driving auxiliary system and method thereof
JP5293321B2 (en) Object identification device and program
CN116985793B (en) Automatic driving safety control system and method based on deep learning algorithm
CN117485348B (en) Driver intention recognition method
CN117130010A (en) Obstacle sensing method and system for unmanned vehicle and unmanned vehicle
CN116108669A (en) Scene generation method based on deep learning heterogeneous driver model
CN115775372A (en) Method for identifying passengers getting on or off vehicle, terminal device and storage medium
CN116997890A (en) Generating an unknown unsafe scenario, improving an automated vehicle, and a computer system
CN114360056A (en) Door opening early warning method, device, equipment and storage medium
CN114677662A (en) Method, device, equipment and storage medium for predicting vehicle front obstacle state

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant