CN117218716B - DVS-based automobile cabin gesture recognition system and method - Google Patents

DVS-based automobile cabin gesture recognition system and method Download PDF

Info

Publication number
CN117218716B
CN117218716B CN202311005673.1A CN202311005673A CN117218716B CN 117218716 B CN117218716 B CN 117218716B CN 202311005673 A CN202311005673 A CN 202311005673A CN 117218716 B CN117218716 B CN 117218716B
Authority
CN
China
Prior art keywords
gesture
vehicle
dimensional
feedback
dvs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311005673.1A
Other languages
Chinese (zh)
Other versions
CN117218716A (en
Inventor
孙晓凯
郝敬宾
刘新华
华德正
梁赐
曹戎格
徐通
刘晓帆
周皓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China University of Mining and Technology CUMT
Original Assignee
China University of Mining and Technology CUMT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China University of Mining and Technology CUMT filed Critical China University of Mining and Technology CUMT
Priority to CN202311005673.1A priority Critical patent/CN117218716B/en
Publication of CN117218716A publication Critical patent/CN117218716A/en
Application granted granted Critical
Publication of CN117218716B publication Critical patent/CN117218716B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a DVS-based automobile cabin gesture recognition system and a method, wherein the system comprises the following steps: the sensing layer, the decision layer and the execution layer are connected in sequence; the sensing layer acquires a gesture image of a user by using DVS; the decision layer classifies the gesture image signals through an algorithm processing module and an output module; the execution layer executes the gesture command through the feedback module; the method comprises the steps of collecting gesture data through DVS, transmitting an image signal to an algorithm processing module for processing, extracting depth information and three-dimensional hand skeleton characteristics for multi-mode fusion, outputting a signal to an execution layer after algorithm verification, signal analysis and model verification of an output module, and executing a command by a feedback module; the system and the method can filter gesture habit differences of different individuals of the user and solve the problem of low recognition accuracy and speed.

Description

DVS-based automobile cabin gesture recognition system and method
Technical Field
The invention relates to a gesture recognition system, in particular to a DVS-based automobile cabin gesture recognition system and a DVS-based automobile cabin gesture recognition method, and belongs to the technical field of automobile cabin and vehicle-mounted gesture recognition.
Background
With the development of automobile intellectualization, gesture recognition technology is widely applied to man-machine interaction and control of automobile cabins. Enriches the mental culture life of people and brings pleasant experience to people. The development of man-machine interaction can improve driving safety and convenience, reduce the visual and operation load of a driver, and promote a gesture interaction system under a cabin area to enter a rapid development period.
For the prior art, as disclosed in publication number CN110119200a, the infrared distance sensor is linearly connected with the optical sensor through a wire, the optical sensor is linearly connected with the gyroscope through a wire, the optical sensor is linearly connected with the accelerator through a wire, the optical sensor is linearly connected with the MCU processor through a wire, and the MCU processor is linearly connected with the gesture storage module through a wire. The automobile gesture recognition system can effectively reduce automobile keys, simplify the operation of an automobile, effectively improve the rapidness and high efficiency of the operation, and enable the operation of the automobile to be intelligent, and further provide certain entertainment for operators, but the gesture recognition of the existing automobile cabin is mainly a static gesture recognition mode, wherein the wearable sensor equipment is common, has better robustness, is relatively high in input use cost and is not suitable for mass production. For vehicle-mounted static gesture recognition, although the recognition rate is high and the recognition is easy, the vehicle-mounted static gesture recognition is not suitable for the current production and living standard. Based on DVS, gesture actions can be captured in real time, space-time feature network fusion is carried out through LSTM and MobilenetV3 networks, a three-dimensional gesture database of the user gesture is obtained, and corresponding gesture signals are fed back in real time through gesture matching output signals. The method has obvious precision advantage and clear image compared with the traditional camera.
Therefore, in order to solve the drawbacks of the prior art and the market demand of intelligent automobile cabins, it is necessary to design a DVS-based gesture recognition system and method for intelligent automobile cabins to solve the above problems.
Disclosure of Invention
The invention aims to solve at least one technical problem and provide a DVS-based automobile cabin gesture recognition system and method, which have high recognition precision and safety, and the gesture learning and optimizing method can realize intelligent self-adaption to environmental change and provide good man-machine interaction experience for users.
The invention realizes the above purpose through the following technical scheme: the utility model provides a car cabin gesture recognition system based on DVS, includes perception layer, decision-making layer, the execution layer that connects gradually, its characterized in that: the perception layer consists of DVS; the decision layer consists of an algorithm processing module and an output module; the execution layer is composed of a feedback module.
As still further aspects of the invention: the DVS of the perception layer is assembled within the cabin, captures the gesture motion (brightness or distance change information) of the driver or passenger with microsecond time resolution, and generates time-related events that carry the time stamp and spatial location information of event findings.
As still further aspects of the invention: the DVS is configured to receive the events, filter, cluster, and sort the time to reconstruct a time series of gesture actions.
As still further aspects of the invention: the decision layer receives gesture signals acquired by the perception layer DVS, processes the gesture signals through a decision layer algorithm and outputs the gesture signals to the execution layer.
As still further aspects of the invention: the algorithm processing module combines the depth information and the three-dimensional hand skeleton information characteristics, and carries out multi-mode fusion with the LSTM through the MobileNet V3.
As still further aspects of the invention: the output module processes the information output by the algorithm processing module through algorithm verification, signal analysis and model verification, and inputs the information to the execution layer;
wherein, the output module comprises the following steps:
s1, constructing a DVS gesture library: collecting various three-dimensional gesture data according to vehicle-mounted function requirements and interaction habits, and constructing a three-dimensional gesture library, wherein each gesture corresponds to a vehicle control operation;
s2, gesture matching: performing gesture matching on the acquired three-dimensional gesture image sequence by using a fusion model of LSTM and MobilenetV3, and matching with each gesture template in a gesture library to obtain the most matched gesture category and matching degree;
s3, gesture filtering: setting a threshold value based on the matching degree of the three-dimensional gestures, filtering out gestures with lower matching degree, and only selecting gestures with matching degree higher than the threshold value to perform subsequent control operation;
s4, control instruction generation: generating a corresponding vehicle control instruction according to a gesture template which is most matched with the input three-dimensional gesture;
s5, scene judgment: judging the driving scene of the current vehicle, if the gesture operation which is the best match is not matched with the current scene, not generating a control instruction, and giving a warning;
s6, visual feedback: displaying the most matched gesture templates on the vehicle-mounted display screen, and displaying the execution effect of the corresponding vehicle-mounted functions;
s7, operation record: recording a three-dimensional gesture operation process of a driver and a feedback process of a system, wherein the three-dimensional gesture operation process and the feedback process are used for incremental learning of a gesture library and optimization of template matching;
s8, matching optimization: the matching model between the three-dimensional gesture and the template is continuously optimized by using an incremental learning method.
As still further aspects of the invention: the feedback module of the execution layer receives the command signal output by the decision layer and executes the command, and the output mode comprises air conditioner air volume, media volume, window lifting and center control screen page turning;
wherein, the feedback module comprises the following steps:
s1, visual feedback: the best matched gesture template and corresponding vehicle-mounted function execution effect (such as opening of a vehicle window) are displayed on a vehicle-mounted display screen, so that proper visual feedback is given to a driver, and gestures are convenient to correct or re-input;
s2, voice feedback: the system informs the driver of the most matched gesture operation and the executed vehicle-mounted control instruction in a voice mode, and carries out necessary voice reminding and interaction;
s3, executing the function: and after receiving the control instruction generated by the mapping, the vehicle-mounted system controls the corresponding vehicle-mounted functional module to perform operation execution (such as opening a vehicle window). The execution result is also displayed as a feedback;
s4, matching results: the system informs the driver of the matching result between the three-dimensional gesture and the gesture template, including the gesture type and the matching degree of the best matching. This can also be used as a feedback for the driver to judge the accuracy of the gesture input;
s5, reporting errors and reminding: if the system detects that the three-dimensional gesture is not matched with the current driving scene and a control instruction is not generated, a fault reporting prompt is given in a visual voice mode, and a driver is prompted to input the gesture again;
s6, operation record: the system records the whole three-dimensional gesture operation process and feedback process for subsequent analysis of interaction effect and improvement of experience. The recorded content can also be used for online learning and optimization of a matching model;
s7, user evaluation: the system inquires evaluation feedback of the three-dimensional gesture interaction effect to the driver, and accordingly selects whether the matching model and the interaction rule need to be updated to achieve personalized optimization.
As still further aspects of the invention: the feedback module specifically comprises: the air conditioner, the volume, the central control screen, the car window and the like can respond differently when receiving command signals, wherein the central control screen can respond according to gesture page scrolling, page turning, space clicking and the like; the volume can be adjusted according to the distance between the thumb and the index finger in the z direction of the space coordinate system; the vehicle window can ascend or descend according to the up-and-down swing of the gesture, and the front-and-back swing enables the skylight to be opened or closed; the air quantity of the air conditioner is adjusted according to the distance between the thumb and the index finger in the x direction of the space coordinate system.
As still further aspects of the invention: the feedback module performs feedback output and mainly comprises the following gesture and control instruction mapping activation: when the index finger is touched with the thumb, the distance between the thumb and the index finger is changed to output the media volume adjusting signal, and the size of the media volume is changed; when the palms are horizontal and the rest four thumbs point downwards to swing upwards and downwards, the vehicle window descends along with the palms, and otherwise ascends; when the palm stands up and swings forward and backward, the sunroof at the top of the vehicle is opened or closed forward and backward along with the palm; when the human-computer interaction relates to a vehicle-mounted screen, a fist is held and the index finger is singly extended to move left and right up and down, the left and right page turning of the screen page can be controlled, the sliding of the up and down contents can be controlled, when the index finger joint has larger angle change, a single click screen command is responded at the corresponding position of the screen once, the time interval is less than 0.5 seconds, the double click command is continuously generated twice, the double click operation is carried out on the screen once; when a fist-making gesture command occurs, the user can autonomously judge whether the running environment is safe or not, and the vehicle immediately enters a decelerating running state.
The DVS-based automobile cabin gesture recognition method comprises an automobile cabin gesture recognition system, wherein the gesture recognition method is based on a multi-mode fusion network model for detection, and comprises the following steps of:
s1, capturing gesture actions of a user in real time through DVS, generating a gesture sequence frame image, forming an event sequence of the gesture image, further processing the event sequence as original input of a network to extract time sequence and spatial characteristics, acquiring gestures with different visual angles and modes, enriching input information, and improving detection robustness;
s2, detecting gesture key points: detecting gesture key points in each frame of images on the time sequence of the gesture images by using a key point detection model OpenPose, obtaining time sequence coordinates of the gesture key points with multiple views, and capturing fine action characteristics and three-dimensional space information of the gestures;
s3, preprocessing a gesture image: performing scale normalization, image rotation, noise filtering, frame selection and modal Registration preprocessing on the acquired gesture image time sequences to improve the matching degree and the feature extraction effect of the time sequences of different modalities;
s4, extracting spatial features of the gesture image: carrying out feature extraction on each frame of image of the preprocessed gesture image time sequence by using a lightweight MobilenetV3 and other networks so as to acquire advanced spatial feature mapping of the images, and enhancing understanding of gesture details;
s5, multi-mode space-time feature network fusion; fusing the multi-view gesture key point time sequence and the image space feature map obtained by step2 and step4, constructing space-time features of the gesture, and performing gesture detection as input of an LSTM network;
s6, post-processing of detection results: post-processing is carried out on the prediction result of the LSTM, wherein the post-processing comprises multi-mode feature mapping, coordinate mapping, smoothing processing and three-dimensional reconstruction to obtain gesture types, multi-view key point time sequences and three-dimensional gesture space information.
The beneficial effects of the invention are as follows:
1. the system can be personalized: by recording and analyzing the gesture operation process of the driver and evaluating the feedback effect, the system can realize personalized optimization of three-dimensional gesture interaction, select the interaction rule and the mapping model which are most matched with the operation habit of the driver, and provide personalized man-machine interaction experience;
2. the operation record and feedback mechanism can provide data support for the investigation of possible traffic accidents after the fact, the generation process and responsibility attribution of the control instruction can be conveniently judged, and the safety guarantee mechanism and the error prevention capability of the system can be improved continuously, so that the safety is improved.
3. The collected large amount of gesture operation data provide conditions for the learning and optimization of the system, the understanding of the system to complex scenes and personal habits can be enhanced by analyzing the data, a man-machine interaction mechanism is continuously improved, intelligent self-adaption to environmental changes is realized, and the learning and optimization is continuously promoted.
4. The three-dimensional gesture interaction system and the targeted control feedback mechanism can be well applied to intelligent driving scenes, high coordination of gesture natural interaction and a vehicle-mounted system is achieved, the operation convenience of a driver and the usability of the system are improved, and the method has high practical application value.
Drawings
FIG. 1 is a block diagram of a gesture recognition system of the present invention;
FIG. 2 is a block diagram of a gesture recognition method according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Embodiment one: as shown in fig. 1, a DVS-based gesture recognition system for an automobile cabin includes a sensing layer, a decision layer, and an executing layer, which are sequentially connected, and is characterized in that: the perception layer consists of DVS; the decision layer consists of an algorithm processing module and an output module; the execution layer is composed of a feedback module.
Embodiment two: in addition to all the technical features in the first embodiment, the present embodiment further includes:
the DVS of the perception layer is assembled within the cabin, captures the gesture motion (brightness or distance change information) of the driver or passenger with microsecond time resolution, and generates time-related events that carry the time stamp and spatial location information of event findings.
The DVS is configured to receive the events, filter, cluster, and sort the time to reconstruct a time series of gesture actions.
The decision layer receives gesture signals acquired by the perception layer DVS, processes the gesture signals through a decision layer algorithm and outputs the gesture signals to the execution layer.
The algorithm processing module combines the depth information and the three-dimensional hand skeleton information characteristics, and carries out multi-mode fusion with the LSTM through the MobileNet V3.
Embodiment III: in addition to all the technical features in the first embodiment, the present embodiment further includes:
the output module processes the information output by the algorithm processing module through algorithm verification, signal analysis and model verification, and inputs the information to the execution layer;
wherein, the output module comprises the following steps:
s1, constructing a DVS gesture library: collecting various three-dimensional gesture data according to vehicle-mounted function requirements and interaction habits, and constructing a three-dimensional gesture library, wherein each gesture corresponds to a vehicle control operation;
s2, gesture matching: performing gesture matching on the acquired three-dimensional gesture image sequence by using a fusion model of LSTM and MobilenetV3, and matching with each gesture template in a gesture library to obtain the most matched gesture category and matching degree;
s3, gesture filtering: setting a threshold value based on the matching degree of the three-dimensional gestures, filtering out gestures with lower matching degree, and only selecting gestures with higher matching degree than the threshold value for subsequent control operation, so that the accuracy of instructions and the robustness of a system can be improved;
s4, control instruction generation: generating a corresponding vehicle control instruction according to a gesture template which is most matched with the input three-dimensional gesture, and generating a control instruction for opening a left front door window if the operation corresponding to the most matched gesture template is that the left front door window is opened;
s5, scene judgment: judging the driving scene of the current vehicle, if the gesture operation which is the best match is not matched with the current scene, not generating a control instruction, and giving an alarm. The misoperation caused by scene change in the three-dimensional gesture interaction process can be avoided;
s6, visual feedback: the best matched gesture template is displayed on the vehicle-mounted display screen, the execution effect of the corresponding vehicle-mounted function is displayed, appropriate visual feedback is given to a driver, and the gesture is convenient to correct or re-input;
s7, operation record: and recording a three-dimensional gesture operation process of a driver and a feedback process of a system, and optimizing incremental learning and template matching of a gesture library to realize personalized customization.
S8, matching optimization: the incremental learning method is used for continuously optimizing a matching model between the three-dimensional gesture and the template, so that accuracy and robustness of gesture matching are improved, and a foundation is provided for realizing high-performance man-machine interaction.
The output method can improve the data processing speed, maintain good accuracy and robustness and provide personalized man-machine interaction.
Embodiment four: in addition to all the technical features in the first embodiment, the present embodiment further includes:
the feedback module of the execution layer receives the command signal output by the decision layer and executes the command, and the output mode comprises air conditioner air volume, media volume, window lifting and center control screen page turning;
wherein, the feedback module comprises the following steps:
s1, visual feedback: the best matched gesture template and corresponding vehicle-mounted function execution effect (such as opening of a vehicle window) are displayed on a vehicle-mounted display screen, so that proper visual feedback is given to a driver, and gestures are convenient to correct or re-input;
s2, voice feedback: the system informs the driver of the most matched gesture operation and the executed vehicle-mounted control instruction in a voice mode, and carries out necessary voice reminding and interaction;
s3, executing the function: and after receiving the control instruction generated by the mapping, the vehicle-mounted system controls the corresponding vehicle-mounted functional module to perform operation execution (such as opening a vehicle window). The execution result is also displayed as a feedback;
s4, matching results: the system informs the driver of the matching result between the three-dimensional gesture and the gesture template, including the gesture type and the matching degree of the best matching. This can also be used as a feedback for the driver to judge the accuracy of the gesture input;
s5, reporting errors and reminding: if the system detects that the three-dimensional gesture is not matched with the current driving scene and a control instruction is not generated, a fault reporting prompt is given in a visual voice mode, and a driver is prompted to input the gesture again;
s6, operation record: the system records the whole three-dimensional gesture operation process and feedback process for subsequent analysis of interaction effect and improvement of experience. The recorded content can also be used for online learning and optimization of a matching model;
s7, user evaluation: the system inquires evaluation feedback of the three-dimensional gesture interaction effect to the driver, and accordingly selects whether the matching model and the interaction rule need to be updated to achieve personalized optimization.
Through the feedback module, interactive feedback and communication of man-machine interaction can be effectively realized, dangerous operation is avoided, the intelligence of the system is enhanced, and the safety of the system is ensured.
Fifth embodiment: in addition to all the technical features in the first embodiment, the present embodiment further includes:
the feedback module specifically comprises: the air conditioner, the volume, the central control screen, the car window and the like can respond differently when receiving command signals, wherein the central control screen can respond according to gesture page scrolling, page turning, space clicking and the like; the volume can be adjusted according to the distance between the thumb and the index finger in the z direction of the space coordinate system; the vehicle window can ascend or descend according to the up-and-down swing of the gesture, and the front-and-back swing enables the skylight to be opened or closed; the air quantity of the air conditioner is adjusted according to the distance between the thumb and the index finger in the x direction of the space coordinate system.
The feedback module performs feedback output and mainly comprises the following gesture and control instruction mapping activation: when the index finger is touched with the thumb, the distance between the thumb and the index finger is changed to output the media volume adjusting signal, and the size of the media volume is changed; when the palms are horizontal and the rest four thumbs point downwards to swing upwards and downwards, the vehicle window descends along with the palms, and otherwise ascends; when the palm stands up and swings forward and backward, the sunroof at the top of the vehicle is opened or closed forward and backward along with the palm; when the human-computer interaction relates to a vehicle-mounted screen, a fist is held and the index finger is singly extended to move left and right up and down, the left and right page turning of the screen page can be controlled, the sliding of the up and down contents can be controlled, when the index finger joint has larger angle change, a single click screen command is responded at the corresponding position of the screen once, the time interval is less than 0.5 seconds, the double click command is continuously generated twice, the double click operation is carried out on the screen once; when a fist-making gesture command occurs, the user can autonomously judge whether the running environment is safe or not, and the vehicle immediately enters a decelerating running state.
Example six: as shown in fig. 2, a DVS-based car cabin gesture recognition method, including a car cabin gesture recognition system, the gesture recognition method detects based on a multi-mode fusion network model, the gesture recognition method includes the following steps:
s1, capturing gesture actions of a user in real time through DVS, generating a gesture sequence frame image, forming an event sequence of the gesture image, further processing the event sequence as original input of a network to extract time sequence and spatial characteristics, acquiring gestures with different visual angles and modes, enriching input information, and improving detection robustness;
s2, detecting gesture key points: detecting gesture key points in each frame of images on the time sequence of the gesture images by using a key point detection model OpenPose, obtaining time sequence coordinates of the gesture key points with multiple views, and capturing fine action characteristics and three-dimensional space information of the gestures;
s3, preprocessing a gesture image: performing scale normalization, image rotation, noise filtering, frame selection and modal Registration preprocessing on the acquired gesture image time sequences to improve the matching degree and the feature extraction effect of the time sequences of different modalities;
s4, extracting spatial features of the gesture image: carrying out feature extraction on each frame of image of the preprocessed gesture image time sequence by using a lightweight MobilenetV3 and other networks so as to acquire advanced spatial feature mapping of the images, and enhancing understanding of gesture details;
s5, multi-mode space-time feature network fusion; fusing the multi-view gesture key point time sequence and the image space feature map obtained by step2 and step4, constructing space-time features of the gesture, and performing gesture detection as input of an LSTM network;
s6, post-processing of detection results: post-processing is carried out on the prediction result of the LSTM, wherein the post-processing comprises multi-mode feature mapping, coordinate mapping, smoothing processing and three-dimensional reconstruction to obtain gesture types, multi-view key point time sequences and three-dimensional gesture space information.
Through multi-mode network fusion, a new network model with high precision, strong timeliness and high robustness can be obtained.
The method is realized based on DVS, so that the overall stability of the gesture recognition system is improved, the accuracy of gesture recognition of the automobile cabin can be effectively improved, the recognition delay is reduced, the overall robustness of the system is improved, and better human-computer interaction experience is provided; the operation record and feedback mechanism can provide data support for the investigation of possible traffic accidents after the fact, is convenient for judging the generation process and responsibility attribution of the control instruction, and is beneficial to continuously improving the safety guarantee mechanism and error proofing capability of the system and improving the safety; the gesture recognition system and the targeted feedback mechanism can be well applied to intelligent driving scenes, so that the gesture natural interaction and the high cooperation of the automobile cabin system are realized, the operation convenience of a driver and the usability of the system are improved, and the method has high practical application value.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present disclosure describes embodiments, not every embodiment is provided with a separate embodiment, and that this description is provided for clarity only, and that the disclosure is not limited to the embodiments described in detail below, and that the embodiments described in the examples may be combined as appropriate to form other embodiments that will be apparent to those skilled in the art.

Claims (2)

1. The utility model provides a car cabin gesture recognition system based on DVS, includes perception layer, decision-making layer and the execution layer that connects gradually, its characterized in that: the perception layer consists of DVS; the decision layer consists of an algorithm processing module and an output module; the execution layer consists of a feedback module;
the DVS of the perception layer is assembled in a vehicle cabin, captures gesture actions of a driver or a passenger with microsecond time resolution, and generates time-related events, wherein the events carry time stamps and spatial position information of event discovery, and the DVS is used for receiving the events, filtering, clustering and sequencing the time to reconstruct a time sequence of the gesture actions;
the decision layer receives gesture signals acquired by the DVS of the perception layer, processes the gesture signals through a decision layer algorithm and outputs the gesture signals to the execution layer, and the algorithm processing module combines depth information and three-dimensional hand skeleton information characteristics and carries out multi-mode fusion with LSTM through MobileNet V3;
the output module processes the information output by the algorithm processing module through algorithm verification, signal analysis and model verification, and inputs the information to the execution layer;
wherein, the output module comprises the following steps:
s1, constructing a DVS gesture library: collecting various three-dimensional gesture data according to vehicle-mounted function requirements and interaction habits, and constructing a three-dimensional gesture library, wherein each gesture corresponds to a vehicle control operation;
s2, gesture matching: performing gesture matching on the acquired three-dimensional gesture image sequence by using a fusion model of LSTM and MobilenetV3, and matching with each gesture template in a gesture library to obtain the most matched gesture category and matching degree;
s3, gesture filtering: setting a threshold value based on the matching degree of the three-dimensional gestures, filtering out gestures with lower matching degree, and only selecting gestures with matching degree higher than the threshold value to perform subsequent control operation;
s4, control instruction generation: generating a corresponding vehicle control instruction according to a gesture template which is most matched with the input three-dimensional gesture, and generating a control instruction for opening a left front door window if the operation corresponding to the most matched gesture template is that the left front door window is opened;
s5, scene judgment: judging the driving scene of the current vehicle, if the gesture operation which is the best match is not matched with the current scene, not generating a control instruction, and giving a warning;
s6, visual feedback: displaying the best matched gesture template on the vehicle-mounted display screen, displaying the execution effect of the corresponding vehicle-mounted function, and giving appropriate visual feedback to a driver;
s7, operation record: recording a three-dimensional gesture operation process of a driver and a feedback process of a system, wherein the three-dimensional gesture operation process and the feedback process are used for incremental learning of a gesture library and optimization of template matching;
s8, matching optimization: the incremental learning method is used for continuously optimizing a matching model between the three-dimensional gesture and the template, so that accuracy and robustness of gesture matching are improved, and a foundation is provided for realizing high-performance man-machine interaction;
the feedback module of the execution layer receives the command signal output by the decision layer and executes the command, and the output mode comprises air conditioner air volume, media volume, window lifting and center control screen page turning;
wherein, the feedback module comprises the following steps:
s1, visual feedback: the best matched gesture template and the corresponding vehicle-mounted function execution effect are displayed on the vehicle-mounted display screen, appropriate visual feedback is given to a driver, and the gesture is convenient to correct or re-input;
s2, voice feedback: the system informs the driver of the most matched gesture operation and the executed vehicle-mounted control instruction in a voice mode, and carries out necessary voice reminding and interaction;
s3, executing the function: after receiving the control instruction generated by mapping, the vehicle-mounted system controls the corresponding vehicle-mounted functional module to perform operation execution;
s4, matching results: the system informs the driver of the matching result between the three-dimensional gesture and the gesture template, including the most matched gesture category and matching degree;
s5, reporting errors and reminding: if the system detects that the three-dimensional gesture is not matched with the current driving scene and a control instruction is not generated, a fault reporting prompt is given in a visual voice mode, and a driver is prompted to input the gesture again;
s6, operation record: the system records the whole three-dimensional gesture operation process and feedback process for subsequent analysis of interaction effect and improvement of experience;
s7, user evaluation: the system inquires evaluation feedback of the three-dimensional gesture interaction effect to a driver, and accordingly selects whether a matching model and an interaction rule need to be updated so as to realize personalized optimization;
the feedback module specifically comprises: the air conditioner, the volume, the central control screen, the car window and the like can respond differently when receiving command signals, wherein the central control screen can respond according to gesture page scrolling, page turning, space clicking and the like; the volume can be adjusted according to the distance between the thumb and the index finger in the z direction of the space coordinate system; the vehicle window can ascend or descend according to the up-and-down swing of the gesture, and the front-and-back swing enables the skylight to be opened or closed; the air quantity of the air conditioner is adjusted according to the distance between the thumb and the index finger in the x direction of the space coordinate system;
the feedback module performs feedback output and comprises the following gesture and control instruction mapping activation:
when the index finger is touched with the thumb, the distance between the thumb and the index finger is changed to output the media volume adjusting signal, and the size of the media volume is changed;
when the palms are horizontal and the rest four thumbs point downwards to swing upwards and downwards, the vehicle window descends along with the palms, and otherwise ascends;
when the palm stands up and swings forward and backward, the sunroof at the top of the vehicle is opened or closed forward and backward along with the palm; when the human-computer interaction relates to a vehicle-mounted screen, a fist is held and the index finger is singly extended to move left and right up and down, the left and right page turning of the screen page can be controlled, the sliding of the up and down contents can be controlled, when the index finger joint has larger angle change, a single click screen command is responded at the corresponding position of the screen once, the time interval is less than 0.5 seconds, the double click command is continuously generated twice, the double click operation is carried out on the screen once; when a fist-making gesture command occurs, the user can autonomously judge whether the running environment is safe or not, and the vehicle immediately enters a decelerating running state.
2. A DVS-based car cabin gesture recognition method, comprising the car cabin gesture recognition system of claim 1, wherein the gesture recognition method is based on a multi-modal fusion network model for detection, the gesture recognition method comprising the steps of:
s1, capturing gesture actions of a user in real time through DVS, generating gesture sequence frame images, forming an event sequence of the gesture images, and further processing the event sequence as original input of a network to extract time sequence and spatial characteristics, wherein gestures with different visual angles and modes are required to be acquired;
s2, detecting gesture key points: detecting gesture key points in each frame of images on the time sequence of the gesture images by using a key point detection model OpenPose, obtaining time sequence coordinates of the gesture key points with multiple views, and capturing fine action characteristics and three-dimensional space information of the gestures;
s3, preprocessing a gesture image: performing scale normalization, image rotation, noise filtering, frame selection and modal Registration preprocessing on the acquired gesture image time sequence;
s4, extracting spatial features of the gesture image: extracting features of each frame of the preprocessed gesture image time sequence by using a lightweight MobilenetV3 network;
s5, multi-mode space-time feature network fusion; fusing the multi-view gesture key point time sequence and the image space feature map obtained by step2 and step4, constructing space-time features of the gesture, and performing gesture detection as input of an LSTM network;
s6, post-processing of detection results: post-processing is carried out on the prediction result of the LSTM, wherein the post-processing comprises multi-mode feature mapping, coordinate mapping, smoothing processing and three-dimensional reconstruction to obtain gesture types, multi-view key point time sequences and three-dimensional gesture space information.
CN202311005673.1A 2023-08-10 2023-08-10 DVS-based automobile cabin gesture recognition system and method Active CN117218716B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311005673.1A CN117218716B (en) 2023-08-10 2023-08-10 DVS-based automobile cabin gesture recognition system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311005673.1A CN117218716B (en) 2023-08-10 2023-08-10 DVS-based automobile cabin gesture recognition system and method

Publications (2)

Publication Number Publication Date
CN117218716A CN117218716A (en) 2023-12-12
CN117218716B true CN117218716B (en) 2024-04-09

Family

ID=89050131

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311005673.1A Active CN117218716B (en) 2023-08-10 2023-08-10 DVS-based automobile cabin gesture recognition system and method

Country Status (1)

Country Link
CN (1) CN117218716B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103076877A (en) * 2011-12-16 2013-05-01 微软公司 Interacting with a mobile device within a vehicle using gestures
KR20140101276A (en) * 2013-02-07 2014-08-19 삼성전자주식회사 Method of displaying menu based on depth information and space gesture of user
KR20200055202A (en) * 2018-11-12 2020-05-21 삼성전자주식회사 Electronic device which provides voice recognition service triggered by gesture and method of operating the same
CN111813224A (en) * 2020-07-09 2020-10-23 电子科技大学 Method for establishing and identifying fine gesture library based on ultrahigh-resolution radar
CN111988493A (en) * 2019-05-21 2020-11-24 北京小米移动软件有限公司 Interaction processing method, device, equipment and storage medium
CN112507898A (en) * 2020-12-14 2021-03-16 重庆邮电大学 Multi-modal dynamic gesture recognition method based on lightweight 3D residual error network and TCN
CN112558305A (en) * 2020-12-22 2021-03-26 华人运通(上海)云计算科技有限公司 Control method, device and medium for display picture and head-up display control system
CN112905004A (en) * 2021-01-21 2021-06-04 浙江吉利控股集团有限公司 Gesture control method and device for vehicle-mounted display screen and storage medium
CN112949512A (en) * 2021-03-08 2021-06-11 豪威芯仑传感器(上海)有限公司 Dynamic gesture recognition method, gesture interaction method and interaction system
CN113807287A (en) * 2021-09-24 2021-12-17 福建平潭瑞谦智能科技有限公司 3D structured light face recognition method
CN114265498A (en) * 2021-12-16 2022-04-01 中国电子科技集团公司第二十八研究所 Method for combining multi-modal gesture recognition and visual feedback mechanism
CN114973408A (en) * 2022-05-10 2022-08-30 西安交通大学 Dynamic gesture recognition method and device
CN116071817A (en) * 2022-10-25 2023-05-05 中国矿业大学 Network architecture and training method of gesture recognition system for automobile cabin
CN116449947A (en) * 2023-03-22 2023-07-18 江苏北斗星通汽车电子有限公司 Automobile cabin domain gesture recognition system and method based on TOF camera

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8396252B2 (en) * 2010-05-20 2013-03-12 Edge 3 Technologies Systems and related methods for three dimensional gesture recognition in vehicles
WO2015184308A1 (en) * 2014-05-29 2015-12-03 Northwestern University Motion contrast depth scanning
KR102530219B1 (en) * 2015-10-30 2023-05-09 삼성전자주식회사 Method and apparatus of detecting gesture recognition error
KR20190104929A (en) * 2019-08-22 2019-09-11 엘지전자 주식회사 Method for performing user authentication and function execution simultaneously and electronic device for the same
US11144129B2 (en) * 2020-03-04 2021-10-12 Panasonic Avionics Corporation Depth sensing infrared input device and associated methods thereof

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103076877A (en) * 2011-12-16 2013-05-01 微软公司 Interacting with a mobile device within a vehicle using gestures
KR20140101276A (en) * 2013-02-07 2014-08-19 삼성전자주식회사 Method of displaying menu based on depth information and space gesture of user
KR20200055202A (en) * 2018-11-12 2020-05-21 삼성전자주식회사 Electronic device which provides voice recognition service triggered by gesture and method of operating the same
CN111988493A (en) * 2019-05-21 2020-11-24 北京小米移动软件有限公司 Interaction processing method, device, equipment and storage medium
CN111813224A (en) * 2020-07-09 2020-10-23 电子科技大学 Method for establishing and identifying fine gesture library based on ultrahigh-resolution radar
CN112507898A (en) * 2020-12-14 2021-03-16 重庆邮电大学 Multi-modal dynamic gesture recognition method based on lightweight 3D residual error network and TCN
CN112558305A (en) * 2020-12-22 2021-03-26 华人运通(上海)云计算科技有限公司 Control method, device and medium for display picture and head-up display control system
CN112905004A (en) * 2021-01-21 2021-06-04 浙江吉利控股集团有限公司 Gesture control method and device for vehicle-mounted display screen and storage medium
CN112949512A (en) * 2021-03-08 2021-06-11 豪威芯仑传感器(上海)有限公司 Dynamic gesture recognition method, gesture interaction method and interaction system
WO2022188259A1 (en) * 2021-03-08 2022-09-15 豪威芯仑传感器(上海)有限公司 Dynamic gesture recognition method, gesture interaction method, and interaction system
CN113807287A (en) * 2021-09-24 2021-12-17 福建平潭瑞谦智能科技有限公司 3D structured light face recognition method
CN114265498A (en) * 2021-12-16 2022-04-01 中国电子科技集团公司第二十八研究所 Method for combining multi-modal gesture recognition and visual feedback mechanism
CN114973408A (en) * 2022-05-10 2022-08-30 西安交通大学 Dynamic gesture recognition method and device
CN116071817A (en) * 2022-10-25 2023-05-05 中国矿业大学 Network architecture and training method of gesture recognition system for automobile cabin
CN116449947A (en) * 2023-03-22 2023-07-18 江苏北斗星通汽车电子有限公司 Automobile cabin domain gesture recognition system and method based on TOF camera

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Dynamic hand gesture recognition of Arabic sign language by using deep convolutional neural networks;Mohammad H. Ismai等;《Indonesian Journal of Electrical Engineering and Computer Science》;20220228;952~962 *
基于深度学习的多分支轻量级网络模型的研究和应用;王铃昊;《中国优秀硕士学位论文电子期刊》;20230115;30-40页 *

Also Published As

Publication number Publication date
CN117218716A (en) 2023-12-12

Similar Documents

Publication Publication Date Title
Pickering et al. A research study of hand gesture recognition technologies and applications for human vehicle interaction
CN111104820A (en) Gesture recognition method based on deep learning
CN103530540B (en) User identity attribute detection method based on man-machine interaction behavior characteristics
CN104750397A (en) Somatosensory-based natural interaction method for virtual mine
CN105844263A (en) Summary view of video objects sharing common attributes
CN102880292A (en) Mobile terminal and control method thereof
CN113378641B (en) Gesture recognition method based on deep neural network and attention mechanism
CN113591659B (en) Gesture control intention recognition method and system based on multi-mode input
CN113327479B (en) MR technology-based intelligent training system for driving motor vehicle
CN104881127A (en) Virtual vehicle man-machine interaction method and system
CN113377193A (en) Vending machine interaction method and system based on reliable gesture recognition
CN114821753B (en) Eye movement interaction system based on visual image information
Meng et al. Application and development of AI technology in automobile intelligent cockpit
CN111695408A (en) Intelligent gesture information recognition system and method and information data processing terminal
Wang et al. Gaze-aware hand gesture recognition for intelligent construction
CN117218716B (en) DVS-based automobile cabin gesture recognition system and method
KR20120048190A (en) Vehicle control system using motion recognition
CN116449947B (en) Automobile cabin domain gesture recognition system and method based on TOF camera
CN105929944B (en) A kind of three-dimensional man-machine interaction method
Zhang et al. Mid-air gestures for in-vehicle media player: elicitation, segmentation, recognition, and eye-tracking testing
CN110413106B (en) Augmented reality input method and system based on voice and gestures
CN109582136B (en) Three-dimensional window gesture navigation method and device, mobile terminal and storage medium
Fu et al. Research on application of cognitive-driven human-computer interaction
CN113807280A (en) Kinect-based virtual ship cabin system and method
CN109144237B (en) Multi-channel man-machine interactive navigation method for robot

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant