CN114241456A - Safe driving monitoring method using feature adaptive weighting - Google Patents

Safe driving monitoring method using feature adaptive weighting Download PDF

Info

Publication number
CN114241456A
CN114241456A CN202111564304.7A CN202111564304A CN114241456A CN 114241456 A CN114241456 A CN 114241456A CN 202111564304 A CN202111564304 A CN 202111564304A CN 114241456 A CN114241456 A CN 114241456A
Authority
CN
China
Prior art keywords
features
convolution
input
adaptive weighting
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111564304.7A
Other languages
Chinese (zh)
Inventor
路小波
陆明琦
胡耀聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202111564304.7A priority Critical patent/CN114241456A/en
Publication of CN114241456A publication Critical patent/CN114241456A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a safe driving monitoring method using feature adaptive weighting. The invention researches the fusion strategy of global features and key point features with different scales. Aiming at the attention problem in the fusion process, the invention does not directly cascade global features and key point features, but provides a feature fusion module based on the posture aiming at the global features and the key point features. The type difference of the driver behavior is shown in different image areas, the model focuses on different areas of different input images, therefore, the self-adaptive weighting module is provided, a set of expert weights specific to input data is learned to select a convolution kernel for calculation, a new direction is provided for the driver action recognition, and the accuracy of the driver behavior recognition is further improved. The invention has important application value in the field of traffic safety.

Description

Safe driving monitoring method using feature adaptive weighting
Technical Field
The invention belongs to the field of image processing and pattern recognition, and particularly relates to a safe driving monitoring method using feature adaptive weighting.
Background
Despite the improved safety of road and vehicle designs, the total number of fatal accidents is still increasing. The World Health Organization (WHO)2017 global condition report reports that the worldwide annual death due to road traffic accidents is estimated to be 125 million, causing up to 5 million people non-fatal injuries. In addition, road traffic accidents cause enormous property damage, and the number of road accidents due to distracted driving is steadily increasing, so that the research of the driver behavior recognition algorithm is an important but challenging task for road safety.
Disclosure of Invention
In order to solve the problems, the invention discloses a safe driving monitoring method utilizing feature adaptive weighting, and the method of feature fusion and dynamic convolution used in the invention realizes the improvement of the accuracy rate of driver behavior identification in the testing stage.
In order to achieve the purpose, the technical scheme of the invention is as follows:
a safe driving monitoring method using feature adaptive weighting includes the following steps:
step 1: using the existing StateFarm distraction dataset as an experimental dataset;
step 2: constructing a feature self-adaptive weighting model, wherein ResNet is used as a network global feature extractor, a gesture estimation model is used for capturing key point level semantics, a gesture-based feature fusion module is used for fusing global features and key point features, the module adopts a multi-branch structure, one branch uses a global average pooling layer to extract the attention of the global features, and the other branch directly uses point-by-point convolution to extract the channel attention of the key point features; then, feeding the fused features into an adaptive weighting module to dynamically adjust the convolutional neural network; dynamically generating a set of input-dependent weights using the fully-connected layer and combining them with corresponding convolution parameters to generate a new convolution kernel; finally, the output characteristics of the module are input to a classifier;
step 201: for an input driving image, extracting global features by adopting ResNet as a model backbone;
step 202: detecting key points of a driver by using a posture estimation model, generating a boundary box of the key points through post-processing, extracting the characteristics of the key points through RoI Align and modeling; setting a threshold value for the response of the key point due to the fact that the key point cannot be detected due to shielding and the like; key points with response values below the threshold will not participate in subsequent calculations; the network generates a separate feature map for each key point;
step 203: the category difference of the driver actions is mainly reflected on the detail of the key points, so that the global features and the key point features are fused in a cascading manner;
step 204: in order to enhance the representation of useful keypoint feature channels and suppress irrelevant features, an adaptive weighting module is proposed to recalibrate the activation strengths of different keypoint feature channels; the self-adaptive weighting module is used for converting the fusion features before the fusion features are transmitted to the classifier; each input is treated as a linear combination of n experts to compute the convolution kernel, a detailed description of this module is as follows:
the self-adaptive weighting module sets a plurality of convolution kernels in the convolution layer; the weight of each convolution kernel is determined by the input of the convolution layer through a full connection layer; finally, a group of convolution kernels which are customized for input are obtained through weighted summation, so that primary convolution is realized; taking the fusion characteristics of the global and key points as the input of the self-adaptive weighting module; the fusion feature reveals the driver action category, guides different experts to focus on the input they are interested in; the n expert weights in the convolutional layer are determined by the fusion characteristics; in other words, the weights of the n experts are different in all samples, each input being processed with a different weight; specifically, the expert weight α ═ r (f) is dynamically generated and combined with the corresponding original parameters to generate a new convolution kernel;
α=r(f)=S(FC(GAP(f)))#(5)
wherein S represents a Sigmoid activation function, FC represents a full connection layer, and GAP represents a global average pooling layer; the full connection layer maps the processed fusion features to n expert weights; the convolution kernel associated with the input is calculated as a function of the input samples and parameterized as (alpha)1·W1+…αn·Wn) (ii) a Generated convolution feature fIDThe calculation is as follows:
fID=σ((α1·W1+…an·Wn)*f)#(6)
where f denotes input features, each αi=ri(x) Is a scalar weight that depends on the input, n is the number of experts, σ is an activation function; it is evident that for different inputs, the model capacity increases with the number of experts; only a small inference cost is needed because the convolution kernel is computed as a linear sum of n expert weights, rather than increasing the kernel parameters or number of channels of the convolution layer;
replacing the standard convolution layer with an adaptive weighting module to construct an adaptive block capable of learning specific convolution kernel parameters for each input driving image; in order to avoid serious overfitting problem caused by too deep network, and the expert weight in the adaptive weighting module is more specific in class in the deeper layer of the network, only the last convolution group in ResNet uses an adaptive block;
and step 3: training a feature adaptive weighting model; on an open source platform PyTorch, training by adopting an SGD optimizer;
and 4, step 4: and testing the characteristic self-adaptive weighting model.
The invention has the beneficial effects that:
(1) the invention provides a feature fusion module based on gestures to fuse global and key point features, which enhances channel attention and fuses multi-scale feature context.
(2) The invention provides a self-adaptive weighting module, which is used for customizing an independent convolution layer for each input sample and dynamically adjusting the convolution layer.
(3) The method further improves the accuracy of identifying the driver behavior, and has important application value in the field of traffic safety.
Drawings
FIG. 1 is a flow chart of the present invention;
figure 2 is a picture of a sample of different driving behaviors in the present invention,
FIG. 3 is a schematic diagram of the structure of the feature adaptive weighting model in the present invention,
fig. 4 is a schematic structural diagram of the adaptive weighting module of the present invention.
Detailed Description
The present invention will be further illustrated with reference to the accompanying drawings and specific embodiments, which are to be understood as merely illustrative of the invention and not as limiting the scope of the invention.
The invention relates to a safe driving monitoring method using characteristic self-adaptive weighting, which comprises the following specific implementation steps:
step 1: using the existing StateFarm distraction dataset as an experimental dataset;
step 2: constructing a characteristic adaptive weighting model, wherein FIG. 3 is a structural schematic diagram of the model; the method uses ResNet as a network global feature extractor, uses a posture estimation model to capture key point level semantics, fuses global features and key point features through a posture-based feature fusion module, the module adopts a multi-branch structure, one branch uses a global average pooling layer to extract attention of the global features, and the other branch directly uses point-by-point convolution to extract channel attention of the key point features; then, feeding the fused features into an adaptive weighting module to dynamically adjust the convolutional neural network; dynamically generating a set of input-dependent weights using the fully-connected layer and combining them with corresponding convolution parameters to generate a new convolution kernel; finally, the output characteristics of the module are input to a classifier;
step 201: for an input driving image, extracting global features by adopting ResNet as a model backbone;
step 202: detecting key points of a driver by using a posture estimation model, generating a boundary box of the key points through post-processing, extracting the characteristics of the key points through RoI Align and modeling; setting a threshold value for the response of the key point due to the fact that the key point cannot be detected due to shielding and the like; key points with response values below the threshold will not participate in subsequent calculations; the network generates a separate feature map for each key point;
step 203: the category difference of the driver actions is mainly reflected on the detail of the key points, so that the global features and the key point features are fused in a cascading manner;
step 204: in order to enhance the representation of useful keypoint feature channels and suppress irrelevant features, an adaptive weighting module is proposed to recalibrate the activation strengths of different keypoint feature channels; in FIG. 3, the adaptive weighting module is used to transform the fused features before passing them to the classifier; as shown in fig. 4, each input is treated as a linear combination of n experts to compute the convolution kernel, a detailed description of this module is as follows:
the self-adaptive weighting module sets a plurality of convolution kernels in the convolution layer; the weight of each convolution kernel is determined by the input of the convolution layer through a full connection layer; finally, a group of convolution kernels which are customized for input are obtained through weighted summation, so that primary convolution is realized; taking the fusion characteristics of the global and key points as the input of the self-adaptive weighting module; the fusion feature reveals the driver action category, guides different experts to focus on the input they are interested in; the n expert weights in the convolutional layer are determined by the fusion characteristics; in other words, the weights of the n experts are different in all samples, each input being processed with a different weight; specifically, the expert weight a ═ r (f) is dynamically generated and combined with the corresponding original parameters to generate a new convolution kernel;
α=r(f)=S(FC(GAP(f)))#(5)
wherein S represents a Sigmoid activation function, FC represents a full connection layer, and GAP represents a global average pooling layer; the full connection layer maps the processed fusion features to n expert weights; the convolution kernel associated with the input is calculated as a function of the input samples and parameterized as (alpha)1·W1+…αn·Wn) (ii) a Generated convolution feature fIDThe calculation is as follows:
fID=σ((α1·W1+…αn·Wn)*f)#(6)
where f denotes input features, each αi=ri(x) Is a scalar weight that depends on the input, n is the number of experts, σ is an activation function; it is evident that for different inputs, the model capacity increases with the number of experts; only a small inference cost is needed because the convolution kernel is computed as a linear sum of n expert weights, rather than increasing the kernel parameters or number of channels of the convolution layer;
replacing the standard convolution layer with an adaptive weighting module to construct an adaptive block capable of learning specific convolution kernel parameters for each input driving image; in order to avoid serious overfitting problem caused by too deep network, and the expert weight in the adaptive weighting module is more specific in class in the deeper layer of the network, only the last convolution group in ResNet uses an adaptive block;
and step 3: training a feature adaptive weighting model; on an open source platform PyTorch, training by adopting an SGD optimizer;
and 4, step 4: and testing the characteristic self-adaptive weighting model.
It should be noted that the above-mentioned contents only illustrate the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and it is obvious to those skilled in the art that several modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations fall within the protection scope of the claims of the present invention.

Claims (2)

1. A safe driving monitoring method using feature adaptive weighting is characterized by comprising the following steps:
step 1: using the existing StateFarm distraction dataset as an experimental dataset;
step 2: constructing a characteristic self-adaptive weighting model; using ResNet as a network global feature extractor, using a posture estimation model to capture key point level semantics, fusing global features and key point features through a posture-based feature fusion module, wherein the module adopts a multi-branch structure, one branch uses a global average pooling layer to extract attention of the global features, and the other branch directly uses point-by-point convolution to extract channel attention of the key point features; then, feeding the fused features into an adaptive weighting module to dynamically adjust the convolutional neural network; dynamically generating a set of input-dependent weights using the fully-connected layer and combining them with corresponding convolution parameters to generate a new convolution kernel; finally, the output characteristics of the module are input to a classifier;
and step 3: training a feature adaptive weighting model; on an open source platform PyTorch, training by adopting an SGD optimizer;
and 4, step 4: and testing the characteristic self-adaptive weighting model.
2. The safe driving monitoring method using feature adaptive weighting according to claim 1, wherein the step 2 is to construct a feature adaptive weighting model, use ResNet as a network global feature extractor, use a pose estimation model to capture the key point level semantics, and fuse the global features and the key point features through a pose-based feature fusion module, which adopts a multi-branch structure, one branch uses a global mean pooling layer to extract the attention of the global features, and the other branch directly uses a point-by-point convolution to extract the channel attention of the key point features; then, feeding the fused features into an adaptive weighting module to dynamically adjust the convolutional neural network; dynamically generating a set of input-dependent weights using the fully-connected layer and combining them with corresponding convolution parameters to generate a new convolution kernel; finally, the output characteristics of the module are input to a classifier; the method comprises the following specific steps:
step 201: for an input driving image, extracting global features by adopting ResNet as a model backbone;
step 202: detecting key points of a driver by using a posture estimation model, generating a boundary box of the key points through post-processing, extracting the characteristics of the key points through RoI Align and modeling; setting a threshold value for the response of the key point due to the fact that the key point cannot be detected due to shielding and the like; key points with response values below the threshold will not participate in subsequent calculations; the network generates a separate feature map for each key point;
step 203: the category difference of the driver actions is mainly reflected on the detail of the key points, so that the global features and the key point features are fused in a cascading manner;
step 204: in order to enhance the representation of useful keypoint feature channels and suppress irrelevant features, an adaptive weighting module is proposed to recalibrate the activation strengths of different keypoint feature channels; the self-adaptive weighting module is used for converting the fusion features before the fusion features are transmitted to the classifier; each input is treated as a linear combination of n experts to compute the convolution kernel, a detailed description of this module is as follows:
the self-adaptive weighting module sets a plurality of convolution kernels in the convolution layer; the weight of each convolution kernel is determined by the input of the convolution layer through a full connection layer; finally, a group of convolution kernels which are customized for input are obtained through weighted summation, so that primary convolution is realized; taking the fusion characteristics of the global and key points as the input of the self-adaptive weighting module; the fusion feature reveals the driver action category, guides different experts to focus on the input they are interested in; the n expert weights in the convolutional layer are determined by the fusion characteristics; in other words, the weights of the n experts are different in all samples, each input being processed with a different weight; specifically, the expert weight α ═ r (f) is dynamically generated and combined with the corresponding original parameters to generate a new convolution kernel;
α=r(f)=S(FC(GAP(f)))#(5)
wherein S represents a Sigmoid activation function, FC represents a full connection layer, and GAP represents a global average pooling layer; the full connection layer maps the processed fusion features to n expert weights; the convolution kernel associated with the input is calculated as a function of the input samples and parameterized as (alpha)1·W1+…αn·Wn) (ii) a Generated convolution feature fIDThe calculation is as follows:
fID=σ((α1·W1+…αn·Wn)*f)#(6)
where f denotes input features, each αi=ri(x) Is a scalar weight that depends on the input, n is the number of experts, σ is an activation function; it is evident that for different inputs, the model capacity increases with the number of experts; only a small inference cost is needed because the convolution kernel is computed as a linear sum of n expert weights, rather than increasing the kernel parameters or number of channels of the convolution layer;
replacing the standard convolution layer with an adaptive weighting module to construct an adaptive block capable of learning specific convolution kernel parameters for each input driving image; to avoid severe overfitting problems due to too deep networks, and to make the expert weights in the adaptive weighting module more class specific at deeper layers of the network, the adaptive block is used only in the last convolution group in the ResNet.
CN202111564304.7A 2021-12-20 2021-12-20 Safe driving monitoring method using feature adaptive weighting Pending CN114241456A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111564304.7A CN114241456A (en) 2021-12-20 2021-12-20 Safe driving monitoring method using feature adaptive weighting

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111564304.7A CN114241456A (en) 2021-12-20 2021-12-20 Safe driving monitoring method using feature adaptive weighting

Publications (1)

Publication Number Publication Date
CN114241456A true CN114241456A (en) 2022-03-25

Family

ID=80759466

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111564304.7A Pending CN114241456A (en) 2021-12-20 2021-12-20 Safe driving monitoring method using feature adaptive weighting

Country Status (1)

Country Link
CN (1) CN114241456A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272992A (en) * 2022-09-30 2022-11-01 松立控股集团股份有限公司 Vehicle attitude estimation method
CN117576666A (en) * 2023-11-17 2024-02-20 合肥工业大学 Dangerous driving behavior detection method based on multi-scale dynamic convolution attention weighting

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115272992A (en) * 2022-09-30 2022-11-01 松立控股集团股份有限公司 Vehicle attitude estimation method
CN117576666A (en) * 2023-11-17 2024-02-20 合肥工业大学 Dangerous driving behavior detection method based on multi-scale dynamic convolution attention weighting
CN117576666B (en) * 2023-11-17 2024-05-10 合肥工业大学 Dangerous driving behavior detection method based on multi-scale dynamic convolution attention weighting

Similar Documents

Publication Publication Date Title
Guo et al. Driver drowsiness detection using hybrid convolutional neural network and long short-term memory
CN108133188B (en) Behavior identification method based on motion history image and convolutional neural network
CN110276765B (en) Image panorama segmentation method based on multitask learning deep neural network
Neil et al. Learning to be efficient: Algorithms for training low-latency, low-compute deep spiking neural networks
CN110516536B (en) Weak supervision video behavior detection method based on time sequence class activation graph complementation
CN110110689B (en) Pedestrian re-identification method
CN109034264B (en) CSP-CNN model for predicting severity of traffic accident and modeling method thereof
CN109145712B (en) Text information fused GIF short video emotion recognition method and system
CN110276248B (en) Facial expression recognition method based on sample weight distribution and deep learning
CN114241456A (en) Safe driving monitoring method using feature adaptive weighting
CN115082698B (en) Distraction driving behavior detection method based on multi-scale attention module
CN112861945B (en) Multi-mode fusion lie detection method
Pratama et al. Deep convolutional neural network for hand sign language recognition using model E
CN115280373A (en) Managing occlusions in twin network tracking using structured dropping
CN110633689B (en) Face recognition model based on semi-supervised attention network
CN114241458B (en) Driver behavior recognition method based on attitude estimation feature fusion
CN111797705A (en) Action recognition method based on character relation modeling
Siddiqi Fruit-classification model resilience under adversarial attack
CN109190471B (en) Attention model method for video monitoring pedestrian search based on natural language description
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN113221683A (en) Expression recognition method based on CNN model in teaching scene
CN113361466A (en) Multi-modal cross-directed learning-based multi-spectral target detection method
CN113205044B (en) Deep fake video detection method based on characterization contrast prediction learning
Singh Image spam classification using deep learning
Karthigayan et al. Genetic algorithm and neural network for face emotion recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination